[PATCH, ARM] attribute target (thumb,arm)

Discussion:

Christian Bruel

2014-09-29 11:03:30 UTC

Hi Ramana, Richard,

This patch implements the attribute target (and pragma) to allow
function based interworking.

as in the updated documentation, the syntax is:

__attribute__((target("thumb"))) int foo()
Forces thumb mode for function foo only. If the file was compiled with
-mthumb iit has no effect.

Similarly

__attribute__((target("arm"))) int foo()
Forces arm mode for function foo. It has no effect if the file was not
compiled with -mthumb.

and regions can be grouped together with

#pragma GCC target ("thumb")
or
#pragma GCC target ("arm")

a few notes
- Inlining is allowed between functions of the same mode (compilation
switch, #pragma and attribute)
- 'arm_option_override' is now reorganized around
'arm_option_override_internal' for thumb related macros
- I kept TARGET_UNIFIED_ASM to minimize changes. Although removing it
would avoid to switch between unified/divided asms
and simplify arm_declare_function_name. Should be considered at some
point.
- It is only available for Thumb2 variants (for thumb1 lack of interest
and a few complications I was unable to test, although this could be
added easily if needed, I think)

Tested for no regression for arm-none-eabi [,-with-arch=armv7-a]

OK for trunk ?

many thanks,

Christian

Ramana Radhakrishnan

2014-10-08 13:05:57 UTC

Permalink

Hi Christian,

Thanks for looking at this. I will need to read the code in detail but
this is a first top level reivew.

Post by Christian Bruel
Hi Ramana, Richard,
This patch implements the attribute target (and pragma) to allow
function based interworking.
__attribute__((target("thumb"))) int foo()
Forces thumb mode for function foo only. If the file was compiled with
-mthumb iit has no effect.

Indeed

Post by Christian Bruel
Similarly
__attribute__((target("arm"))) int foo()
Forces arm mode for function foo. It has no effect if the file was not
compiled with -mthumb.

Indeed.

Post by Christian Bruel
and regions can be grouped together with
#pragma GCC target ("thumb")
or
#pragma GCC target ("arm")
a few notes
- Inlining is allowed between functions of the same mode (compilation
switch, #pragma and attribute)

Why shouldn't we allow inlining between functions of ARM mode vs Thumb
mode ? After all the choice of mode is irrelevant at the time of
inlining (except possibly for inline assembler).

Perhaps an option is to try to disable inlining in the presence of
inline assembler or if not gate it from a command line option.

Post by Christian Bruel
- 'arm_option_override' is now reorganized around
'arm_option_override_internal' for thumb related macros

Looks like a reasonable start - We need a couple of tests to make sure
that __attribute__(("arm")) on a file compiled for the M profile results
in a syntax error. v7(e)m is Thumb2 only.

for bonus points it would be great to get __attribute__(("target"))
working properly in the backend. I suspect a number of the tuning flags
and the global architecture state needs to be moved into this as well to
handle cases where __attribute__(("arm")) used with M profile options is
error'd out.

Post by Christian Bruel
- I kept TARGET_UNIFIED_ASM to minimize changes. Although removing it
would avoid to switch between unified/divided asms

I know Terry's been trying to get Thumb1 to also switch by default to
unified asm. So I think a lot of the logic with "emit_thumb" could just
go away. Maybe we should just consider switching ARM state to unified
syntax and that would be as simple as changing TARGET_UNIFIED_SYNTAX in
arm.h to be TARGET_32BIT. Long overdue IMHO.

The only gotcha here is inline assembler but GAS is so permissive that
I'm not too worried about it in ARM state and Thumb2 state. I'm a bit
worried about Thumb1.

Post by Christian Bruel
and simplify arm_declare_function_name. Should be considered at some
point.

I think that can be done for a lot of newer cores - some of that logic
is dated now IIUC.

I remember why my original project failed - I couldn't get enough of the
backend in shape for the state to be saved and restored and then I moved
on to other more interesting things, so whatever is done here needs to
make sure that all ISA mode state is saved and restored.

One thing I experimented with while doing this work was adding something
like the mflip-mips16 option and then have the testsuite run with this
option to try and make sure enough of the backend state is saved and
restored properly.

This will give more testing coverage hopefully to the switching logic
for the attributes and hopefully expose any issues that are there with
respect to saving and restoring state. The problem you'll probably find
is that in some of the gcc.target/arm tests the flipping of state will
cause various interesting failures. The other side is making sure that
all state that is global is now captured and reinitialized everytime we
switch context between ARM and Thumb.

I'm not sure how to best test the pragma switching logic but given that
the 2 hang off each other, it should just work (TM) if one is right. In
any case adding some testcases that direct test this would be useful.

Post by Christian Bruel
- It is only available for Thumb2 variants (for thumb1 lack of interest
and a few complications I was unable to test, although this could be
added easily if needed, I think)

I don't think we should restrict this to Thumb1 or Thumb2 it should be
allowed if the architecture allows it.

For example __attribute__((thumb)) on a function compiled with
-march=armv5te -mfloat-abi=softfp -mfpu=vfpv3 should give a syntax error
as this is not supported. No VFP instructions in Thumb1. Explicit tests
for this would be appreciated.

IIRC all other cases should be accepted.

Post by Christian Bruel
Tested for no regression for arm-none-eabi [,-with-arch=armv7-a]
OK for trunk ?

Sorry not yet.

I would also like some documentation added to extend.texi for these
attributes.

So in summary.

1. Some documentation in extend.texi please.
2. TARGET_UNIFIED_SYNTAX to be turned on for ARM state too.
3. Testcases for this and some testing with a mflip-thumbness switch
(added only for testing).
4. Investigate further giving up restriction on Thumb1.
5. Investigate lifting inlining restriction for __attribute__(("arm"))
or __attribute__ (("thumb")) or maybe gate it off a command line option.
6. Split this patch into smaller logical chunks that can help with
review i.e. the restructuring from the attribute and pragma addition,
the testcases.
7. Tests for the diagnostics and error cases i.e. __attribute__(("arm"))
used with -march=armv7em -march=armv7m command line options.

regards
Ramana

Christian Bruel

2014-10-08 14:38:21 UTC

Permalink

Hi Ramana,

Thanks for your feedback. Just a few comments while you continue the review

1) about the documentation in extend.texi, it was in the patch already
: did I miss a part ?
* doc/extend.texi (arm, thumb): Document target attributes.
* doc/invoke.texi (arm, thumb): Mention target attributes.

2) about supporting thumb1
OK I'll suppress this limitation. But I covered the testing only for
thumb2 as I don't have a thumb1 platform, if it's OK with you thumb1
will only be covered by "visual checking". Can you help to test this mode ?

3) about inlining
I dislike inlining different modes, From a conceptual use, a user
might want to switch mode only when changing a function's hotness.
Usually inlining a cold function into a hot one is not what the user
explicitly required when setting different mode attributes for them,

The compiler would take a decision that is not what the user wrote. And
in addition if you consider the few instructions to modify R15 to switch
state that would end up with more code executed in the critical path,
voiding a possible size of speed gain.

4) about coverage.
Thanks for your idea about a mflip like internal option for the
testsuite. I'll give it a try. Note that in the meantime I gave a few
successful tries with LTO, and I'm in the process of running a
combinatorial exploration of a set of larger benchmarcks.
Thanks for you hint about testing the -march=armv7em -march=armv7m
error cases. This is indeed needed.

5) I'm still not sure about what to do with TARGET_UNIFIED_ASM. In one
hand I'm reluctant to bind to this development an improvement that
should be orthogonal (or a prerequisite), in another hand I don't really
like the logic with "emit_thumb". If your recommendation is to make
TARGET_UNIFIED_ASM the default for ARM that great, but I'm still
worrying for thumb1. Terry's feedback might be useful for this.

I'll resent the patch in different parts and thumb1 support. For your
inlining concern do you agree that inlining different modes might not be
mandatory (or even counter-productive) at this stage ?

Best Regards

Christian

Post by Ramana Radhakrishnan
Hi Christian,
Thanks for looking at this. I will need to read the code in detail but
this is a first top level reivew.

Indeed

Post by Christian Bruel
Similarly
__attribute__((target("arm"))) int foo()
Forces arm mode for function foo. It has no effect if the file was not
compiled with -mthumb.

Indeed.

Why shouldn't we allow inlining between functions of ARM mode vs Thumb
mode ? After all the choice of mode is irrelevant at the time of
inlining (except possibly for inline assembler).
Perhaps an option is to try to disable inlining in the presence of
inline assembler or if not gate it from a command line option.

Post by Christian Bruel
- 'arm_option_override' is now reorganized around
'arm_option_override_internal' for thumb related macros

Looks like a reasonable start - We need a couple of tests to make sure
that __attribute__(("arm")) on a file compiled for the M profile results
in a syntax error. v7(e)m is Thumb2 only.
for bonus points it would be great to get __attribute__(("target"))
working properly in the backend. I suspect a number of the tuning flags
and the global architecture state needs to be moved into this as well to
handle cases where __attribute__(("arm")) used with M profile options is
error'd out.

Post by Christian Bruel
- I kept TARGET_UNIFIED_ASM to minimize changes. Although removing it
would avoid to switch between unified/divided asms

I know Terry's been trying to get Thumb1 to also switch by default to
unified asm. So I think a lot of the logic with "emit_thumb" could just
go away. Maybe we should just consider switching ARM state to unified
syntax and that would be as simple as changing TARGET_UNIFIED_SYNTAX in
arm.h to be TARGET_32BIT. Long overdue IMHO.
The only gotcha here is inline assembler but GAS is so permissive that
I'm not too worried about it in ARM state and Thumb2 state. I'm a bit
worried about Thumb1.

Post by Christian Bruel
and simplify arm_declare_function_name. Should be considered at some
point.

I think that can be done for a lot of newer cores - some of that logic
is dated now IIUC.
I remember why my original project failed - I couldn't get enough of the
backend in shape for the state to be saved and restored and then I moved
on to other more interesting things, so whatever is done here needs to
make sure that all ISA mode state is saved and restored.
One thing I experimented with while doing this work was adding something
like the mflip-mips16 option and then have the testsuite run with this
option to try and make sure enough of the backend state is saved and
restored properly.
This will give more testing coverage hopefully to the switching logic
for the attributes and hopefully expose any issues that are there with
respect to saving and restoring state. The problem you'll probably find
is that in some of the gcc.target/arm tests the flipping of state will
cause various interesting failures. The other side is making sure that
all state that is global is now captured and reinitialized everytime we
switch context between ARM and Thumb.
I'm not sure how to best test the pragma switching logic but given that
the 2 hang off each other, it should just work (TM) if one is right. In
any case adding some testcases that direct test this would be useful.

Post by Christian Bruel
- It is only available for Thumb2 variants (for thumb1 lack of interest
and a few complications I was unable to test, although this could be
added easily if needed, I think)

I don't think we should restrict this to Thumb1 or Thumb2 it should be
allowed if the architecture allows it.
For example __attribute__((thumb)) on a function compiled with
-march=armv5te -mfloat-abi=softfp -mfpu=vfpv3 should give a syntax error
as this is not supported. No VFP instructions in Thumb1. Explicit tests
for this would be appreciated.
IIRC all other cases should be accepted.

Post by Christian Bruel
Tested for no regression for arm-none-eabi [,-with-arch=armv7-a]
OK for trunk ?

Sorry not yet.
I would also like some documentation added to extend.texi for these
attributes.
So in summary.
1. Some documentation in extend.texi please.
2. TARGET_UNIFIED_SYNTAX to be turned on for ARM state too.
3. Testcases for this and some testing with a mflip-thumbness switch
(added only for testing).
4. Investigate further giving up restriction on Thumb1.
5. Investigate lifting inlining restriction for __attribute__(("arm"))
or __attribute__ (("thumb")) or maybe gate it off a command line option.
6. Split this patch into smaller logical chunks that can help with
review i.e. the restructuring from the attribute and pragma addition,
the testcases.
7. Tests for the diagnostics and error cases i.e. __attribute__(("arm"))
used with -march=armv7em -march=armv7m command line options.
regards
Ramana

Ramana Radhakrishnan

2014-10-08 16:56:36 UTC

Permalink

Hi Christian,

Post by Christian Bruel
Hi Ramana,
Thanks for your feedback. Just a few comments while you continue the review

Sure - all thoughts are welcome, this isn't a trivial project and I'm
dredging tertiary storage in my brain and old notes for context on all
the gotchas involved in this.

Post by Christian Bruel
1) about the documentation in extend.texi, it was in the patch already
: did I miss a part ?
* doc/extend.texi (arm, thumb): Document target attributes.
* doc/invoke.texi (arm, thumb): Mention target attributes.

Indeed - I don't know how I missed it and as I said it was a brief
review. I am more interested in discussing the semantics up front.

Post by Christian Bruel
2) about supporting thumb1
OK I'll suppress this limitation. But I covered the testing only for
thumb2 as I don't have a thumb1 platform, if it's OK with you thumb1
will only be covered by "visual checking". Can you help to test this mode ?
3) about inlining
I dislike inlining different modes, From a conceptual use, a user
might want to switch mode only when changing a function's hotness.
Usually inlining a cold function into a hot one is not what the user
explicitly required when setting different mode attributes for them,

__attribute__((thumb)) should not imply coldness or hotness. Inlining
between cold and hot functions should be done based on profile feedback.
The choice of compiling in "Thumb1" state for coldness is a separate one
because that's where the choice needs to be made.

Post by Christian Bruel
The compiler would take a decision that is not what the user wrote. And
in addition if you consider the few instructions to modify R15 to switch
state that would end up with more code executed in the critical path,
voiding a possible size of speed gain.

I do not expect there to be any additional instructions needed to switch
state. If function x is inlined into function y the state would be lost
and the state would be in terms of the state of function x.

Obviously if the user doesn't want inlining - the user would add
attributes to disable inlining. You do have extensions such as
__attribute__((noinline)) and __attribute__((never_inline)) to give the
user that control and those bits need to be used in addition.

The attribute then purely reflects then the output instruction state of
the function if a copy of it's body is laid out separately in the output.

IMHO, the heuristics for inlining should be the best judge of when
functions should be inlined between one and another and we shouldn't be
second guessing that in the backend.

If there is a copy of the function to be put out by the compiler, only
then should we choose this based on the state of the "target" i.e. arm
or thumb.

Post by Christian Bruel
4) about coverage.
Thanks for your idea about a mflip like internal option for the
testsuite. I'll give it a try. Note that in the meantime I gave a few
successful tries with LTO, and I'm in the process of running a
combinatorial exploration of a set of larger benchmarcks.

Would be interesting to hear in terms of how you played with LTO.

Post by Christian Bruel
Thanks for you hint about testing the -march=armv7em -march=armv7m
error cases. This is indeed needed.

Adding some directed tests for the pragmas would also be required.

Post by Christian Bruel
5) I'm still not sure about what to do with TARGET_UNIFIED_ASM. In one
hand I'm reluctant to bind to this development an improvement that
should be orthogonal (or a prerequisite), in another hand I don't really
like the logic with "emit_thumb". If your recommendation is to make
TARGET_UNIFIED_ASM the default for ARM that great, but I'm still
worrying for thumb1. Terry's feedback might be useful for this.

I want to think about this carefully too.

Post by Christian Bruel
I'll resent the patch in different parts and thumb1 support. For your
inlining concern do you agree that inlining different modes might not be
mandatory (or even counter-productive) at this stage ?

More practically and from memory I remember that there maybe quite a lot
of false positives in the other parts of the testsuite if you put in the
-mflip-thumbness switch if inlining is turned off.

I think I've also explained my reasons for allowing inlining above.

Also with the target macros that you are changing between ARM and Thumb
it would be good to make sure that you capture everything. I am
concerned about the __ARM_ARCH_ISA_ARM and __ARM_ARCH_ISA_THUMB macros
with the M profile cores. Those probably also need handling please.

I'll review this again when the next patch set arrives with the
different parts.

regards
Ramana

Post by Christian Bruel
Best Regards
Christian