GCC inline asm annoyance

Doing some PowerPC work recently, I wanted to use the lwbrx instruction, which loads a little endian word from memory. A simple asm statement wrapped in an inline function seemed like the simplest way to do this.

The lwbrx instruction comes with a minor limitation. It is only available in X-form, that is, the effective address is formed by adding the values of two register operands. Normal load instructions also have a D-form, which computes the effective address by adding an immediate offset to a register operand.

This means that my asm statement cannot use a normal “m” constraint for the memory operand, as this would allow GCC to use D-form addressing, which this instruction does not allow. I thus go in search of a special constraint to request X-form. GCC inline assembler supports a number of machine-specific constraints to cover situations like this one. To my dismay, the manual makes no mention of a suitable contraint to use.

Not giving up hope, I head for Google. Google always has answers. Almost always. None of the queries I can think of return a useful result. My quest finally comes to an end with the GCC machine description for PowerPC. This cryptic file suggests an (undocumented) “Z” constraint might work.

My first attempt at using the newly discovered “Z” constraint fails. The compiler still generates D-form address operands. Another examination of the machine description provides the answer. When referring to the operand, I must use %y0 in place of the usual %0. Needless to say, documentation explaining this syntax is nowhere to be found.

After spending the better part of an hour on a task I expected to take no more than five minutes, I finally arrive at a working solution:

static inline uint32_t load_le32(const uint32_t *p)
{
    uint32_t v;
    asm ("lwbrx %0, %y1" : "=r"(v) : "Z"(*p));
    return v;
}

Bookmark the permalink.

int func_m(int *p) { int x; asm volatile ("lwbrx %0, %y1" : "=r"(x) : "m"(*(p+1))); return x; } int func_z(int *p) { int x; asm volatile ("lwbrx %0, %y1" : "=r"(x) : "Z"(*(p+1))); return x; }

00000000 : <func_m> 0: 7c 63 24 2c lwbrx r3,r3,r4 4: 4e 80 00 20 blr 00000008 : <func_z> 8: 38 63 00 04 addi r3,r3,4 c: 7c 60 1c 2c lwbrx r3,0,r3 10: 4e 80 00 20 blr

7 Responses to GCC inline asm annoyance

Johannes Rajala says:

Wednesday, 21st January, 2009 at 8:59 am

Nice find, thanks! I was having the same problem.
Andrew Pinski says:

Tuesday, 12th May, 2009 at 10:31 pm

The Z constraint has been documented since GCC 4.3.0 (as I added the documentation for it and all the missing constraints for PPC). Also it is better to use __builtin_bswap32 for 4.3.0 and above because GCC is able to optimize that better if the load does not need to happen (if the value is in a register already, it uses three instructions, if the value is in a register).

Here is the documentation for the Z constraint (from http://gcc.gnu.org/onlinedocs/gcc-4.3.0/gcc/Machine-Constraints.html):

Z
Memory operand that is an indexed or indirect from a register (`m’ is preferable for asm statements)

Thanks,
Andrew Pinski
- Mans says:
  
  Tuesday, 12th May, 2009 at 11:55 pm
  
  A few observations:
  – There is still no mention of the magic ‘y’ modifier.
  – GCC 4.3 is so buggy, particularly for PPC, that it would be unwise to use it widely.
  – The day GCC is able to optimise anything at all reliably will be a day to celebrate.

Mike says:

Thursday, 28th May, 2009 at 7:41 pm

What’s wrong with the traditional version?

static inline uint32_t __lwbrx(const register uint32_t *p)
{
    register uint32_t v;
    asm ("lwbrx %0, 0, %1" : "=r"(v) : "b"(p));
    return v;
}

Manish says:

Thursday, 25th June, 2009 at 6:41 am

from the gcc page
Z
Memory operand that is an indexed or indirect from a register (`m’ is preferable for asm statements)

asm volatile("stbrx %1,%y0": "=m"(*addr): "r"(reg): "memory" );
asm volatile("stbrx %1,%y0": "=Z"(*addr): "r"(reg): "memory" );

Both gave the same assembly

   c:   81 3f 00 08     lwz     r9,8(r31)
  10:   7d 80 4d 2c     stwbrx  r12,0,r9

Mans says:

Monday, 3rd August, 2009 at 11:37 pm

A simple test like that is likely to produce the same code for both. Try something slightly more complicated instead:

This gives the following assembly:

I’m not quite sure what’s going on here, but func_m is obviously incorrect.

GCC inline asm annoyance

7 Responses to GCC inline asm annoyance

Recent Posts

Recent Comments

Categories

Archives

Meta