Does gccvb intentionally ignore the use of r29 when determining how to assign registers? I’m calling the following function inside a loop and gccvb assigns r29 to the loop’s counter in spite of my explicitly assigning r29 in the broken case, and explicitly flagging r29 as clobbered in the latter working case:
#define WHY_DOES_THIS_CRASH #ifdef WHY_DOES_THIS_CRASH /* This doesn't work */ void CopyBits(u32 destBitOffset, u32 srcBitOffset, u32 stringLength, u32* dest, u32* src) { register u32 destBitOffsetReg asm ("r26") = destBitOffset; register u32 srcBitOffsetReg asm ("r27") = srcBitOffset; register u32 stringLengthReg asm ("r28") = stringLength; register u32* destReg asm ("r29") = dest; register u32* srcReg asm ("r30") = src; asm(" movbsu " : /* output */ : "r" (destBitOffsetReg), "r" (srcBitOffsetReg), "r" (stringLengthReg), "r" (destReg), "r" (srcReg) /* input */ ); } #else /* This works */ void CopyBits(u32 destBitOffset, u32 srcBitOffset, u32 stringLength, u32* dest, u32* src) { register u32 destBitOffsetReg asm ("r26") = destBitOffset; register u32 srcBitOffsetReg asm ("r27") = srcBitOffset; register u32 stringLengthReg asm ("r28") = stringLength; register u32* destReg = dest; register u32* srcReg asm ("r30") = src; asm(" mov r29,r1 mov %3,r29 movbsu mov r1,r29 " : /* output */ : "r" (destBitOffsetReg), "r" (srcBitOffsetReg), "r" (stringLengthReg), "r" (destReg), "r" (srcReg) /* input */ : "r1", "r29" /* clobbered */ ); } #endif
I’ve noticed that DanB’s Hunter engine follows this pattern as well– backing up r29 before executing movbsu. My question is: why is this necessary? Is this a bug in gccvb? I read through NEC’s official documentation for both the v810 and the v830 and with the exception of the bitstring instructions, neither designates r29 as a first-class citizen. I’m kinda stumped and would prefer the compiler to handle register assignments as it tends to be better at that than I am, especially when inlining.
GCC 2.95.2, using the sources and patches from this site. (Is there a newer version that works better?)
Just an update:
I built GCC 4.4.2 using the experimental patches found here and not only does this not solve the problem, I’m now getting an error message when compiling the “working” version (not very helpful):
main.c: In function 'CopyBits': main.c:345: error: r29 cannot be used in asm here
Hmmmm… Why not?
Finally solved this issue.
According to this page, it looks like the v850 uses r29 as the frame pointer and this likely carried over into the v810 patches. The frame pointer is useful for debugging but without a working debugger it’s pretty pointless for VB dev. The easy fix is to build passing -fomit-frame-pointer as an argument to GCC. A more appropriate solution is to fix GCC to assign a register to the frame pointer that isn’t already reserved for internal use by the v810. I chose r25.
I’ve attached patches for both GCC 2.95.2 and GCC 4.4.2. HTH. 🙂
Attachments:
Replying to myself again… 😛
Here’s the final version of my routine for anyone interested:
void CopyBits(u32 destBitOffset, u32 srcBitOffset, u32 stringLength, u32* dest, u32* src) { register u32 destBitOffsetReg asm ("r26") = destBitOffset; register u32 srcBitOffsetReg asm ("r27") = srcBitOffset; register u32 stringLengthReg asm ("r28") = stringLength; register u32* destReg asm ("r29") = dest; register u32* srcReg asm ("r30") = src; asm volatile("" "movbsu" : "=r" (destBitOffsetReg), "=r" (srcBitOffsetReg), "=r" (stringLengthReg), "=r" (destReg), "=r" (srcReg) /* output */ : "0" (destBitOffsetReg), "1" (srcBitOffsetReg), "2" (stringLengthReg), "3" (destReg), "4" (srcReg) /* input */ : "memory" /* clobbered */ ); }
You’ll need to apply another patch to GCC to build this– again, a likely carryover from the v850 renames r30 to “ep” (element pointer). The v810 doesn’t have an element pointer and gccvb partially acknowledges this; however there are still bits of code left that translate r30 to ep in the intermediate assembly code, leading to errors at build time. I applied a quick patch (attached) that adds ep as a recognized register name mapped to r30, though a better solution would probably be to clean up any remaining references to the ep register.
Attachments: