Working on my project, building in GCC 4, I have finally run into the issue shown in the attached screenshot and described in this post http://www.planetvb.com/modules/newbb/viewtopic.php?post_id=10819#forumpost10819 . What in C is an innocent call to a function I’ve defined, compiles to a jump to 0x0800027c– somewhere in VRAM, where Mednafen happily executes whatever is there. =P
I’m willing to look into it and dig into the GCC code but I haven’t the first thought of where to start. Any ideas?
Attachments:
This music sample is great. Nice work. I’ve been playing around with compiling this and looking at the odd jal instruction. I don’t have any answers but I did find something interesting that may point someone else in the right direction. I simply modified the linking statement to have main.c.o linked before isr.c.o and the odd jal instruction goes away. After that change the disassembly line for the call to initISR looks like this
70002d6: 00 ac 6e 00 jal 7000344 <_initisr>
which is correct and I don’t find any 7400000+ addresses anywhere.
So my guess is that the linker is having some issue when trying to link to something that the address pointer needs to move backwards for and that address is in a separate object file than the calling instruction.?!? It seems to work fine when calling addresses that are within the same object file and are prior to the instruction call. Haven’t tested this theory though. Doesn’t really make any sense to me but maybe someone else might be able to make more sense of it. My eyes are hurting from staring at the screen all afternoon so I’m going to bed…
From doing some internet searches it seems linking order does matter. If A depends on B then A needs to be linked first. Since main depends on ISR main needs to be linked first. So since memory is mirrored maybe it just found the next mirored occurance somehow of the function. I’m guessing at this point a modification to the make file is all that’s needed to fix the issue
Based on Greg’s idea, I did some tests by changing the makefile in blitter’s music demo (BTW, great music/conversion blitter!) in two ways:
First, I made it put main.c first. I thought this produced a change, but it must have been caused by a difference in compilers (I’m using one I built myself from blitter and Jorge’s patches which, apparently, only runs on my system :-P). When I built a ROM using the unchanged makefile, they were identical.
Next, I changed it to put isr.c after the other files. This did produce a difference. I’ve attached the three ROMs and their disassembler listings for those more experienced with v810 assembly to look at 😉
The makefile changes were pretty simple. I just changed the definition of the “CFILES” variable as follows:
# Original CFILES = $(wildcard *.c) $(foreach dir,,$(wildcard $(dir)/*.c)) # main.c first CFILES = main.c $(filter-out main.c,$(wildcard *.c) $(foreach dir,,$(wildcard $(dir)/*.c))) # isr.c last CFILES = $(filter-out isr.c,$(wildcard *.c) $(foreach dir,,$(wildcard $(dir)/*.c))) isr.c
Attachments:
The makefile changes were pretty simple. I just changed the definition of the “CFILES” variable as follows:
I would actually be surpised if the CFILES order made a difference. I was actually referring to the $OFILES in the make file.
I believe the command needing changed is
$(ELFFILES): $(OFILES) $(LD) $(OFILES) $(LDFLAGS) -o $@
on my system the output of the command looks like this
/home/user/.wine/drive_c/vbde/gccvb/bin/v810-ld output/crt0.s.o output/isr.c.o output/main.c.o output/musicPlayer.c.o output/noiseChannel.c.o output/notemap.c.o output/SCIFI.SNG.o -L/home/user/.wine/drive_c/vbde/gccvb/lib -L/home/user/.wine/drive_c/vbde/gccvb/lib/gcc/v810/4.4.2 -Tvb.ld -nodefaultlibs -lgcc -gc-sections -o output/MusicTest.elf
If I modified the above statement and placed main.c.o before isr.c.o that seemed to fix the error (on my system anyway).
I believe your change is what is needed but to the $OFILES variable instead of the $CFILES variable.
If I understood make files better I would offer a solution but I always use the bare minimum necessary to compile and haven’t really delved into the symantics of it.
- This reply was modified 11 years, 1 month ago by Greg Stevens.
Next, I changed it to put isr.c after the other files. This did produce a difference. I’ve attached the three ROMs and their disassembler listings for those more experienced with v810 assembly to look at 😉
Also, I find it much easier to look at the .ELF file before the objcopy command is used on it since it still has the symbols in it.
I just use v810-objdump -d musictest.elf > disassembly.txt
and the listing is much easier to interpret at a glance.
here is main and the incorrect function call which looks like __vbvectors_end+0x3f0311 since it went beyond the end of the linking area. I still don’t understand how it calculated a mirrored address correctly though.
07000374 <_main>: 7000374: 5f 01 mov lp, r10 7000376: 00 ac ca 0f jal 7001340 <__save_r31> 700037a: 40 bd 00 02 movhi 512, r0, r10 700037e: 4a a1 24 00 movea 36, r10, r10 7000382: 6a c1 00 00 ld.b 0[r10],r11 7000386: 6b b5 ff 00 andi 255, r11, r11 700038a: 6b b1 01 00 ori 1, r11, r11 700038e: 6a d1 00 00 st.b r11, 0[r10] 7000392: 3f ac 7e ff jal 7400310 <__vbvectors_end+0x3f0311> 7000396: 00 ac 32 01 jal 70004c8 <_initmusic>
and here is the exact same section of code after switching the isr.c.o and main.c.o in the v810-ld command
070002b8 <_main>: 70002b8: 5f 01 mov lp, r10 70002ba: 00 ac 86 10 jal 7001340 <__save_r31> 70002be: 40 bd 00 02 movhi 512, r0, r10 70002c2: 4a a1 24 00 movea 36, r10, r10 70002c6: 6a c1 00 00 ld.b 0[r10],r11 70002ca: 6b b5 ff 00 andi 255, r11, r11 70002ce: 6b b1 01 00 ori 1, r11, r11 70002d2: 6a d1 00 00 st.b r11, 0[r10] 70002d6: 00 ac 6e 00 jal 7000344 <_initisr> 70002da: 00 ac ee 01 jal 70004c8 <_initmusic>
as you can see _initISR now shows up with the correct address instead of the vbvectors+ address.
Also if my theory is correct I should be able to cause more issues by moving musicPlayer.c.o before main.c.o and here is the result of that test
070011f8 <_main>: 70011f8: 5f 01 mov lp, r10 70011fa: 00 ac 46 01 jal 7001340 <__save_r31> 70011fe: 40 bd 00 02 movhi 512, r0, r10 7001202: 4a a1 24 00 movea 36, r10, r10 7001206: 6a c1 00 00 ld.b 0[r10],r11 700120a: 6b b5 ff 00 andi 255, r11, r11 700120e: 6b b1 01 00 ori 1, r11, r11 7001212: 6a d1 00 00 st.b r11, 0[r10] 7001216: 3f ac fa f0 jal 7400310 <__vbvectors_end+0x3f0311> 700121a: 3f ac 7a f2 jal 7400494 <__vbvectors_end+0x3f0495> 700121e: c0 bc 00 07 movhi 1792, r0, r6 7001222: c6 a0 08 41 movea 16648, r6, r6 7001226: 3f ac 4e f1 jal 7400374 <__vbvectors_end+0x3f0375> 700122a: 00 8a br 700122a <_main+0x32>
So initMusic and playMusic now show the same odd jal instructions since I linked them before their calls in main.
I got most of my information from the following link in case anybody is interested…
http://stackoverflow.com/questions/45135/linker-order-gcc
- This reply was modified 11 years, 1 month ago by Greg Stevens.
- This reply was modified 11 years, 1 month ago by Greg Stevens.
- This reply was modified 11 years, 1 month ago by Greg Stevens.
- This reply was modified 11 years, 1 month ago by Greg Stevens.
Greg Stevens wrote:
I believe your change is what is needed but to the $OFILES variable instead of the $CFILES variable.
Well, that’s essentially what I did, since the order of the .o files in the OFILES variable is derived from that of the .c files in the CFILES variable. COBJS consists of the filenames in CFILES with “.o” stuck on the end of each, and OFILES contains COBJS, verbatim.
I knew about getting an assembly listing from the ELF, I just forgot the syntax and didn’t feel like looking it up before posting 😛 so thanks for mentioning it! 🙂 I used my customized version of David Tucker’s disassembler, instead.
Anyway, does using the right link order fix all of the known issues, or is gcc still doing other, unrelated weird stuff?
RunnerPack wrote:
Well, that’s essentially what I did, since the order of the .o files in the OFILES variable is derived from that of the .c files in the CFILES variable. COBJS consists of the filenames in CFILES with “.o” stuck on the end of each, and OFILES contains COBJS, verbatim.
Anyway, does using the right link order fix all of the known issues, or is gcc still doing other, unrelated weird stuff?
Ha, and apparently my understanding of make files was even less than I thought. I didn’t pick up on that. I should probably study the syntax at some point.
I have not seen any other anomalies but I don’t usually link multiple object files together either. I couldn’t find anything else in these test files.
Greg Stevens wrote:
I should probably study the syntax at some point.
I found a pretty good introduction:
Greg Stevens schrieb:
here is main and the incorrect function call which looks like __vbvectors_end+0x3f0311 since it went beyond the end of the linking area. I still don’t understand how it calculated a mirrored address correctly though.
That part led me to the solution of a problem related that I had which crashed the game when running on the VB. I was getting the same references in my assembler code (__vbvectors_end+0xXXXXXXX ), but after compiling gccvb 4 with the patches blitter provided in the following thread, the linker now outputs the right addresses for the relative jumps.
http://www.planetvb.com/modules/newbb/viewtopic.php?topic_id=5328&post_id=26112#forumpost26112
I know that linker order can affect the visibility of symbols across object files, but I’ve never before seen a case where the order can directly affect the addresses generated for visible symbols. I fear something more sinister is at play here. At any rate, switching the linker order can appear to fix some instances of this problem, but I’ve run into this when building assembly-based projects too, where multiple symbols may be defined within a single translation unit, and a deliberate order determines where the assembly is placed in the binary. Changing the linker order here means changing the layout of the binary, which in my case could break execution since pieces of my code expect functions to live in certain areas of ROM.
That said, and since I have not the desire to dig into GCC to find the root cause of this (I’d rather be developing for the VB itself), I decided to hack my way around this problem by writing a small tool that patches ELF files with bad jumps. As an example:
mbp:jmptool blitter$ ./jmptool ../../isrtest/output/IsrTest.elf IsrTest_fixed.elf Total ELF sections: 12 Last ELF section address is 0x0707FDE0, size is 0x00000220, last byte + 1 is 0x07080000 disp26 mask is 0x0007FFFF disp26 shift is 7 Found a bad jump at 0x0700003A: jr 0x07400022 [(int26)0x003FFFE8] Correcting to 0x07000022 [(int26)0x03FFFFE8] Found a bad jump at 0x070000AC: jr 0x07400022 [(int26)0x003FFF76] Correcting to 0x07000022 [(int26)0x03FFFF76] Found a bad jump at 0x07003472: jal 0x07403404 [(int26)0x003FFF92] Correcting to 0x07003404 [(int26)0x03FFFF92] mbp:jmptool blitter$
Keen eyes may notice that the bad jumps are 8 bits away from their corrected versions– the bad jumps are coded as 18-bit displacements when they should be 26-bit. Maybe this can help somebody (dasi? Guy Perfect?) but I’ve given up solving this problem since GCC is quite the complicated beast and this tool works well enough when stuck between ld and objcopy. 🙂
I built this tool for OS X and also in Cygwin on my Windows 8.1 machine, but the source is included too just in case. Hope this helps someone else out there.
Attachments:
This looks like a great stop-gap solution, blitter; thanks!
I took the liberty of compiling it with Mingw, to remove any possible dependencies on the Cygwin DLLs (of which I’m not even sure there are any). While doing so, I got a warning about using a C++-only, implementation-dependent feature – the multi-character char constant used to test for the ELF signature – so I converted it to an int value.
Attachments:
D’oh, I meant MinGW. Whenever I use the GCC suite in Windows I’m conditioned to assume it’s Cygwin.
And yeah, 32-bit multi-char literals have been a staple of Mac development since the dawn of time. Tells you where I wrote it. 😉