In v810-op.c, there are typos on the masks of all but the addf.s instruction. This cause divf.s, mulf.s, and subf.s to disassemble(as when using objdump -D) to cmpf.s, which is obviously not correct behavior…and confused me for a while until I realized the disassembler was broken.
Here are snippets of the relevant parts of the table. The instances of 0xFC000 should instead be 0xFC00(only addf.s is correct).
{ "addf.s", two (0xf800, 0x1000), two (0xfc00, 0xfc00), {R1, R2}, 0, PROCESSOR_ALL }, { "cmpf.s", two (0xf800, 0x0000), two (0xfc00, 0xfc000), {R1, R2}, 0, PROCESSOR_ALL }, { "divf.s", two (0xf800, 0x1c00), two (0xfc00, 0xfc000), {R1, R2}, 0, PROCESSOR_ALL }, { "mulf.s", two (0xf800, 0x1800), two (0xfc00, 0xfc000), {R1, R2}, 0, PROCESSOR_ALL }, { "subf.s", two (0xf800, 0x1400), two (0xfc00, 0xfc000), {R1, R2}, 0, PROCESSOR_ALL },
A mischievous mysterious mad (h/cr)acker working on a gcc, stripped-down newlib, and ‘custom library whose name I have not decided upon’-based toolchain for PC-FX development.
Two(that I know of) toolchains already exist, one being the official NEC one, and the other being a modification to enable NEC’s libraries and header files to work with gcc. NEC’s libraries have unclear license terms, and horrid misspellings.
Aaaaand…cats go meow!
I found another bug in the assembler, that will break(seg fault in gas called by gcc during compilation) the gcc optimization control argument “-mzda=n”. In binutils-2.10/gas/config/tc-v810.c:
case AREA_SDA: if (sbss_section == NULL) { sbss_section = subseg_new (".sbss", 0); bfd_set_section_flags (stdoutput, sbss_section, applicable); seg_info (sbss_section)->bss = 1; } break; case AREA_ZDA: if (zbss_section == NULL) { zbss_section = subseg_new (".zbss", 0); bfd_set_section_flags (stdoutput, sbss_section, applicable); seg_info (zbss_section)->bss = 1; } break;
bfd_set_section_flags() is called with the wrong section variable in case AREA_ZDA. Changing it to be zbss_section instead of sbss_section appears to fix the problem. Has anyone else encountered this bug?
Mednafen wrote:
I found another bug in the assembler, that will break(seg fault in gas called by gcc during compilation) the gcc optimization control argument “-mzda=n”. In binutils-2.10/gas/config/tc-v810.c:case AREA_SDA: if (sbss_section == NULL) { sbss_section = subseg_new (".sbss", 0); bfd_set_section_flags (stdoutput, sbss_section, applicable); seg_info (sbss_section)->bss = 1; } break; case AREA_ZDA: if (zbss_section == NULL) { zbss_section = subseg_new (".zbss", 0); bfd_set_section_flags (stdoutput, sbss_section, applicable); seg_info (zbss_section)->bss = 1; } break;bfd_set_section_flags() is called with the wrong section variable in case AREA_ZDA. Changing it to be zbss_section instead of sbss_section appears to fix the problem. Has anyone else encountered this bug?
Wow! Another great find from our new friend Mednafen! π
Well, since I hadn’t even heard of the flag “-mzda=n” before this post, I never actually experienced this bug.
I wonder if it (and/or the previous one) are bugs in gcc or in the v810 patch. I suppose someone(s) should be trying to get a more mature version of gcc to output v810 code… It still irks me that someone felt the need to actually remove that target from the compiler >:( I thought OSS was about gaining features, not losing them!
I was thinking of at least incorporating your fixes into the patch from http://hp.vector.co.jp/authors/VA007898/pcfxga/develop/gcc.html and I still might.
Thanks, Mednafen!
EDIT:
I found those two errors in the patch included in this: http://www.vr32.de/content/tech/utilities/gccvb/files/vb_v810_gcc_03.tar.gz and fixed them.
I wanted to attach the changed file, but it wouldn’t let me π so, I guess I’ll continue my plan to make a new tarball of the compiler/utils/patches with your fixes (and hopefully some others).
- This reply was modified 16 years, 1 month ago by RunnerPack.
RunnerPack wrote:
Well, since I hadn’t even heard of the flag “-mzda=n” before this post, I never actually experienced this bug.
-mzda=n forces all global and static data variables/structures <= n bytes in size to be located in the first 32KiB of the V810's address space, so that "offset[r0]" type of addressing can be used, for smaller and faster code in many cases. Also, for some reason, the linker scripts appear to start the ZDA at 0x160 instead of 0x0. Maybe to conform to NEC's (un?)published V810 ABI? (Random rants: the V810 doesn't really suck, but it really needs clever usage of its I/O and memory maps for optimal code, that the PC-FX lacks in several cases...and I wonder if NEC's proprietary V810 compilers take into account DRAM page change penalties...)
I wonder if it (and/or the previous one) are bugs in gcc or in the v810 patch. I suppose someone(s) should be trying to get a more mature version of gcc to output v810 code… It still irks me that someone felt the need to actually remove that target from the compiler >:( I thought OSS was about gaining features, not losing them!
-mzda=n is an architecture-specific option, so it’s not a problem in the core of gcc. I don’t believe V810 support was ever included in mainline gcc, it has, AFAIK, simply existed as a patch(and some pre-patched unofficial tarballs, of course). V850 has and still is in gcc, though. I’ve made a list of major differences in the instruction sets between the two…
Mednafen wrote:
-mzda=n forces all global and static data variables/structures <= n bytes in size to be located in the first 32KiB of the V810's address space, so that "offset[r0]" type of addressing can be used, for smaller and faster code in many cases. Also, for some reason, the linker scripts appear to start the ZDA at 0x160 instead of 0x0. Maybe to conform to NEC's (un?)published V810 ABI? (Random rants: the V810 doesn't really suck, but it really needs clever usage of its I/O and memory maps for optimal code, that the PC-FX lacks in several cases...and I wonder if NEC's proprietary V810 compilers take into account DRAM page change penalties...)
*whooooosh*
That was the sound of most of that going right over my head π
I’m pretty sure ZDA means “zero data area” but I don’t understand why accessing stuff at lower addresses speeds up or shrinks the code… Why couldn’t you just put an address in, e.g. R15 and then use “offset[R15]” to achieve the same effect?
I also understood: “the first 32KiB of the V810’s address space”. I know you’re more interested in the PC-FX, but I don’t think that would work on the VB, since that area is used by the VIP (specifically, one of the left frame buffers and the first 512 characters). The “work” RAM area is the 64K at 0x0500XXXX.
-mzda=n is an architecture-specific option, so it’s not a problem in the core of gcc. I don’t believe V810 support was ever included in mainline gcc, it has, AFAIK, simply existed as a patch(and some pre-patched unofficial tarballs, of course). V850 has and still is in gcc, though. I’ve made a list of major differences in the instruction sets between the two…
Actually, I misread an archived message from the newlib mailing list about dropping V810 support, but the anger still applies π
Anyway… Could one get the source for gcc 4.3.2 (http://ftp.gnu.org/gnu/gcc/gcc-4.3.2/) and apply (a modified version) of the V810 patch (from here)?
I.e. does that patch turn the V850 support into V810? Or does it not even care if the V850 support is there? If it adds it from scratch, the architecture may have changed enough that the whole patch will have to be made again…
I don’t know any specific reasons why I want a newer gcc. I just know that, generally (except in the case of Microsoft products :-P) a higher version number means a better program π
I’m pretty sure ZDA means “zero data area” but I don’t understand why accessing stuff at lower addresses speeds up or shrinks the code… Why couldn’t you just put an address in, e.g. R15 and then use “offset[R15]” to achieve the same effect?
The loading of the address into a register takes CPU cycles too, you know. π
It probably won’t help much when you iterate over arrays, or use a lot of structures. It really helps when you have some global/static variables all over the place that you want to access quickly.
I also understood: “the first 32KiB of the V810’s address space”. I know you’re more interested in the PC-FX, but I don’t think that would work on the VB, since that area is used by the VIP (specifically, one of the left frame buffers and the first 512 characters). The “work” RAM area is the 64K at 0x0500XXXX.
Bizarre.
I.e. does that patch turn the V850 support into V810? Or does it not even care if the V850 support is there? If it adds it from scratch, the architecture may have changed enough that the whole patch will have to be made again…
AFAIK, the patch only adds files and modifies build scripts/makefiles, it doesn’t touch the V850 support at all.
Speaking of iterating, the V850 had a good idea with the SLD and SST instructions; load/store in 16-bit-long instructions(normal LD/ST instructions are 32-bit-long), with a fixed base register, and greatly reduced offset range, perfect for iteration!
Some of you may not be aware of Mednafen’s previous work here:
http://mednafen.sourceforge.net/
Mednafen wrote:
The loading of the address into a register takes CPU cycles too, you know. π
Okay, 1) I knew that π and 2) it only has to be done once, so I still don’t see the big improvement, but whatever… That’s far from being the only thing I don’t know, so let’s just leave it at that. π
AFAIK, the patch only adds files and modifies build scripts/makefiles, it doesn’t touch the V850 support at all.
That’s what I thought… Just too lazy to confirm it π
The vucc compiler has an option that sets the value of the global pointer, gp (r4), so that the bss and data sections can be accessed by gp offset.
-msda=n “[puts] static or global variables whose size is n bytes or less into the small data area that register gp points to”, but vb.ld would need fixing up to get it working.