Original Post

In my (nonexistent) spare time, I’d like to take a look at trying to fix GCCVB bugs. This means that I’d need a copy of the GCC source code to compile/make changes as well as the relevant patches to add the v810 backend.

First things first, before I attempt to fix ANY bugs, I need to compile the darn thing for Windows. I know there are instructions on how to compile GCCVB for Cygwin, but the 4.x version I have is in fact for MinGW. Does anyone have instructions on how compile for MinGW (I think RunnerPack did this port)?

Perhaps some of us could collaborate to create a document on how to build gccVB for Cygwin, Linux, and BSD users?

From what I know, GCCVB 4.x at present will sometimes generate code that points to code offsets in RAM, even when the linker script explicitly gives the correct code reigons, and a user didn’t do the copy of code themselves to RAM (not that passing a fcn pointer to memcpy is legal C anyway :P). Are there any other bugs anyone has run into?

16 Replies

Thanks for offering to look into this, cr1901!

If your MinGW installation includes the MSYS bash terminal, you should be able to follow the steps in this post, perhaps with a few minor modifications depending on whether you get errors and what they are.

The rest of that thread contains a lot of other information that might also be of use.

I don’t recall where I got the actual patches I used, but they’re in an archive attached to one of the gcc4-related threads, so the search function should get you there fairly quickly.

Off the top of my head:

1) Biggest problem so far is that GCCVB 4.x will sometimes generate relative jumps that are offset by some power of two ahead of where their proper destination should be. This becomes a huge problem when these jumps are executed repeatedly, as eventually the PC will wrap and wind up in video memory or elsewhere. I wrote a tool that operates on the final .elf file and attempts to correct these bogus jumps, but it’s more of a bandaid than a solution. http://www.planetvb.com/modules/newbb/viewtopic.php?post_id=31203#forumpost31203

2) Optimization is lousy. It’s better in 4.x but still carries the baggage from the 2.x patches upon which the v810 support is based. So some operations such as 32-bit loads and subroutine calls are hardcoded as a certain sequence of assembly instructions when the compiler might be able to be clever about it (for example, doing several consecutive loads of RAM addresses– movhi hi(0x05000000),rN should only need to be done once)

3) Found a bug recently in the version of gas that ships with GCCVB 4.x where sometimes a bne instruction– written in the assembly source *as a bne instruction*– will assemble to a be and a jr instruction. There is no reason whatsoever that I can think of why this should happen.

4) The frame pointer is assigned to register r29 I think, which is normally used by the bitstring instructions. I’ve never seen documentation for the v810 that says where the frame pointer should go, but since the frame pointer is irrelevant for VB development, I just pass -fomit-frame-pointer when compiling. I mention it here for completeness.

blitter wrote:
See also this post: http://www.planetvb.com/modules/newbb/viewtopic.php?post_id=30370#forumpost30370

Oh look, me bringing the exact same topic up months ago and not following through XD. What else is new in my life?

Erm, do you know if the 4.8 patches are floating around?

I have a Linux box in the living room. I think I want to massage the patches out on that before I attempt a MinGW build. One problem with MSYS for me is that builds using ./configure and make are slow as molasses due to POSIX emulation.

EDIT: I found your download link which includes just the patches. This permits me to apply to a unmodified source tree which is exactly what I wanted. Nice!

Where did the files binutils-2.20.1-vb.patch, gcc-4.4.2-vb.patch, and newlib-1.17.0-vb.patch (without authors appended) come from? Were those from back when v810 support was in the main tree?

EDIT 2: Is there any particular reason that the newlib patch needs to be 300kB? Most of the patch seems to be autoconf macros and other build system BS.

  • This reply was modified 9 years, 7 months ago by cr1901.
  • This reply was modified 9 years, 7 months ago by cr1901.

Using an Ubuntu 12.04 machine…

Binutils compiled ok with blitter’s script, since -Wno-error was on :P.

JWeinberg’s patch to the predicates machine description file does not apply correctly:

What gcc/config/v810/predicates.md actually contains (line 79):

(define_predicate "special_symbolref_operand"
  (match_code "symbol_ref")
{
  if (GET_CODE (op) == SYMBOL_REF)
    return (SYMBOL_REF_FLAGS (op) & (SYMBOL_FLAG_ZDA | SYMBOL_FLAG_TDA | SYMBOL_
FLAG_SDA)) != 0;
  else if (GET_CODE (op) == CONST)
    return (GET_CODE (XEXP (op, 0)) == PLUS
            && GET_CODE (XEXP (XEXP (op, 0), 0)) == SYMBOL_REF

            && GET_CODE (XEXP (XEXP (op, 0), 1)) == CONST_INT
            && CONST_OK_FOR_K (INTVAL (XEXP (XEXP (op, 0), 1))));

  return FALSE;
})

What the JWeinberg’s patch thinks is at that location:

(define_predicate "special_symbolref_operand"
   (match_code "symbol_ref")
 {
+  if (GET_CODE (op) == CONST
+      && GET_CODE (XEXP (op, 0)) == PLUS
+      && GET_CODE (XEXP (XEXP (op, 0), 1)) == CONST_INT
+      && CONST_OK_FOR_K (INTVAL (XEXP (XEXP (op, 0), 1))))
+    op = XEXP (XEXP (op, 0), 0);
+
   if (GET_CODE (op) == SYMBOL_REF)
-    return (SYMBOL_REF_FLAGS (op) & (SYMBOL_FLAG_ZDA | SYMBOL_FLAG_TDA | SYMBOL
_FLAG_SDA)) != 0;
-  else if (GET_CODE (op) == CONST)
-    return (GET_CODE (XEXP (op, 0)) == PLUS
-           && GET_CODE (XEXP (XEXP (op, 0), 0)) == SYMBOL_REF
-           && ENCODED_NAME_P (XSTR (XEXP (XEXP (op, 0), 0), 0))
-           && GET_CODE (XEXP (XEXP (op, 0), 1)) == CONST_INT
-           && CONST_OK_FOR_K (INTVAL (XEXP (XEXP (op, 0), 1))));
+    return (SYMBOL_REF_FLAGS (op)
+           & (SYMBOL_FLAG_ZDA | SYMBOL_FLAG_TDA | SYMBOL_FLAG_SDA)) != 0;

   return FALSE;
 })

Perhaps this was an oversight that wasn’t caught? I only found it by accident, tbh.

v810 support hasn’t been in GCC mainline since version 2.x IIRC. I got the 4.4.2 patches from dasi; perhaps he wrote them originally but I honestly have no idea. I think the v810 patches are largely based on existing support for the v850, which is similar.

The 4.8 patches are part of “devkitV810,” which as far as I can tell is just a fancy name for GCC + some auxiliary tools. It’s supposedly being worked on by dasi and/or Guy Perfect, but I haven’t seen or heard anything from either of them regarding that project in a long time. I think they might be floating around out there on a Git server somewhere…

I’m just going to go ahead and use my best judgment to combine the incompatible patches.

Most of the work is obviously done… my first step is to examine the machine description language and see if I can’t figure out what’s going on.

Against my better judgment, I may attempt to just port against a clean GCC tree- I recently figured out how to add a dummy i8086 (not a typo) target just for kicks, and it’s not like the .md format has changed significantly. It’s more work though :/.

Also, I want to make something clear- this is a project that will take me months on and off working on it. Mainly getting used to the GCC source tree and finding things that work and things that don’t. But it’s worth a try, just to say I did it :P.

How about porting V810 to LLVM instead of trying to fix the gcc port? Added benefit of re-usability, clang, etc.

cYa,

Tauwasser

Well, I’m not good with C++11, and in my experience dealing with GCC’s machine descriptions is easier than dealing with LLVM IR.

Nevertheless, since this a months-long thing, it’s worth a try, and I already have my own personal fork of LLVM to play with.

One of the members here already started on an implementation of LLVM/v810… Maybe it was Parasyte? dasi? As I recall there wasn’t much progress on it.

blitter wrote:
Off the top of my head:

1) Biggest problem so far is that GCCVB 4.x will sometimes generate relative jumps that are offset by some power of two ahead of where their proper destination should be. This becomes a huge problem when these jumps are executed repeatedly, as eventually the PC will wrap and wind up in video memory or elsewhere. I wrote a tool that operates on the final .elf file and attempts to correct these bogus jumps, but it’s more of a bandaid than a solution. http://www.planetvb.com/modules/newbb/viewtopic.php?post_id=31203#forumpost31203

3) Found a bug recently in the version of gas that ships with GCCVB 4.x where sometimes a bne instruction– written in the assembly source *as a bne instruction*– will assemble to a be and a jr instruction. There is no reason whatsoever that I can think of why this should happen.

I found a valid reason for 3): Branch instructions have a 9-bit displacement. Relative jumps have a 26-bit displacement. If it’s necessary to jump further than +/-256 bytes, you’ll have to do something like: be/jr.

As for 1), I haven’t looked into it, but is it possible that the gas port doesn’t take into account jumps that span further than +/-32MB? I don’t think GCC would know/care about that information. But binutils certainly would.

cr1901 wrote:
I found a valid reason for 3): Branch instructions have a 9-bit displacement. Relative jumps have a 26-bit displacement. If it’s necessary to jump further than +/-256 bytes, you’ll have to do something like: be/jr.

Derp. Looking at the v810 architecture manual, you are absolutely right. On the bright side, that’s one less bug to fix in GCC. 🙂

Yes, I’m glad I could fix a bug by recognizing that it is, in fact, not a bug :P. I do understand why it could be seen as a bug. I haven’t taken a look at the .md as to why this particular code sequence is generated.

Since a repeated JMP misses its destination by a power of two, I can’t help but wonder if that is related to the jump range of -/+32MB…

Okay, I’m officially confused now as to what GCC 4.2 is thinking here…

Consider the following code- loadfail.c:

#include "libgccvb/libgccvb.h"

void test_load()
{
	WA[31].head = WRLD_END;
}

-O0 produces the following:

	.file	"loadfail.c"
	.section .text
	.align 2
	.global _test_load
	.type	_test_load, @function
_test_load:
	add -4,sp
	st.w r29,0[sp]
	mov sp,r29
	movhi hi(_WA),r0,r10
	movea lo(_WA),r10,r10
	ld.w 0[r10],r10
	addi 992,r10,r10
	movea lo(64),r0,r11
	st.h r11,0[r10]
	mov r29,sp
	ld.w 0[sp],r29
	add 4,sp
	jmp [r31]
	.size	_test_load, .-_test_load
	.ident	"GCC: (GNU) 4.4.2"

Makes sense to me. -O1 to -O3 however, produce something rather weird to me.

	.file	"loadfail.c"
	.section .text
	.align 2
	.global _test_load
	.type	_test_load, @function
_test_load:
	movhi hi(_WA),r0,r10
	ld.w lo(_WA)[r10],r10
	movea lo(64),r0,r11
	st.h r11,992[r10]
	jmp [r31]
	.size	_test_load, .-_test_load
	.ident	"GCC: (GNU) 4.4.2"

Maybe it’s because it’s 4:30AM, but I can see how a load from a control register address (since _WA=0x0003D800, r10 holds 0x00030000 by the time ld.w lo(_WA) rolls around…) is going to help to get the right address of the World Attributes base register into r10 by any stretch of the imagination.

This isn’t a fluke either. By -O2, all address loads in all files seem to be replaced with this optimized form. I can’t really test right now, but doing a load from memory to generate a base address (instead of using movea) seems wrong. Can anyone possibly explain why this works?

EDIT: Wait a second, it’s in the unoptimized version too!

ld.w 0[r10],r10
addi 992,r10,r10

WHY is there a load to r10?! There’s absolutely no guarantee that _WA will be 0, and the addi might generate a bad address!

My two cents is I agree. Makes no sense. Also I’ve been compiling one of my projects with -O so I tried it with -O2 and it stops working. No errors just no output from the same code that works fine with -O. I haven’t dug into it at all but the -O2 optimizations clearly break what I’m working on. Doesn’t really explain anything but might be some evidence that there is a bug there.

FWIW, in my test code, O3 works correctly in Mednafen. So whatever that ld.w is doing doesn’t seem to be breaking anything. Won’t be able to test real hardware until later.

 

Write a reply

You must be logged in to reply to this topic.