Original Post

I’m looking for any general ideas for increasing frame rate. I’ve been working on wireframe graphics for a while now and I’m able to import OBJ files and display models without too many issues. However my FPS is pathetic. I’ve done most of the obvious optimizations (I think) like enabling the instruction cache around my heavy loops, setting the WCR to 1 etc. I don’t have exact numbers but I don’t really need them yet since I’m comparing my output to that of Red Alarm and I can count my frame rate by eyeballing it at this point. Anybody reverse engineered the Red Alarm code yet to see how they are managing such high frame rates? I started out writing directly to the frame buffers, then I wrote to a single frame buffer array in memory and then copied it to the 4 frame buffers. I’m not sure I’m fully grasping the frame buffer timings either. Like is it possible to write to the right frame buffer while the left one is being displayed? At this point I’m just looking for general ideas, nothing specific but keeping in mind that I am not using any of the VIP objects (Characters, Background Maps, Worlds, Objects) to display graphics. It’s all being done through code and writing directly to the Frame Buffers so I can’t really take advantage of the natural processing power of the VIP (At least I don’t see a way yet). I’ve been using the code from the hunter game written by DanB for my screen refreshes.
static u16 pgflip = XPBSY1;
pgflip = ~pgflip & XPBSYR;
VIP_REGS[XPCTRL] = VIP_REGS[XPSTTS] | XPEN;
while(!(VIP_REGS[XPSTTS] & pgflip));
VIP_REGS[XPCTRL] = VIP_REGS[XPSTTS] & XPENOFF;//Disable drawing.
This is the last bit of code called at the end of my game loop and it updates the screen just fine but the frame rate is really slow.
It’s also possible that I could speed things up by organizing my model data in better ways but I’m using such small models that regardless of how bad the data is arranged it shouldn’t be affecting me too much.

  • This topic was modified 11 years, 10 months ago by Greg Stevens.
9 Replies

What sort of line drawing algorithm are you using?

This was the cleanest concise one I could find. I’m also using Fixed point Arithmetic so that’s what the F_NUM_DN is just a (x>>7)

	
	/**************************
	The following algorithm was taken from stack overflow
	http://stackoverflow.com/questions/5186939/algorithm-for-drawing-a-4-connected-line
	**************************/
	dx=abs(vx2-vx);
	dy=abs(vy2-vy);
	
	sx = (vxp)>>PARALLAX_SHIFT));
		e1=e+dy;
		e2=e-dx;
		if(abs(e1)
        
    

Are drawPoint() and abs(), and all functions they invoke, inline?

drawPoint is defined as an inline function yes and it calls no other functions. abs() is the C version so it’s not my function. I’m not an expert on C so I’m not sure if it inlines it or not and I haven’t checked the disassembly to be sure but I would have to think that GCC would do that automatically.

I’ve been messing around some more and I seem to have been able to squeeze out a little better speed by using Bit String operations for buffer copies and clears instead of loops but still noting compared to Red Alarm.

Partially-unrolled loops operating on multiple 32-bit units per iteration will most likely be faster than movbsu for your use cases here.

As in something like:

uint32 *ptr = BLAHBLAHBLAH;
uint32 *ptr_bound = ptr + length_in_32bitunits;

while(ptr < ptr_bound)
{
 ptr[0] = 0;
 ptr[1] = 0;
 ptr[2] = 0;
 ptr[3] = 0;
 ptr += 4;
}

Why would you clear the buffers manually by copying data, when the VIP will clear it to BKCOL for you each frame? And If you’re manually copying the same buffer to both eyes, you can’t have any 3D effect either ofcourse…

Sorry I wasn’t clear about that. I tried defining a memory buffer so
u32 buff[0x1800];
I tried writing to that and then moving that buffer all at once to the actual frame buffers (using bit string move). It is the above buffer that I clear not the VB frame buffers.
(It was just an experiment to see if writing all my information once and then copying it using bit string moves was more efficient that writing the individual pixels to each buffer). Like I said it seemed to improve a little but not so noticibly that it is going to be a solution for me. I’ve acutally already undone that code.

I have not explored unrolling my loops yet so that’s probably something I’ll try next.

Thanks.

u32 buff[0x1800];

That’s over 1/3 of available memory. 🙁

 

Write a reply

You must be logged in to reply to this topic.