As my ongoing audio initiative continues, the advent of easy-access music on Virtual Boy draws near. More information on that project will be on its way, but of course it led me into music theory, so let’s talk music theory!
Have you ever sat down and tried to formulate some technical specification to answer the question “what is music”? If one is to implement music on a computer such as the Virtual Boy, that’s the first question that has to be answered. I boiled the question down to its most fundamental elements, and came up with the following: music is some arrangement of one or more notes of an audible tone, each with a starting time, a duration and a frequency.
Therefore, in its simplest form, music on Virtual Boy needs to be a program capable of producing such notes. Of course other characteristics of notes are also useful, such as the exact sound used as a tone, the amplitude or volume of the note, and the note’s ability to change its properties while it’s playing (such as fading out or raising in pitch). Either way, we’re looking at musical notes that start and end.
I know, I know, this seems obvious, but you’ll find that happens a lot in programming. You have to think about the simple things because you have to tell the program about those simple things. Computers don’t make assumptions.
Virtual Boy supplies audio hardware that gives us some useful features, but also imposes some limitations. We can use five PCM buffers to specify single cycles of wave forms, which is useful for producing different sounds of our design. However, these buffers are only 32 samples in size and they can’t be written to while sound is being generated. It gives us five channels that can play simultaneously using any of the five waves, which is wonderful, but at the end of the day limits us to five simultaneous notes. There is also a sixth channel that produces pseudorandom noise, useful for its own little subset of audio effects.
There are some other features in the audio hardware that aren’t especially useful for music, or are things that can be simulated through software, so I won’t mention them here.
A simple musical track would initialize the PCM buffers with some waveforms, then have a schedule of notes to play on each channel. This is basically what MIDI does. But I think we can take it a step further, which leads us into technical territory…
__________
One of the specific goals of my project is efficiency. Not just fast code, but small data too. I want it to be so technically insignificant that anyone can ask themselves, “Do I have the resources to incorporate this into my project?” and answer yes every time. To that end, a few optimizations can be made.
Data compression is an example of the archetypal “double edged sword”. While it reduces the number of bytes needed to represent something, it necessarily increases the processing of that thing by necessitating a decoding routine, which often requires more memory to work in. I’m not keen in using something like deflate on Virtual Boy, since that’s a pretty big code and memory requirement that don’t really fit in with my project goals.
Most data compression works by eliminating redundancy in the data. This is easy to do, because the data can simply specify “Remember that block of bytes from earlier? Let’s do that another 20 times” instead of actually including that block of bytes 20 more times. Think of something like a drum loop. In many musical tracks, long portions of percussion are the same technical notes repeatedly. So instead of storing them who knows how many times, why not store them once and play them repeatedly?
The working draft I’ve got in front of me here allows the music engine to read data directly from ROM without decompressing it or anything, but still eliminate redundancy. This should get the best of both worlds, fingers crossed.
We’re also talking video game music, so I want to incorporate some features that are specifically tailored for that. I don’t want to spill all the beans right now because I want to be able to hype it up later, but just consider a few things…
• Video game music often repeats to an extent, but generally won’t repeat an intro.
• Some music changes a bit depending on game events. Mario underwater almost always sounds different.
• Conversely, some game events are dependent on the music.
__________
The topic of frequency raises an interesting technical challenge. I’m kinda proud of this one. (-:
When interpolating between two values, such as when a note fades out, you smoothly transition from a starting value (hereby called “Left”) and an ending value (“Right”). For volume in a fade, this can be done linearly with the usual interpolation formula:
Value = Left + (Right – Left) * Percent
Frequency doesn’t work quite like that, though. At least not in the sense of an audible tone. To increase a note by one octave is to double its frequency. For example, raising 100hz by one octave is 200hz. Raising that another octave is 400hz. Since 100hz is two octaves below 400hz, it could be said that 50% of the way between them is just one octave, or 200hz. This is the intuitive way to notate the frequencies on a keyboard or piano roll.
Using the linear interpolation formula, you’d wind up with 250hz, so that’s a snag. This of course is an exponential interpolation, and the formula is markedly similar. You just kick all the operators up a notch:
Value = Left * (Right / Left) ^ Percent
Easy enough in concept, but have you ever tried to do exponentiation with non-integer powers algorithmically? Yuck. Even using the industry-standard pow() function A) requires double-precision floats which VB doesn’t natively support, and B) involves an infinite series and is thus quite slow. It’s just really icky all around, but I won’t yield!
Further complicating matters is the fact that the Virtual Boy audio hardware doesn’t accept hertz directly. It samples from the PCM buffers on a 5MHz timer with a delay given by the CPU. Since there are 32 samples per PCM buffer, one could reason that the tone cycle frequency is 5MHz / 32 = 156250hz. The delay is 11-bit and inverted so that lower values yield lower frequencies, making the actual tone formula the following:
ToneHz = 156250 / (2048 – Value)
In other words, to interpolate between audible frequencies in this music engine, I have to take the following equation:
Left * (Right / Left) ^ Percent = 156250 / (2048 – Value)
… and solve for Value. Fortunately, a keen associate of mine helped me work through it, and I’m pleased to report that we devised an algorithm to get there in fewer than 25 CPU cycles for any input of Left, Right and Percent. The data stored in ROM does not increase in size, and the lookup data is minimal.
I need to come up with some way to reward the other guy for being such a big help.
__________
While I finish finalizing this draft of the spec, let’s take a moment to talk music. What problems and solutions have you guys been a part of in your history of working with the technical nitty-gritties of music?
Hello,
Thank you for posting. Do you happen to know what the sound chip is on the VB? I caught your paragraph on the sound channels and am curious about composing for new VB homebrews. Are people doing this? Please let me know; I am curious about this machine and its capabilities. Thank you!
The sound chip is called the “Virtual Sound Unit”, or “VSU” for short. This chip was made by Nintendo, only used in the Virtual Boy. The chiptune community considers it a ripoff of the HuC6280, used in NEC’s PC Engine.
The VSU contains a mirrored 8-bit address space and a 8-bit databus. (The reason why I said “mirrored” is because of the VB’s use of a 32-bit RISC processor and 8-bit VSU register size, which will mirror each register four times.)
from VB Development Wiki
The silicon chip is housed in a 28-pin SOIC package labeled “U3” and located on the top of the VB’s mainboard.
The VSU is described in this block diagram:
Databus -> Master Control -> Channels 1 through 6 enables -> VSU Control Registers -> Frequency Timer -> Wave -> Envelope -> ------------> Stereo -> Mixer -> Frequency Timer -> Wave -> Envelope -> ------------> Stereo -> Mixer -> Frequency Timer -> Wave -> Envelope -> ------------> Stereo -> Mixer -> Frequency Timer -> Wave -> Envelope -> ------------> Stereo -> Mixer Sweep -> Frequency Timer -> Wave -> Envelope -> Modulation -> Stereo -> Mixer -> Frequency Timer -> LFSR -> Envelope -> ------------> Stereo -> Mixer
There’s more into the VSU chip, but that’s all I have to say for now!
Really? The VSU is a rip-off of the PC Engine soundchip…? Well, that’s music to my ears! My favorite composer is Yuzo Koshiro, and he cut his teeth on PC-88 synth. He continued to use that soundchip for years after it lost relevance. In fact, he grew so attached to it that he still uses it when he is allowed to today. Even for his modern games, he tries to get permission to make alternate PC-88 versions of the soundtrack.
Yuzo Koshiro’s the legend who made the soundtracks for Streets of Rage, ActRaiser, Shinobi, Ys I & II, at least some of the Oasis games, 7th Dragon, Wangan Midnight, and of course the series I came to know him through, Etrian Odyssey.
Although the EO series is for the DS, some of his Original Soundtracks include PC-88 FM synth versions of the soundtrack. These are four examples:
EO III opening, “That’s the Beginning of Adventure”:
One of my faves, “Those That Slay and Fall”
“Black and Red,” so… it seems appropriate:
But I like “Dyed in Scarlet” better:
Below are the full playlists:
If the PC Engine soundchip can do this, and the Virtual Boy is similar to the PC Engine soundchip… then yeah… imagine music like this from a Virtual Boy game! 😀
Oh dear, I do believe you’re (understandably) mistaken on your terminology ^^;
PC-Engine and the PC-x8 line of computers are not the same, even though both were developed by NEC (in conjunction with Hudson, relating to the PC-E).
PC-Engine is the TurboGrafx-16, which used the HuC6280A, a chip mostly based on wavetable synth and not FM synth. This is more accurate to a sound you could probably get off the VB’s VSU-VUE: https://youtu.be/9t80SrX-nXk
Or perhaps this… if you wanna juggle faux sample playback because VSU doesn’t have PCM output like the HuC does: https://youtu.be/8108XHXqitg
I do hope that clears things up a little 😀