Making a Game Boy Game! (Part 3)
Over the past few weeks I've decided to work on a Game Boy game that I'm having a bit of fun making. The working title of it is "Aqua and Ashes". It's open source, hosted at https://github.com/InvisibleUp/AquaAndAshes. In this third part I'll talk about how I made my sprite drawing less of a pain to work with by using the power of sprite mappings. Here is a link to the previous part.
Fantastic Sprites and How to Store Them
Last time, I had just finished rendering some sprites to the screen. It was just done in a very ad-hoc and messy fashion. Basically, I ahd to specify in code what to display where. This made animation pretty much impossible, it was consuming a lot of CPU time, and it was unmaintainable. I needed a better way to do this.
Specifically, I wanted something where with a only an animation number, a frame number, and a timer I could iterate through every single animation. If I wanted to change the animation, I only wanted to change the animation and reset the frame counter. The animation routine that ran every frame should just pick the right sprites to display and throw them at the screen without any effort on my part.
As it turns out, this is a pretty solved problem. What I was looking for was known as sprite mappings. Sprite mappings are data structures that (really roughly) contain a list of sprites. Each sprite mapping contains all the sprites to render one object. Also related are animation mappings, which is a list of sprite mappings with information on how to loop them.
Sonic 2's Mapping Format
Sonic 2 was the particular game I was targeting, as I was toying around with the idea of making a Genesis hack before this came along. Sonic 1 and 3K are mostly the same, but to keep things simple I'll stick to discussing 2.
First, let's start with sprite mappings. Here's a fairly typical sprite for Tails, his halfway blinking animation. (I wanted to be different. Go for sprite #2 instead of sprite #1.)
The Genesis does sprites a bit differently. A tile on the Genesis (called a "pattern" by most programmers) is 8x8, just like on the Game Boy. A sprite consists of a rectangle of up to 4x4 tiles, a lot like the Game Boy's 8x16 mode, just more flexible. The catch there is that the tiles have to be next to each other in memory. With Tails, the developers of Sonic 2 wanted to reuse as many tiles from the standing frame as for the blinking frame. Therefore, Tails is split into 2 hardware sprites consisting of 3x2 tiles; one for his head and one for his body. Here they are.
The top half of this dialog are the attributes for the hardware sprites. It contains their position relative to the origin (the negative numbers are clipped; they're actually -16 and -12 for the first sprite and -12 for the second), the starting tile it uses in VRAM, the width and height of the sprite, and some various status bits for flipping the sprite or changing the palette.
The bottom half shows the tiles as they're loaded into VRAM from the ROM. There's not enough space in VRAM to contain all of Tails' sprites, so the necessary tiles have to be copied into memory every frame. These are known as Dynamic Pattern Load Cues. We can skip over these for the time being, though, as they're mostly independent of the sprite mappings and therefore easy to add later.
On the animation side, things are a touch simpler. An animation mapping in Sonic is a list of sprite mappings with 2 pieces of metadata; a speed value and an action to do when the animation ended. The three most-used actions were to loop all frames, loop the last N frames, or go to another animation altogether (for instance, when transitioning from Sonic's standing animation to his impatient foot-tapping). There's also a couple commands that set internal flags in the object memory, but not many things used them. (I may, now that I think about it, set a bit in object RAM to 1 when the animation loops. That would be useful for sound effects and stuff.)
If you were to look at the disassembly for Sonic 1 (Sonic 2's disasm was too big to link to), you'd notice that animations aren't referred to by any sort of ID. Instead, a list of animations is given for each object, and an animation index is kept in memory. To render a particular animation, the game takes the index, looks it up in the animation list, and then renders that. This keeps things a bit simpler, as you don't have to scan through animations to find the one you're looking for.
Distilling Struct Soup
Let's review our mapping types:
- Sprite Mappings: A list of sprites, which consist of a starting tile, the number of tiles, a position, flip status, and a palette.
- DPLCs: A list of tiles from ROM to load into VRAM. Each entry in a DPLC consists of a starting tile and a length; each entry is placed after the last in VRAM.
- Animation Mappings: A list of animations, which consist of a list of sprite mappings, a speed value, and a loop action.
- Animation List: A list of pointers to each animation action.
Given the fact we're on a Game Boy, we can make a few simplifications. For sprite mappings, we know that there will always be 2 tiles in an 8x16 tile. Everything else we need to keep, though. We can, for the time being, drop DPLCs entirely and just keep everything in VRAM. This isn't sustainable, but again, it's an easy enough fix. Lastly, we can probably get away with throwing out the speed value if we assume every animation runs at the same speed.
Let's start figuring out how we'd implement a system like that in my game.
Follow along with commit 2e5e5b7!
Let's start with the sprite mappings. Each entry in a mapping should mirror OAM (Object Attribute Memory, the sprite VRAM), that way it becomes a simple loop and memcpy to display an object. As a reminder, an OAM entry consists of Y, X, a starting tile, and an attributes byte. I simple just need to make a list of these. For the attribute byte, I just prepared them ahead of time using the assembler's EQU pseudo-op, so I had a readable name for each possible attribute combination. (You'll notice in the previous commit that I had swapped the Y/X with the tile in the mappings. This was because I was dumb and misread the OAM specs. I also added a sprite count, so that I knew how long to loop for.)
One thing you'll notice is that the tail and the body for the arctic fox are stored separately. If they were stored together, there would be a lot of redundancy, as every single animation would have to be duplicated for every tail state. This would add up quickly. Sonic 2 had the same problem with Tails. In that, the problem was solved by making Tails' tails their own object with their own animation state and timer. I don't really want to do that, as I don't really want to deal with the issues of keeping the tail in the proper position compared to the fox.
Instead, I solved the problem in the animation mappings. If you look at my (only) animation mapping, there's 3 pieces of metadata. There's the number of animation mappings, so I know when it ends. (Sonic instead checked if the next animation was invalid, like the concept of a null byte in C strings. Sonic's approach frees a register but introduces a comparison, which would have worked against me.) There's the loop action, of course. (I condensed this from Sonic's 2-byte approach to a 1-byte number with bit 7 as the mode bit.) But there's also the number of sprite mappings. Sonic didn't have this. Having multiples sprite mappings per animation frame allows me to reuse mappings across multiple animations, which I think will save a lot of precious storage space. You'll also note that the animations are duplicated for each direction. This is because the mappings are different for each direction, and they have to be included.
Dancing with the Registers
Follow along with this particular file in commit 1713848.
Let's start with drawing a single sprite to the screen. Well, okay, that's a lie. As a reminder, we can't write to the screen outside of VBlank. And this whole process lasts too long to place in VBlank. So we have to write to the region of memory we have set aside for DMA. Ultimately this changes nothing, but it's important to write to the right place.
Let's start counting off registers. We have 6 registers on the GBZ80, A through E, H, and L. H and L are special, as they're really good for iterating across memory. (Because they're used together, they're referred to as HL.) In one opcode, I can write to the memory address that HL contains and add one to it. This is hard to beat. I can either use this for the source, or the destination. I used it for the destination, and the register combo BC for the source, because that worked out the best. That leaves only A, D, and E. I need A for math operations and the like. So what will DE be used for? I'll use D as a loop counter, and E as scratch space. And now we're out of registers.
So, as an example, let's say we had 4 sprites. We set D (our loop counter) to 4, HL (our destination) to the address of our OAM buffer, and BC (our source) to whatever location in ROM contains our mappings. At this point... I wish I could have called memcpy. Slight issue, though. Remember those X and Y coordinates? They're relative to the origin, the center of the object used for collision and such. If we were to write them as-is, every object would be displayed in the top-left of the screen. That's no good. So to fix that, we need to add the object's X and Y to the sprite's X and Y.
Quick side-note, I've been talking about "objects", but I don't think I've introduced the concept. An object is just a set of attributes related to a thing in the game. The attributes include position, speed, direction, what the thing is, etc. I bring this up because I need to pull the X and Y data from these objects. This would require a third register set that points to wherever in object RAM the coordinates are. And then we would need to store the X and Y somewhere. Also the direction, because that helps us determine which way to face the sprites. Also, we want to render all the objects, so we need a loop counter for the objects, too. And we haven't even got to animations yet! This is getting out of hand fast...
Rethinking Our Approach
Okay, I've gotten way, WAY ahead of myself there. Let's rewind and think about every piece of data I need to keep track of, and where I can put it.
First off, let's split this up into "stages". Each stage should only serve to fetch data for the next, except for the last which does the copying.
- Object (Loop) - Figure out if object needs to be rendered and, if so, render it.
- Animation List - Determine which animation needs to be displayed. Also fetch object attributes.
- Animation (Loop) - Determine which mapping list needs to be used and render each mapping in it.
- Mapping (Loop) - Iterate through each sprite in the sprite list
- Sprite - Copy sprite attributes to OAM buffer
For each of these, I listed the variables they needed, the roles they played, and where they were to be stored. That table looks something like this.
|Current Byte||1||Sprite||Scratch||Mapping Src||E|
|Animation Mapping Start||2||Sprite Mapping||Pointer||Stack3||DE|
|Mapping Source||2||Sprite Mapping||Pointer||[DE]||BC|
|Sprites Remaining||1||Sprite Mapping||Scratch||Mapping Src||D|
|OAM Buffer||1||Sprite Mapping||Pointer||HL||HL||Stack1|
|Animation Mapping Start||2||Animation||Scratch||BC/Stack3||BC||Stack3|
|Mappings Remaining||1||Animation||Scratch||Anim Start||HiRAM|
|Total Mapping Count||1||Animation||Variable||Anim Start||HiRAM|
|Mappings per Frame||1||Animation||Variable||Anim Start||UNUSED!!!|
|Mapping Pointer||2||Animation||Pointer||AnimStart + Dir*TMC + MpF*F#||BC||DE|
|Animation Table Start||2||Animation List||Scratch||Hardcoded||DE|
|Object Source||2||Animation List||Pointer||HL||HL||Stack2|
|Frame #||1||Animation List||Variable||Obj Src||HiRAM|
|Animation #||1||Animation List||Scratch||Obj Src||A|
|Obj X||1||Animation List||Variable||Obj Src||HiRAM|
|Obj Y||1||Animation List||Variable||Obj Src||HiRAM|
|Obj Direction||1||Animation List||Variable||Obj Src||HiRAM|
|Animation Mapping Start||2||Animation List||Pointer||[Anim Table + Anim #]||BC|
|OAM Buffer||2||Animation List||Pointer||DE||Stack1|
|Object Source||2||Object Loop||Pointer||Hardcoded/Stack2||HL|
|Objects Remaining||1||Object Loop||Variable||Calculated||B|
|Objects Active Bitfield||1||Object Loop||Variable||Calculated||C|
|OAM Buffer||2||Object Loop||Pointer||Hardcoded||DE|
Yes, that is a mess. To be perfectly honest, I only made that while writing this post to explain things a bit better, but I'm already getting use out of it. Regardless, I'll try and walk you through it. Start from the bottom and work your way up. You can see every piece of data I start with, that being the object source, the OAM buffer, and some pre-computed loop variables. For each loop, we start with that and only that, with the exception that the Object Source gets updated every loop.
For every object we render, we have to determine which animation needs to be shown. While we're at it, we might as well store the X, Y, Frame #, and Direction attributes before incrementing the object pointer to the next object and storing it on the stack to pick back up when exiting. We use the animation number combined with a hardcoded animation table offset to find where the animation mapping begins. (I'm making a simplification here by having every object share the same animation table. This limits me to 256 animations for the whole game, but is that really anything I'll go over?) We also stash the OAM buffer to save some registers.
Once we have out animation mapping, we need to find where the sprite mapping list for the given frame and direction and how many mappings to render. You'll notice that the mappings per frame variable isn't used. This is because, looking over my code, I wasn't thinking and hardcoded that to 2. I should fix that. We also grab our OAM buffer back from the stack. You'll also notice the complete absence of anything involving loop control. That's done in a separate, much simpler subroutine that can skip a lot of the register juggling here.
From there it's relatively straightforward. Our mapping is a bunch of sprites, so loop over the sprites and draw them using our stored X and Y coordinates. However, we re-store the OAM pointer at the end of the sprite list so that the next mapping can start where that left off.
The end result of all of this? Exactly what we had before. Arctic fox, waving tail in the darkness. But adding a new animation or sprite is a lot easier. Next time, fancy backgrounds and parallax scrolling! Now if you excuse me, I have a lot of code I need to clean up, because I can so use those registers better...