36

I'm taking a look at the chapter on sprites from a NES programming guide at famicom.party. There is a little table which describes what the different sprite attribute flags do:

Bit # Purpose
7 Flips sprite vertically (if "1")
6 Flips sprite horizontally (if "1")
5 Sprite priority (behind background if "1")
4–2 Not used
1–0 Palette for sprite

I'm generally amazed by how well the memory space is utilised in the console, so I was surprised to see that there are 3 bits per sprite which are not used. Two of these bits could specify a rotation orientation of the sprite. Why was this not implemented on the hardware?

Were these bits used by game programmers for their own sprite-specific information, potentially unrelated to graphics? Were they left unused intentionally for this reason?

user3840170
  • 23,072
  • 4
  • 91
  • 150
Jojo
  • 567
  • 1
  • 5
  • 10
  • Weird, that domain has a bajillion open ports, and responds to ping, but won't serve me a webpage. Are you using https or http? – Omar and Lorraine Nov 09 '21 at 09:40
  • The link in my question directs to 'https:// famicom.party/'. I don't really understand the difference between http and https, but when I click on that link it opens for me. – Jojo Nov 09 '21 at 09:47
  • I don't know then; maybe it's a problem my end. – Omar and Lorraine Nov 09 '21 at 09:53
  • 3
    Link works fine for me. But it’s something of an annoyance (and a bit spammy too) to link to the front page instead of the specific chapter in question, so I edited that. – user3840170 Nov 09 '21 at 12:22
  • 7
    Note that you'd only need one bit for rotation. If it's set, rotate 90° to the right after applying any flipping. Rotating 180° is already possible with two flips, and 270° would be possible with two flips and a 90° rotation. – Radvylf Programs Nov 09 '21 at 18:34
  • Oh yes I can see that, there are 8 = 2^3 possible transformations which requires only 3 bits altogether and not 4. – Jojo Nov 09 '21 at 20:41
  • 4
    Incidentally, if one looks at the die layout for the PPU, the memory array used for the OAM omits the storage transistors for the unused bits. From a software standpoint, giving functions to those bits would have been "free", but each extra bit would have expanded the size of the array by about 3%. That having been said, I think the functions I'd have liked to have seen given an extra OAM bit would have been per-sprite control of height, and per-sprite interrupt enable (versus having only sprite 0 support collision detection). – supercat Nov 10 '21 at 20:55
  • 1
    @supercat that's super interesting! So the answer to 'were these bits used by game developers for non- graphics purposes' is 'no, because the hardware didn't actually contain those bits'? – Jojo Nov 10 '21 at 23:09
  • That book you linked to is great, nice find! – chiliNUT Nov 10 '21 at 23:13
  • @Joe Even if the PPU OAM did contain those bits, game developers wouldn't use them because reading from OAM is very buggy -- it's nearly impossible to do without corrupting memory, and it's not possible at all on hardware sold earlier in the NES's lifespan. It also would be very slow and of limited utility, since it could only be read during VBlank. – NobodyNada Nov 11 '21 at 01:39
  • (Virtually?) every game stores a complete copy of OAM in the CPU memory and DMA's it to the PPU every frame, so a game could conceivably make use of the extra bits in the CPU's copy. However, reading back individual sprite data is not really all that useful. Sprites have to be constantly shuffled around and must be (usually randomly) omitted when there's not enough slots left in OAM, so most games will use separate data structures to store information about players, enemies, etc. and then render the sprites as a final step in the game loop. – NobodyNada Nov 11 '21 at 01:45
  • @NobodyNada: I wonder how the OAM evolved? If there wasn't a need to update items individually, I think a 29x64 shift register could probably have been implemented more compactly. The DRAM-based design requires that the OAM memory cells be able to hold data over a millisecond, but a shift-register design could reduce the required hold time to a few microseconds. – supercat Nov 11 '21 at 19:55
  • @supercat Although the CPU cannot reliably random-access OAM, the PPU needs to during the sprite evaluation process. I'm not sure if that would work using a shift-register-based design. The advantage of byte-addressable OAM is that, on each scanline, the PPU only needs to read the Y-position byte of most sprites (and only has to spend clock cycles reading the rest of the sprite if its position is in range). – NobodyNada Nov 11 '21 at 20:06
  • 1
    Also, it's worth noting that the CPU is intended to be able to random-access individual OAM entries, but that feature is extremely buggy: due to an unintended interaction with the DRAM refresh circuitry, changing the CPU's OAM address pointer causes the in-progress DRAM refresh to write to the wrong location. So, the only reliable thing to do after writing OAMADDR is to immediately overwrite all 256 bytes of OAM using OAMDATA or OAMDMA (because you don't know what might have been corrupted). – NobodyNada Nov 11 '21 at 20:11
  • @NobodyNada: Having the hardware read out a 29-bit OAM entry every four dot clocks would be simpler than having it selectively try to skip entries. If the design intention from the get-go was that random access be allowed, that would have precluded any consideration of a shift-register-based design, but if the ability to support random access was a feature that Nintendo tried to add later, then a shift-register design might have made sense. – supercat Nov 11 '21 at 20:23

1 Answers1

91

For each sprite displayed on a scanline, the hardware fetches two bytes from memory, and then clocks the pixels out one by one. The sprite is eight pixels wide, and each pixel is two bits, which is why it's two memory accesses per sprite per scanline. You can imagine that this arrangement just needs a couple of shift registers per sprite to clock the pixels out.

Now, flipping the sprite about its vertical axis is easy: you just clock the pixels out in reverse order! Similarly, flipping the sprite about its horizontal axis is also easy: You just fetch the bitmap from memory in reverse order. Rotating by 180° of course is the same as flipping both horizontally and vertically.

But if you wanted to rotate the sprite 90° or 270°, that's much harder. That's because the hardware would need to get one pixel from each pair of bytes, which means fetching 16 bytes from memory instead of just two. There is not enough time on this slow hardware to do that. Incidentally, this is also where the maximum number of sprites per scanline limitation comes from.

It's a similar story on the Commodore 64, the Atari 2600, and many others: These platforms can do some simple 2D manipulations on sprites like stretching and sometimes flipping like you've seen, but never rotations.

Omar and Lorraine
  • 38,883
  • 14
  • 134
  • 274
  • 20
    "I don't think there's enough time on this slow hardware to do that." There absolutely is not enough time. See the NESDev Wiki's frame timing description and diagram: each memory access takes 2 cycles, and there is only one idle cycle per scanline. So adding another 14 bytes * 2 bytes/cycle * 8 sprites per scanline = 224 cycles per scanline would certainly not be feasible. – NobodyNada Nov 09 '21 at 19:32
  • 5
    One can imagine a more complex PPU design that could access memory in one cycle, or gradually loads sprite tiles over a period of 8 lines before they're needed -- but any method to rotate sprites would have greatly increased the cost and complexity of the PPU for very little benefit, and keeping costs down was extremely high-priority during the NES's development. – NobodyNada Nov 09 '21 at 19:32
  • 15
    @NobodyNada: Exactly. If you need to rotate a sprite by 90°, the fast and easy solution is to make a set of pre-rotated bitmaps. Sure, it doubles the amount of ROM needed for the sprite, but that's the only cost, and I doubt that it would be breaking for most games that need it. And if you don't need it, you won't need to pay the cost. – Ilmari Karonen Nov 09 '21 at 22:36
  • 8
    Note that this is somewhat true today. Simple GPU settings can set tiling to x=-1 (horizontal flip) or y=-1 (vertical flip) or both (180 rotation). A 90 or 270 rotation requites playing with the matrixes. – Owen Reynolds Nov 10 '21 at 00:50