22

It seems that 3D games, especially simulations like Falcon, were not faster (fps) on the Amiga than on the Atari ST - even a bit slower due to the CPU clock.

I was wondering why this is the case, since the Agnus seems to me as an early model of a GPU, capable of drawing vector lines and even fill out polygons. Was it not capable to display the typical number of 3D objects in those games or was it simply not used, since it would require a complete different codeline for the Amiga with respect to 3D rendering ?

Ola Ström
  • 327
  • 1
  • 2
  • 9
Marco
  • 1,387
  • 1
  • 10
  • 11
  • What does "faster" mean? more frames per second or something? – Omar and Lorraine Sep 19 '18 at 07:46
  • yes, more fps, I update the question – Marco Sep 19 '18 at 08:23
  • Mind to support your assumption with some data? Link frame rates used and how they differ? – Raffzahn Sep 19 '18 at 10:39
  • 2
    That is a little tough for me, since I did not the comparison by myself. I have a print magazine from 1990, which only covered simulations cross-system and there almost was for every game the statement that the 3D aspects between Amiga and Atari ST would be more or less the same – Marco Sep 19 '18 at 11:15
  • Keep in mind that 3D isn't anything that was hardware related (non CPU that is). Also, these machines where rather 'hard-coded' Timing was made to fit the 50/60 Hz fixed frame rate, not independant units operating and synchronizing as of today - and related, without all the inbetween layers. So unless special effects of the amiga where used, it all comes down to the CPU manipulating the bitmap. And when used, these games where usually not portable in any way. – Raffzahn Sep 19 '18 at 11:35
  • There is definitely a long-held perception that the Atari is faster for 3d games; I also can't prove it with examples and therefore cannot say whether it is true, but I'm happy to support that it was commonly said. – Tommy Sep 19 '18 at 11:43
  • 2
    Would it be worth (someone's) while to add an answer talking about how as 3d graphics algorithms developed, they were frequently better able to be implemented on graphics systems that used a "chunky" approach vs the Amiga's (and many other older systems) "planar" approach? – Damien_The_Unbeliever Sep 19 '18 at 13:47

5 Answers5

30

There are a few reasons why multi-platform 3D games turned out to be no faster on the Amiga than on other 68000-based platforms without its blitter:

The developers may have targetted the lowest common denominator system and not taken advantage of the blitter when they ported to the Amiga. Early games and poor conversions tended to be like this.

A naive implementation by somebody unfamiliar with the hardware may well fail to use the blitter's line-drawing and space-filling operations efficiently, thus losing the benefit. If one draws and fills individual polygons and then blits them to the framebuffer, that is a lot of wasteful work. AmigaOS itself didn't set a good example here.

Finally, the blitter is a pure 2D device which offers no acceleration for 3D calculations, so the main CPU needs to do those. This involves a lot of matrix multiplication, and the 68000's MULU/MULS instructions are very slow, taking about 70 cycles. (The exact number of cycles is data-dependent because the microcode uses an iterative algorithm.) At 7.09MHz that's a shade over 100,000 multiplies per second, or 2,000 per 50Hz field. Even if the blitter was infinitely-fast, this still sets an upper limit of a few hundred points on screen or compromise on frame rate.

Brian H
  • 60,767
  • 20
  • 200
  • 362
pndc
  • 11,222
  • 3
  • 41
  • 64
  • thx for the explanations...I have not thought about the necessary matrix stuff and that of crs makes completely sense, then – Marco Sep 19 '18 at 11:18
  • 5
    "that's a shade over 100,000 multiplies per second" ;-) – Nigel Touch Sep 19 '18 at 12:56
  • 1
    I would dare imagine that most optimised titles do not use MULU/MULS, preferring a more limited numerical range and the lookup table options that permits. But that's not to argue with your main point about calculation versus drawing costs. – Tommy Sep 19 '18 at 14:02
  • 2
    Actually the biggest performance hit in 3D calculations usually comes from the perspective divide, as divide instructions are much slower than multiply instructions. Of course like Tommy said with multiplies, in practice games would avoid using an actual divide instruction. –  Sep 19 '18 at 14:22
  • I guess one somewhat mitigating feature here is that if you do the 3D math and the blitting at the same time, the blitter will be free to do its thing while the CPU is grinding those slow math instructions (and therefore not using up any bus cycles). But yeah, multiplication and division are quite painfully slow on the 68000 (and not much better on the other 680x0 series chips until the very late 68060, really). The rather awkward planar graphics layout the Amiga used didn't really help 3D performance, either. – Ilmari Karonen Sep 19 '18 at 14:39
  • @RossRidge I guess that if you're fully 3d, no special cases, you're talking about nine multiplications and nine additions for transform per vertex, then two divides? Assuming you're conveniently not worrying about clipping, of course, e.g. because your objects are small enough that culling at z=1 is sufficient and the rest can be handled in 2d. If so then I guess it's whether a divide costs more or less than 4.5 multiplies? Unless and until you simplify your camera and object movement, of course. – Tommy Sep 19 '18 at 18:37
  • @Tommy Yess: I mainly see a lot of adds, some multiplications, and quite some table lookups for trig - But no divisions, actually. – tofro Sep 19 '18 at 19:20
  • @Tommy The standard 3d transforms multiply a 4 element vector by a 4x4 element matrix, so in the worst case it's 16 multiplies, but in practice a lot of the elements are going to be either 0, 1 or -1 (eg. the perspective transformation matrix only has 4 elements that aren't 0, 1 or -1). Potentially some of the remaining elements are constants that can approximated well enough with a few shifts and adds or small lookup tables. –  Sep 19 '18 at 19:44
  • @tofro The perspective divide is what makes things smaller the farther away they are. –  Sep 19 '18 at 19:46
  • 1
    @RossRidge but both the final column and final row are invariably 0, 0, 0, 1 assuming you're doing only linear transforms. So you can just multiply by the inner 3x3 and then add the final column as a translation. I think another common optimisation is spotting that most of the time you're just doing rotation, so elements are in the range [-1, 1], giving a great opportunity for locally-limited-precision arithmetic. E.g. 8x8 matrix arithmetic into a 16.16 space. But then the divides have to occur in the objective space, which further helps your point — 8x8 multiply vs at least 16x16 divide! – Tommy Sep 19 '18 at 20:09
  • 3
    The AtariST has a slightly higher clock than the Amiga, but most of the cpu time was spent in drawing the picture. The Amiga had an edge in theory because the blitter has a fill mode so you could draw lines and use the blitter to do the fill, but the constraints made it so that it wasn't that useful in games, however it was used a lot in demos – Thomas Sep 19 '18 at 23:56
  • 1
    If a lot of objects are cuboids, you could probably exploit the fact that your x, y or z coordinates are used more than once but I don't know if any games did this. – user3570736 Sep 20 '18 at 07:46
  • @RossRidge You can get rid of a lot of the divides by cleverly sizing your models and introducing some "horizontal fog". – tofro Sep 20 '18 at 07:52
  • What does @tofro mean by Horizontal Fog? – Omar and Lorraine Sep 20 '18 at 12:39
  • @Wilson reducing the viewing distance by just letting far-away objects vanish in fog (or nothing...) – tofro Sep 20 '18 at 12:44
  • @user3570736 with a unit cube you can also exploit the fact that your edges are of length 1. Which is a really easy number to multiply by! But, more useful, I think a lot of titles use symmetrical models to halve transform costs. Elite's are all symmetrical, possibly with the exception of the asteroid (?) – Tommy Sep 20 '18 at 22:34
  • 1
    The z-division was usually done by bitshifting: the eye plane distance was usually hardcoded to 128 so you could shift the fixpoint numbers by 7 bit. Nice thing was that the 68000 had 32 bit registers so you could use a single register for each coordinate. – Peter Parker Oct 30 '18 at 10:36
  • I wasn't aware the Blitter in either machine had a line drawing or area filling capability; can anyone provide an informative link? The chip type in general is aimed towards shifting rectangular blocks of data around in memory, with some ability to flip one/other/both ways, distort by copying into a different shaped block, and of course perform a variety of addition/subtraction/XOR functions between source and destination. Plus the Atari one has a somewhat funky halftone/smear function that can make some nice effects, but you have to be a bit of a wizard to get it right... – tahrey Oct 28 '19 at 00:40
  • As for clock speed, 8MHz vs 7.1MHz is hardly "slightly better", it's a good 11%. PC era videocard wars have been fought and won for smaller margins. Given that (Blitter concerns aside) drawing filled polys in both cases is pretty much a case of some simple Bresenhams and than spamming 1s or 0s into the relevant bitplanes (itself possibly easier on the ST due to different memory organisation?), which is a memory intensive thing that a higher clock rate also accelerates, it's more than enough to compensate for other areas the machine is sluggish in. – tahrey Oct 28 '19 at 00:43
  • (Naturally I may be a bit biased given the name, but this one area my childhood toy can claim victory in is quite long established... it's not like the Amigans can't claim honest wins in a hundred other areas, so let us have this one, eh? :) – tahrey Oct 28 '19 at 00:45
  • From memory, 3D engines back in the day did not use matrices, they used Euler angles or Quaternions. These were computationally lighter than full matrix transforms. – Renee Cousins Feb 03 '22 at 14:23
13

The thing the Agnus has that speeds up 3D games is the polygon fill feature. Blitting by itself is not so much a standard operation driving 3D performance. It can help 2D games and windowed GUIs a lot, however, one of the reasons the Atari ST got a (much simpler) blitter as well in later releases.

3D Games are CPU-heavy, or rather performance is driven how fast you can do vector arithmetics and trigonometric functions (which would mainly be done by table lookups in older games) - Agnus doesn't help here, even a 68881 wouldn't help much, as fixed-point integer maths done by a 68k are still faster. We're talking mainly integer operations here, and that is driven by raw CPU speed and bus contention.

tofro
  • 34,832
  • 4
  • 89
  • 170
  • Polygon fill doesn't really help with 3D though, because you have to draw the boundaries, fill the polygon and then later blit it into place, all stealing memory access cycles away from the CPU. It's faster to use the CPU to fill, rather than doing so many blitter operations (3 line draws, one fill, blit into target image). – user Sep 20 '18 at 08:10
  • @user Yep - That's what I was trying to convey with "bus contention". Fill, however, can be made a lot faster or slower by the layout of your video memory and allocation of bitplanes - The more linear, the better. The Amiga was a bit complicated in that respect, I seem to recall. – tofro Sep 20 '18 at 08:16
10

Agnes doesn't help much with 3D acceleration. It has a blitter which is capable of filling polygons, but they must then be copied in to the display bitmap so it's actually slower than just filling with the CPU. Additionally the blitter requires the horizontal limits of the polygon to be rendered first, meaning an additional line draw per vertex.

The biggest performance limitation for 3D on the Amiga is memory bandwidth. In fact all 16 bit home machines of the time were the same, memory bandwidth was the biggest challenge.

The maximum fill rate can be achieved by using unrolled CPU loops that write directly in to bitmaps. There is then no copy stage, and no need to line draw vertexes.

The general strategy used by most high performance 3D on the Amiga is to calculate all vectors in one go, allowing as much data to be kept in CPU registers to avoid memory reads. The display is then rendered by the CPU alone as outlined above.

Since the Amiga had the same CPU as the Atari ST running at a similar speed, but with actually a little bit less memory bandwidth performance was about the same. The Amiga could do some tricks to improve things, such as variable colour depth on different parts of the screen and sprites for low overhead overlays, but in practice they tended not to be that significant performance-wise.

user
  • 15,213
  • 3
  • 35
  • 69
6

I did 3d on both and optimizations were different on both machines; I didn't do very advanced stuff at the time, mostly rotating simple objects, etc but no game or complex scenes.

On the Amiga, you would use the blitter to draw lines and fill the bitplanes. The different bit planes were not interleaved but sequential, so the size of the blitter pass you had to do was depending on the colors used. You would draw the same lines in all the planes needed and then do a blitter fill, but filling plane 0 and plane 3 in two passes was slower than filling plane 0 and 1 in a single pass which was slower than filling only plane 1 for example.

The blitter really sped up the filling, but the bitplane layout forced color palette organization in order to make sure the largest surfaces of your objects were the fastest to fill.

On the Atari ST, it was a different story: you had to draw your picture as you were building your triangles, so you needed to sort the vertices vertically and then, as you were doing line interpolations, you were filling up the screen. The edges were tricky to work with, but large surfaces could be filled in a single pass.

In short, the ST required more CPU work, but drawing the picture could be done in a single pass. On the Amiga, the process was generally faster, but the memory layout generated a lot more memory traffic.

Despite these differences, the two came remarkably close in performance.

Thomas
  • 3,842
  • 18
  • 25
2

It's because of the particular things each system was best at, and could better accelerate. 3D games were actually one of the few cases (plus a few other number-crunching sims?) where games ran faster (not just "at the same speed") on the ST rather than the Amiga. And it's entirely because of the reason you first posit; the Atari had an ~11% faster CPU clock.

The games in question were very heavily CPU-bound, their simulation, and their drawing as well, relied almost entirely upon numeric calculation (and getting stuff in and out of RAM, which was also faster). The actual moving-data-to-the-screenbuffer part was a relatively small piece of the whole setup; I daresay that the Amiga's custom chipset still made that operation somewhat faster than the same job in the ST, at least what parts of the job it was actually capable of accelerating (it's not really much of a GPU; it's pretty much just a console style hardware-sprite-and-scroll engine, with a few fancy tricks like line-by-line auto-fill which helps a bit with rendering solid polygons if the programmer knows how to / can be bothered making use of it, or the register bit-banging provided by the Copper), but the ST was already entirely capable of shoving the calculated and rendered output to the screen pretty much as fast as it was produced (by double buffering, so the final "painting" of the screen was a single register-write), so the best it could have done was maybe shave one scan-time off the construction and display of each frame.

There's no line or box or other primitive drawing acceleration, no large-area auto fill, no bezier curves, absolutely nothing that would count as even the most rudimentary 3D acceleration, the sprites or second playfield only appear as overlays (and for the most part are irrelevant here, other than PF2 being used as a cockpit/HUD for a fairly minor speedup), hardware scrolling is pretty much useless, as are most Copper tricks, and even the (less efficient than the ST's - assuming the Atari had one fitted of course) Blitter is of limited usefulness.

If the complexity of the scene was such that it took the Amiga CPU fully 8 scans to finish computing and rendering the image, but the ST only needed 7 (entirely plausible given that a lot of 3D polygon games really struggled to do any better than maybe 6~7fps, which is a 7 to 8 "frameskip" on a PAL monitor, and worse on NTSC), then even that minor improvement to the rest of the system latency wouldn't do anything more than bring them back to level-pegging.

Pretty much any other type of contemporary game, sure, the Amiga had it comfortably in the bag. But for sims and other 3D stuff, the ST just about nosed ahead.

tahrey
  • 690
  • 5
  • 7