18

AFAIK VGA mode 13h palette has only 64 possible colors (6-bit) per channel.

One obvious way to map those 64 colors to 256 colors is to multiply them by 4 (since 4 * 64 = 256):

8_bit = 6_bit * 4;

This is however not accurate since biggest value is 63 and 4 x 63 = 252 not 255.

Some online ressources I found suggest to do the following:

8_bit= (6_bit * 255) / 63;
8_bit= (6_bit << 2) | (6_bit >> 4);

This is much better (63 is now mapped to 255). However those are not equivalent (it will not give the same result, for example 48 is mapped to 194 and 195 respectively).

I am wondering if someone knows how VGA hardware works (how those 6-bit values are mapped to analog signals by the DAC). The last formula seems the most plausible since it's fairly easy to implement in hardware.

tigrou
  • 283
  • 2
  • 6
  • 2
    These are linear voltage DAC, where values are linearly converted into voltages. 8bits palettes just have more intermediate steps. Actual light intensity isn't linear (gamma), but the correct conversion from 6bits to 8bits is still to spread linearly the 6bits values over the 8bits range. – Grabul Jul 24 '23 at 22:08
  • @ilkkachu moved the comment to answer with code and result LUT ... – Spektre Jul 26 '23 at 13:40
  • I'm not sure how a brightness increase of 1.6% qualifies as "much better". I would call it insignificant, especially considering you have to give up linearity to achieve it. – Dmitry Grigoryev Jul 27 '23 at 13:18
  • I know exactly how the video hardware works for this kind of conversions: I've implemented all the variety of conversions between 4bit, 5bit, 6bit, 8bit and 10bit colors in HW. See my answer on the approach to this. – lvd Jul 27 '23 at 14:21
  • It seems to just be me but since I was a kid programming Z80 and m68k assembler such solutions involving multiplying by one less than a power of two always felt wrong to me. I remember agonizing about it a lot back in the day and finally settled on adding one, then shifting, the subtracting one. It felt to me like multiplying/dividing by the highest value instead of the number of possible values was a kind of off-by-one error. That was of course fast on CPUs with no mul but probably slower today. I'm not an engineer or mathematician but it felt like the right solution and results were great. – hippietrail Jul 28 '23 at 04:04
  • (The above is a comment on the whole thread, not on the question or comments above. But it seemed best to put it here since it comes up in multiple answers.) – hippietrail Jul 28 '23 at 04:05
  • 8_bit= (6_bit << 2) | (6_bit >> 4); - this makes 0 map to 0 and 63 to 255. Isn't this good enough? – Thorbjørn Ravn Andersen Jul 28 '23 at 11:22

5 Answers5

20

The most accurate way of course depends on what is the purpose. No way is perfect and there is no one true method for all purposes.

First of all, how the VGA works, is that it used the INMOS G171 RAMDAC. Now, as it is a real world IC with real world tolerances, it is not that accurate, as it can have full scale error of 4% and 2% error between different channels, there are non-linearities, and even then, the other real world components around it that set the reference current have tolerances and non-linearities too.

But in an ideal world, the DAC simply has 64 ideal levels that are equally spaced and they are set to correspond to voltage levels 0V and 0.7V. At least on paper. The white level may not in theory be exactly 0.7V so it does not really matter what voltage it is in practice if it has some tolerance.

Now, if that signal is fed to a digital monitor (or TV) with VGA input, it has to sample the analog voltages and convert them to digital - with the imperfections of both the source and monitor being accounted for. Therefore, when VGA card outputs level of 63 for white, there is no guarantee that an 8-bit monitor samples it as 255 or 252, it may be that 63 is below 255 but it may also be that even 62 is above 255 so the information is lost. So, like said, the exact value is not that important.

When time passed on and PCs started to have 8-bit RAMDACs, they were still compatible with 6-bit colours, and could be set to just use 6 bits for VGA modes instead of 8 for VESA modes, and they just left the 2 LSBs as zeroes. So if a BT476 RAMDAC was set to output exactly 0.7V for 8-bit code 255, the 6-bit code would be 252 as 8-bit code and thus voltage less than 0.7V.

If you simply want to make an emulator or just a screenshot and convert 6-bit colors to 8-bit colors, just multiply by 4. So what, if white is (252,252,252) instead of (255,255,255). The same thing is done in modern digital video processing too, to convert 8-bit values to 10 (or 12) bits, e.g. 8-bit white level is 235 and 10-bit white level is 940. This guarantees that the step between each level is equal in size: 4, which is most accurate in that sense, and it is also easy to convert from 10-bit to 8-bit by just ignoring the two LSBs.

If for some reason you must map a (0..63) range to (0..255) or (0..100% for any bit depth) then any method is fine. It just means you get 100% white if you want, but steps are not always equal, as step is either 4 or 5. There is no reason to target for 255 anyway. Whether you truncate, round, or just use the 2 MSBs as the expanded LSBs, it only changes where the jump-of-5 codes happen, so any one of them is good enough and none of them is the ultimately best method.

Perhaps best to use the simple multiply by 4 method, as that is how cards with 8-bit DACs emulated a 6-bit DAC.

Sep Roland
  • 1,043
  • 5
  • 14
Justme
  • 31,506
  • 1
  • 73
  • 145
  • 2
    When I tried it, I got best results by if (0) then 0 else fill low bits with 1s. – Joshua Jul 25 '23 at 17:08
  • "The same thing is done in modern digital video processing too to convert 8-bit values to 10 (or 12) bits, e.g. 8-bit white level is 235 and 10-bit white level is 940." - wait, what? Why 235 and 940 instead of 255 and 1020? – user2357112 Jul 26 '23 at 08:19
  • 3
    @user2357112 Computer graphics use "full range" of 0..255 and digital video uses "limited range" of 16..235. 16 for nominal reference black and 235 for nominal reference white, allowing for undershoots below 16 and overshoots above 235 when digitizing an analogue source to digital, and to have some processing headroom for e.g. digital filtering. Some digital interfaces (SDI) have codes 0 and 255 reserved not for video but for synchronizing information and as e.g. audio data packet headers. Which means that in 10-bit world video range is from 4 to 1019, 0..3 and 1020..1023 being unavailable. – Justme Jul 26 '23 at 08:38
  • Where does your last claim come from? – lvd Jul 27 '23 at 14:18
  • 1
    If you want an algorithm to convert full-on to full-on when extending bit depth, then do it by repeating bits from the beginning. – davolfman Jul 27 '23 at 16:20
  • @davolfman As already said, you do that if that is what you want. But, as already said, as done by 8-bit RAMDACs and digital video ever since it was standardized in the early 80s, you might not want to do that. – Justme Jul 27 '23 at 18:21
  • @lvd last claim? It was already said that that's how 8-bit VGA RAMDACs already did, and that's how standard digital video does it, and it keeps the step size constant, and you won't see the difference of 252 not being as bright as 255. However, you might see an uneven step of 5 in the middle of steps of 4 in the darkest regions. – Justme Jul 27 '23 at 18:24
  • So it is exactly my question -- where can it be seen how exactly the vga (every kind of vga compatible hardware probably?) does that conversion? Regarding what will or won't be seen, some example pictures with simple gradients might be more helpful in deciding that. – lvd Jul 29 '23 at 17:22
  • @lvd RAMDAC datasheets tell how they handle the 6-bit to 8-bit conversion. And I only skimmed through a few discrete RAMDAC data sheets, BrookTree and AnalogDevices, and one of them even warns that you get 1.2% less than full 8-bit scale so it does not surprise the user. – Justme Jul 30 '23 at 20:20
  • Which ramdacs? Maybe some links to bitsavers? – lvd Aug 01 '23 at 13:48
  • @lvd To name a few, Sierra SC11488, Winbond W82C478, Brooktree Bt477. – Justme Aug 01 '23 at 22:15
  • @Justme all those datasheets contain a text about zeroing LSB bits as well as they state that it will give "1.5% lower levels" while in reality it would be just 1.2% -- (255-63*4)/255 is ~1.2%. Now I'm guessing, apart from the fact they were copying text one from another (or all from some unknown source), which else statements might be not true there? – lvd Aug 02 '23 at 11:16
  • @lvd Maybe they used 4/256 instead of 3/256, who knows, the important thing is that (a) the two LSB bits are zeroed and (b) even if some DACs do compensate for the drop, they compensate it on the analog side, so the digital steps are still all equal in size. Remember, the user has the brightness and contrast knobs. Also fun fact is that based on multiple IBM PS/2 model schematics, the output is not even 0.7V, it's at least 2% less, and the INMOS RAMDACs are used with reference current that is below suggested minimum for proper operation. – Justme Aug 02 '23 at 11:40
15

Justme detailed some of the reasons that the exact conversion method probably doesn't matter. Another reason is that you are probably going to interpret the 8-bit output according to the sRGB standard, which postdates VGA by a decade. I'm not so sure that VGA monitors of the late '80s had primaries or transfer functions that were very close to sRGB, or each other.

All that aside, in principle, to scale a 6-bit channel linearly to 8 bits, you should multiply by 255/63 = 85/21 and round to the nearest integer, which in a C-like language with truncating or floor division works out to

eight_bit = (six_bit * 85 + 10) / 21;

or equivalently

eight_bit = (six_bit * 259 + 33) >> 6;

Likewise, when converting from 24-bit to 48-bit color you should multiply by 65535/255 = 257, not 256, and when converting from 24-bit sRGB to a linear colorspace, you should divide by 255, not 256, to get a properly normalized value to plug into the transfer function.

benrg
  • 1,957
  • 11
  • 15
  • Nitpick: floor division does not necessarily produce the same result as floating-point division (or exact arithmetic division) followed by rounding to the nearest integer. But of course, floor division is likely faster, and the the difference overall is just in where the 5-step increments occur. – John Bollinger Jul 25 '23 at 14:14
  • @JohnBollinger They're equivalent in this case because n/21 can't be halfway between integers. 10/21 rounds down and 11/21 rounds up. – benrg Jul 25 '23 at 15:36
  • You don't have to hit exactly halfway. With floor division, 10/21 truncates down, and 11/21 also truncates down. This is exactly the point: 11/21 does not round up. – John Bollinger Jul 25 '23 at 15:41
  • 3
    @JohnBollinger I added 10 before dividing. By "with floor division" I just meant that the language has an integer floor (/truncating) division operator. It's not always easy to simulate floating-point behavior with it, but in this case it is. – benrg Jul 25 '23 at 15:59
  • 1
    An alternative would be to simply copy the top two bits to the bottom two bits. This would yield larger steps between 15-16, 31-32, and 47-48, as opposed to between 10-11, 31-32, and 52-53, but that's just a slight rounding difference. – supercat Jul 25 '23 at 17:52
  • 1
    @supercat That was one of the options that OP listed, and it's probably good enough. This answer (after the first paragraph) is just about getting the theoretically most accurate answer to this narrowly defined problem. – benrg Jul 25 '23 at 18:24
  • Gamma values for VGA monitors were wildly variable. Mine had a gamma of 2.9 (compare to the sRGB standard of 2.2). – Mark Jul 25 '23 at 23:54
  • @Mark on the other hand sRGB was carefully designed to match the average monitor of the day, so even if it doesn't match a specific one chances are it's good enough for an emulation. – Mark Ransom Jul 26 '23 at 03:17
  • The sRGB standard may not be relevant here. Basically that's how monitor interprets the voltage on connector to brightness/intensity. My point being, given two video cards, for the same 6-bit RAMDAC code they should output same voltage, and since more modern RAMDACs are 8-bit, the voltage depends on how the RAMDAC interprets 6-bit codes into 8-bits. – Justme Jul 26 '23 at 15:36
  • The video DACs more modern than those used in 90ies (S)VGA cards lack the ability to 'interpret' 6-bit codes, since they only accept 8 bits per color component. – lvd Aug 02 '23 at 11:11
8

[Preface:That question isn't really related to retro computing but hardware in general, it might thus be moved]

TL;DR:

Absolute brightness on screen is not defined by the signal, but by the screen's brightness setting (*1). For all practical means it's only important that the steps used are of equal distance and in total reasonably close to the desired range (*2).

For real displays it's a non-issue, thus no one would have cared.

The situation may be different if this is about intermediate handling - which the question doesn't seem to be about.


One obvious way to map those 64 colors to 256 colors is to multiply them by 4 (since 2 * 64 = 256):

Yes, that's the obvious and also usual approach. Although any hardware designer would see this rather as shifting than multiplying.

This is however not accurate since biggest value is 63 and 4 x 63 = 252 not 255.

True as well, but a real world application usually wouldn't care, after all, the maximum error is not only just 1.1% (3/256) but also linear, that is the error is continuous, increasing monotonously over all values, thus not changing any relations. This is most important as human vision is quite receptive to relative difference rather than absolute values.

Not to mention that absolute brightness is not controllable by the source signal, but only provided by the screen. That means it's only important that each of the encoded steps is of the same size.

I am wondering if someone knows how VGA hardware works (how those 6-bit values are mapped to analog signals by the DAC).

Well, that's rather simple: A VGA compatible output signal is 'encoded' as voltage of 0.0 to 0.7 Volt at 75 Ohm. That is 0.0 Volt for no colour, 0.7 Volt for maximum level. With 6 bits on the input side this means the voltage level is subdivided into 64 levels.

Some online resources I found suggest to do the following:

8_bit= (6_bit * 255) / 63;
8_bit= (6_bit << 2) | (6_bit >> 4);

Both are great approximations - if one really wants to close in on that 1.1% (*3).

This is much better (63 is now mapped to 255). However those are not equivalent (it will not give the same result, ...

Both reduce the error to about 0.4% (*4), but at different points. That's because the first is an arithmetical approximation, while the second is a mapping in 4 groups. The arithmetic distribution is rather continuous, while the mapping distributes the error in discrete intervals. They are overall the same, with a similar result, but not of course not the same.

... for example 48 is mapped to 194 and 195 respectively).

Well, it's about spreading, making three of the 63 steps 5/256th instead of 4/256th.

Arithmetic Mapping Group Mapping
20/ 80 15/ 60
41/165 31/125
62/250 47/190

The last formula seems the most plausible since it's fairly easy to implement in hardware.

Yeah, as the mapping

 0..15 ->   0.. 60   (or In * 4 + 0)
16..31 ->  65..125   (          + 1)
32..47 -> 130..190   (          + 2)
48..63 -> 195..255   (          + 3)

is simply done by routing:

In1 -> Out1 -> Out7
In2 -> Out2 -> Out8
In3 -> Out3
In4 -> Out4
In5 -> Out5
In6 -> Out6

No electronics needed :))

(In1..In6/Out1..Out8 are bits in decreasing value)


*1 - For practical purposes the brightness setting of a screen can be seen as an analogue mapping of the input signal toward the screen.

*2 - Being close is all what counts, not how close, as all analogue circuitry is created to work best with a value around that 0.0..0.7 Volt range.

*3 - Since brightness can easily differ by 20% between screens, not to mention different curves of adaption, looking for where to place those 3 uneven steps is in reality rather useless.

*4 - By being quantized that's the lowest possible error anyway (1/256).

Sep Roland
  • 1,043
  • 5
  • 14
Raffzahn
  • 222,541
  • 22
  • 631
  • 918
  • 1
    Heh. Let's not even get into "blacker than black" when displaying video content (e.g., DVD) on a VGA display ... – davidbak Jul 24 '23 at 21:44
  • @davidbak h yeah, the collision between how TV and VGA frame the 0..0.7V signals ;)) – Raffzahn Jul 24 '23 at 21:47
  • @Raffzahn Well PAL defines black as 0V and white as 0.7V. As long as the syncs are -0.3V it is compatible. – Justme Jul 24 '23 at 22:38
  • @Justme it's relative, so shifting the base doesn't make an argument. – Raffzahn Jul 24 '23 at 22:43
  • @Raffzahn To be fair, most RAMDACs that could output 1Vpp composite signal used data codes 00..FF for the 0.7V video as usual and the sync input brought the signal down to the sync level. – Justme Jul 24 '23 at 22:54
  • This is not even nearly exact compared to the "do the exact math, round to nearest" approach. – lvd Jul 27 '23 at 14:15
  • I edited the English earlier but wasn't 100% sure about two parts so left them: 1: "but not of course not the same" - should it be "but not of course the same", "but of course not the same", or should it really have both "nots"? 2: Being close is only relevant - should it be "only relative"? – hippietrail Jul 28 '23 at 10:56
3

I would use simple linear search to construct a conversion LUT with minimal error. Here is simple C++ code for conversion from 6-bit to 8-bit:

uint8_t LUT[64];

int i,j,jj;
float ai,aj,d,dd,di=1.0/63,dj=1.0/255;

for (ai=0.0,i=0;i< 64;i++,ai+=di){ jj=0; dd=1.0; // all 6-bit values
 for (aj=0.0,j=0;j<256;j++,aj+=dj)               // all 8-bit values
    {
    d=fabs(ai-aj);           // compute difference
    if (dd>d){ dd=d; jj=j; } // remember smallest difference
    } LUT[i]=jj; }           // store it to LUT

It's possible to optimize it a lot by search around interpolated value instead of full search or use binary search. However I see no point in it as the runtime is pretty fast and in final usage I would use a hard-coded LUT anyway ... Here is the hard-coded result from the code above:

uint8_t LUT[64] = 
   { 
   0, 4, 8, 12, 16, 20, 24, 28, 32, 36, 40, 45, 49, 
   53, 57, 61, 65, 69, 73, 77, 81, 85, 89, 93, 97, 
   101, 105, 109, 113, 117, 121, 125, 130, 134, 138, 
   142, 146, 150, 154, 158, 162, 166, 170, 174, 178, 
   182, 186, 190, 194, 198, 202, 206, 210, 215, 219, 
   223, 227, 231, 235, 239, 243, 247, 251, 255 
   };

So convert like this:

8bit_value = LUT[6bit_value];

As you can see it's mostly incrementing by 4 except a few times (at indexes: 11,32,53) where it increments by 5 so it's possible to convert this to an equation something like:

8bit_value = (4*6bit_value) + ((6bit_value+10)/21);

Which leads to the same results ...

Another option is to use:

8bit_value = ((6bit_value*255)+32)/63

As @RichF suggested in his/her answer, however (s)he was missing the rounding +32 term before the division. Anyway this also leads to the same results ...

Sep Roland
  • 1,043
  • 5
  • 14
Spektre
  • 7,278
  • 16
  • 33
  • 1
    That has got to be quite extreme way of calculating it - it simply maps 64 steps to 256 steps so that 0..63 maps to 0..255. The step will always be 4 in 60 cases and in 3 cases it is 5. The question is what algorithm to use to distribute the 3 cases, unless the answer is to not distribute them at all and just always use out=4*in. – Justme Jul 26 '23 at 15:28
  • @Justme there are 3 approaches in there one LUT and two using linear interpolation all with the same accuracy I see no extremism in here – Spektre Jul 26 '23 at 16:14
  • 3
    There's no need to "search" anything. It really boils down to rounding x*255/63 to nearest integer. And the question is not about using a LUT, but what is the "most correct" way of scaling/mapping the 6-bit values to 8-bit values i.e. for calculating the LUT values. – Justme Jul 26 '23 at 16:24
  • 1
    I would concur that using a lookup table for something that can be done with few basic instructions is kinda extreme. – Raffzahn Jul 26 '23 at 18:26
  • @Raffzahn depends on circumstances like used HW and purpose in low level gfx LUTs are often used as its much faster in HW 64x8 bit LUT is not that much stretch in comparison to adder + divider, in SW its just 64 Bytes and on MCUs it does not even need to be in RAM on modern CPU based machines there is usually enough RAM so 64 Bytes its not a concern. – Spektre Jul 29 '23 at 02:20
  • @Spektre In Hardware it doesn't need any LUT, but a bunch of wires, and in software it's 2-7 instructions depending on CPU, which again is less than a table. – Raffzahn Jul 29 '23 at 12:15
  • @Raffzahn my experience in SW is that LUT is faster especially on older HW – Spektre Jul 29 '23 at 23:30
  • @Spektre As mentioned, on hardware it's just a bunch of wires. And for everything worth in real world usage ist a shift by two positions left, ignoring that 3 value issue as it doesn't matter at all. Can't see how an shift by two get beaten by any LUT logic. Also, 64 bytes is quite a lot on older HW. – Raffzahn Jul 29 '23 at 23:38
2

The general formula, as given by OP, is actually the most exact:

enter image description here

This formula describes the (obvious) idea that the most black color component keeps being the most black, and the most saturated component keeps being such too.

Obviously enough, the intermediate values are non-integer. If we can multiply and divide, the things are easy -- do the math, round to nearest integer and you're done. If we can't or we don't want to multiply and divide, however, other approaches must be explored.

Let's play a little with the formula and try to get rid of the division:

enter image description here

Now we can see, that the second OP's option (that is, v8=(v6<<2)+(v6>>4)), corresponds to the first two terms in the infinite expansion without round to nearest, the same as @Raffzahn proposal.

However, taking more terms and doing proper round to nearest would make the conversion more accurate. To demonstrate that, here's a simple C program doing conversions within several different numbers of terms:

#include <stdio.h>
#include <math.h>

double exact(int v6) { return ((double)v6)*255.0/63.0; }

double shr6(int v6) { return (double)((v6<<2) + (v6>>4)); }

double t3(int v6) { // implements v6<<2 + v6>>4 - v6>>6 then round to nearest return (double)(((v6<<8) + (v6<<2) - v6 + 32)>>6); }

double t4(int v6) { // implements v6<<2 + v6>>4 - v6>>6 + v6>>10 then round to nearest return (double)(( (v6<<12) + (v6<<6) - (v6<<4) + v6 + 512 )>>10); }

void main(void) { double err_shr6=0, err_t3=0, err_t4=0;

    for(int v6=0;v6&lt;64;v6++)
    {
            double r_exact = exact(v6);
            double r_shr6  = shr6 (v6);
            double r_t3    = t3   (v6);
            double r_t4    = t4   (v6);

            printf(&quot;v6=%2d, exact=%7.3f, shr6=%5.1f, t3=%5.1f, t4=%5.1f\n&quot;, v6, r_exact, r_shr6, r_t3, r_t4);

            err_shr6 += (r_exact-r_shr6)*(r_exact-r_shr6);
            err_t3   += (r_exact-r_t3  )*(r_exact-r_t3  );
            err_t4   += (r_exact-r_t4  )*(r_exact-r_t4  );
    }

    printf(&quot;shr6 mean squared error: %f\n&quot;, sqrt(err_shr6/64.0));
    printf(&quot;t3   mean squared error: %f\n&quot;, sqrt(err_t3  /64.0));
    printf(&quot;t4   mean squared error: %f\n&quot;, sqrt(err_t4  /64.0));

}

Its output is:

v6= 0, exact=  0.000, shr6=  0.0, t3=  0.0, t4=  0.0
v6= 1, exact=  4.048, shr6=  4.0, t3=  4.0, t4=  4.0
v6= 2, exact=  8.095, shr6=  8.0, t3=  8.0, t4=  8.0
v6= 3, exact= 12.143, shr6= 12.0, t3= 12.0, t4= 12.0
v6= 4, exact= 16.190, shr6= 16.0, t3= 16.0, t4= 16.0
v6= 5, exact= 20.238, shr6= 20.0, t3= 20.0, t4= 20.0
v6= 6, exact= 24.286, shr6= 24.0, t3= 24.0, t4= 24.0
v6= 7, exact= 28.333, shr6= 28.0, t3= 28.0, t4= 28.0
v6= 8, exact= 32.381, shr6= 32.0, t3= 32.0, t4= 32.0
v6= 9, exact= 36.429, shr6= 36.0, t3= 36.0, t4= 36.0
v6=10, exact= 40.476, shr6= 40.0, t3= 40.0, t4= 40.0
v6=11, exact= 44.524, shr6= 44.0, t3= 45.0, t4= 45.0
v6=12, exact= 48.571, shr6= 48.0, t3= 49.0, t4= 49.0
v6=13, exact= 52.619, shr6= 52.0, t3= 53.0, t4= 53.0
v6=14, exact= 56.667, shr6= 56.0, t3= 57.0, t4= 57.0
v6=15, exact= 60.714, shr6= 60.0, t3= 61.0, t4= 61.0
v6=16, exact= 64.762, shr6= 65.0, t3= 65.0, t4= 65.0
v6=17, exact= 68.810, shr6= 69.0, t3= 69.0, t4= 69.0
v6=18, exact= 72.857, shr6= 73.0, t3= 73.0, t4= 73.0
v6=19, exact= 76.905, shr6= 77.0, t3= 77.0, t4= 77.0
v6=20, exact= 80.952, shr6= 81.0, t3= 81.0, t4= 81.0
v6=21, exact= 85.000, shr6= 85.0, t3= 85.0, t4= 85.0
v6=22, exact= 89.048, shr6= 89.0, t3= 89.0, t4= 89.0
v6=23, exact= 93.095, shr6= 93.0, t3= 93.0, t4= 93.0
v6=24, exact= 97.143, shr6= 97.0, t3= 97.0, t4= 97.0
v6=25, exact=101.190, shr6=101.0, t3=101.0, t4=101.0
v6=26, exact=105.238, shr6=105.0, t3=105.0, t4=105.0
v6=27, exact=109.286, shr6=109.0, t3=109.0, t4=109.0
v6=28, exact=113.333, shr6=113.0, t3=113.0, t4=113.0
v6=29, exact=117.381, shr6=117.0, t3=117.0, t4=117.0
v6=30, exact=121.429, shr6=121.0, t3=121.0, t4=121.0
v6=31, exact=125.476, shr6=125.0, t3=125.0, t4=125.0
v6=32, exact=129.524, shr6=130.0, t3=130.0, t4=130.0
v6=33, exact=133.571, shr6=134.0, t3=134.0, t4=134.0
v6=34, exact=137.619, shr6=138.0, t3=138.0, t4=138.0
v6=35, exact=141.667, shr6=142.0, t3=142.0, t4=142.0
v6=36, exact=145.714, shr6=146.0, t3=146.0, t4=146.0
v6=37, exact=149.762, shr6=150.0, t3=150.0, t4=150.0
v6=38, exact=153.810, shr6=154.0, t3=154.0, t4=154.0
v6=39, exact=157.857, shr6=158.0, t3=158.0, t4=158.0
v6=40, exact=161.905, shr6=162.0, t3=162.0, t4=162.0
v6=41, exact=165.952, shr6=166.0, t3=166.0, t4=166.0
v6=42, exact=170.000, shr6=170.0, t3=170.0, t4=170.0
v6=43, exact=174.048, shr6=174.0, t3=174.0, t4=174.0
v6=44, exact=178.095, shr6=178.0, t3=178.0, t4=178.0
v6=45, exact=182.143, shr6=182.0, t3=182.0, t4=182.0
v6=46, exact=186.190, shr6=186.0, t3=186.0, t4=186.0
v6=47, exact=190.238, shr6=190.0, t3=190.0, t4=190.0
v6=48, exact=194.286, shr6=195.0, t3=194.0, t4=194.0
v6=49, exact=198.333, shr6=199.0, t3=198.0, t4=198.0
v6=50, exact=202.381, shr6=203.0, t3=202.0, t4=202.0
v6=51, exact=206.429, shr6=207.0, t3=206.0, t4=206.0
v6=52, exact=210.476, shr6=211.0, t3=210.0, t4=210.0
v6=53, exact=214.524, shr6=215.0, t3=214.0, t4=215.0
v6=54, exact=218.571, shr6=219.0, t3=219.0, t4=219.0
v6=55, exact=222.619, shr6=223.0, t3=223.0, t4=223.0
v6=56, exact=226.667, shr6=227.0, t3=227.0, t4=227.0
v6=57, exact=230.714, shr6=231.0, t3=231.0, t4=231.0
v6=58, exact=234.762, shr6=235.0, t3=235.0, t4=235.0
v6=59, exact=238.810, shr6=239.0, t3=239.0, t4=239.0
v6=60, exact=242.857, shr6=243.0, t3=243.0, t4=243.0
v6=61, exact=246.905, shr6=247.0, t3=247.0, t4=247.0
v6=62, exact=250.952, shr6=251.0, t3=251.0, t4=251.0
v6=63, exact=255.000, shr6=255.0, t3=255.0, t4=255.0
shr6 mean squared error: 0.345033
t3   mean squared error: 0.287384
t4   mean squared error: 0.286086

You can play with it interactively here: https://godbolt.org/z/sMr5azYM8

From the results we see that:

  • The "shr6" method, that is, (v6<<2)+(v6>>4) gives the worst results,
  • "t3", the three-terms method with explicit round-to-nearest is significantly better,
  • "t4", four-term method, gives exact integer result.

The similar approach could be used for any other conversions of color depth.

lvd
  • 10,382
  • 24
  • 62
  • This has been suggested by many - what makes one of these the most correct in theory? i.e. what is the theory, or standard, or proof, other than 0..63 must be mapped to 0..255 (which is only a must if you want that, and you may not even want that, based on how 8-bit RAMDACs already mapped 6-bit values to 8-bit, or how digital video standardized how conversions from 8-bit to 10-bit should be done). – Justme Jul 27 '23 at 21:40
  • When there's a need to perform further calculatuions with the values, having the most accurate value is actually the best thing to do. – lvd Jul 29 '23 at 17:30
  • Yes but if there is a need to keep the steps most accurate, then you multiply by four exactly. Multiplying by 255/63 and then truncating misaligns the steps (a bit). If you need to do further calculations, most accurate thing is to not truncate and then do more calculations with the truncated values. Please note that it does not matter a lot if white is simply 4x63, as that's already what most 8-bit RAMDACs have already done for years converting 6-bit codes to 8-bit, so why that isn't that the most accurate then? Why give so much weight on converting 63 to 255 which gives unequal steps? – Justme Jul 29 '23 at 19:15
  • What application could ever require "keep the steps most accurate"? – lvd Aug 01 '23 at 13:47