you might want to take it further: render an image as the human eye would capture it or even as the human being would perceive it.
There are two ways to interpret this. I'll do both.
Interpretation 1: Render an image that looks perceptually realistic.
At the end of the day, your image still needs to be displayed somewhere. Here's the key: you want to render your image in such a way that when you *display* that image on a particular display device, it will produce the same sensation the original radiometric image would have produced.
Here's how to unpack that idea.
In the real world, radiometric spectra (i.e., real distributions of light) enter your eye and stimulate approximately1 four light receptors. The stimulations of the receptors produce the sensations of color we associate with images.
In rendering, we don't have arbitrary control over the spectra we produce. Fortunately, since we (usually) have only three cones, each of which produces only a scalar value, color vision can be reproduced by using exactly three primaries. The bottom line is you can produce any color sensation by using a linear combination of three wavelengths only (up to a few colors that might have to be negative, in which case, you just use different primaries).
You don't have a choice of primaries. Almost all color display devices use the sRGB standard, which provides three primaries (which actually usually don't have a single wavelength). That's fine because it turns out it's all abstracted and you don't have to care.
To clarify the mess that is perceptually accurate rendering, here's the algorithm:
- Render your image using correct radiometric calculations. You trace individual wavelengths of light or buckets of wavelengths. Whatever. In the end, you have an image that has a representation of the spectrum received at every point.
- At each pixel, you take the spectrum you rendered, and convert it to the CIE XYZ color space. This works out to be integrating the product of the spectrum with the standard observer functions (see CIE XYZ definition).
- This produces three scalar values, which are the CIE XYZ colors.
- Use a matrix transform to convert this to linear RGB, and then from there use a linear/power transform to convert linear RGB to sRGB.
- Convert from floating point to uint8 and save, clamping values out of range (your monitor can't represent them).
- Send the uint8 pixels to the framebuffer.
- The display takes the sRGB colors, does the inverse transform to produce three primaries of particular intensities. Each scales the output of whatever picture element it is responsible for. The picture elements light up, producing a spectrum. This spectrum will be (hopefully) a metamer for the original spectrum you rendered.
- You perceive the spectrum as you would have perceived the rendered spectrum.
Interpretation 2: Attempt to simulate the end data the human eye might receive for visualization purposes or compensation for LDR displays.
This one has a less useful meaning, I think. Essentially, you're trying to produce an image that tweaks the way the brain perceives it for fun/profit.
For example, there was a paper at SIGGRAPH this year where they simulated afterimages and color reduction to make images appear perceptually different. Of course, the only reason they do this at all is because the displays we're working with are all low-dynamic range (LDR). The point is to simulate the effects someone might see if exposed to a real high-dynamic range (HDR) display as actual image data.
In practice, this turns out to not work very well. For afterimages, for example, we see afterimages because of a very bright stimulus exhausting color cells. If you instead try to stimulate the effect with a fake afterimage, it might look kindof similar--but since it's a completely different mechanism, it's not very convincing.
This sort of graphics is actually underexplored in the literature if you want to make a go at it. The mentioned paper is an example of more-or-less the most state-of-the-art approaches we have. I think the current consensus, though, is that it is not really worth trying to simulate (at least at this time), since at best you'd only be approximating real vision effects by substituting different ones, and that this doesn't really work.
1Rod+3*cones, the usual case. Approximate because humans may have as few as zero functional light receptors up to a conjectured maximum of seven (with the highest ever observed being five).