Optimal Data — The Short Story

With any digital camera, you collect data. This seems like a simple enough thing to understand, but it's surprising to me how many people don't grep that, or worse, do acknowledge that but go on to make incorrect assumptions. 

The data you collect is the data you collect. This, too, seems like a simple thing to understand, but is often skipped over. The data you obtain will not be the most optimal set of data that could be collected, though these days our image sensors and lenses are getting remarkably good and do a better job than ever before. Wait, I just added lenses into the equation. That's right: the actual equation for "resolution"—which seems to be a term those of you seeking "better" always bring up—is a chained equation that is a series of of things that all contribute, not a single thing. But let's not get too far from the thread: you get one chance to collect data. Once collected, that's the data you have to work with.

It's easy to discard data. Indeed, if you take JPEG images, you're doing just that—discarding data—and considerably so. The fact that you can't see a difference in the visual result isn't meaningful: by using JPEG you just threw out a ton of data. You did so with bit depth, you also did so with compression. Those two factors together can cause you to simplify 30MBs of data into 2MBs.

Whatever you discard in terms of data, you can't get back. Here we hit the crux of my long-time argument, and why I try to collect optimal 14-bit uncompressed (or lossless compressed) data in the first place. If you end up with 2MBs of JPEG data in the camera, you can't get back the other 38MBs that might have been in the raw file. Moreover, there's a relative to this that Sony users have to understand: setting lens corrections in camera changes the raw data. You can't get back to what was actually collected either way, whether you discard or change the data.

If you add data, you've definitely entered an alternate universe to the "real" data. One thing that a lot of people seem to think is that there is some magical software that can add back data. Once thrown away, data can't be restored (my previous point).  But to my point, this tenet of adding "faux" data comes up with resizing products. Yes, Topaz Gigapixel AI is a pretty good way to increase the number of pixels you have to work with, as is Adobe's Super Resolution function. Both make (mostly) visually appealing additions to your data set. But now you're in the realm of making up data, not recreating data that should have been there. Indeed, the whole raw conversion process for Bayer and X-Trans image sensors is already doing a bit of this, as both require the software to interpolate (make up a reasonable value) for missing color data at each photosite position. 

So here's the thing: I talk all the time about optimal capture of data, optimal processing of data. I don't want to discard data, I don't want to add (interpolate) new data if I can avoid it, and I don't want to have to change or worse still, add data.

It's “changing data" that gets a lot of people in trouble, and fast. 

The most common situation I see is someone complains about the dynamic range of their camera being inadequate. But when I take a close look at their image data, they've been underexposing. They didn't collect optimal data during the capture. Then they took the data that they did collect—which often wasn't the optimal data set they could have collected—and they move the pixel values to make up for that (changing data). Then they complain about what they see, such as more noise.  (And obviously, if you see noise at all, your camera must be rotten ;~). 

Other examples abound: missed white balance settings, use of non-neutral imaging demosaics, missed focus points, too long/short shutter speeds for the subject, use of the wrong ISO value, and many, many more. Get the data capture slightly wrong and you give yourself downstream issues to correct. Process the data capture slightly wrong, and you generate further downstream issues you'll be correcting. 

Presently, there are only two currently defensible positions in terms of image data: (1) what you collected looks visually good enough to you, so you have no issues and do no or minimal processing; and (2) what you collected in data was optimally captured and optimally processed, but you probably still have very small issues you want to address (or would want the camera/processor to eliminate). 

The in-between position is where all the arguments start. False flag arguments in many cases. Which makes it even more difficult to tell the real problems/solutions in the cantankerous world of the Internet. 

I recently tried to catalog all the issues I could have with data collection. I gave up after I had a list of over 100 items, some of which are (mostly) out of my control (e.g. fixed pattern noise). And this was if I did everything “perfectly” in data collection. 

So, as you ponder updating your gear this holiday season, consider something else, too: what are you doing to guarantee that you’re getting the best possible data collection as you photograph? The more you find wanting in that question, the less you need a new camera and the more you need more training and practice. 


A copy of this article has been saved in the Technique section of this site.

 Looking for gear-specific information? Check out our other Web sites:
DSLRS: dslrbodies.com | mirrorless: sansmirror.com | Z System: zsystemuser.com | film SLR: filmbodies.com

bythom.com: all text and original images © 2024 Thom Hogan
portions Copyright 1999-2023 Thom Hogan
All Rights Reserved — the contents of this site, including but not limited to its text, illustrations, and concepts,
may not be utilized, directly or indirectly, to inform, train, or improve any artificial intelligence program or system.