Jump to content

A question for programmers about debayering algorithms


Recommended Posts

The order is R, G, B. So what you can do is very simple...

You might want to rethink that.

A single photosite is not a pixel, nor vice-versa. RAW files do not have pixels. They hold photosite values.

 

In order to reconstruct a RAW file, you need to deconvolve 4x photosites from one single pixel, and just adding four zeroes onto an 8 bit value does not give you a 12 bit value. That would be a totally pointless piece of coding.

Link to comment
Share on other sites

  • Replies 59
  • Created
  • Last Reply

Top Posters In This Topic

Bill- I did not work with Michael Kriss. I worked with digital sensors in the 80s, the 90s- optical networks. I liked collaborating with my wife as she was doing a lot of image processing in the 90s. Now- write custom code for my CCD based Leica cameras just for fun. Write code for embedded systems for a living in the 21st century.

 

Karim- I'm not sure, but what I think you are trying to do falls into "SEI", Specific Emitter Identification", ie identify which specific camera produced a digital image. The fields, "Meta-Data" placed in each file is easily stripped off. Doing this using a reduced size JPEG loses too much information to backtrack. Going from 12~14 bits down to 8 bits is going to wipe out any pattern noise that might relate the image to a particular sensor. So back to "rephotography" of an image- using some sort of image scene metrics, feature extraction, etc- it would probably be possible to identify a reproduced image as coming from someone's original file.

Link to comment
Share on other sites

You might want to rethink that.

A single photosite is not a pixel, nor vice-versa. RAW files do not have pixels. They hold photosite values.

 

In order to reconstruct a RAW file, you need to deconvolve 4x photosites from one single pixel, and just adding four zeroes onto an 8 bit value does not give you a 12 bit value. That would be a totally pointless piece of coding.

 

Whether you call it a pixel or a photosite (or a doohickey or a doodad) doesn't matter. What matters is that each doohickey corresponds to a geographic location in the rectangular sensor on which it was captured and in which it will ultimately be rendered for viewing. The data file of 0's and 1's that arise at capture in the sensor (14 such bits for each doohickey in most high end sensors today) lead, via mathematical transforms to the final 0's and 1's of a TIFF or JPEG file that cause the display to show an image. The OP's question was can one reverse this mathematical transform.

 

Each photosite on the sensor does in general correspond directly to a pixel. So for instance on my Nikon Z7, the sensor is a rectangular grid of 8256x5504 doohickeys = 45.4... million total. Each doohickey returns 14bits. When I take a photograph in RAW Full size mode the .NEF file size (if uncompressed) should be 45.4M x (14/8) bytes = 79.5...Mbytes. Actually the file size that downloads to my computer is approximately 60Mbytes but this is explained by lossless compression. If I set my in-camera capture mode to instead process the raw data and output TIFF Full Size the image is still 8256x5504 (just like the sensor size). Now each of the 45.4M sites, via the mathematical transforms of de-bayering and rendering etc, contributes 3 bytes to the resulting file = 3 x 45.4M = 136Mbytes. Sure enough when I download the TIFF from the sensor to my computer the file is approximately 136Mbytes.

 

Note that the TIFF from the camera has more bits than the RAW, despite the fact that the TIFF came from the RAW. This seems counterintuitive. How can the TIFF have "more information" than the sensor that provided it in the first place. It certainly cannot have more image information. I think the explanation is that the extra data provided in the TIFF file is information that aids in displaying. TIFF files are display-able quickly and easily in a host of user software applications. The RAW file from which it arose is a sort of compressed version of the TIFF file. Note I said 'sort of', no need to explain to me compression algorithms.

 

So since the TIFF had more bits than the original RAW it should be possible to reverse the process. Again as I said in my earlier post it won't be what Nikon (or whatever manufacturer supplied as the RAW input) explicitly did but it should still be possible to take the 24 bits in each of the 45.4M long RGB array and down-convert to 14 bits. Note that this conversion will not be array element by array element (trivially one could take 24 bits, and throw away the lower 10, or the upper ten. or the middle ten). The conversion applied to each array element will be based on values of neighboring arrays as well. I leave the details and the actual coding to the interested reader. :)

Link to comment
Share on other sites

The TIFF stored RGB values for each pixel, using some interpolation scheme. In the old days, the original value for the site would be stored, and the two other color planes would be interpolated using nearest neighbor. With that scheme- the original RAW image could be reconstructed using only the RGB values that were not interpolated. Demosaic routines can be much more complex, including eliminating aliasing in the interpolation algorithm. Every value of the image, all three image planes, are interpolated values. With that scheme- it is not possible to reconstruct the raw values exactly. There will be "round-off" error in the interpolation scheme, I wrote a DNG processor to add a Gamma curve to my M Monochrom files. I converted the 14-bit values to 16-bits using the curve, that way the non-linear transform would output unique values for every unique value input to the routine. The created image could be reversed to the same input value, the slope of the gamma curve was never greater than 4. If it had been 14-bit to 14-bit, then you would end up with some different input values being translated to the same output value. So- you can get something in the original Raw image format, but it is likely to not contain the same information as the original raw file.
Link to comment
Share on other sites

Each photosite on the sensor does in general correspond directly to a pixel.

Sorry, but it doesn't.

Each pixel is an amalgamation of 4 photosites - in a standard Bayer mosaic sensor. That amalgamation then moves on by one photosite and combines the next cluster of 4 RGGB photosites.... and so on.

 

There is no direct relationship between a particular pixel in the final image and one particular photosite. How can there be? Since each photosite is filtered only Red, Green or Blue, and a pixel requires all 3 colours to describe it.

 

In its simplest form the creation of a pixel goes: Take the average (digitised) value of two diagonally adjacent green photosites, add the red and blue digital values of the two photosites that occupy the other diagonals. Raise those values to the power of the inverse gamma of the target colour space, and store those resultant 3 values as the colour descriptor for one pixel. Now move one photosite horizontally or vertically and repeat the process.

 

The only relationship between photosite 'doohickeys' and pixel 'doohickeys' is that you end up with approximately the same number of them. Except that you can't start creating pixels at photosite number one, because it has no neighbouring photosites on 3 sides. And neither does the last photosite in the matrix.

 

An easier way to imagine the process, is that one pixel replaces the central joining point of a square consisting of two green 'tiles' and one red and one blue 'tile' tesselated together.

 

Once the colour of those tiles is mixed, and the colour-space gamma, white-balance shift and other necessary hue and tonal shifts have been added, it becomes quite a complex task to get back to anything close to the original photosite values that created the finished pixel. Remembering that each photosite affects the colour value of four pixels, and therefore the tone and hue of four adjacent pixels need to be considered in the reconstruction of one photosite.

Edited by rodeo_joe|1
  • Like 1
Link to comment
Share on other sites

Note that the TIFF from the camera has more bits than the RAW, despite the fact that the TIFF came from the RAW. This seems counterintuitive. How can the TIFF have "more information" than the sensor that provided it in the first place?

That's easily explained.

TIFF files only come in a few types - 24 bit (3x 8bits/channel) and 48 bit (3x 16bits/channel) being two common types. So any colour-depth over 8 bits/channel from the camera needs to be packed into a 48 bit TIFF, which naturally takes up more space than the 12 or 14 bits/channel that pop out of the camera's analogue-to-digital converter.

  • Like 1
Link to comment
Share on other sites

I thought this whole thing was answered with this thread by BeBu Lamar ??

 

Of course there are many ways. The basic of it you can read it here:

Demosaicing - Wikipedia

No- the question was whether you could go from an image that had gone through the demosaic process and reconstruct the original raw image. The answer to that is "probably not with most modern algorithms" and "what is the purpose, why do you want to do this". If the purpose is for image authentication and to prevent theft of intellectual property, there are some other methods that may be of interest.

Link to comment
Share on other sites

You might want to rethink that.

A single photosite is not a pixel, nor vice-versa. RAW files do not have pixels. They hold photosite values.

 

In order to reconstruct a RAW file, you need to deconvolve 4x photosites from one single pixel, and just adding four zeroes onto an 8 bit value does not give you a 12 bit value. That would be a totally pointless piece of coding.

Firstly, files have pixels, sensors have photo sites. I'll take 'photosite value' as well, as it's obviously more accurate. But it is not wrong to say that RAW files have pixels.

 

Secondly, I made a greater error than you pointed out - I used two bytes from a TIFF file to describe two pixels on a reconstructed RAW file. Oops. And I also did not take into account the maths behind debayering, although my simple method would yield results which may be of interest to some. Converting 8 to 12 bits (or 14 or 16) is not pointless if you want to create a file that a RAW converter can read without any extra steps.

 

The formula, n / 255 x 4095 (note that I had to correct the equation) does not merely place four trailing zeroes on a number, as you can demonstrate to your own satisfaction.

 

Every value of the image, all three image planes, are interpolated values. With that scheme- it is not possible to reconstruct the raw values exactly.

But - how close can you get, and can you tell that it was done? I don't know of anyone who has done such a thing.

 

If the purpose is for image authentication and to prevent theft of intellectual property, there are some other methods that may be of interest.

Yes, that is the problem I'm trying to think about. I asked myself, "Can an image authenticate itself?" Probably not. It's (almost) like asking, "Can a file embed its own hash?" Even if it could, a fake file could also embed its own hash. So it's not a simple problem.

 

It seems that the RAW file can almost solve this problem - media companies may have to start uploading only RAW files, which will be debayered by Web browsers. This is not exactly a perfect guarantee of authentication but it's a starting point.

Link to comment
Share on other sites

But it is not wrong to say that RAW files have pixels

It is.

A RAW file holds photosite values only, and has to be de-Bayered, or otherwise de-mosaiced and extensively processed before being visible as a collection of pixels.

 

If you can't see the clear difference between a pixel and a photosite value, or know that a RAW file only contains photosite values and not pixels, then there's almost no chance of this venture being successful!

The formula, n / 255 x 4095 (note that I had to correct the equation) does not merely place four trailing zeroes on a number.

If n is integer or truncated to 8 bits it does. It's exactly the same as multiplying by 16 or shifting the binary number 4 places to the left and leaving four trailing zeroes. In fact even if n is floating, it's still exactly the same as multiplying by 16 or shifting four binary places to the left. It adds no precision whatsoever.

 

BTW 4095/255 does indeed give a non-integer multiplier, but it's wrong, and still a constant.

Edited by rodeo_joe|1
Link to comment
Share on other sites

Once again: it can be said that files have pixels, and that sensors have photosites.

 

If n is integer or truncated to 8 bits it does. It's exactly the same as multiplying by 16 or shifting the binary number 4 places to the left and leaving four trailing zeroes. In fact even if n is floating, it's still exactly the same as multiplying by 16 or shifting four binary places to the left. It adds no precision whatsoever.

 

BTW 4095/255 does indeed give a non-integer multiplier, but it's wrong, and still a constant.

I stated in an above post that you would not gain precision by converting an 8-bit value to an 12-bit one. The only purpose of that exercise would be to create a file with expected values for a RAW converter.

 

Secondly, I will demonstrate below that such a conversion does not merely add four trailing zeroes to a number (it can, but not always).

 

We will assume a pixel value of 100 out of a total of 256 possible values, in decimal numbers to start with:

 

100/255

 

We begin our conversion process:

 

100/255 = 0.392156862745098

 

We multiply 0.392156862745098 by 4095:

 

0.392156862745098 x 4095 = 1605.882352941176571

 

We round the result to the nearest integer value:

 

1606

 

The binary number for 100:

 

1100100

 

Adding 4 trailing zeroes to the above gives us this 12-bit number (which is 1600 in decimal):

 

11001000000

 

The binary number for 1606:

 

11001000110

 

The binary number for 1606 does not have four trailing zeroes, therefore converting 8-bit values to 12-bit values may not be done by merely adding four trailing zeroes.

Link to comment
Share on other sites

Yes, yes, yes.

You're still incorrectly multiplying by 4095/255 to get a totally false value that means nothing.

 

4095/255 = 16.05882353

A meaningless multiplier.

 

Let's work in decimal to make it easier to understand your error.

 

Say we have a calculator that only shows 6 figures in decimal. If we want to increase the number capacity or decimal places to 8 figures, we would have to add two more numerical indicators or get an 8 digit calculator. This would make absolutely no difference to any 6 figure number entered into either a 6 or 8 digit calculator, and would require no mathematical operation to be carried out on that number.

 

For example: The number 12.3456 would remain 12.3456 on our expanded 8 digit calculator. With no need to multiply that number by 9999/99, which is the equivalent of what you're suggesting we do in binary.

 

All that's needed to make an 8 bit binary number 'fit' into a 12, 14, or 16 bit calculation is to add leading zeroes to the number. Not multiply it by anything, especially not 4095/255.

 

E.g. decimal 135 in binary is 10000111 and even if we put that byte into a 16 bit word, it remains as 0000000010000111.

Whereas, if we do as you propose, it becomes 000010000111100 or 2168 in decimal. Which means what?

Once again: it can be said that files have pixels, and that sensors have photosites.

Files contain data - full stop. It's what those data represent that's important, and the data in a RAW file represent the brightness values of photosites. They do not represent full pixels. Otherwise they wouldn't be RAW files.

Edited by rodeo_joe|1
Link to comment
Share on other sites

A Hash value is typically used to authenticate that a file has not been changed, MD5 being the most common. With a Hash: the most minute change in the file will produce a vastly different output Hash, making change easy to spot. An image that has gone through the demosaic process has pixel values that are dramatically different from the raw values, except in the trivial case where one original value is stored in the TIFF file, that value not being truncated or interpolated. For terminology- 40 years ago when this stuff was still in the research lab the term pixel was used, "picture element". So- I will stick with calling them pixels.

 

You lose precision when going from 14 bits to 8 bits. If any transforms have been used, like applying "photoshop curves", Histogram processing, etc- you just cannot get back to a meaningful/simple numerical conversion. You would need to know all of the operations applied to get close- such as backing out a curve, but the best you can do is a "jagged" reconstruction as you get round off errors in the integer math.

 

SO- if someone presented this problem of how to authenticate an image, I can think of how to compare a processed image against a raw image by using statistical comparisons of the features in the image. Things like frequency+phase in a Fourier Transform and spatial comparisons from a Wavelet transform, things I learned by listening to my wife.

 

"A bit related", I modeled the portion of the image thrown away by the Leica compression scheme used in the M8 and M9.

 

Leica - I don't like lossy image compression...

 

I use M8RAW2DNG for my M8, and never used compressed images in the M9. What is shown is the "chaotic" behavior of the scheme, meaning lost information is related to the contents of the image.

Edited by Brian
Link to comment
Share on other sites

as_shot.thumb.jpg.b7c9303f6a47f4223602128fb200d57c.jpg processed.thumb.jpg.849882501036b23045eaaa091c874d22.jpg

 

As shot on the M8, B&W 090 Red filter, about the same as a Kodak Wratten R25 and the processed image. Blue channel gets IR only, Green gets mostly IR, Red channel is mostly Red and 5% IR. Histogram equalization used to boost Blue and Green to the same level as Red.

 

Going from the processed image back to the original- makes my head hurt, and I wrote the code to do this conversion on the DNG file directly.

.

Comparing the two images using features, the "spatiial content"- convert both to monochrome and run an auto-correlation.

Link to comment
Share on other sites

SO- if someone presented this problem of how to authenticate an image, I can think of how to compare a processed image against a raw image by using statistical comparisons of the features in the image. Things like frequency+phase in a Fourier Transform and spatial comparisons from a Wavelet transform, things I learned by listening to my wife.

Yes, but you'd have to have the two images to compare. If you use RAW publishing + a hash + blockchain time stamps, you've pretty much done as much as you can... I think. Keep in mind that I'm taking into account that any image, fake or real, can have a hash and a blockchain timestamp. The key is to get in first with the original RAW file, and even then it's not a 100% guarantee.

 

Edit: Sorry I haven't commented on the actual images, as I'm so focused on the authentication problem! I will give those a proper look so I can try and understand what you're doing there.

 

 

 

All that's needed to make an 8 bit binary number 'fit' into a 12, 14, or 16 bit calculation is to add leading zeroes to the number. Not multiply it by anything, especially not 4095/255.

I will let an engineer or scientist settle the matter. But I am sure I'm correct, because I'm translating values from one scale to another, and that is the correct method. It's how you get percentages, which is grade 6 level stuff.

Link to comment
Share on other sites

Each cell in a raw image contains an 8 or 16 bit number representing a luminosity value. Any integer 8 bits or more is represented as an 8-bit number. More than 8 and 16 or less is always 16 bits, etc. Most digital cameras are 12-14 bit depth, saved as 16 bit numbers. In converting the raw image to a TIFF file, color from the corresponding Bayer filter element is composited with color data from adjacent cells, so each cell becomes a pixel (picture cell) described by three numbers corresponding to the RGB values assigned to that cell.

 

Value is added in this process. Thus a TIFF image is approximately 3 times the size of the original RAW image.

 

The RAW cell data is probably a coulometric measurement, based on the voltage of a known capacitance, which is then converted to an integer for further processing. It's possible to actually count electrons, coulometrically, but that's more likely for deep space photography where photons trickle in over hours or minutes.

Link to comment
Share on other sites

Yes, but you'd have to have the two images to compare. If you use RAW publishing + a hash + blockchain time stamps, you've pretty much done as much as you can... I think. Keep in mind that I'm taking into account that any image, fake or real, can have a hash and a blockchain timestamp. The key is to get in first with the original RAW file, and even then it's not a 100% guarantee.

 

I will let an engineer or scientist settle the matter. But I am sure I'm correct, because I'm translating values from one scale to another, and that is the correct method. It's how you get percentages, which is grade 6 level stuff.

 

You can of course register a raw image, and if anyone simply produces a JPEG straight from it, you can show that the same JPEG would be produced by the original. Once any processing is applied, contrast/curves/color corrections, etc- then you must do a more meaningful comparison to relate the two images.

 

If you want to license the digital image: encrypt the main body, leave the embedded thumbnail unencrypted, sell the password to give access to the raw image. Grant limited rights to the buyer to distribute the image as JPEG.

 

If you are trying to prevent theft of IP- then you will have the two images to compare: the original image and the allegedly stolen image. If you want to automatically scan the Internet for stolen images, test a published image against the block-chain protected original, then trivial changes to the image being examined will break the Block-Chain hash.

 

The images I posted are "extreme" as the image processing applied performs histogram equalization across the RGB channels. Going backwards from this, from the JPEG images shown to the original DNG file as shot with the camera- that would be difficult.

Link to comment
Share on other sites

Just to add- scaling the image from 8 bits to 14 bits with a simple shift operation, you still have the loss in precision. If the original image was simply bit shifted to go from 14 bits to 8 bits, you end up with the upper bits of the 14 bit values being the same. The demosaic process is much more complex than this, so when going from a RAW to a JPEG, it's not just bit-shifting the value. Your 8-bits shifted back into position will be different values from the upper 8 bits of the RAW image. Take several point on a curve, run a third-order polynomial fit over them and come up with an interpolated curve. Right Bit shift all of the Y values on the curve 6 places. You will most likely not come up with the original values that you started with by shifting bits 6 places to the left.
Link to comment
Share on other sites

Image authentication using global and local features - IEEE Conference Publication

 

A quick google search using:

image+authentication+using+feature+extraction

 

Shows that others have thought along the same lines of using image features for an authentication process. Nice to know.

 

Back to asking for the purpose of the question- authenticating an image, and developing methods to protect intellectual property OR making a processed JPEG look like a RAW image. The latter- you can convert the JPEG to a RAW format, take it back to a mosaic image. But- too much information is lost to authenticate it with the original image.

Edited by Brian
Link to comment
Share on other sites

But I am sure I'm correct, because I'm translating values from one scale to another, and that is the correct method. It's how you get percentages, which is grade 6 level stuff.

And I'm absolutely certain that you're incorrect.

 

What you're proposing makes no sense whatsoever.

Dividing by 255 and then multiplying the result by 4095 is exactly the same as simply multiplying by 4095/255.

 

n/255 x 4095 = n x 4095/255 = n x 16.05882353.

 

So where does multiplying by some random constant (16.05882353) get you? Whether you do it in decimal, binary, hex or octal, it's all the same.

 

It's not the same as turning a fraction into a percentage. You can't get a percentage just by multiplying the numerator of a fraction by a constant. That's grade 6 level stuff as well!

Link to comment
Share on other sites

You can't get a percentage just by multiplying the numerator of a fraction by a constant. That's grade 6 level stuff as well!

 

250 is what % of 500?

 

250 / 500 x 100 = 50

 

Therefore, 250 is 50% of 500.

 

How to you translate the value of 100 out of a scale of 256 values into a scale of 4096 values?

 

100 / 255 x 4095 = 1605.882

 

I'm not sure what you mean by 'constant'. I don't think I ever mentioned one.

Link to comment
Share on other sites

Back to asking for the purpose of the question- authenticating an image, and developing methods to protect intellectual property OR making a processed JPEG look like a RAW image. The latter- you can convert the JPEG to a RAW format, take it back to a mosaic image. But- too much information is lost to authenticate it with the original image.

I'd love to see what a reconstructed RAW file would look like once it's debayered.

 

BTW you're heard of fotoforensics.com? It's interesting but a good compositor will easily fool it. I used it once to demonstrate to myself the non-authenticity of a photograph of Madeleine McCann, as well as comparing it with the original. I might do a thread about that, seeing as I'm very interested in that case.

Link to comment
Share on other sites

I'm not sure what you mean by 'constant'. I don't think I ever mentioned one.

A constant is simply a number that stays constant throughout a range of calculations. Like your 4095/255 = 16.05882353

 

I see what you're trying to do, but I can't see any reason to do it. You're trying to scale 255 to 4095, but why?

 

OK, let's walk through what happens with your 'algorithm'.

8 bit levels 1 to 8 get scaled to 16, 32, 48, etc.... up to 128. They naturally get rounded down to the nearest integer in the process. Then at level 9 there's a jump in the result to145, and then the increments of 16 continue until we get to level 26 when that gets multiplied and rounded up to 418.

 

You see what's happening? The steps aren't even, and effectively all you're doing is multiplying by 16 (= adding 4 trailing zeroes in binary) except at levels 9, 26, 43, 60, 77, etc. where an odd increment of 17 (binary 10001) is added. This makes no sense at all. And it certainly doesn't add any precision to the multiplied values. In fact it makes them less precise through rounding error.

 

So unless you have an algorithm that actually adds precision, you might just as well leave the 8 bit numbers alone entirely.

 

Adding that precision to an individual photosite value would mean looking at the values of the 4 pixels that it contributes to, and at the 16 pixels that its neighbouring photosites contribute to. Beyond that I think there's little point in trying to gain precision, since any fractional brightness change is probably going to be below the threshold of detection of the human eye.

Link to comment
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now



×
×
  • Create New...