Decompression depression...
I've finally written a program where I try to follow the logic laid out in the patent application with diagnostic output all along the way. Despite my best efforts (so far
) I can't get it to replicate the compression that's going on for the PV2 Bayer data. I'll lay out various parts of my thinking - maybe someone can help:
+ The patent app. documents using every other sample to "reduce entropy". This is a method similar to the "predictor" sample subtraction used in TIFF. The predictor is limited (true?) to use with LZW compression - which is similar, sorta, to arithmetic compression.
The data I'm seeing looks like it is going through this alternate sample subtraction. Looking at the first bayer line of a solid white image we see: FF FF FF FF FF... The alternate sample subtraction turns this into: FF FF 00 00 00... And then the compressed version is 55 21 FE FF 7F... which seems consistent since it has some high data (FF's) compressing to low numbers followed by a very long string of zero data (00's) compressing to a very long string of 1's (with a 0 stuffed in every 8 bits - more on this later).
So, then I wander over to the cutline all-black where the data is: 00 00 00 00 00... Subtraction gives 00 00 00 00 00. And then, consistent with the previously described compression, a stream of 00 bytes compress to a stream of 1-bits (with stuffed 0's). Sure enough, allblack3.raw shows FE FF 7F BF DF, etc. Consistent again - yay!
+ A second thing that seems to match up is an arithmetic coding implementation technique called bit-stuffing. This technique is somewhat described in the article that j_tetazoo linked as "underflow bits". There is a clearer (I thought) article describing arithmetic coding named "Arithmetic Coding revealed" at http://www.data-compression.info/Algorithms/AC/
Arithmetic compression is based on coding a sequence of things into a really long and totally unique fraction (i.e. number between 0.0 and 1.0). But, computers don't do fractions that well. So, a fairly standard implementation of arithmetic coding using integers has been developed - a main source on that seems to be a paper by Witten-Neal-Cleary published in Communications of the ACM in June 1987.
This algorithm details how to re-scale things repeatedly so that it works using integers. But, there a situation where the algorithm has some trouble - when it finds a sequence which compresses to a long series of 1's in the output. Data being compressed "downstream" could end up needing to flip the whole series of 1's to 0's - very much like adding 1 to 9,999,999,999. Software running with a lot of memory could handle this one way - but "old" (circa 1981) hardware implementations needed a different approach since they didn't have much memory around. Also, it was desirable to find a way to do this with fixed hardware registers. Remember, this is long before megabytes of memory were plentiful like today...
The compromise solution was to stick a 0-bit into the output stream every so often to force an end to the sequence. This decreased the compression efficiency a bit, but allowed for coding and decoding with fixed hardware implementations. I think the basic algorithm for this is in patent 4,463,342 and it is also mentioned later in patent 5,418,532 in conjunction with a "guard register" (just like our patent application).
So, seeing data compressed to a stream of 1's with a 0 stuffed in after every eight 1-bits is typical output from arithmetic compression that has been implemented in hardware. This is mentioned in section [0097] of the patent app - so it seems to tie together.
- But... on the other hand... something looks really wrong for arithmetic compression. When I look at the "edges" of arithmetic compression, I believe that a sequence of 00 bytes should end up compressing to a sequence of 0-bits and a sequence of FF bytes should end up compressing to a sequence of 1-bits. This would also hold true for a 3-bit "channel" like the data byte "channels" described in the patent app: 0 values would compress to 0-bits and 7 values (the highest) would compress to 1-bits.
However, I see two things in the compressed PV2 images that contradict my theories. The cutline black image is a series of 00 bytes - but it compresses to all 1-bits. Same with the white images, high value data tends to compress producing 0-bits and low value (i.e. 00 byte) data produces 1-bits.
So, there seems to be an inversion from what I'm expecting. I can't figure out where this inversion is taking place. On one hand it could be that the compressed data is just a 1's complement of the actual arithmetic compression output - but that doesn't really work with the bit-stuffing algorithm which is designed to handle a series of 1-bits (i.e. there wouldn't need to be any stuffing if they really were a series of 0-bits...) So, other possible places to invert things are the histogram model or 'most anywhere in between.
This is really scrambling my brain.
- Arithmetic compression of the sequence FF FF 00 00 00... - using a histogram model following the patent application's apparent intentions - gives me a few 1-bits followed by an endless sequence of 0-bits. The long string of 0-bits doesn't match up to the long string of 1-bits in the PV2 compression. But, worse, the FF FF sequence starting things shouldn't ever compress to anything other than a series of 1-bits - _based on the described histogram model_.
- What's the deal with where the compressed data starts in the RAW file? When the header says 9C, the data seems to start at 9D. When the header says C8 the data seems to start at C9. Or does it... is that first byte of 00's part of the compressed image data or not? Bah!
So, when I ponder where I'm "off base" in my compression things seem to get out of reach very quickly. The compressed output doesn't seem to possibly match up with the histogram model described in the patent app - so I have to assume that this model isn't being used. So... what model IS being used?
It's almost impossible to detect if the data is being compressed directly or broken up into "channels" for some calculations and then "multiplexed" back together for compression as described in the patent app. There are three pieces which are unknown: histogram model algorithms, channel div/multiplex, and compression details. If I had a clear description of any two (or maybe just one) piece then I could figure out the rest. But with this many unknowns I feel lost.
I am losing some (not all) hope for figuring out the compression algorithm by following the patent application. I think it's just too vague to be completely useful.
The use of bit-stuffing and the patent application are indications that the compression is implemented in hardware. If so, there's little use in combing through the camera's firmware for the compression routine. I know that the bayer image is present in the flash - there, I suspect, for generating the thumbnail - so perhaps there is a decompress in the firmware to get the Bayer from the hardware-compressed RAW data?
But, there's definitely a decompress in the TWAIN driver and/or image viewer application provided for the Radio Shack (and other) SMaL-based cameras. I've disassembed the whole FlatFoto TWAIN driver using "PE Explorer", but the decoding routine hasn't jumped out at me yet. Is anyone else looking along this avenue?
- sailpix _/) _/)