<< ctxmodel.net
## Introduction to Rangecoding
## 1. Combinatorics
Suppose we have a bit string with
## 2. Enumeration-based codingOne simple way to perform such an enumeration is recurrent: C(n0,n1) = (n0+n1)!/n0!/n1! for strings with first bit: 0 - use interval 0..C(n0-1,n1)-1 1 - use interval C(n0-1,n1)..C(n0,n1)-1 (Accidentally, C(n0-1,n1)+C(n0,n1-1)=C(n0,n1))so s[i] = i'th bit of the string a(s,x) = Sum( s[i], i=0..x-1 ) // count of 1s in string prefix code(s) = Sum( s[i]*C(n0-(i+1-a(s,i)),n1-a(s,i)), i=0..n0+n1-1 )which can be rewritten like this: low = 0 range = C(n0,n1) for bit in s[] { rnew = range*n0/(n0+n1); // C(n0-1,n1) if( bit==1 ) { n1--; low += rnew; range -= rnew; // range*n1/(n0+n1) } else { n0--; low += 0; range = rnew; } }This, with long arithmetics, would be a completely precise enumeration algorithm, assigning each of numbers in the [0..C(n0,n1)-1] interval to some of the string permutations.
But even with range rounding and other relaxations,
there's apparently not that much redundancy, and its much
faster to compute.
Though a multiplication-free version with table lookups
for C(n0,n1) values might be still worth consideration
in some cases.
## 3. Rangecoding
To use a limited precision for the
Basically, {0}{N x 1}{k-bit mantissa}and with k-bit range added to that, changes won't
propagate any further than first 0 (which would turn to 1
on k bits overflow).
{0}{N x 1}{k-bit mantissa} + {k-bit range} ---------------------------- {1}{N x 0}{k-bit new mantissa}with overflow, or without overflow: {0}{N x 1}{k-bit mantissa} + {k-bit range} ---------------------------- {0}{N x 1}{k-bit new mantissa}So, we can just count the number of 1 bits at msb of low mantissa, not actually discarding them, but using a more compact representation. And only output/discard these bits (and cached 0/1 before) after ensuring that they won't change anymore.
That's how the plain arithmetic coding works, but memory
i/o is bytewise on the modern platforms, so its not quite
efficient to use bits there. {0..b-2}{N x b-1}{k-digit mantissa}Its just the same thing about mantissa overflow propagation. So, b=256 seemed convenient and was called "rangecoding" by M.Schindler who proposed an open source implementation, but 2 and 256 are certainly not the only possibilities, of which MarcDemo (any base in 2..256) and rc_sh2d.inc (65536) are good examples.
## 4. Carryless coding
All this stuff with variable-length
Also, the redundancy can be significantly reduced by using
a higher precision for However, a considerable inconvenience of carryless coding is the requirement for symmetrical processing in decoding, which makes a carryless decoder fairly complex comparing to "normal" (with carries).
But then, there's a somewhat more exotic alternative
to usual carryless implementations - instead of cutting
2013-07-26 20:03:18 >2014-11-15 02:12:45 >2014-11-26 04:03:45 >2015-01-11 01:23:41 > |