<< ctxmodel.net
There were talks about counters as lossy compression of histories.
Like in "On universal counters".
...However there's only the same statement though.
But anyway, the point is that some simple logic applies to counters:
-
We want to derive a probability estimation from a context history
-
The best (most precise) way to do that is to keep the whole context
history and run a model through it each time when probability estimation
is necessary.
-
Context histories are redundant. so next reasonable
step in such a case would be compressing these histories.
-
But then, even compressed, these histories can take a
lot of memory, so, for practical use, lossy compression is necessary.
-
Lossy compression is where we discard the least usable information.
For example, this is lossy coding too:
cstate[p0].s[0] = Init_FSM_States( x+1, (y+3)/4 );
cstate[p0].s[1] = Init_FSM_States( (x+3)/4, y+1 );
here only 0/1 counts are taken into account (so order of
bits in history is lost) and some bits are cut off each
time but we can visualize the update of such counter by
printing an "approximated history".
Like, lets take some string like 001101001 aka '9':
<> <> +0
<0> <0> +0
<00> <00> +1
<001> <01> +1
<0011> <011> +0
<00110> <001> +1
<001101> <011> +0
<0011010> <001> +0
<00110100> <0001> +1
<001101001> <011>
Like this, if there's no mistake
(the approximated histories for the ccm counter quoted above)
Now, there're supposedly lots of ideas on how to improve this ;)
By doing the bit removal not purely ad hoc ;)
Well, here's some more explanation:
-
Context history and context model are unrelated things.
We don't have to use the same model for history (lossy)
updates and for probability estimation, which is well
demonstrated by delayed counter implementations.
-
Because of the memory limit, we need to lose some
information in history updates.
And the most reasonable approach seems to remove the least
useful bit.
Like, to estimate the compressed size of some data in
context of given bit history and its versions with various
bits removed, and keep the version which has the best compression.
Also, in this case, the entropy of history itself can be considered too:
we have to limit it with <24 bits (or lower - lookup tables would
be too large otherwise).
-
Slowness of these entropy-based decisions doesn't matter at all.
We can safely make it as slow as possible, and use the
most advanced models to compress the history, because
after effectively quantizing it to 24 bits with lossy
constraint, we'd be able to generate a lookup table for all
counter states in memory, and then further cluster them
until reaching a reasonable number of states (eg. 8 or 16
bits).
-
The model for counter updates is completely unrelated
to modelling of actual data though - its just a more
precise approximation of a single history (comparing to an ad hoc counter).
But actual models would have SSE etc anyway, and likely
there's no need to take that into account.
Of course, state clustering stage is not really necessary
but plain lossy entropy coding of histories would produce
imprecise number of states then - something like 1639 unique states
with a 12 bit entropy limit, which would be usable too, but not exactly efficient.
Anyway, its likely that some contexts deserve more states
than one can fit into a byte.
2013-08-13 11:46:49 >
|
|