<< ctxmodel.net

Q. How to make a multi-threaded implementation of order0 coder?

The main problem is decoding - its quite easy to thread the compression part as much as you want because it can access whole data.
And a multi-threading implementation of sequential decoding might be sensible at some higher orders using speculative execution (that is, to start processing the updates for most probable symbol before really testing that your guess is right), because it would be 50-70% hits at high orders.
But with order0 I think that most reasonable would be to simply process the data blocks in different threads - its even possible to avoid any compression degradation by compressing the frequency tables for each block. Then, another possible approach would be the one I talked about in fpaq0pv4B thread... separating the bit sequences.
These kinds of coders can be easily threaded by separating high and low bits. Eg. we can use two rangecoders with two buffers, one of which would encode bits4..7 and other bits0..3. Higher bits are generally more predictable, so their (de)coding would run faster, so most of execution time can be overlapped.
That is, instead of sequentially compressing bit7-0 for each byte, we first compress all bit7's, then all bit6's using already known bit7's, etc. Like that it would be easy to perform in parallel and I think that it won't affect the compression at order0. But using this technique it would be really troublesome to keep the strong compression at higher orders.
2013-08-09 12:44:56                 >
2014-11-26 15:29:54                 >
2015-01-11 13:36:08                 >

Write a comment:

Name: