Context Modelling

<< ctxmodel.net

View(s) 3299, Comment(s) 22

2008-09-17 11:20:41 chornobyl       > 
Now that what i call REAL compression 
not some computer synthetic stuff :)

2009-07-13 22:32:58 Yuri+G.         > This is a CatPressor >)

2009-12-30 17:37:43 Chris           > 
Hmm, is that a good book ? 
Can anyone suggest good one for ppm,cm compressors 
any learning resources ? 
thx

2010-01-04 09:41:12 Shelwien        > 
Its not bad as an overview, but there're only like 30 pages 
about PPM, and its PPM from 20 years ago. 
And well, there's not that much written about modern algorithms 
anywhere, and its not quite easy to explain them in words anyway. 
There're sources though, and some articles.

2010-01-12 14:52:19 Chris           > 
Thanks Shelvien 
Could you point me to some sources or articles 
that can help me get started. 
I have no idea where to begin context mixing. 
my knowledge of ppm is very new. 
 
thanks and here is my mail 
seenu4all @ gmail DOT com

2010-01-12 17:14:40 Shelwien        > 
Unfortunately that email doesn't seem to work, 
don't know what I'm doing wrong. 
My email is shelwien@gmail.com , if you need it 
 
>> Could you point me to some sources or articles 
>> that can help me get started. 
 
We have a forum on data compression, 
and even an IRC channel: http://encode.dreamhosters.com/ 
 
>> I have no idea where to begin context mixing. 
>> my knowledge of ppm is very new. 
 
1) It might be a good idea to start with PPMd: 
 
http://ctxmodel.net/files/PPMd/ 
 
PPMd is featured as "text compression method" is most 
popular archivers now, and there's my rewrite of PPMd source 
which is hopefully easier to read, and also Shkarin's 
article describing the algorithm. 
 
The ppmonstr, which is ppmd with some additional logic, 
actually still keeps a world record on large text compression 
(see http://cs.fit.edu/~mmahoney/compression/text.html) 
"durilca" there = ppmonstr with text filters 
 
2) As to specifically CM, you can consider Mahoney's stuff: 
http://mattmahoney.net/dc/ 
(there're good algorithm descriptions in the sources) 
 
or toffer's m1: 
http://ctxmodel.net/files/m1/ 
 
or some of my coders: 
 
BWT+CM: 
http://ctxmodel.net/files/mix_test/BWTmix_v1.rar 
 
direct CM with logistic mixing: 
http://ctxmodel.net/files/mix_test/mix_test_vC.rar 
 
hashed CM with linear mixing: 
http://ctxmodel.net/files/MIX/mix_v3.rar 
 
old bytewise CM with unary coding and secondary estimation 
http://ctxmodel.net/files/ASH/ash04.rar 
 
even older and very simple bytewise CM: 
http://compression.ru/sh/ppmy_3c.rar 
 
or a new prefix tree template with some trivial model plugged in: 
http://ctxmodel.net/files/tree_v1.rar 
 
3) As to good texts on modern CM, I'm not aware of any, except for 
Mahoney's algorithm descriptions, and various posts on the forum.

2010-01-13 05:19:49 Chris           > 
Thanks Shelwien , this will help me a lot to start understanding 
PPM and CM. 
Strangely my gmail id dosnt seem to work, i have added your gmail on my alternate pmcontext AT gmail DOT com 
 
Thanks again :D

2010-01-13 05:33:45 Chris           > 
I have experience with LZ77, huffman , arith, symbol ranking(matt) , BWT (i dont like bwt that much).

2010-01-13 23:55:39 Shelwien        > 
BWT is useful for testing of CM components (counters, mixers, etc) 
It allows to produce reasonable compression without dynamic allocation and complex data structures. 
As to "arith"... what do you think about my rangecoding article 
here? 

2010-01-14 15:29:18 pmcontext       > 
BWT compression is good without complicated coding 
but my dislike for it is due to the sorting i guess. 
the simplicity of schindlers range coder is very 
attractive , I duno much about combinatorics but the binary 
coder implemented in fpaq0 is clear for me to understand.

2010-01-14 18:42:56 Shelwien        > 
Usual BWT compression (like bzip) is only good comparing to LZ, 
so advanced CM postcoders still improve it a lot. 
As to rangecoders, Matt's coder is really simple, but it could 
become really troublesome if you'd try to do something unusual 
with it, including speed optimizations. 
For example, its not compatible with precise probabilities 
(>=16 bits) and bytewise coding and 16-bit i/o, and its also 
very hard to unroll its renormalization loop - it would need 
5 iterations, comparing to 1-2 in my implementation.

2013-11-22 08:39:45 plzqmasw        > 
yauzt7yx  
  
ugtrlr2d  
  
insurance  
  
by9gyysl  
  
rv6g6pt4

2014-01-15 07:08:56 eduzwsqp        > 
zcz170c7  
  
u8j9z2xj  
  
insurance  
  
n1kgvb5n  
  
fivwsoqw

2014-01-15 15:39:03 eduzwsqp        > 
of6t4rpx  
  
nn3ax9jn  
  
insurance  
  
jizupwo2  
  
th1f8h6k

2014-01-29 22:43:03 WarrenEl        > Goods

2014-11-25 12:53:50                 > 

2014-11-25 19:16:21                 > 

2014-11-26 11:01:11 ewpsxigq        > 
bbkvr50u  
  
r6di316f  
  
insurance  
  
audxkhxy  
  
me89aiks

2014-11-27 03:31:18 ewpsxigq        > 
zf2f3j20  
  
edan96or  
  
insurance  
  
ah86ltha  
  
tart5vfo

2014-12-01 12:24:19 ewpsxigq        > 
ph2mavan  
  
oklma37u  
  
insurance  
  
hpe9mmaw  
  
yb8kwvms

2014-12-10 01:22:13 ewpsxigq        > 
hoq4197g  
  
zdgqr749  
  
insurance  
  
ls0h69wn  
  
uocf8v9q

2015-01-10 17:10:58                 >