<< ctxmodel.net

> I'm trying to make something like GPU based compression.

I'm interested in GPU programming too. In fact, I made some md5 cracker for CUDA (well, started with cuda, but switched to their PTX "assembly" really fast).

But I don't expect much effect on compression from GPUs.

Especially considering fast compression.

I think that with GPU it might become reasonable to use some super-complex models (like 10x of paq8 computations) and to perform the parameter optimization in realtime.

But I don't believe that its possible to improve the performance of common and relatively fast LZ and PPM algorithms with GPU. BWT might be an exception, but I doubt even about that, because global memory access is really slow on GPU, and there's no such thing as implicit caching. Also the performance of average GPUs isn't that impressive either - i've got ~230M/s md5 on my 8800GT(512M) and ~4x25=100M/s md5 on Q9450, and note that MD5 computation doesn't require any significant memory i/o.

Basically, it seems that GPU helps only with things like cryptographic functions, which are computationally heavy and don't require any global memory access, (nvidia gpus only have a lot (8k) of fast registers and global memory which is 100x slower) and even there its not much help because of lower clock frequency and simpler architecture (of a single unit vs CPU).

So compression-related applications for GPU that I can think of are something like keeping all the statistics in GPU memory and running ~120 alternative submodels in parallel and mixing their predictions with another group of units.

> Actually, I won't try it on CPU. I have asked this question
> for GPU side implementation. It will be "after-BWT" stage.

Well, as I said, its reasonable in such a case to make a GPU "thread" per each order1 context, and eg. store their predictions (probabilities) into GPU global memory, and then maybe even encode the predictions on the GPU too, blockwise, though there'd be 1-4 bytes of redundancy per block, but thats only 4x128 bytes max. Hmm... maybe I should write such a thing myself :)

> Another thing is about my another question: "order-0 modelling in multithreading".
> Actually, I had asked only about compression, not decompression.

Parallel compression is too simple. You can easily split a CM compression algorithm into as many threads as you'd like - because there're independent counters and any of them can be processed as a separate thread, while you have access to the data.

Also compression part can be significantly optimized even without multi-threading (but using the same logic). Eg. for fpaq0pv4B I had the idea to make a template function with byte value as a template parameter, so it would be possible to make a table containing EncodeByte<0>...EncodeByte<255>, and compiler would be able to optimize that much better, maybe even merge some interval arithmetics.

Also look at this: http://compression.ru/sh/parcoder.rar
I'd written it in 2001, and it seems that with modern CPUs/GPUs it finally became relevant :)


2010-03-20 13:42:10 Carlos          >
Hello! My name is Carlos and I'm from Brazil. Nice text you got here. It helped me and gave me some ideas. I've just started the second phase (developing phase) of my final essay of Computer Science. My Essay will be about trying to make a CUDA compressing software. Can you please help me and send me any progress you have with the compression software? Or maybe let me check on how you did the MD5 using the GPU that would be really apreciated aswell. My email is carlos.rivin@gmail.com

If you have some updates on this software please let me know.

Thanks alot.

Carlos
2010-03-20 23:45:04 Shelwien        >
I don't really have any progress, but here're some links:

http://encode.dreamhosters.com/showthread.php?t=569
http://encode.dreamhosters.com/showthread.php?t=511
http://encode.dreamhosters.com/showthread.php?t=160
http://cuetools.net/doku.php/flacuda
http://majuric.org/software/cudamd5/
http://3.14.by/en/read/md5_benchmark
http://golubev.com/hashgpu.htm
2013-07-27 19:41:36                 >
2013-08-30 20:08:25 vivienne+west...>
1)The principle benefit of making use of Dynamics GP ten.0 Fixed Assets is the fact that it enables you to create an limitless variety of guide courses. Guide Class information permit you to team property according to how depreciation really should be dealt with when reporting to Federal, State, and corporate amounts. Guide course information also can be used to team property determined by form of house, depreciation system, depreciable existence and averaging conference.
2013-09-13 17:23:24 official+toms...>
shoe zoo, mens nike shoes on sale, low cost mens nike shox, most affordable price nike shoes, low cost mens basketball shoes, low cost mens tennis shoes, mens nike shoes clearance,
2013-11-11 10:39:16                 >
2013-12-31 00:10:29                 >
2014-03-10 11:19:35                 >
2014-05-18 14:57:27                 >

Write a comment:

Name: