PDA

View Full Version : Compression and decompression benchmark


emanuel1972
20th July 2008, 05:35 AM
Skeptics,

I just finished an extensive compression test suite on my girlfriend's Ubuntu machine. The contestants are lzo, gzip, bzip2, lzma, lrzip and rar. 6.5 GB of data were processed.

Read all about it here: jel-desktop.blogspot.com/2008/07/compression-and-decompression-shoot-out.html

Thank you for your attn!

Bob Blaylock
20th July 2008, 05:42 AM
BlogSpot URLs…
Appear commonly in spam.
Here's an example.

emanuel1972
20th July 2008, 06:05 AM
BlogSpot URLs…
Appear commonly in spam.
Here's an example.

OK, what's spam about my link? I'm not trying to sell you anything. I just want feedback.

If you're so bothered about the blogspot.com URL I could host the page on my web page. Not that it would make any difference. The media is not the message.

ddt
20th July 2008, 06:09 AM
You do realize that the kind of data in those files heavily influences compression rates and may also skew the relative merits of the various algorithms? The data from your Windows machine seem to be mostly programs; the data from your Linux machine seems to be mainly what you have in your home directory.

You also realize that using the fuse NTFS file system, which resides partly in user space, also heavily skews the results, as it makes disk access times a lot higher in comparison to the (de)compression times?

Bob Blaylock
20th July 2008, 06:13 AM
OK, what's spam about my link? I'm not trying to sell you anything. I just want feedback.

If you're so bothered about the blogspot.com URL I could host the page on my web page. Not that it would make any difference. The media is not the message.


Did you join this forum to participate in the discussions, or did you joint it to advertise your blog? On Usenet, at least, it is a very safe bet, whenever you find a BlogSpot-hosted URL in a posting, that the person who posted it has no interest whatsoever in the topic of the newsgroup in which the posting appears, and is merely using the newsgroups to advertise his blog.

The same appears to be true here, of your use of this forum.

This forum isn't an advertising service.

emanuel1972
20th July 2008, 06:17 AM
Did you join this forum to participate in the discussions, or did you joint it to advertise your blog?

I joined the forum in order to lurk, mostly.

emanuel1972
20th July 2008, 06:21 AM
You do realize that the kind of data in those files heavily influences compression rates and may also skew the relative merits of the various algorithms? The data from your Windows machine seem to be mostly programs; the data from your Linux machine seems to be mainly what you have in your home directory.

Yes, ofcourse. This is inescapable. No corpus is perfect.


You also realize that using the fuse NTFS file system, which resides partly in user space, also heavily skews the results, as it makes disk access times a lot higher in comparison to the (de)compression times?

Which is why I included a cat baseline. Sys time is ~7 seconds (out of 111). "Skews", yes, but I would seriously dispute the "heavily" qualifier.