About bzip2

The free and open source lossless data compression algorithm bzip2 is developed by Julian Seward. The first public release of bzip2 (algorithm as well as program) was in July 1996 (version 0.15). Over the next few years, its popularity waxed and now, bzip2 is one of the standards in data compression.

As do all data compression algorithms, bzip2 operates on data in blocks. The block size of bzip2 is between 100 and 900 kB. Frequently-recurring character sequences are converted into strings of identical letters through the Burrows-Wheeler transform, and then move-to-front transform and Huffman coding are applied.

As the bzip2 algorithm is asymmetric, decompression is relatively faster. A modified version called pbzip2 was created in 2003, which uses CPU multi-threading to speed up the algorithm by nearly linear rates on multi-CPU and multi-core systems.

Bzip2 is only a compressor. It does not perform the function of archival, and the standard procedure is to use the TAR archival facility and then compress the archive.