BIRLE

BIRLE is a bijective Run Length Encoding (RLE) compressor implemented in Java.

Like BIAC, it is another port of a compressor which was written by Mark Nelson and modified to make bijective by David Scott.

I've modified it in such a way that it is no longer compatible with the original. I felt it was not conservative enough - and had too-great a tendency to cause expansion on perfectly normal files. As a consequence I changed it to consider a run as anything starting wiuth three repeated characters (rather than the original two).

This modification increased the total compression ratio on the corpus from 11.49% to 12.72%. Increasing the run threshold further appeared to have deleterious results overall.

This code is intended to act as a pre-processor for other compression schemes. It leaves the overall format of most files more-or-less intact (and just eliminates runs) leaving the floor open for subsequent compressors of different types.

The improvement that can be expected is relatively small on normal files - but is increased if the files in question can be expected to contain many runs.

For example, BIAC reduces the files in the corpus from 3,312,291 bytes to 1,841,675 bytes - a reduction of some 44.39%.

If BIRLE is applied first, the reduction is to 1,821,284 bytes, making an overall compression ratio of 45.01%. The saving is of 20,391 bytes - or 0.62% of the total.

There is a GUI front end as well as a command-line interface.

Here's a snapshot of it in action:

Download the executable Jar file.

Download the Java source code.

Browse the source code.

Browse the associated javadoc.

Results on the Calgary Corpus test suite can be found here.


Index | Links


tim@tt1.org | http://mandala.co.uk/