Talk:Run-length encoding

Page contents not supported in other languages.
From Wikipedia, the free encyclopedia

LZ77 is NOT a mere generalization of RLE[edit]

LZ77 is much more powerful. LZ77 is capable of finding non-local repeats, and repeats do not have to be contiguous in the string either. LZ77 finds any repeating substrings, not just consecutive repeats. Moreover, LZ77 does not compress a string such as BWWBWWBWW... as a triplet (3,"BWW"), but rather as something like BWW,[-3,3],[-6,6],..., where [-3,3] reads "copy starting at current position minus three, for three characters long".

It eventually generates codes close to the entropy, but the process is quite different than simple RLE. — Preceding unsigned comment added by Steven.Pigeon (talkcontribs) 01:02, 15 July 2011 (UTC)[reply]

iiii[edit]

juste wanna know how to do if the image is like :
wkwkwkwkwkwkwkwkwkw
kwkwkwkwkwkwkwkwkwk
... —Preceding unsigned comment added by 82.238.53.157 (talk)

Define a new symbol: wk as % for example and then 15% —Preceding unsigned comment added by 88.232.103.154 (talk)

Sounds something like Lempel-Ziv-Welch alogoritm.. RLE was designed to be very simple to implement and should give some results on normal files. —Preceding unsigned comment added by 213.216.227.63 (talk)

Outputs needed[edit]

I cannot say what to do about the code, I'm a programmer, lot's of people don't know programming, but there has to be some way to explain for the common reader what an algorithm is... But output from executing the programs is heavily needed, in order to exhibit to ordinary people what the programs are supposed to do. ... said: Rursus (bork²) 12:06, 19 January 2009 (UTC)[reply]

Too much example code[edit]

Does this article really need examples in four different programming languages? Even taking into account imperative vs. functional languages, there are two more examples than are really needed. I would suggest moving the examples to Rosetta Code and just linking to them. If no one objects, I'll do that within the next few days. — DataWraith (talk) 13:37, 14 April 2009 (UTC)[reply]

Done. I kept the Java example here because it uses strings instead of a dictionay, which is closer to the way the article describes RLE. — DataWraith (talk) 10:20, 24 April 2009 (UTC)[reply]
I don't see why there is any language specific example, and Java is far from a ubiquitous language. If an implementation example is necessary, it should be in pseudocode. Spacexplosion[talk] 19:00, 28 July 2010 (UTC)[reply]

Usually also non-continuous blocks are marked[edit]

The efficiency on hardly compressed images is greatly improved if also non-continuous blocks are marked. For example windows bitmap uses this. Without it, random data size is getting bigger after compression. Ie. WWWABCDWWW would be coded as 3 times W, next 4 chars as displayed, 3 times W. So do not to code 3W1A1B1C1D3W, but instead 3W4ABCD3W —Preceding unsigned comment added by 213.216.227.63 (talk)

Assuming a color is in a single byte, and a number is in a single byte, it's impossible to use a standard unsigned number to store the numbers. This because the information, whether the numbers mean repeats or asdisplayed is needed. Your ASCII example is faulty for this reason. 121.111.40.101 (talk) 07:22, 20 March 2018 (UTC)[reply]

Visualization: Render sample image[edit]

It could be discussed if a color stream (BWWWWBBW) is more general than a typical bitstream (1000110); anyway, I'd suggest to visualize the bitstream by some (tiny) image (vertical line?). —Preceding unsigned comment added by 84.60.65.146 (talk) 15:53, 27 November 2009 (UTC)[reply]

Variations[edit]

RLE is a popular device to use during data compression, particularly in combination with transforms like the Burrows-Wheeler Transform, which has a tendency to produce long runs. However, the way runs are encoded varies broadly, with some designed to work better with specific additional transforms. One of the most common forms I've seen outputs single-letter runs as single letters, then replaces runs of two or more characters as two characters followed by the total run length. Other algorithms split the run length away from the main data stream, as the run length notation can interrupt the local character context and decrease compressibility. If the run length stream is separated from the main stream, different techniques have been used to compress this stream; for instance, some form of notation may be used to describe the number of bits needed to define a specific run length, followed by the value up to but excluding the most significant bit. A straightforward RLE like what's shown on this page, separating all characters into runs and outputting their run lengths is often weak for compression purposes, since it adds so many characters for single-letter runs. At the very least there should be some mention of common alternatives.

Storage[edit]


How are colors stored, and how are numbers stored? It needs to be in the article.

— when there are only 2 colors, black and white, an uncompressed format would store it in a bit. Will run–length use bytes for numbers, combining single bits and full bytes? (of course, storing pixels and runlengths separately will fix byte misalignment)
—— another form of run–length with 2 colors I've seen on the Internet is that a nibble stores the number (0—15) of black pixels, then next nibble stores white pixels, and it alternates. When there are more than 15 black pixels in a row, I suppose the format would be F0F0F0...   .
——— a Huffman code to store the number instead of a fixedsize nibble?
— when there are 256 colors, an uncompressed format would store it in a byte. If bytes are also used as numbers, you get the idea.