Talk:Fountain code

Fountain code received a peer review by Wikipedia editors, which is now archived. It may contain ideas you can use to improve this article.

Untitled[edit]

Regarding the practical considerations:

I am currently working on my implementation of Online Codes for Ruby (http://rubyforge.org/projects/archipelago/ contains a gem for that), and I am actually achieving an average overhead of about 8-10% for 2000 packets.

I admit that the speed of my implementation is far from good enough at the moment, but the data overhead is quite alright. Does anyone care to comment?

==========================[edit]

Sorry, but this is totally jargon-filled article. It is not suitable for an encyclopedia, but more for a specialized journal. I am an IT guy with tewnty years of experience and I cannot really understand the article (and I really want to!). I mean stuff like: "if and only if the BP algorithm over a check matrix H (or triangularization of a check matrix H) can recover most of input symbols." Oy!! Whut??

I understand that this is a complex subject, but please see if you can define many of the terms, or at least hyperlink them to where they might be understood by us amateurs. Rchakrav 16:58, 28 September 2007 (UTC)[reply]

Regarding the complaint for being too technical[edit]

The article explains quite well the basic properties of a Fountain code, including applications. There are not any technical or mathematical descriptions at all in the current state of the article. And I don't think it well help at all over-trivializing a topic; it just requires a bit of background knowledge to truly understand - and for this the article provides both external and internal references. But then, the OP was probably just trolling... Nageh (talk) 10:48, 14 February 2010 (UTC)[reply]

I'm going to prefix this by saying that if I'm completely off-base here, I apologize. I don't mean to sound aggressive, and I'm pretty new to Wikipedia editing. Please correct me if I am wrong.

I tried making this article a bit easier to understand, without sacrificing much accuracy (The last few edits, while not logged in). In my opinion, the current state of the article is pretty bad. I'll give some specific gripes:

"Fountain codes are flexibly applicable at a fixed code rate, or where a fixed code rate cannot be determined a priori, and where efficient encoding and decoding of large amounts of data is required."

This sentence uses some dense language to say very little. I'd translate it to something like "Fountain codes can be used with a fixed code rate, or one determined on the fly. Encoding and decoding are both fast enough to handle large workloads.". The sentence adds no substance to the article, though, so I'd remove it.

(The first example in Applications)

I tried to come up with a good example to illustrates where this type of encoding scheme would be well-suited, and I think I did a good job. The passive receiver / broadcasting transmitter example can illustrate both the advantages of fountain codes in a multicast scenario, and the advantages of fountain codes over a lossy wireless connection. The data carousel example is not very concrete, and needs to reference another article (data carousel) to be understood. The referenced article is also fairly low-quality (and has been tagged as such since 2007).

"[...]are a class of erasure codes with the property that a potentially limitless sequence of encoding symbols can be generated from a given set of source symbols such that the original source symbols can ideally be recovered from any subset of the encoding symbols of size equal to or only slightly larger than the number of source symbols."

I find this opening sentence pretty hard to parse. I need to leave, though, so I'll stop here. Thanks!

--Meyermagic (talk) 02:53, 20 February 2012 (UTC)[reply]

Always good to have feedback. Let's get through each of the issues you have brought up and your suggestions.

"Fountain codes are flexibly applicable at a fixed code rate, or where a fixed code rate cannot be determined a priori, and where efficient encoding and decoding of large amounts of data is required." I am not particularly happy with the wording, but it intends to convey two key advantages compared to ordinary FEC codes. First, that you do not have to predefine (or even dynamically select) a code rate, and that you can use them to encode large amounts of data (without breaking them up into smaller chunks). Note that you do not even need to select a code rate "on the fly" as once you have started out sending encoding symbols you can always and easily create new encoding symbols that can all be used by a receiver for decoding – that's why they are called rateless codes. While you may think that the sentence adds no substance, both a fixed code rate and the suitability for the encoding of only small objects are important limitations of ordinary FEC codes in the applications examples given below.
Concerning the first example in the Applications section, this is an important scenarios for multiple reasons. It is actually used in practice in commercial products (such as in DVB-H broadcast systems), it highlights the benefits of using fountain codes while pointing out both of the drawbacks of traditional FEC codes (their short block length and their fixed code rate), and avoids original research by not coming up with an own example. Now, let's have a look at the text you had suggested in your earlier edit. What you are discussing is the difference between traditional connection-oriented, ARQ-based file transfer and multicast transmission using a FEC code for error resilience. While it is true that in the FEC version of your example scenario you assume use of a fountain code, the example fails to contrast this with ordinary FEC codes. The problem of the latter is highlighted in the data carousel scenario given in detail in the Applications scenario, that is, the coupon collector's problem. I am sorry that the data carousel article is that bad and does little to explain the concept, but that should be motivation for improvement of the latter article.
Regarding the lead section, I agree with the sentiment that the first sentence is somewhat hard to parse and could need improvement. However, I do not agree that your proposed rewording provided clarification. Notably, it fails to stress the important concept of ratelessness. Also note that "encoding symbol" is standard terminology in coding theory, while a chunk refers to the parts of a larger object, each of which is to be separately encoded.

Gotta go now, too. I'll come back later. Nageh (talk) 10:29, 20 February 2012 (UTC)[reply]

Thanks for the friendly reply. You are correct in most of your criticisms of my suggestions, thanks for pointing all that out.

In regards to the introduction to the Applications section, I think now more than before that the first sentence is misleading. By saying that a fixed code rate can't be determined a priori, one might take it to mean that a fixed code rate could be determined a posteriori (taken here to mean after we've started encoding). This is not correct, as you pointed out. It isn't just that code rate is determined "on the fly" (as I wrote without thinking), but that the encoding system is inherently rateless.
As for my example, original research is a good point. I also didn't make it very clear how the example system would benefit from fountain codes, rather than traditional FEC (especially since you might have a pretty good idea of the error rate in advance, in the given scenario).
I considered that use of the word "chunk" might be misleading, you are probably right. I don't think my version does a poorer job of conveying the concept of ratelessness, though. Both versions note how collecting (one original message / number of source symbols)'s worth of encoding symbols or a slightly more allows you to retrieve the encoded information. I added information about how probability of a successful decoding changes with the number of collected symbols above the number of source symbols which, thinking about it now, I only know to be true specifically for Raptor Codes.

--Meyermagic (talk) 23:49, 20 February 2012 (UTC)[reply]

Help i am not understand my data can change. Mr.mark363 (talk) 12:04, 2 May 2019 (UTC)[reply]

Apparent conflict of interest re BitRipple[edit]

User:The mike luby added a link to BitRipple, where he says he is a cofounder and CEO. That seems like advertising his own company. 2001:14BB:6BA:95EA:E10A:5CF9:D186:7CC1 (talk) 22:48, 6 January 2021 (UTC)[reply]