Talk:Online algorithm

Page contents not supported in other languages.
From Wikipedia, the free encyclopedia

Super-online?[edit]

Is there a word for an online algorithm that has the additional property that partial results can be combined? That is, if I want to run an algorithm in parallel, I want to be able to combine partial results. For example, the sum of a set of numbers is the sum of the partial sums. —Ben FrantzDale 01:00, 7 December 2006 (UTC)[reply]

Hi Ben,
1). This might be a stretch of your question... If you had a program that could determine the competitiveness of a randomized algorithm against any adaptive online adversary on one processor and you used a different processor to find the competitiveness of the algorithm against any oblivious adversary, then you could combine these results using Thm 2.2 from:
S. Ben-David, A. Borodin, R. Karp, G. Tardos, A. Wigderson. (1994). "On the Power of Randomization in On-line Algorithms". Algorithmica. 11: 5–6. {{cite journal}}: External link in |title= (help)CS1 maint: multiple names: authors list (link)
Thm 2.2 states: If G is a c-competitive randomized algorithm against any adaptive online adversary, and there is a randomized d-competitive algorithm against any oblivious adversary, then G is a randomized (c * d)-competitive algorithm against any adaptive offline adversary.
2). Your question sounds more like a practical question compared a theoretical question. If you can give an example of such a problem you are talking about, perhaps I can give additional information.
Hope this helps,
Oravec 04:03, 7 December 2006 (UTC)[reply]
A very slightly more complicated example is computing an average. You can keep track of a sum and a count and can combine two partial results by adding the sum and the count. A more complicated example is computing the standard deviation of a bunch of numbers. It is definately possible to compute this as an online algorithm, as shown in Algorithms for calculating variance. The internal state of that algorithm is a count, the current mean, and the current sample variance. From that page, though, it isn't clear how you would combine partial results. Obviously it's easy to combine the sum and means, but it isn't obvious how you'd combine the variances.
In general I'm just interested in O(N) algorithms that have O(1) internal state but where the internal state for partial results can be combined so each of m processors could compute a partial result in O(N/m) time and those results could then be combined quickly to get the full solution. —Ben FrantzDale 05:01, 7 December 2006 (UTC)[reply]
In machine learning, this is quite a common paradigm for large-scale learning: you train a bunch of linear models in parallel, then average their parameters, weighted by the number of samples each learner got (or you train several random forests and concatenate them). I'm not aware of a generic term for this, though. Usually you'll here this being called "distributed online learning" or something similar. QVVERTYVS (hm?) 10:14, 21 July 2013 (UTC)[reply]

Algorithms for variance[edit]

I had added "Online algorithms for calculating mean and variance" to the see-also section but it got removed with a comment that "streaming algorithm" was a more appropriate place for it. While I'm not 100% clear on the difference between a streaming and an online algorithm, computing the mean and standard deviation of a sequence in a single pass seems like an ideal simple example of an online algorithm. At first glance it is obvious that the mean can be computed in a single pass with O(1) storage, but that isn't obvious about the variance. Could someone explain why these are better examples of streaming algorithms than of online algorithms? —Ben FrantzDale (talk) 20:58, 30 April 2009 (UTC)[reply]

Proposed merge with Streaming algorithm[edit]

Seems to describe the same class of algorithms. QVVERTYVS (hm?) 09:59, 21 July 2013 (UTC)[reply]

Then you don't understand what they are about, because they are very different. Online alforithms are about decision making in the face of uncertaintly, and making a sequence of decisions that would not necessarily be globally optimal if you knew the whole input at once but are as close to optimal as you can make with your limited knowledge. Streaming algorithms are about solving problems optimally (or close to optimally) when the whole input is given, but using sublinear memory to do it. —David Eppstein (talk) 16:31, 21 July 2013 (UTC)[reply]

Misleading title for the "Definition" Section[edit]

I think the title of that section should be replaced, not sure to what title though. The content of the section does not give the definition of what is an online algorithm (it does provide a definition of the term "competitive ratio"), rather it discusses (not necessarily all main) properties of (not necessarily all) online algorithms.--Shay Zakov (talk) 08:08, 6 August 2020 (UTC)[reply]