Talk:Variation of information

Metric?[edit]

The given definition fails to comply with the first condition for being a metric: the identity of indiscernibles that requires d(X,Y)>0 for X!=Y.

For instance, if X is 0 with prob 1/3 and 1 with prob 2/3, and Y is 1-X, then clearly X!=Y while VI(X,Y) = 0.

It is a metric on partitions, not on random variables. The variables X and Y induce the same partition and accordingly convey the same information. 2A02:1210:2642:4A00:8C85:4B63:B849:BF6D (talk) 08:09, 16 February 2024 (UTC)[reply]

Universal?[edit]

Even more, it is a universal metric, in that if any other distance measure two items close-by, then the variation of information will also judge them close.

Seems doubtful. There must be some (or many) conditions on the "other" distance.

In any case, mathematics doesn't generally admit "universal" metrics. Topologies, Banach space norms, etc can generally be coarser or finer without bound. The only "universal" metric in the sense defined above is the zero metric.

See also more extensive comments at Talk:Mutual_information#Distance_is_.22universal.22 where I've quoted from the original article. 129.132.211.9 (talk) 19:59, 22 November 2014 (UTC)[reply]

Probability measure[edit]

There was something wrong with the previous definition. It defined VI in terms of entropy concepts in other articles, but didn't draw attention to the fact that you have to use a uniform probability measure on the underlying set to make the other concepts useful. I wrote an independent definition in the spirit of the probability values p_i that were previously defined in the article, but never used. This required defining things for q_j as well and naming the overall set A. I hope the definition is okay. 178.38.151.183 (talk) 18:49, 29 November 2014 (UTC)[reply]

Strange reference[edit]

This topic is in need of attention from an expert on the subject.

The section or sections that need attention may be noted in a message below.

As far as I know, this "variation of information" is usually called the Crutchfield information metric, having been proposed by Jim Crutchfield in 1989 (http://csc.ucdavis.edu/~cmg/compmech/pubs/IAIMTitlePage.htm). This article doesn't include this reference and instead, for some reason, references a 2003 ArXiv preprint that is about a different quantity. (The reference only briefly mentions the Crutchfield metric, giving a reference to the standard textbook by Cover and Thomas, but then simply says it is "not appropriate for our purposes.") The page also references, without a citation, a 1973 paper that appears to be about a completely different topic.

In short, the references for this article seem to be a random selection of irrelevant articles, and the obvious primary reference is not included. I hope that an expert (possibly me when I have time) can address these issues.

Nathaniel Virgo (talk) 05:43, 19 March 2018 (UTC)[reply]

Zurek would seem to precede Crutchfield. See ^[1] and ^[2] Polnasam (talk) 21:13, 15 November 2019 (UTC)[reply]

In relation to the reference of 1973, that's actually an excellent reference: it corrects my previous comment on who's first. I added a subsection in the definition that bridges the gap with that paper. — Preceding unsigned comment added by Polnasam (talk • contribs) 22:15, 16 November 2019 (UTC)[reply]

References

^ W.H. Zurek, Nature, vol 341, p119 (1989)
^ Physics Review A, vol 40, p4731 (1989)

Correction[edit]

The proof that the VI distance is a metric is

H(X|Z) ≤ H(X,Y|Z) = H(X|Y,Z) + H(Y|Z) ≤ H(X|Y) + H(Y|Z) .

where the inequalities are explained by

1) Adding variables increases the entropy.

2) This is the "chain rule". Seeking a reference.

3) Conditioning less increases the entropy.

2A02:1210:2642:4A00:8C85:4B63:B849:BF6D (talk) 08:15, 16 February 2024 (UTC)[reply]

[1] W.H. Zurek, Nature, vol 341, p119 (1989)

[2] Physics Review A, vol 40, p4731 (1989)

[1]

[2]