Talk:Fractal tree index

Page contents not supported in other languages.
From Wikipedia, the free encyclopedia

Notability and COI[edit]

I've added notability and COI tags to the article. I've been unable to find articles on this topic independent of the group that invented the data structure. Without independent, in-depth reliable sources per WP:RS, this article may fail to satisfy notability guidelines per WP:GNG. The COI tag comes from the article creator also being one of the inventors of the algorithm and also an author on two of the cited papers: "Cache-Oblivoius streaming B-trees" and "The TokuFS Streaming File System". Per WP:COI, this can lead to problems with the neutrality of the article. Editors knowledgable about the topic should review the content for neutrality--although without independent reliable sources to rely on, this may be difficult. Thanks, --Mark viking (talk) 18:24, 23 April 2014 (UTC)[reply]

I've removed the notability tag. Database indexing is an important topic, B-Trees and their variants are very widely used despite their limitations. A new technique that overcomes them is notable.

I too would like to see more independent sources. Paul Foxworthy (talk) 05:55, 1 April 2016 (UTC)[reply]

@Paul Foxworthy: That's not how notability works, please see WP:GNG for the criteria. I am restoring the notability tag. -- intgr [talk] 08:29, 1 April 2016 (UTC)[reply]
@Intgr:I respectfully leave your judgement there. However, the GNG guidelines say "Notability is a property of a subject and not of a Wikipedia article". I agree with that sentiment.Paul Foxworthy (talk)
@Paul Foxworthy: I don't see how that quote applies here. My point is, notability is established by the existence of independent reliable sources about the subject, not hand-wavy claims like "A new technique that overcomes them is notable". A common and useful way to demonstrate notability is adding citations to such sources to the article, but you could also simply list them on the talk page. -- intgr [talk] 07:20, 4 April 2016 (UTC)[reply]

Terse line maybe?[edit]

The article says "There are several choices for how the buffers are flushed, all leading to similar I/O complexity."

It seems you are hinting at whether all buffers in a node are flushed or only the fullest buffer is flushed, and comparing/contrasting the costs resulting due to one of the 2 decisions. Is that so? If that is the case, then it is totally not clear that it is what is meant. Some more elaboration would be super helpful! — Preceding unsigned comment added by Dhruvbird (talkcontribs) 02:06, 28 April 2014 (UTC)[reply]


There are a bunch of choices for how and when to flush: as you point out, you can flush only the fullest buffer or all. But you can also flush-on-full only or also flush-on-query. None of these change the asymptotic analysis. It's all an engineering choice, based on typical workloads. Does that help? Farach (talk) 20:30, 28 April 2014 (UTC)[reply]


Thanks! That makes sense. However, I would like to see some proof or a sketch (maybe hand-wavey) of why it doesn't affect the asymptotics and whether the constants involved are different for either case. Dhruvbird (talk) 21:03, 30 April 2014 (UTC)[reply]


This looks like a cache-oblivious data structure/(crystalized) algorithm[edit]

However, there are no links to https://en.wikipedia.org/wiki/Cache-oblivious_algorithm — Preceding unsigned comment added by Dhruvbird (talkcontribs) 15:42, 28 April 2014 (UTC)[reply]


Although the COLA is cache oblivious, the current version of the FTI is not. It is based on the B-epsilon tree, which explicitly codes for a block transfer size. Farach (talk) 20:28, 28 April 2014 (UTC)[reply]


Ah! Yes, I forget - thanks for the clarification. Dhruvbird (talk) 21:03, 30 April 2014 (UTC)[reply]

Refs. for why the current version was chosen over the COLA?[edit]

You mentioned in one of the previous comments that the FT was earlier implemented using a COLA, but subsequent implementations have since been based on the B-epsilon tree. Are there any references to why this decision was made? What were the tradeoffs and why the current choice is a better one? — Preceding unsigned comment added by Dhruvbird (talkcontribs) 04:10, 1 May 2014 (UTC)[reply]