Wikipedia talk:Size in volumes

Page contents not supported in other languages.
From Wikipedia, the free encyclopedia

GB to B conversion ?[edit]

In the "Assumptions", the article says "Same source shows 19.83 GB (=20,498,960 B)". But 19.83 GB = 21 292 300 370 B. Httqm (talk) 09:26, 25 August 2013 (UTC)[reply]

At the moment the period (US decimal point) has been turned into a comma. I haven't checked the numbers but the comma makes it 1000 times larger. "Same source shows 23,836 GB (=24,408,064,000 bytes)" should be "Same source shows 23.836 GB (=24,408,064,000 bytes)" I have not corrected this; I leave it to someone who also wants to double check the math. But the comma/period needs to be right. — Preceding unsigned comment added by Monica Anderson (talkcontribs) 09:16, 6 March 2016 (UTC)[reply]

 Done Rwessel (talk) 23:42, 6 March 2016 (UTC)[reply]
No, this still isn't correct. TimD1 (talk) 19:26, 8 March 2023 (UTC)[reply]

What about the images?[edit]

The size calculation appears to ignore images. You might want to add a note about that, or else do some additional calculations to include them. --Tagishsimon (talk) 17:45, 23 April 2008 (UTC)[reply]

No updates?[edit]

Am I doing something wrong, or has the statistics page at http://stats.wikimedia.org/EN/TablesWikipediaEN.htm seriously not been updated since Oct. 06? Is there a more updated version? 74.250.127.214 (talk) 10:00, 27 April 2008 (UTC)[reply]

No, it hasn't been updated for that long. Aparently the software used has trouble coping with database dumps above a certain size. However, the number of volumes is based on the live article count (which does update evry time you look at the page), and the average number of words per article (which uses the somewhat old page you linked to).
Sanity check Dec 2013: I counted bytes in decompressed current articles dump: bzcat enwiki-20131202-pages-meta-current.xml.bz2 | wc -c : 101657271884 bytes or 94.7 gigabyte. Of course there is some overhead in the dump (metadata). Still the discrepancy with the count in the article is big Erik Zachte (talk) 09:31, 4 December 2013 (UTC)[reply]

Graphic incorrect[edit]

The graphic (as of Aug. 26, 2008) displays 627 while the live article count is 827. I would change the graphic if I knew how; does anybody have any thoughts on how to change the graphic to reflect the live article count? --HJKeats (talk) 14:57, 26 August 2008 (UTC)[reply]

Done. Tompw (talk) (review) 20:53, 26 August 2008 (UTC)[reply]

The graphic is incorrect again, as of June 21, 2009 the number of volumes calculated is 954 as noted in the article but the graphic shows 1154.--HJKeats (talk) 00:51, 22 June 2009 (UTC)[reply]

The graphic is still off by 200 volumes (August 19, 2009).--HJKeats (talk) 23:10, 19 August 2009 (UTC)[reply]

It's happened again: there should be 1419 volumes but there's only 1219. —Anonymous person on the internet (talk) 05:34, 29 July 2010 (UTC)[reply]

Filesize?[edit]

So Wikipedia is only 4.4GB?? That seems rather small. The entire thing could fit on an iPhone. --69.151.28.135 (talk) 05:23, 22 January 2009 (UTC)[reply]

I think that figure is just the text. Images would vastly increase it, as would storing all the page revisions. 79.64.251.192 (talk) 18:23, 24 January 2009 (UTC)[reply]

Inconsistancy[edit]

This article says that it would take 910 articles, but the WP: size article says it takes 909. Which is correct???Wise dude321 (talk) 16:28, 10 March 2009 (UTC)[reply]

909 1/2? Randy Kryn (talk) 21:48, 12 July 2021 (UTC)[reply]

update[edit]

We finaly have an updated version of the stats to work with[1]Geni 12:54, 15 April 2010 (UTC)[reply]

I've updated the number of words per article from 435 to 562 based on the new stats. The number of volumes has thus been increased by about 300. It is very nice to know that we have been writing longer articles over the last 4 years and not just creating millions of stubs! --Tango (talk) 11:25, 17 April 2010 (UTC)[reply]
An image estimating the size of a printed version of Wikipedia as of August 2007. (Up-to-date image using volumes of Encyclopædia Britannica)

Wikipedia:Size of Wikipedia shows this image, and I have added an explanation because the counting methods are different. The image is from 2007, and, using 400 pages per volume, printing current Wikipedia will waste about 2647 volumes (~ 15000 MB of text), almost the double of the 2007 size. I have contacted Nikola, but I think he is busy. If someone with SVG skills can update the image to 2647 volumes, it would be nice. Thanks. emijrp (talk) 10:10, 7 September 2010 (UTC)[reply]

Why is it showing 1587 and 1585.8 down below ?[edit]

I understand if you wanted to show the number of volumes you would need to hold WP rather than the size of library that's closest you'd round up but then it should be 1586. — Preceding unsigned comment added by 208.120.216.35 (talk) 21:20, 8 October 2011 (UTC)[reply]

How many PediaPress volumes?[edit]

PediaPress books.

I think the current statistic is pointless. We have now the possibility to print collections of articles in PDF or with PediaPress, including pictures, tables, etc. I've done several tests clicking on "Random article" and it seems that on average 408 articles can be printed in a 76 MB PDF of 800 pages; therefore, to print en.wiki you'll need 9,221 PDF files containing over 7.3 millions of pages, and occupying more than 680 GB of disk space. Unfortunately, PediaPress actual typesetting and layout is different, and that's why I'm here asking for some help. I know that the size of a PediaPress book is 216 mm x 140 mm, but I don't know the exact weight and thickness of a ~800 pages book, and the average number of random articles contained. With more details in our hands we'll be able to obtain funny stats like the shelf length needed, the total cost, the square meters of paper used, the time a single person need to read all the books, and so on. We can even ask to PediaPress the amount of glue and ink required! Don't you think it would be nice to calculate how many PediaPress volumes we need for real in order to print Wikipedia? --Bibliofanny (talk) 23:33, 8 October 2011 (UTC)[reply]

I did some checking, and 15 pages/mm is typical for a paperback textbook. Your figures imply 0.51 articles per page in October 2011. There are currently 6,820,444 articles, so the total thickness would be (6,820,444 articles) /(0.51 articles/page) / (15 pages/mm) = 891562mm = 892m. So, we're talking over half a kilometer of shelving!
In case you're wondering how that squares with smaller amount depicted on this page, remember that Encyclopaedia Britannica volumes are 19.2 x 12.4 inches (238 sq inches), while PediaPress volumes are 8.5 x 5.5 inches (46.75 inches) - roughly a quarter of the size. (And probably with larger type to boot) Tompw (talk) (review) 02:07, 6 July 2013 (UTC)[reply]

It's missing a stack[edit]

Yes it is. 1634 volumes, at 20 per shelf and 10 shelves per stack, should be 8 complete stacks, plus one partial one (of 34 volumes). I have contacted the author of the graphic with a proposed correction. Rwessel (talk) 21:15, 16 February 2012 (UTC)[reply]
Per the discussion at Wikipedia:Help desk/Archives/2012 February 29 #Missing stack in User:Tompw/bookshelf, Wikipedia:Size in volumes etc., I have gone ahead and made the above changes. If anyone notices any further discrepancies, please send me a note on my talk page. Rwessel (talk) 03:50, 8 March 2012 (UTC)[reply]

Volume size compared to the human figure[edit]

Volumes of the Encyclopædia Britannica are roughly twice as high as a human hand. Either the depicted guy has really big hands or the volumes are paperback size. SpeakFree (talk)(contribs) 20:04, 4 April 2012 (UTC)[reply]

EB volumes are about 11.25 inches tall, so if they're two of your hands, you've got really small hands.  ;-) The figure is a bit large (I'd guess about 6' 3" or 6' 4" - the thickness of the "shelf" is not well defined, but assuming 3/4 inch, and minimal unused vertical space, that's his approximate height), and his hand does appear a bit large (say compared to his head). But it's not all that far off. FWIW, I'm 5' 10" or so, and my hand is just a hair under eight inches (or 71% of the height of an EB volume) from the heel of my wrist to the tip of my middle finger. Scaling him down a bit would be trivial, but I'm not sure to what. As I mentioned, the height of a shelf is ill defined, but scaling him to something closer to 5' 10" would make him rather more average in size. FWIW, if we assume the shelves are of zero thickness, he does actually work out close to 5' 10". Rwessel (talk) 05:19, 5 April 2012 (UTC)[reply]
I have no idea what I assumed for his height - it's been too long. He's 140px wide, which makes him 140*720/325 = 310px tall. The books are 35px tall, and nominally 25cm. So, 1 pixel = 25/37 = 0.67cm, making him 209cm = 6ft 10 inches tall! (The shelves are 3px = 2cm = 0.8 inches). I will adjust him down a bit. Tompw (talk) (review) 02:25, 6 July 2013 (UTC)[reply]

Rate of increase?[edit]

How long does it take to add a "volume" roughly? Thanks! Woz2 (talk) 20:15, 6 April 2012 (UTC)[reply]

Found it! Looks like about ~30M words/month or ~22 "volumes"/month assuming 1.3M words/volume Woz2 (talk) 20:24, 6 April 2012 (UTC)[reply]

Drivel[edit]

The reason Wikipedia is so voluminous is because over 90% of it is trivial drivel.Wasp14 (talk) 14:12, 29 June 2012 (UTC)[reply]

Up to 98.3% of it could be trivial drivel before it becomes shorter than Britannica.— Preceding unsigned comment added by [[User:{{{1}}}|{{{1}}}]] ([[User talk:{{{1}}}|talk]] • [[Special:Contributions/{{{1}}}|contribs]])
If we got rid of everything but the start-class or better articles, we'd still have ~1.5 million articles. If we went further, and in addition to these, got rid of all the low-priority and unknown-priority articles, we'd still have about half a million articles of start-class or better and mid-importance or better. Oddly enough, this pretty much matches Wasp14's 10%-good figure, given the roughly 4.8 million articles we have now. Even then, we'd have 200 volumes of at least decent-quality articles about reasonably important subjects. In reality, not all articles marked as stubs or unassessed are actually poor-quality, and not all articles marked as low-importance or less are unimportant, and even material such as the many gazetteer-style place articles and trivia articles still has value. -- Impsswoon (talk) 19:18, 24 December 2014 (UTC)[reply]

Size in nickel discs[edit]

Here they say 270 6-inch nickel discs, each about 1mm thick.[2] --Nemo 18:35, 17 November 2012 (UTC)[reply]

Better reality check[edit]

I'd like to know the size of WP without the topic areas that Encyclopedia Brittanica doesn't cover, like every TV show episode, and every pro sports team and game ever played, every trivial two-shop "chain" store (I'm looking at you, Stiletto Spy School). --Lexein (talk) 19:31, 20 August 2013 (UTC)[reply]

Well to start with we would need a robust defintion of "topic areas that Encyclopedia Brittanica covers".Geni (talk) 07:12, 25 August 2013 (UTC)[reply]

Size of bookshelves display[edit]

As the number of bookshelves in the display is going to hit eleven in the next couple of days, I think it would be a good idea to make the graphic a bit smaller, especially since the last bookshelf is already scrolled way off to the right.

I’ve started a discussion at User_talk:Tompw/bookshelf#Size_of_bookshelves_display. Rwessel (talk) 19:03, 20 May 2014 (UTC)[reply]

Is "This" not a citation?[edit]

Why does the calculation of 590 words per article have 'citation needed', when the line begins with a pointer to the source of the data? — Preceding unsigned comment added by 80.7.233.136 (talk) 10:12, 16 December 2014 (UTC)[reply]

I'm guessing the problem is that the "Words" column here only goes up to January 2010. The article says "2,714 million words" without explaining where this data came from. I'll change the template a bit. 89.73.211.125 (talk) 17:33, 23 December 2014 (UTC)[reply]

Alternative attempt[edit]

7,600 volumes although they use a different size of volume and I think include images and tables.©Geni (talk) 15:46, 17 June 2015 (UTC)[reply]

He must be including category pages, dabs and whatnot - he cites 11.5 million articles. It's likely if he's printing these that he's getting considerable extra space from all the images in articles. Rwessel (talk) 17:44, 17 June 2015 (UTC)[reply]
Also gaining some space from including a table of contents and an author list.©Geni (talk) 19:49, 17 June 2015 (UTC)[reply]

There is a bug in the code[edit]

It shows 2,239.3 but there's a 2,241st book on the right. Sagittarian Milky Way (talk) 02:05, 21 January 2016 (UTC)[reply]

I've noticed that before. When there's an exactly full shelf (and ceil(2239.3) exactly fills the last shelf), you get an extra volume on the shelf above. I think it's the list #ifeq in User:Tompw/bookshelf/row20v (the "=0" one), empty shelves are dealt with by the caller, so this appears to confuse the 0 and 20 books on the shelf case (and the latter will never happen because of teh full shelf logic in the caller). It should just be a matter of removing those two lines. I've gone ahead and made the change, but we need to keep an eye on this to make sure it handles the transition to 2241 volumes correctly. Rwessel (talk) 11:24, 21 January 2016 (UTC)[reply]
It appears to be correct now. Sagittarian Milky Way (talk) 04:41, 25 January 2016 (UTC)[reply]

Word count[edit]

How is the word count done? Is there any real statistic about this? -Theklan (talk) 11:46, 26 August 2017 (UTC)[reply]

New(?) stat: Words in all content pages[edit]

It seems that there is a new statistics field in Special:Statistics — Words in all content pages: 3 248 745 364. Dividing that by article count gives 566 (3248745364/5735447). Which is less than current estimated number: 640. --Papuass (talk) 07:48, 16 October 2018 (UTC)[reply]

Updating the number: 3288594906/5769742=570. --Papuass (talk) 12:20, 21 December 2018 (UTC)[reply]
One more update: 3347212525/5836517=573. I know it is not comfortable to go to a lower number (from current estimate used) :-) --Papuass (talk) 09:55, 4 April 2019 (UTC)[reply]
However unless there are objections I'll switch over to that once I work out how.©Geni (talk) 11:15, 1 August 2019 (UTC)[reply]
Nice! This helps counter the death of stats:EN/TablesDatabaseWords.htm. It seems we can thank mw:Extension:CirrusSearch for this (gerrit:392471), so the exact calculation method for the number of words is presumably found in ElasticSearch code. Previously discussed at m:Community Wishlist Survey 2017/Miscellaneous/Word count on statistics and mw:Analytics/Wikistats/DumpReports/Future_per_report. Nemo 12:04, 1 August 2019 (UTC)[reply]

I've updated it. The relevant number is at User:Tompw/bookshelf/volumes. In the long term I'd like to pull the whole thing out of usespace but that would involve me understanding it better than I currently do.©Geni (talk) 23:20, 6 August 2019 (UTC)[reply]

Starting with the reasoning[edit]

  • on 6 August 2019 Special:statistics showed 3,417,737,301 words across 5,903,983 articles implying an average of 579 words per article.
  • This showed that in January 2010 (most recent statistics) 14 GB (=1,5032,385,536 bytes) across 1,798  million words, implying 8.3 bytes/word. ASCII uses 1 byte/character which in turn implies 8.3 characters/word. However, this includes wikimarkup, and 5 char/word plus one for space is standard, so 6 characters/word will be assumed.
  • There are currently 6,820,444 articles, which means {{nts|3949037076} words, which means 2.3694222456×10^10 characters.
  • One volume: 25cm high, 5cm thick. 500 leaves, 2 pagefaces per leaf, two columns per pageface, 80 rows/column, 50 characters per row. So one volume = 8,000,000 characters, or 1,333,333 words, or 2,302.8 articles. (Pictures not included!)
  • Thus, the text of the English Wikipedia is currently equivalent to 3,273.8 volumes of the Encyclopædia Britannica.
  • Sanity check: Encyclopædia Britannica has 44 million words across 32 volumes, or 1,375,000 words per volume. This would imply 2,872 volumes for WP.

Real number of words[edit]

We can know it: Special:Statistics. Currently is 4,221,575,110, notably more than the 4.0961642378×109 , way higher. Theklan (talk) 06:19, 19 October 2022 (UTC)[reply]