Wikipedia:Reference desk/Archives/Computing/2013 June 11

From Wikipedia, the free encyclopedia
Computing desk
< June 10 << May | June | Jul >> June 12 >
Welcome to the Wikipedia Computing Reference Desk Archives
The page you are currently viewing is an archive page. While you can leave answers for any questions shown below, please ask new questions on one of the current reference desk pages.


June 11[edit]

PRISM stuff[edit]

If PRISM is truly a thing that can see what I'm doing, will that affect me that is currently live outside US and not US citizen? They said that the PRISM is tapping the internet backbone. If PRISM is true, and it spied on me, how can I protect myself? The worst thing to have is to search about how to make homemade bomb and get a predator drone snipe me with electrolaser from far 118.136.5.235 (talk) 09:08, 11 June 2013 (UTC)[reply]

Any email or phone protection you applied would have to be applied at both ends. If you use disposable SIM cards then they will still have access to your call if the person you call doesn't take the same precautions their end. Same with Email...Disposable email addresses are pointless unless your contacts use them.
Safest options: (1) Learn to live without human contact in a cave. (2)Wear a tinfoil hat and hope it all goes away.
The latter half of your message indicates option 2 is best in this case Jenova20 (email) 09:14, 11 June 2013 (UTC)[reply]
Don't worry, I'm not that paranoid...but will tor work or vpn? 118.136.5.235 (talk) 10:49, 11 June 2013 (UTC)[reply]
Probably just asking the question is enough to get the CIA agents in Jakarta interested! -- Q Chris (talk) 11:07, 11 June 2013 (UTC)[reply]
The leaker of our information on PRISM made claims that if these Men In Black were really interested in you then they would bug your computer. If that's the case - then Tor will be useless to help.
The safest option is to keep your secrets in your head. Written or computerised records are easier to find.
Thanks Jenova20 (email) 11:09, 11 June 2013 (UTC)[reply]
The people they really should chase are the terrorists who send coded messages to each other using edits that look like vandalism on Wikipedia. Other coded messages are sent to groups of them in spam messages so there's no way of knowing which recipient is an actual member. If these miscreants were taken out by drones I'm sure we'd all be very grateful to the security services. ;-) Dmcq (talk) 18:30, 11 June 2013 (UTC)[reply]
I'm going to stick with option 2. ;) --Yellow1996 (talk) 01:24, 12 June 2013 (UTC)[reply]
Well this question got a virtual non answer. It's also a virtual non question, so not a problem. But so far very few people without clearance are talking about it. What's most alarming is that these programs violate what were the assumed limits of what was otherwise understood to be an extensive spying asparagusapparatus. It does nobody any favors to have glib one-off answers like Jenova's here. If you don't need privacy then leave the bathroom door open, to your stall, in the airport. Shadowjams (talk) 11:44, 12 June 2013 (UTC)[reply]
Eh? This whole thing is of limited verifiable information. So if you have some better advice for what PRISM is actually for and doing then you should give it to the OP instead of just standing at the sidelines criticising others. One-off answers about unlocked bathroom doors and airport stalls do no one any good. Thanks Jenova20 (email) 13:46, 12 June 2013 (UTC)[reply]
We can't really answer the question with what's publicly known right now; it's all just speculation about how extensive the program and others like it are. Maybe more will emerge in time. Shadowjams (talk) 04:00, 13 June 2013 (UTC)[reply]

C++ bignum library in Quincy[edit]

I try to use the arbitrary precision math library from http://www.hvks.com/Numerical/arbitrary_precision.html in Quincy (see http://www.codecutter.net/tools/quincy/). I have written the following code to test the library:

#include <iprecision> #include <iostream> using namespace std; int main() { int_precision i1(1); int_precision i2(4); int_precision i3; i3=i1+i2; cout << "i3 is " << i3 << endl; return 0; }

However, this does not compile. It seems that the iprecision library is not found. Any ideas how that could be resolved? What confuses me the most is that in Quincy I can specify the path of includes and libraries via Tools/Options. I placed the iprecision.h file in the include folder, where also some other header files distributed with Quincy are located, and that is the folder specified as the location of the includes in the options. Any ideas how I could get that library to work in Quincy? -- Toshio Yamaguchi 12:22, 11 June 2013 (UTC)[reply]

It doesn't compile or it doesn't link? 87.115.32.177 (talk) 12:37, 11 June 2013 (UTC)[reply]
Does not compile. -- Toshio Yamaguchi 12:41, 11 June 2013 (UTC)[reply]
Could you please paste in a copy of the compiler error? 209.131.76.183 (talk) 12:45, 11 June 2013 (UTC)[reply]

C:\Program Files (x86)\quincy\mingw\bin\gcc.exe -fno-rtti -fno-exceptions -std=c++98 -pedantic-errors -Wno-long-long -Wno-write-strings -gstabs -|C:\Program Files (x86)\quincy\include\\"-|"C:\Program Files (x86)\quincy\mingw\include\\"-|"C:\Program Files (x86)\quincy\include" -o "c:\users\....\iprecision_test.o" -c "c:\users\....\desktop\c++_projects\precision_test.cpp"
c:\users\....\precision_test.cpp:1:22: error: iprecision: No such file or directory
c:\users\....\precision_test.cpp:5: error: 'int_precision' does not name a type
c:\users\....\precision_test.cpp:6: error: 'int_precision' does not name a type
c:\users\....\precision_test.cpp:7: error: 'int_precision' does not name a type
c:\users\....\precision_test.cpp: In function 'int main()':
c:\users\....\precision_test.cpp:11: error: 'i3' was not declared in this scope
c:\users\....\precision_test.cpp:11: error: 'i1' was not declared in this scope
c:\users\....\precision_test.cpp:11: error: 'i2' was not declared in this scope

-- Toshio Yamaguchi 13:07, 11 June 2013 (UTC)[reply]

Unless something has gone wrong when you cut-and-pasted the error, it looks like the command line you're passing to gcc is using vertical bar ("pipe" characters; ascii 0x7c) rather than capital letter i (I, ascii 0x49) for the include directives (that is, it should be -I"C:\Program Files (x86)\quincy\include" not -|"C:\Program Files (x86)\quincy\include" -- Finlay McWalterTalk 13:37, 11 June 2013 (UTC)[reply]
Quincy does that automatically when compiling a program. -- Toshio Yamaguchi 15:40, 11 June 2013 (UTC)[reply]
Check your project- or compiler-settings. Perhaps somebody has misconfigured compiler flags. The compiler is gcc, and it includes header files and header directories prefaced by -I. Quincy is passing the wrong flag to gcc and that's where your error is coming from. Nimur (talk) 15:45, 11 June 2013 (UTC)[reply]
(ec) That doesn't make it right. What does -| mean? I think it doesn't mean anything, and its the root of your problem. -- Finlay McWalterTalk

Can someone download the math library and copy the program code and check whether they can get it to work? If the problem really is with Quincy, then I have no idea how to fix it (I have successfully compiled some other programs in Quincy). I also have Wascana and #include <iprecision> again produces an error No such file or directory. Where does this library have to be located? (I am on a Windows machine). -- Toshio Yamaguchi 16:51, 11 June 2013 (UTC)[reply]

Again, the problem isn't that the file is located in the wrong place, it's that the IDE is sending the wrong options to the compiler. Either fix the IDE project or just stop using an IDE altogether and compile with a batch file. Passing the correct options to the compiler is a central concern of C/C++ development, one you need to take ownership of. -- Finlay McWalterTalk 16:58, 11 June 2013 (UTC)[reply]
I think I will try MinGW-w64 & Metapad. -- Toshio Yamaguchi 20:24, 11 June 2013 (UTC)[reply]

Intel 5000 GPU equivalent?[edit]

What would be the Nvidia or Radeon equivalent of the new Intel 5000 series GPUs? --76.94.124.25 (talk) 18:27, 11 June 2013 (UTC)[reply]

Nothing's "equivalent" except two units of the exact same product model! You need to compare different types of GPU along a few different axes: capabilities (supported APIs and features); performance (measured by benchmark); power efficiency (... which depends very much on the system as a whole); cost.
Start by looking at the feature-set of Intel HD Graphics systems: Supported graphics APIs and features on current Intel products. The 5000 series supports DirectX 11.1, OpenGL 4, OpenCL 1.2; so you can compare these to capabilities of comparable discrete graphics cards from other vendors.
Intel's products also support Intel-specific technologies, like Intel Wireless Display, accelerated video processing, ...
Nvidia GPUs - including new notebook products - support CUDA, so if this capability is required, there is no substitute.
Also consider the power consumption, the cost, and the total system performance, as measured by your favorite benchmark. Nimur (talk) 19:03, 11 June 2013 (UTC)[reply]

Prism data storage[edit]

On TV I think I heard that PRISM (surveillance program) storage in Utah can store 5 zettabytes of data. That seems unreasonably large - it would cost billions for the hard drives alone. Was it exabytes or something else? Bubba73 You talkin' to me? 18:38, 11 June 2013 (UTC)[reply]

I realized that they might not have it all on hard drive. It may be stored on something else and indexed so they can get it if they need to. Bubba73 You talkin' to me? 19:21, 11 June 2013 (UTC)[reply]
This article[1] says that "a storage facility capable of holding five zeta bytes of data was constructed at Bluffdale, Utah - the Bluffdale Data Centre" so I think you heard right. It doesn't say what sort of storage they would be using. A Quest For Knowledge (talk) 19:30, 11 June 2013 (UTC)[reply]
There's also a cite for the zettabyte comment in the zettabyte article which Bubba linked us to. Dismas|(talk) 19:42, 11 June 2013 (UTC)[reply]
Although, the Utah Data Center article has exabytes cited. Dismas|(talk) 19:47, 11 June 2013 (UTC)[reply]
I've read these claims before and they seem questionable to me at least the zettabyte one. [2] says in 2014 300000 petabytes will ship. This says in 2011 350000 petabytes of HDDs shipped [3]. While HD shipments have been decreasing, I'm not sure if total petabytes have been or these figures are just different estimation methodology (the text of the first article suggest it's probably the later) but I think it's safe to say it was unlikely to ever be over 500000 petabytes. If the Utah data centre had 5 zettabytes of capacity this means they had more than 10 times the total capacity for a single year's shipment. It's difficult to imagine HDD vendors were hiding that much capacity. As Bubba hinted at, with 5 terabyte HDDs, it also implies about 1 billion HDs for that data centre. And if not HDDs then what? Clearly not SDDs. Tapes? [4] from 2011 mentions 1 exabyte library, although if you read the article it's only 500 PB (1 EB is assuming a compression ratio of 2:1) using a 'revolutionary' 5TB tape and 100k cartridges. So again to get 1 billion cartridges or 1000 full versions of these libraries. I didn't find tape shipments per year but I did find [5] who make a big deal of shipping 550PB in a 6 month period. Even if they're small, in the market, it's difficult to imagine the market is much more than 100x this size meaning we're still talking about 5 times the yearly market at a minimum for this one data centre. Of course it's also possible this 5 zettabyte figure was highly misleading and was actually referring to a compression ratio of 10 times or more which would make the figure slightly more realistic.
Yes, 5 exabytes I can believe. 5 zettabytes is 5 billion terabytes. Bubba73 You talkin' to me? 02:26, 12 June 2013 (UTC)[reply]
5 exabytes is more plausible but still implies the US may have used 1% of the HDD shipments for a year (or less spread out over several years) for a single data centre. And this is also about the same as the total for disk storage systems capacity in 2010 according to this estimate [6].
BTW, while researching this, although I didn't look for anything related to the NSA or the data centre, I came across [7] from 2009 about the same data centre where they suggest plans for 1 yottabyte! It seems to me these could easily be misinformation, or wishful thinking.
Nil Einne (talk) 22:08, 11 June 2013 (UTC)[reply]
That doesn't sound unreasonable (collection rate). 1% of all HD purchases (which is the only relevant storage... don't even think about tape for these purposes) sounds plausible. If you incorporate compression into this (which is a big if... most reports want a big number). Shadowjams (talk) 07:34, 12 June 2013 (UTC)[reply]


I don't know anything about this but since what they were storing was presumably logs with a lot of repeat information, could this zettabyte claim relate to the amount stored if it were to be decompressed? Text compresses really well, especially when using solid compression. For example, it's easy to compress many gigabytes of text into a fraction of that size using 7zip or similar software. It seems possible that they could have 5ZB of compressed data on a couple of hundred 1TB hard drives. 77.101.52.130 (talk) 22:40, 11 June 2013 (UTC)[reply]

That certainly sounds much more likely if it isn't a mistake. Personally I can't see why they'd want to store 100Mb about each and every person in the world. Dmcq (talk) 10:16, 12 June 2013 (UTC)[reply]
You're clearly not thinking widely enough. 100 MB per person seems modest. Shadowjams (talk) 11:38, 12 June 2013 (UTC)[reply]
Yeah you're right, they're ahead of me, 5ZB is more like one gigabyte per person. They can get hi-res photos of you, your relatives and house and your little dog too. Dmcq (talk) 13:36, 12 June 2013 (UTC)[reply]
Think big! Shadowjams (talk) 04:01, 13 June 2013 (UTC)[reply]
1 gigabyte per person sounds reasonable to me; but how many high quality pictures, videos etc. do they really need for one person? There's no doubt though that they would opt for hi-res. --Yellow1996 (talk) 01:21, 13 June 2013 (UTC)[reply]
Of course, I don't know what all they have, but phone records - number called, when, and length of call - which have been in the news would not take anywhere near 1GB per person. Bubba73 You talkin' to me? 02:52, 13 June 2013 (UTC)[reply]

Well, we are all on the list now.  :-) Bubba73 You talkin' to me? 23:51, 13 June 2013 (UTC)[reply]

Wikipedia database[edit]

Is there a way to get a file with all of a specific editor's contribs? The actual text of those contribs I mean. No More Mr Nice Guy (talk) 22:44, 11 June 2013 (UTC)[reply]

Yes. Start by reading Help:User contributions (accessing), which provides the URL syntax for a user contribution page. For example, you can get my contributions by viewing Special:Contributions/Nimur. If (for reasons unspecified) you wanted to aggregate those contributions, you could write a script to download every one of the "diff" links. If you don't want to spider contributions in HTML format, you could use the API (additionally documented at MediaWiki.org). There are several formats available for such content, including HTML, XML, JSON, and so forth. Nimur (talk) 23:10, 11 June 2013 (UTC)[reply]
Thanks for the reply. I obviously wasn't being precise enough. I wanted to ask if there's a tool that does it, rather than me having to spider wikipedia.org. No More Mr Nice Guy (talk) 00:47, 12 June 2013 (UTC)[reply]
The API has ready methods to do it, and please don't spider wikipedia.org. There's no dump though (not public at least) on individual users. If you wanted to aggregate contribs from a dump that's possible, but the format doesn't lend itself nicely to that, so it'll take some time. But it's quite possible. Shadowjams (talk) 02:55, 12 June 2013 (UTC)[reply]
Looking at the API, I see it's quite easy to get a list of contribs including revision ID, but action=query&list=usercontribs doesn't include the actual text of the edit. Looks like I'd have to first get a contrib list and then the actual edits one by one, which is not much better than scraping and spidering. Also, I can't seem to find the URL for just the edited text and not as a diff. Any ideas? Is the API rate limited? No More Mr Nice Guy (talk) 04:56, 12 June 2013 (UTC)[reply]
You can get the contrib diffs on one pass through the API. Shadowjams (talk) 07:29, 12 June 2013 (UTC)[reply]
Could you kindly give me the exact API call? No More Mr Nice Guy (talk) 19:02, 12 June 2013 (UTC)[reply]
I think there's a way to do it using the query command "generator=allpages" but it requires making another list call to usercontribs... I've been trying to make it work though with no success. I thought I remembered http://en.wikipedia.org/w/api.php?action=query&prop=revisions&rvprop=user%7Ccontent&rvuser=Shadowjams working to grab user edits, but I need to provide it with something to work on. I thought generator would do that but I'm not totally up on how it works. It's easier to just do what Nimur suggests I guess. Only other thing is I thought I had read that one shouldn't scrape the HTML. PERFORMANCE as an essay is about the encyclopedia for normal browsing, not for hammering the webserver. The API queries though I believe are fine. Shadowjams (talk) 04:19, 13 June 2013 (UTC)[reply]
I'm not aware of any API call that provides all the diffs for a user. I believe you actually must call usercontribs one or more times, and then iterate over the result-set using n Revisions queries. I don't think there's any reason to worry about the extra load you add on the server. The relevant guideline is Wikipedia:PERF - don't worry about performance - if you find some way to create enormous load on the server using the public API, Wikimedia will deal with the technical details in an appropriate way. Usually, this means they will rate-limit queries - so it will be impossible to "overload" the server. In other words, if you want to spider the pages, go ahead - if you want to spider them in HTML format using http - go ahead - the servers are just machines, and don't care how you query them - but you might find that certain API calls respond more promptly, either due to intrinsic performance or due to rate-limiting. Nimur (talk) 02:53, 13 June 2013 (UTC)[reply]
Any idea why prop=revisions is showing me multiple revisions even when I set rvlimit to 1? I tried to set rvuser and rvend as well, but I still keep getting the 4 revisions preceding the one I want. No More Mr Nice Guy (talk) 18:52, 13 June 2013 (UTC)[reply]
Did I misunderstand "revision" to mean an edit when in fact it means the state of the whole page? No More Mr Nice Guy (talk) 20:03, 13 June 2013 (UTC)[reply]
| don't think I'd agree with Nimur here. PERF is a good guideline but it's primarily intended to tell editors not to worry about the effect on servers when using manual tools or when writing templates etc. (Note that as the page says, it doesn't mean we don't consider performance despite the name.) It doesn't really apply when you're running bots or otherwise using automated tools for which there may be wikipedia wide policies or server admin imposed requirements. As in all cases when running bots, you are responsible for familiarising yourself with these policies or requirements and following them, in other words ensuring your bot is well behaved. You shouldn't be surprised to find yourself hard blocked without any notification or even clarity that you have been blocked even if you follow normal good practice of ensuring your bot provides contact details. Server admins ultimately have final say in how you use servers they admiistrate. (It has happened to me before and in that cases there was no public policy and my bot did use a fairly low hit rate and followed robots.txt etc and the site had no API.) I would expect this would apply to wikimedia servers too, definitely all I've read seems to imply it. In other words while you don't have to worry about the effect on servers and for something like this the chance you as a single user woth I'm presuming a home connection could have a noticeable effect on performance is slim, you do have to follow the requirements for such tools. So the relevant policy here is [8] which is linked from the PERF guidelines and has rate limits you should follow. You may also want to take a read of Wikipedia:Creating a bot although that is more geared at editing bots. Nil Einne (talk) 04:41, 14 June 2013 (UTC)[reply]