Wikipedia:Reference desk/Archives/Computing/2011 December 16
Computing desk | ||
---|---|---|
< December 15 | << Nov | December | Jan >> | December 17 > |
Welcome to the Wikipedia Computing Reference Desk Archives |
---|
The page you are currently viewing is an archive page. While you can leave answers for any questions shown below, please ask new questions on one of the current reference desk pages. |
December 16
[edit]Quicktime media soundfile
[edit]Hello, I need help in determining if a quicktime media soundfile can be broken down to find information about when the file was recorded and/or information on the recording device such as cell phone or other. The file in question is a text mms sent through a smart phone. Any help will be greatly appreciated. :)
Thank you,
AJ — Preceding unsigned comment added by 184.79.110.134 (talk) 00:24, 16 December 2011 (UTC)
- Maybe. The Quicktime container format does have support for metadata. Apple's documentation for that says there are optional metadata types for location (see the "location" secton) and creation date (see the "QuickTime Metadata Keys" section). A program called Metadata Hootenanny (which I haven't used, so can't speak for its safety or quality) claims to be able to read this metadata. None if this means that the given quicktime file you have will actually have that metadata, or that it will be accurate. -- Finlay McWalterჷTalk 00:35, 16 December 2011 (UTC)
Can the government find who is behind my username?
[edit]Hi, suppose I live in a nice place that has been taken over by an Unpleasantocracy. Can the government find who I am from my edits on Wikipedia if I never edit under an IP? Let us assume they have complete control over the infrastructure within the country, but nothing outside, ie. no spies among the Wikipedia bureaucrats. On the other hand, if I do edit under an IP, I assume it would be easy for them. Am I right here? Thanks in advance. IBE (talk) 02:59, 16 December 2011 (UTC)
- If you edit under the IP, then they can find out which ISP or Telco (or whoever) supplied the IP to you, and if they can access the company's records (and if the company keeps sufficient records) then they can tell which subscriber line used the IP at any time. The Unpleasantocracy probably have a law requiring ISP within their borders to store & divulge such info. Under your user name with wikipedia as a black box: they would not be able to identify you from historic edits. But, if they could do some sort of deep packet inspection on all internet packets going across their border, they could presumably catch you in action (e.g. by user name) and get the IP from the packet(s). That would be a monumental effort. --Tagishsimon (talk) 03:12, 16 December 2011 (UTC)
- So as I understand it, if they check everything, the moment they find a post under my username, when it's in transmission, they have the IP pretty much in front of them. Can they monitor all posts for certain keywords (eg. bomb, attack, "I hate the government") and then catch them? The article deep packet inspection you linked seems to suggest this, but I just want to check. Is that how they catch bloggers etc.? IBE (talk) 03:42, 16 December 2011 (UTC)
- Not in direct answer to your question, but you may find our article on Echelon interesting. Vespine (talk) 04:30, 16 December 2011 (UTC)
- So as I understand it, if they check everything, the moment they find a post under my username, when it's in transmission, they have the IP pretty much in front of them. Can they monitor all posts for certain keywords (eg. bomb, attack, "I hate the government") and then catch them? The article deep packet inspection you linked seems to suggest this, but I just want to check. Is that how they catch bloggers etc.? IBE (talk) 03:42, 16 December 2011 (UTC)
- Because of how packet-based transmissions work, it is very difficult to monitor all the Internet traffic and just put it back together to monitor usage. How bloggers (and idiots who call themselves "hacktivists") get caught is by sharing their information to the public on the Internet. They do things like go on Twitter and post their home town, make comments about where they go to school, make comments about where they work, and then make comments about how cool their latest blog entry or hack attempt was. Add to that the photos and it is far too easy for the government to track them down. -- kainaw™ 04:43, 16 December 2011 (UTC)
- Thanks, I think I get this now. The critical factor in packet-based transmission seems to be the maximum transmission unit, which on the Internet appears to be very small. It would seem that if they had billions of dollars at their disposal just for surveillance of Internet traffic, they might have a chance, but I doubt anyone does. If anyone knows how Echelon gets around this, I would be very curious. IBE (talk) 05:27, 16 December 2011 (UTC)
- The NSA budget for 1997-1998 has been disclosed as $27 billion per year, and it has probably increased considerably after 9/11. I am sure they have no problem funding their multiple internet surveillance programs. There have also been sporadic news about network surveillance by intelligence services in other countries. 130.233.228.9 (talk) 09:51, 16 December 2011 (UTC)
- On the NSA's post-9/11 efforts, see the Trailblazer Project. --Mr.98 (talk) 14:02, 16 December 2011 (UTC)
- The NSA budget for 1997-1998 has been disclosed as $27 billion per year, and it has probably increased considerably after 9/11. I am sure they have no problem funding their multiple internet surveillance programs. There have also been sporadic news about network surveillance by intelligence services in other countries. 130.233.228.9 (talk) 09:51, 16 December 2011 (UTC)
- Thanks, I think I get this now. The critical factor in packet-based transmission seems to be the maximum transmission unit, which on the Internet appears to be very small. It would seem that if they had billions of dollars at their disposal just for surveillance of Internet traffic, they might have a chance, but I doubt anyone does. If anyone knows how Echelon gets around this, I would be very curious. IBE (talk) 05:27, 16 December 2011 (UTC)
- Using Wikipedia:Secure server (which uses HTTPS) would help protect your identity. --Colapeninsula (talk) 10:29, 16 December 2011 (UTC)
Since it seems that there may be a risk, even given the technical obstacles, I should mention that I was of course thinking of countries like Iran and China. I kept it anonymous to avoid any trace of politics, but as there seems to be no single, general answer, I thought I should lift the veil, even if it was not exactly obscure. If anyone knows anything specific, I would be curious. Thanks for the answers so far. IBE (talk) 10:48, 16 December 2011 (UTC)
- In the case of Iran I'd say they definitely don't have the resources, but they don't even want people watching what we would call harmless television there (they care less about who is doing what online and more about who is getting online at all). ¦ Reisio (talk) 00:43, 21 December 2011 (UTC)
Also, this may sound paranoid, but could they tell that you are using the https server, and start targeting just you, to see what else you might be up to? Then they wouldn't have to spot you over the whole internet - they could just eavesdrop on your IP, and wait for you to send one non-encrypted message, and they would have something. Or they could just get you on trumped-up charges if they decide they don't like https. IBE (talk) 10:54, 16 December 2011 (UTC)
- Most governments assign people to find people who are doing bad things, including using Internet to do bad things. Definitions of "bad" may include pornography, apostasy, racism, sabotoge, espionage, fraud, drugrunning, terrorism and many others, all of whose definitions vary. Methods and countermeasures also vary, thus have become a vast area for study including secret study, thus are not answerable in a forum like this one. Jim.henderson (talk) 14:55, 16 December 2011 (UTC)
- I'm not sure what the MTU has to do with it. The hardest part of this kind of inspection is getting in between the transmissions. For instance, if you had access to a switch closet, you could monitor things going through it, but little else. To monitor the world of internet traffic you'd need to be at a central switching hub. Apparently this is what was done with the illegal surveilance program--they installed equipment in switching closets of major ISPs. That's not easy to do, and certainly not easy to do covertly, which is incidentally how the whistle got blown. There are certainly other ways to monitor traffic (subpoenaing wikipedia's logs would be one way to map IPs to users). If you happen to get inbetween connections and can monitor them, then dealing with encryption and the volume of traffic is your next problem. That's, if I understand it correctly, what Trailblazer was all about. Sorting through the mass of traffic to find what they were interested in. Shadowjams (talk) 08:57, 17 December 2011 (UTC)
Question about upgrading from XP 32 bit OEM to Windows 7 64 bit in entirely new system
[edit]Hey guys and gals,
So my sister's HP desktop bought the farm the other day. I suspect motherboard failure. Since I had spare parts lying around from a recent upgrade, I decided that I would build her a system from scratch, new case, new parts, new everything; but transfer her data to the new PC. My plan is to clone her HDD from the HP computer (with an OEM version of Windows XP Home Edition 32 bit) to the new SDD, put that SDD in the new build, and then upgrade to Windows 7 64 bit) by booting from an upgrade disk to do a clean install. Will this work? The only issue I anticipate is that Windows might have an issue with the new components, but does that even matter? I'm not very tech savvy, so I'm not sure, but if I booted from the CD I should avoid any potential "activation" problems, correct?
Thank you in advance! Mtzen (talk) 05:41, 16 December 2011 (UTC)
- Well, make sure that the new system is powerful enough to run Windows 7 and that the processor is an x86_64. If you can, provide the specs here so we can all see what hardware you'll be working on. Other than that if all components work together you're good to go. What do you mean by 'an issue with the new components'? If you are referring to problems with drivers or compatibility, then you are running the risk that the system might just not work (properly). --Ouro (blah blah) 08:46, 16 December 2011 (UTC)
- I don't think this plan will work. I doubt that an old XP install will take too kindly to being cloned to radically different hardware, and it looks like you can't do an upgrade install of XP to 7 (I'm also pretty sure you can't upgrade 32 bit Windows to 64 bit). Why not just put the old HDD aside for the moment, build the new PC with a fresh copy of Windows 7 and get it all set up, then connect the old HDD up and copy off all the data you need? CaptainVindaloo t c e 23:49, 16 December 2011 (UTC)
- I've already copied all the data, etc. Since I have an upgrade disk, I wanted to use that to do a completely clean install of Windows 7 64 bit instead of shelling out $100-200 for a retail copy of Windows 7 or sticking with the old 32 bit XP. Essentially, this is a brand new computer, but I was wondering if I could take advantage of the fact that I have a legal Windows XP license, instead of paying the full price for Windows 7. My only concern is that XP, since its OEM, could refuse to work with the new mobo/ components and not "activate". However, I don't know if that matters when I boot off the disk. Mtzen (talk) 00:35, 17 December 2011 (UTC)
- Ah, I see. If the data is already safe, that's less of a worry. You might not be able to activate Windows on the new hardware - OEM Windows licences aren't supposed to be moved to different computers. Sorry for the bad news. Does your sister have Windows-dependant programs (games, etc)? If not, consider Ubuntu. CaptainVindaloo t c e 01:21, 17 December 2011 (UTC)
How can I download 40,000 web pages while re-pooling open connections?
[edit]Hi, for a research project I need to download ~40,000 static web pages from a research site. I don't want to inadvertently DOS the site, so I want to make sure that my script uses http-keep-alive so it can re-use the TCP connection. I'm planning on using wget like so:
for VARIABLE in 1 2 ... 40000
do
wget http:my_url done
According to the --no-http-keep-alive section here:
http://www.gnu.org/software/wget/manual/html_node/HTTP-Options.html
It seems that wget does that. But I'm not sure how wget remembers the last connection since it runs and exits during each iteration of the for loop. Could someone explain if it will work or if there might be a better way to do this? — Preceding unsigned comment added by Rain titan (talk • contribs) 12:42, 16 December 2011 (UTC)
- keep-alive is a property of the HTTP connection, which persists only as long as the underlying TCP socket does. When a process ends, its sockets are disconnected. So, in your proposed scheme, the keep-alive will be futile, not (just) because wget can't remember the keep-alive info, but because the underlying connection has already been destroyed. The keep-alive is meaningful when you make multiple downloads from the same server in the same invocation of wget; you might choose to build a list of URLs and pass them to wget -i, for example. But an altogether better way to do this is to not use wget, but to write your own download script in something like perl or python (using say pycURL) - that way the HTTP sockets are explicit objects that persist until you destroy them - and it's easier to control what gets sent. -- Finlay McWalterჷTalk 12:59, 16 December 2011 (UTC)
- If you still want to use wget see Wget#Advanced examples section. It does not explain keeping the connection alive, but gives example of defining the whole set of URLs for one wget call and of avoiding DoS both on client and server side by randomizing download delays and by bandwidth throttling. --CiaPan (talk) 13:10, 16 December 2011 (UTC)
I wouldn't worry about inadvertently DoSing the site. Wget only uses one connection so you will never reach the level of a DoS unless you're running many instances of wget at the same time. 82.45.62.107 (talk) 11:29, 17 December 2011 (UTC)
- You can pipe a stream of URLs into wget, and you can also tell it to pause between requests wget -i- -w 1; I also have used --limit-rate=30k (or some number to stop it using all the internet link capacity). Graeme Bartlett (talk) 04:45, 18 December 2011 (UTC)
Passwords?
[edit]Any tips for making strong passwords? Heck froze over (talk) 19:23, 16 December 2011 (UTC)
- Combine uppercase letters, lowercase letters, and numbers, with no English words. Use a mnemonic to remember it. For example, "my 2 cats names are Smokey and Alamo" becomes "m2cnaSaA". StuRat (talk) 19:26, 16 December 2011 (UTC)
- I would make that one "m2cnaS&A". HiLo48 (talk) 21:41, 17 December 2011 (UTC)
- Agreed, provided the ampersand is supported in passwords on the system you are using. StuRat (talk) 19:43, 18 December 2011 (UTC)
- Use a long password that is not specifically made of words, like "this is my password" uses only common words, and toss in a few symbols. An example is "my 100% ultra-strong password". I find that by remembering four words, I can create long passwords that I don't forget. When I'm forced to use something like 10 characters with at least two numbers and one symbol, I forget the password in about 30 seconds. -- kainaw™ 19:30, 16 December 2011 (UTC)
- An anecdote: Because of the high concern of security in health data, my password for my health data password is an entire lyric from a song that I like. It takes a long time to type in. When someone asks me for something, I purposely type it one word at a time with pauses between each word. They get very annoyed and rarely return to ask me to do more of the work they should be doing. -- kainaw™ 19:32, 16 December 2011 (UTC)
- A relevant XKCD comic on this which weighs heavily towards the Kainaw approach than the StuRat approach, which is, indeed, what security experts of note advise these days. In general, passphrases are more secure than convoluted passwords. Unfortunately most secure sites of note do not support them, in my experience. (My bank won't let me make a password longer than 11 characters, and requires me to insert all sorts of difficult-to-remember stuff into it. I find this highly irritating, and I think it leads towards worse and worse practices, like using lots of the same, small, difficult-to-remember passwords on multiple sites.) For the stuff I really care about, and where I can use passphrases (e.g. with TrueCrypt containers), I pick random sentence fragments out of books I have on the shelves. I slyly dog-ear the page in case I forget it (there are enough books, and enough dog-eared pages, that retracing this would be prohibitively hard, much less figuring out which sentence fragment it was — in any case, I'm not worried about people in my house finding out my passwords). An example (not an actual passphrase of mine, but it might as well be): "it was also correct in the sense of agreeing with experiments". Long, very difficult for a computer to guess (lots of entropy), but very easy to remember, once I've written it out a few times. Even if I forget the exact phrasing, it's easy to find which sentence it was just by glancing at the page of the book in question, if I can remember even part of it. --Mr.98 (talk) 19:54, 16 December 2011 (UTC)
- A standard piece of advice is to turn a passphrase into an initialism. So, if a memorable phrase to you is "I am now a perfectly safe penguin, and my colleague here is rapidly running out of limbs!", you get "ianapspamchirrool", which is a reasonably strong password, seeing as there's little relationship between one first letter and the next. Paul (Stansifer) 20:29, 16 December 2011 (UTC)
- But isn't that even orders of magnitude less entropy than the amount in the first XKCD example? My understanding is that these days, that is not very hard to brute force — a few days of processing or so. --Mr.98 (talk) 21:18, 16 December 2011 (UTC)
- A relevant XKCD comic on this which weighs heavily towards the Kainaw approach than the StuRat approach, which is, indeed, what security experts of note advise these days. In general, passphrases are more secure than convoluted passwords. Unfortunately most secure sites of note do not support them, in my experience. (My bank won't let me make a password longer than 11 characters, and requires me to insert all sorts of difficult-to-remember stuff into it. I find this highly irritating, and I think it leads towards worse and worse practices, like using lots of the same, small, difficult-to-remember passwords on multiple sites.) For the stuff I really care about, and where I can use passphrases (e.g. with TrueCrypt containers), I pick random sentence fragments out of books I have on the shelves. I slyly dog-ear the page in case I forget it (there are enough books, and enough dog-eared pages, that retracing this would be prohibitively hard, much less figuring out which sentence fragment it was — in any case, I'm not worried about people in my house finding out my passwords). An example (not an actual passphrase of mine, but it might as well be): "it was also correct in the sense of agreeing with experiments". Long, very difficult for a computer to guess (lots of entropy), but very easy to remember, once I've written it out a few times. Even if I forget the exact phrasing, it's easy to find which sentence it was just by glancing at the page of the book in question, if I can remember even part of it. --Mr.98 (talk) 19:54, 16 December 2011 (UTC)
- That depends a hell of a lot on who you talk to. Some people still claim that the best computers will take ages to brute-force a 10-character password that may have any of all the printable characters on the keyboard. Others have claimed to set up password hacking machines using multiple multi-GPU cards and parallelizing the whole thing. I saw one paper (wish I could dig it up right now) where a guy spent under $3000 on his password hacking box, used a GPU version of john in parallel mode, and was getting 8-character passwords in a matter of hours (not weeks, months, or years). While I don't remember the paper much, I remember bringing it up at a security conference and the experts claimed that even with 100 parallel processors, it would take months to crack the average 8-character password. So, you are left with a monstrous variance between what one expert claims and what another expert claims. -- kainaw™ 21:23, 16 December 2011 (UTC)
- A coworker just jogged my memory. The GPU password cracker came out of Georgia Tech research last year. They can do anything up to 11-characters in a reasonable amount of time (8-characters in minutes). -- kainaw™ 21:31, 16 December 2011 (UTC)
- You can hack a password that way only if you can decide reasonably fast whether a password is good or not. For a website, you'd need to ask the site for each possibility (unless ofcourse you already have a hash for instance, but in that case you probably don't even need the password anyway). Major websites will not even allow you to try that many times, others will not be able to handle the load. "ianapspamchirrool" is much stronger than any password I've ever used, I'll switch to that :) Joepnl (talk) 02:36, 17 December 2011 (UTC)
- When I have to general passwords for other people, who demand they be memorable, I usually use the following rubric: word number word number, where word is something I get from hitting the Wikipedia random page button and picking a word with my eyes closed, and number is from a random integer (0..99) generator (you could use random.org, but obviously it's best not to use an online source that theoretically could be intercepted). That yields stuff like christopher90bronson43, which people complain about but which they very quickly discover they can remember. That's a scheme fairly similar to the XKCD one - it's not perfect, but people do actually get reasonably strong passwords that don't appear in credible rainbow tables and that they actually can, and do, remember. For myself, especially for secure things, I use Keepass's password generator - which gives very secure passwords, but which I'm relying on a single (pretty strong) password to protect the whole lot (because storing in Keepass is a fancy way of writing passwords down, if I'm honest). -- Finlay McWalterჷTalk 20:04, 16 December 2011 (UTC)
- If you want to know the upper bound on the character space of a password the number of possibilities = number of possible characters^(number of characters in the password). So an 8 character password using upper and lower case characters and numbers has 218,340,105,584,896 possible passwords. So to crack that in a month (on average) you'd need to process 42,118,076 passwords per second. That seems high for most desktops, but the GPU and FPGA type crackers might come close. Much of this depends on what scheme the password system is using. Some, like the old zip file format, was extremely weak. Others, like most modern key derivation algorithms, are quite good.
- More common crackers will use a dictionary and will adjust for common substitutions, like swapping O with 0, or changing cases, or suffixing/appending numbers. Shadowjams (talk) 05:33, 17 December 2011 (UTC)
- I use my army service number and since I left the army in 1952, I doubt that anyone (except the War Office) will know it!85.211.148.143 (talk) 19:10, 17 December 2011 (UTC)
- How long is that service number? By restricting your possible keyspace to only numerals you make it dramatically less secure. This strikes me as a poor password choice. Shadowjams (talk) 21:16, 17 December 2011 (UTC)
- I use my army service number and since I left the army in 1952, I doubt that anyone (except the War Office) will know it!85.211.148.143 (talk) 19:10, 17 December 2011 (UTC)
would it be useful to use symbols like !@=, etc.? My current formula for passwords is to cryptically describe a math problem, for example:
(math problem) use four single number digits to make the number 20.
resulting password: 4x1#=20
4x for the four numbers, 1# representing the number of digits in each (if the digits are already specified then the 1 is unnecessary), =20 means that the 4 digits refered to earlier equal 20.
is this type of password very strong? Heck froze over (talk) 04:11, 20 December 2011 (UTC)
wii ip?
[edit]just curious, (i noticed my friend's wii has an IP address), do the ip addresses of nintendo wiis change? don't regular computers do this too? Heck froze over (talk) 19:47, 16 December 2011 (UTC)
- It depends on the Internet Service Provider, who is in charge of assigning the IP addresses to subscriber devices. Some ISPs rotate IPs pretty frequently, some keep them static for years. --Mr.98 (talk) 19:50, 16 December 2011 (UTC)
- The Wii will most likely have gotten its IP address using a DHCP request to your friend's router. That IP address will be local to your friend's home network. Computers often do the same, however computers can also be more directly connected to the outside world, so may get their IPs from the ISP. Both Wiis and computers can also be configured with a static IP on the home network, meaning it does not change. This can be more challenging to maintain, but does allow for additional configuration options in the router, such as allowing (or blocking) certain ports for specific IPs. --LarryMac | Talk 19:56, 16 December 2011 (UTC)
- Most DHCP servers will give the same devices the same IPs back if they can. Similarly, DHCP requests can ask for a particular IP. Shadowjams (talk) 05:23, 17 December 2011 (UTC)
- Just about every device that can connect to the internet will use Internet Protocol, so essentially has to follow "the same rules". Whether it be a computer, a smartphone, a PS3 or Wii. Vespine (talk) 00:10, 20 December 2011 (UTC)
- Most DHCP servers will give the same devices the same IPs back if they can. Similarly, DHCP requests can ask for a particular IP. Shadowjams (talk) 05:23, 17 December 2011 (UTC)
Jumbo frames and non-jumbo frame devices together?
[edit]I have a small network of computers that are connected via gigabit ethernet to a switch that has jumbo frames support enabled. Most of these computers support jumbo frames also, but there are a few that don't. Is there any overall impact on the network of having non-jumbo frame computers connected? Any modern switch is smart enough to handle both type of frames at once, yes? I assume jumbo frame computers will be able to communicate with non-jumbo computers (and vice versa) — much like gigabit devices can communicate with 100Base-T and 10Base-T devices (and vice versa), right? --76.79.70.18 (talk) 21:29, 16 December 2011 (UTC)
- To do this properly you need to use a routing function to do IP fragmentation. If you use Windows it may retry with a smaller size, but because the switch does not return the ICMP fragmentation required packet, it won't know there is a problem to do anything about. So the solution may be to vlan segment the switch, and route between the jumbo enabled segment and the 1500 byte MTU part of network. Graeme Bartlett (talk) 04:33, 18 December 2011 (UTC)
bsddb with regards to Unix and Python
[edit]I haven't heard of bsddb till recently. I'm assuming the links below are referring to the same thing.
<http://docs.python.org/library/anydbm.html#module-anydbm>
<http://docs.python.org/library/bsddb.html#module-bsddb>
<http://www.oracle.com/us/products/database/berkeley-db/index.html>
<http://en.wikipedia.org/wiki/Berkeley_DB>
I haven't been able to figure out what exactly it is. bsddb seems to be installed on my Mac but I definitely didn't install it. Is that because bsddb comes installed by default? Is that true on all Unix machines? Is the library on my machine Oracle's distribution? (I tried which bsddb but that didn't work).
Could anyone just generally explain bsddb both in the the context of why it's in my machine and also in terms of the Python libraries provided (anydbm and bsddb). — Preceding unsigned comment added by Rain titan (talk • contribs) 23:31, 16 December 2011 (UTC)
- I'll try to put those in some kind of order. Berkeley DB is indeed the product currently maintained by Oracle. It is a library that provides access to databases - storage of key/value pairs in disk files with efficient lookup by key. It includes programming interfaces for C and C++. "bsddb" is a python module which acts as a bridge between Berkeley DB's C library and the python language. "anydbm" is another python module similar to bsddb but it also works with other dbm libraries (Berkeley DB's competitors, basically). The oldest unix library that did a job similar to Berkeley DB was called "libdbm", so now "dbm" is sometimes used a generic term for any similar library. The "anydbm" module is called that because it can work with any of them.
- As for what's using Berkeley DB on your machine... I dunno, in the Linux world we have package managers that can answer that question directly so we don't have to ask strangers to guess. 68.60.252.82 (talk) 03:57, 17 December 2011 (UTC)