Wikipedia talk:Featured article candidates/Folding@home/archive3

Page contents not supported in other languages.
From Wikipedia, the free encyclopedia

Addressed comments from Crisco 1492[edit]

Prose comments from Crisco 1492

  • I feel a bit out of my depth, but I'll give some comments
"Work Units" - Is this a proper noun, or a normal one? If it's just a regular old noun, it shouldn't be capitalised.
Thomas Kuhn Paradigm Shift Award - Should link to the award, or a section in his article about it if there is not one on the award
"for his unique approach to employing advances in algorithms that make optimal use of distributed computing, which places his efforts at the cutting edge of simulations. The results have stimulated a re-examination of the meaning of both ensemble and single-molecule measurements, making Dr. Pande’s efforts pioneering contributions to simulation methodology." - Quote is rather long, perhaps trim or paraphrase a bit more of it.
More tomorrow. — Crisco 1492 (talk) 23:06, 19 September 2012 (UTC)[reply]
Thanks. Often times "Work Unit" is abbreviated as "WU", but since that's not done in the article I put it in lowercase; it's more proper that way. I have created a section in the Thomas Kuhn article about the award, and linked to that. I have also trimmed off the first sentence of that quote since its just a rewording of the previous award so its actually redundant. • Jesse V.(talk) 23:43, 19 September 2012 (UTC)[reply]
  • "Protein folding is naturally..." - Perhaps reword to avoid the ambiguity between "in nature" and "of course"
  • "The goal of the first five years of the project was to make significant advances in understanding folding" - Did it meet the goal?
  • "its relationship to disease" - Folding and misfolding are two different nouns, so "its" is not the right pronoun here.
  • I'm seeing a lot of overlinking (protein folding in the lede, aggregations, misfolding diseases, etc. That should be minimised. — Crisco 1492 (talk) 03:26, 20 September 2012 (UTC)[reply]
  • I changed "naturally" to "normally", though another word I considered was "biologically".
  • I'll look around for sources regarding the results of that first goal, but I suspect that the source will be from the Pande lab itself.
The stated goal is a very subjective one, and so I'm not sure how to go about stating that the goal was met, or even if that's possible. I found [1], [2], [3], and [4] on the subject. I can state the goal itself as a fact. Is it important to state if the goal was met like you say? • Jesse V.(talk) 15:55, 20 September 2012 (UTC)[reply]
  • Perhaps note how the project feels it met its first five-year plan goals.
  • File:FAH-tflops.PNG - Any more up-to-date data?
  • I agree that more recent data would be nice, but I think the limited time domain highlights an important event in the history of Folding@home's computing power. That said, in the file the creator commented "I created this chart (image) myself using data compiled over the course of 10 months", so I assume the underlying data is publicly available.
I think the chart has even more significant issues. Some things to fix:
  1. The chart has an error: it uses "Tflops" where the unit should be "TFLOPS". Emw (talk) 01:04, 21 September 2012 (UTC)[reply]
  2. The axes labels need to be improved. The x-axis increments in the form "02/11/2006", "02/12/2006", "02/01/2007", "02/02/2007"; this makes it hard to read. The day portion of the date is superfluous and should be removed, especially since it's the same at each interval. The y-axis labels are too tightly packed. If every increment is going to be labeled, a smaller font should be used; or the font size could be kept the same and labeled only every other increment (e.g. 0, 200, 400, etc.); or the font could be kept the same and the y-axis increments set to 1 increment per 250 TFLOPS. Emw (talk) 01:04, 21 September 2012 (UTC)[reply]
  3. The chart area background's color gradient adds only visual noise. There should be no gradient. The background should be a very light, neutral color or solid white, since that maximizes visual contrast with the chart feature of interest (the actual plotted data). Emw (talk) 01:04, 21 September 2012 (UTC)[reply]
  4. The actual chart takes up about a 1/3 of the area of the graphic, with the metadata taking up about 2/3. Those ratios should be reversed. Emw (talk) 01:04, 21 September 2012 (UTC)[reply]
Jesse, if you could point me to where this data is available, then I could probably fix the issues with this graphic this weekend. Emw (talk) 01:04, 21 September 2012 (UTC)[reply]
Back in July I emailed kaleb_zero, the uploader of File:FAH-tflops.PNG, and he said that he no longer had the data that he used to generate the image. I suppose a program could be made to parse the information from the pixels in the image if need be. To the best of my knowledge, there isn't a reliable up-to-date graphic. This is a foldingforum.org thread on this very subject, and the uploader of the image joined in the collaboration to make this Google Spreadsheet of F@h's stats graph over time. Although they're pulling screenshots and whatnot from around the Internet and using scripts to insert the official stats on a daily basis, it's not entirely complete and they haven't been double-checking everything. I agree with your points. How can I help? • Jesse V. (talk) 01:25, 21 September 2012 (UTC)[reply]
To clarify, I could write a program to read the pixels from the image if that needs to be done. Let me know. • Jesse V.(talk) 05:33, 21 September 2012 (UTC)[reply]
The charts in the linked Google spreadsheet would probably be the best option. They're of drastically better production quality and have wider and more interesting data. I'd suggest uploading the 'Folding@home computational power per platform' and 'Participation' charts to Wikimedia Commons and using them in the article. Emw (talk) 02:14, 22 September 2012 (UTC)[reply]
I have done so. Please see the discussion below. • Jesse V.(talk) 04:49, 25 September 2012 (UTC)[reply]
  • I strongly suggest archiving your web references, perhaps through www.webcitation.org
  • The text about the FLOPS milestones reads like proseline.
  • "This generates a fair system of equal pay for equal work, and attempts to align credit with the value of the scientific results." - This is according to the project. What do neutral third parties have to say about this, for the lack of a better word, payment system?
  • Are the points exchanged for anything?
  • What does Anton's development team have to say about Folding@home? We're only getting one side of the story here.
I'll try and tackle a large section tomorrow. — Crisco 1492 (talk) 23:24, 20 September 2012 (UTC)[reply]
  • An excellent idea. I have archived the references into the deep web, though there are URL domains that I know to be very stable so I didn't see a need to bloat the citations by citing them. Also, a few URLs could not be archived.
  • I have improved some of the transitions, though I'm not sure how to improve it further. If you have an idea, please advise.
  • I have removed the first clause of this statement. It should be more objective now and is definitely true.
  • No the points cannot be exchanged for anything. See this Yahoo Answer. In rare cases some teams do offer rewards for point productivity, such as here. The article describes the general implications of the points though.
  • I added a line from two publications co-authored by D. E. Shaw Research. • Jesse V.(talk) 04:54, 21 September 2012 (UTC)[reply]
  • I'd suggest archiving them anyways, as you never know when they an up and change. I'm glad I archived Tempo's stuff before they went paywall. — Crisco 1492 (talk) 05:02, 21 September 2012 (UTC)[reply]
True, so now I'm glad I archived those others! To clarify, the domains that are left are foldingforum.org, folding.typepad.com, and folding.stanford.edu. Those are very unlikely to be put behind a paywall since the Pande lab is a non-profit organization. Based on past experiences, there's generally some warning given before any significant changes to those sites; I'd very likely hear about it beforehand. I noticed that the source-code I get for this article's page is nearly 500KB. Do you still think I should add archive information? • Jesse V.(talk) 05:33, 21 September 2012 (UTC)[reply]
  • I'm getting 136,642 bytes in size at the article history page, which is well within the maximum. — Crisco 1492 (talk) 06:20, 21 September 2012 (UTC)[reply]
I meant the actual HTML source-code. After further thought, I realized that you're right. I wasn't thinking long-term enough. I have added archives to the remainder of the web citations, so now the ones under folding.stanford.edu, folding.typepad.com, and foldingforum.org are stable. That should resolve this. • Jesse V.(talk) 21:02, 21 September 2012 (UTC)[reply]
  • I started reading the diseases section but my eyes glazed over. Sorry.
  • I note a lot of seemingly promotional statements are only sourced to FAH, like "significant work into minimizing security issues". Any third-party verification?
  • You should switch from PlayStation 3 to PS3 earlier, will save bytes.
  • Any way that independent sources could be cited for some of this stuff? From my reading the article looks very dependent on the FAH sites. — Crisco 1492 (talk) 13:06, 22 September 2012 (UTC)[reply]
  • That is the second time that the technical language has been brought up. Is it too technical? If you'd like me to simplify it, please identify the most challenging statements and I'll see what I can do to make them easier to understand.
  • I added a third-party source to that one.
  • Yes, "PlayStation 3" was said numerous times, I've changed some of them to "PS3".
  • I'm sure that there are independent sources for some statements, but for other times the details are difficult or impossible to get anywhere else. The folding.stanford.edu sites are often authored by multiple scientists in the Pande lab, the folding.typepad.com posts are written by Dr. Vijay Pande and sometimes other scientists, and for foldingforum.org I've strived to focus on posts only from experts (scientists, admins, etc). I'm aware of policy and the advantages of third-party sources, which is why the Project Significance and Biomedical Research sections almost exclusively rely on scientific journals, many of which are third-party and/or reviews. • Jesse V.(talk) 15:55, 22 September 2012 (UTC)[reply]
  • Unlike Ryan above, I find myself bored to tears by medical text. I wouldn't have the motivation to provide specific examples, sorry. Based on what I've seen in the other sections, nothing too much to worry about.
  • I think for anything that could be construed promotional, a third-party source would be preferable. — Crisco 1492 (talk) 16:00, 22 September 2012 (UTC)[reply]
There are several statements that could be "construed promotional", but the ones I noticed had third-party/reliable sources. Can you please help me identify the others? I think you're in a better position to identify these statements than I am. • Jesse V.(talk) 16:10, 22 September 2012 (UTC)[reply]
  • "On September 16, 2007, due in large part to the participation of PS3s, the Folding@home project officially attained a sustained performance level higher than one native petaFLOP, becoming the first computing system of any kind in the world to do so.", for example, is sourced to a FAH source and PS source. No tech publications picked this up?
  • "This was the first time a distributed computing project had utilized MPI, as it had previously been reserved only for supercomputers" is another one, sourced only to FAH. — Crisco 1492 (talk) 16:24, 22 September 2012 (UTC)[reply]
  • Several publications picked that up. I used one of them instead of that PS3 press release.
  • I don't doubt the statement, though so far I haven't found a better source (plenty of Wikipedia mirrors though!). I'll keep looking around. If anyone finds something, please let me know. • Jesse V.(talk) 19:46, 22 September 2012 (UTC)[reply]
  • "This was the first demonstration that MSMs were capable of statistically capturing folding events that could not be seen by conventional simulation methods." - As this reference is peer reviewed it may be acceptable, but a secondary source would still be better.
  • Note the tag I've added. — Crisco 1492 (talk) 23:37, 22 September 2012 (UTC)[reply]
  • Regarding the MPI statement: I have been searching and so far I have not found any additional confirmation. Dr. Peter Kasson is the scientist behind the MPI implementation of GROMACS. The publication "Heterogeneity Even at the Speed Limit of Folding: Large-scale Molecular Dynamics Study of a Fast-folding Variant of the Villin Headpiece" (JMB, 2007) is the first paper produced using SMP1, but the paper doesn't confirm the statement. I have emailed Dr. Kasson about the claim, and yesterday he replied "The JMB 2007 paper is the manuscript of record; the SMP implementation detailed there used MPI." What do you recommend that I do?
  • MSMs: I used a different statement.
  • About the 2006 Irving Sigal Young Investigator Award, the citation now points to the Awards page because the links are now dead to the original source on proteinsociety.org, a change that was made by a passing editor in this diff. From [5] apparently another source from the Protein Society is [6], which is also broken. I have emailed the Protein Society about this. It took a while but using Google I found: [7], [8], and [9]. Which one should I use? Whichever one is more reliable, I think it'd be best if it supplemented the reference to the Awards page since its a good-looking page even if it is a primary source. • Jesse V.(talk) 23:41, 23 September 2012 (UTC)[reply]
  • Regarding MPI, if the paper cannot confirm it but the current source is peer reviewed I think it may be acceptable. None of the links you suggested for the award look good, as they are forums. — Crisco 1492 (talk) 10:54, 24 September 2012 (UTC)[reply]
  • I think it's very reasonable to assume that he wrote the statement in the blog post following collaborative efforts with other researchers, including with Dr. Kasson. So while it's not peer reviewed in the normal sense, I'm convinced that he was speaking for the group as a whole since he used the words "we" in the phrase "We're the first distributed computing platform to roll out MPI calculations (typically reserved for supercomputers), so we are dealing with some major growing pains issues there."
  • This morning I received an email from Jody L. McGinness, managing director of the Protein Society. The email wrote: "In general we only keep the detailed recipient info live online for the most recent year’s recipients. I am not certain where the information for the 2006 winners “lived” on the web, but we did make a switch to a new website in the last couple of months so perhaps that is why the old detail page went away. In answer to your question, this is from the 2006 Awards press release: “The Irving Sigal Young Investigator Award, sponsored by Merck Research Laboratories, recognizes a significant contribution to the study of proteins by a scientist who is in the early stages of an independent career and, generally, not more than 40 years of age at the time of the award. The 2006 awardee is Dr. Vijay Pande (Stanford University) for his unique approach to employing advances in algorithms that make optimal use of distributed computing, which places his efforts at the cutting edge of simulations. The results have stimulated a re-examination of the meaning of both ensemble and single-molecule measurements, making Dr. Pande’s efforts pioneering contributions to simulation methodology.“ So this confirms the Award's statements, but since it's in an email I can't cite it. • Jesse V.(talk) 14:53, 24 September 2012 (UTC)[reply]
  • K, changed to "leaning support". I am waiting for the table to be worked out and an editor with more contextual knowledge to vet it. — Crisco 1492 (talk) 15:30, 24 September 2012 (UTC)[reply]
Wooo! I assume you are referring to File:FAH-tflops.PNG and not the TFLOPS table. I'll get on that, the Google Spreadsheet wasn't playing nice so I'll have to do the graph in Excel instead. I'll upload it under the Public Domain license, because I believe that it falls under that per this section of policy. I did not initiate the collaboration, but I did contribute to it. • Jesse V.(talk) 15:53, 24 September 2012 (UTC)[reply]
  • I would suggest keeping the legend in the caption, it's indecipherable even at high resolutions. — Crisco 1492 (talk) 04:55, 25 September 2012 (UTC)[reply]
I will be sure to reiterate the legend in the caption when I add it to the article if that's what you're saying. • Jesse V.(talk) 05:12, 25 September 2012 (UTC)[reply]
  • Alright. Women in Peru has an example which doesn't use an in-graphic legend (for easier translation) if you want to look at one, but keeping the original may be okay. — Crisco 1492 (talk) 09:24, 25 September 2012 (UTC)[reply]
Thanks, but there the legend takes up a lot of space. I swapped out the image and just wrote the legend in the caption. See Folding@home#Participation. • Jesse V.(talk) 14:57, 25 September 2012 (UTC)[reply]
  • You could work such a legend into the caption. The current image still looks a little cramped, with the legend on the chart. — Crisco 1492 (talk) 15:23, 25 September 2012 (UTC)[reply]
You mean like this? Are you saying that the legend should have a wider margin in the chart? • Jesse V.(talk) 15:49, 25 September 2012 (UTC)[reply]
  • No, no... it's just that I don't think the legend looks good at thumbnail size when it's hardcoded into the chart. It makes it look cramped. It's not a sticking point, so don't worry too much about it. You have the updated data, which is the key thing I was looking for. — Crisco 1492 (talk) 16:08, 25 September 2012 (UTC)[reply]
For what it's worth, I think it's important to keeping the legend within the chart. That way, if the chart is reused elsewhere, then one doesn't have to refer back to captions or description metadata to determine what the different lines represent. And really, the whole chart and not just the legend looks cramped at thumbnail size. Emw (talk) 16:58, 30 September 2012 (UTC)[reply]