Wikipedia talk:Wikipedia Signpost/2010-08-16/Spam attacks

Discuss this story

Has the university been contacted with ethics concerns? Has anyone verified that the supervisors have in fact supported this "research"? - David Gerard (talk) 08:52, 17 August 2010 (UTC)[reply]

ArbCom told me that they are not aware whether the university or any of the advisors have been informed. Regards, HaeB (talk) 09:08, 17 August 2010 (UTC)[reply]

That's a rather astounding display of unethical behaviour. Regardless of how good the tool is, I'd like to see some kind of permban enacted. This was unconscionable, and plain bizarre. — Huntster (t @ c) 10:38, 17 August 2010 (UTC)[reply]

I can't believe that ArbCom unbanned him! They just endorsed disruptive editing and completely ignored WP:POINT. Any sort of researcher in good conscience should have at the very least contacted somebody from Wikimedia before starting any experiments. bahamut0013^words _deeds 12:38, 17 August 2010 (UTC)[reply]

Agreed. This idiot should be treated like any other spammer: His account permanently blocked and his institution reported for the abuse of both our website and their resources. And, as noted, "W." might want to look into an ethics course in his next semester, as he is obviously lacking. Reso lute 13:37, 17 August 2010 (UTC)[reply]

As ArbCom mention, universities have ethics committees to make decisions on whether this kind of research is acceptable. Either, this researcher didn't get his research reviewed by the ethics committee or the ethics committee at his university is incompetent (he makes good arguments for why the users clicking on the spam needed to believe it was genuine, but he doesn't address the obvious question of why he didn't get permission from either the WMF or the community before doing anything). Either way, it's a problem that needs to be fixed. --Tango (talk) 12:43, 17 August 2010 (UTC)[reply]

Everything about this user's program, STiki (which really is excellent), his involvement with the global Wikipedia community, and his straightforward follow-up to questions indicates that this was legitimate research from a serious researcher at a serious university. There's certainly something squeamish about the 'I violated your sister so that I could report to you about her vulnerabilities' logic, but, I don't believe the user had intentions except to advance understanding and to engage with the complex issues of the site. It sucks that the test disrupted things. It's worth asking if it's possible to make similar findings without such disruption. But maybe down the road it will prevent a much more serious attack from someone who has less benign motivations. The reality is that the smartest people in the coding community have bigger questions than whether something is NPOV. They want to push the boundaries of open systems, understand our relationship to technology, and ultimately make the whole thing stronger. I'm not suggesting we accept that blindly or give anyone who puts Cialis adds on the mainpage a free pass. But part of this work might make Wikipedia better, and the involvement of technically sophisticated users definitely will. Ocaasi (talk) 13:15, 17 August 2010 (UTC)[reply]

This story was the first I've heard of this project, but A.W. is in fact a real researcher who I have met in person. See the STiki papers from this year's WikiSym. -- phoebe / (talk to me) 15:28, 17 August 2010 (UTC)[reply]

I normally take a hard line towards spam and vandalism. I am unimpressed with this incident (see my longer comments in a separate section below). Nevertheless, I think Ocassi's comments above are right on the money.

What's best for Wikipedia and Wikimedia? We indulge in a momentary feel-good flash of retribution and embarrassment for the researcher and his university? Or we find ways going forward to harness the outputs of a talented guy at one of the world's top computer science departments -- someone who has already made valuable contributions before this boneheaded incident? --A. B. ^{(talk • contribs)} 15:32, 17 August 2010 (UTC)[reply]

Obviously, his contributions could be valuable, but the Q&A with him posted in this article isn't very reassuring. The apology doesn't seem very sincere; it has the distinct flavor of "you should be thanking me" rather than "I promise to collaborate better next time." That's not to say I'm suggesting retribution, but I'm not yet convinced he won't just do it again. Powers ^T 16:08, 17 August 2010 (UTC)[reply]

Overlooking the actions of an admitted spammer who disruptively abused multiple accounts on the unspoken promise of some nebulous benefit to the project is no better than "indulging in a momentary feel-good flash of retribution". I would like to see a definable example of some good that can come out of this that justifies letting W off the hook so easily. Reso lute 16:46, 17 August 2010 (UTC)[reply]

Lt. Powers, I'm not sure how truly remorsefully he is either but I'm more interested in what he can do for us going forward than whether he's a nice guy Iwant dating a relative.

As for repeating his actions, he'd have to be an idiot. He just gets "one bite at the apple" ^[1][2] on this one. If he gets caught again, ArbCom and the Signpost might not be so gracious again in dealing with him discreetly. Even if they were, the community outcry would be so much greater as to inevitably draw unwanted media attention on him, his faculty and his school. No faculty adviser or future employer takes risks on people that'll embarrass them. Even if he covered his tracks better in the future, he'd still have to present his data and methods to his faculty -- they would not appreciate his having risking their reputations. They know any trail-masking scheme might fail and plenty of smart computer science experts also volunteer here. Too, major research universities have layers upon layers of oversight and disciplinary processes dealing with research integrity -- they'd not want the aggravation of tangling with them. (His school has labyrinth of web pages on the topic; a small sample: [3][4][5][6][7][8][9])

So what might "assume good faith" mean in this case? When I get in deep trouble, I become guarded in my remarks, too. I'm embarrassed. I'm running scared. I'm reluctant to go on the record until I know what to say.

For all we know, he may not even have had a chance to talk to his faculty. Or they may be telling him what to say and what not to say.

The ball's in his court -- let's see what happens after the dust settles. --A. B. ^{(talk • contribs)} 17:28, 17 August 2010 (UTC)[reply]

Resolute, the best predictor of what he can do for us in the future is what he's done for us in the past before this incident; see the comments by Ocaasi and phoebe above about the sophisticated tools he's developed for Wikipedia (with support from his school) and his presentations to Wikimania and/or WikiSymposium (not sure which or if it was both). I think he's a safe bet not to cause problems again (see my comments above) and likely to continue doing good stuff for us going forward. --A. B. ^{(talk • contribs)} 17:35, 17 August 2010 (UTC)[reply]

The perpetrator's remarks are more apologia than apology. I don't believe he gets it, but it may still be hoped that rational self-interest will lead him to behave responsibly notwithstanding. ~ Ningauble (talk) 18:08, 17 August 2010 (UTC)[reply]

Anyone who thinks that it's a "vulnerability" that Wikipedia can be edited freely should be banned forever, no matter how many college degrees they have. This is like someone stealing all the change from the charity jar to demonstrate how "weak" the honor system is. I see nothing but a fundamental lack of clue, along with demonstrated malicious behavior. Gigs (talk) 19:11, 17 August 2010 (UTC)[reply]

If the University of Pennsylvania institutional review board approved this, then I say we should let it go. If it is exempt from IRB jurisdiction then were at least the two faculty advisers who were “aware of [your] motivations in these experiments” briefed in detail on what you were planning to do before you did it, and did they approve of the actions themselves (rather than merely the motivations)? Bwrs (talk) 05:10, 20 August 2010 (UTC)[reply]

University research that involves large-scale social provocation must be endorsed by the institution's human ethics committee. This requirement, and that such endorsement is necessary but not sufficient before conducting the research, should be made clear to the instigators. Tony (talk) 07:53, 22 August 2010 (UTC)[reply]

Redaction[edit]

Why redact the name of the guy who did this? It's easy enough to figure out from the links provided to diffs. Powers ^T 13:12, 17 August 2010 (UTC)[reply]

As I said earlier in the Signpost Newsroom: Basically because of Google. There is indeed little point in trying to prevent the readers of this article from finding out (it would be easy even without the diffs), but that still left the question whether one wanted the article to turn up in a search for his full name. Regards, HaeB (talk) 14:04, 17 August 2010 (UTC)[reply]

I think redaction because of Google was a good idea and I support it 100%. I think this person has made a mistake but has also generated some powerful tools for Wikipedia in the past. Let's just move on and put him to work in useful areas. We need his future contributions more than he needs our retribution. --A. B. ^{(talk • contribs)} 15:24, 17 August 2010 (UTC)[reply]

I don't understand this reasoning at all. A. W. obviously doesn't think he did anything wrong. If he is content with the quality of his work, why is The Signpost ashamed of it on his behalf? Also, assuming that he wants to be known for his work (which is a general tendency among academics), bumping a page discussing his research up the Google ranking would appear to be a favor, since he currently is not a very prominent "A W". - Banyan Tree 03:04, 18 August 2010 (UTC)[reply]

BanyanTree, I redacted the name in your comment. I see that logic, but if the signpost editors found it important to keep his name out of the article, you should establish consensus before correcting that problem yourself. (Note, if I'm missing something, and these comments aren't indexed by Google, please change it back). Ocaasi (talk) 04:15, 18 August 2010 (UTC)[reply]

These comments are indexed by Google: [10] Note that Google's current cached version has BanyanTree's unredacted comment with A.W.'s full name; hopefully that will clear in a day or two: [11] --A. B. ^{(talk • contribs)} 12:16, 18 August 2010 (UTC)[reply]

I endorse that action - I think this is preferrable to us at The Signpost than receiving a request to courtesy blank this whole page when it otherwise consists of feedback. Also to clarify in response to Banyantree, The Signpost is not "ashamed of it on his behalf"; instead, and contrary to some people's views, we're not trying to be vindictive or punish people. Ncmvocalist (talk) 04:29, 18 August 2010 (UTC)[reply]

University research that involves large-scale social provocation must be endorsed by the institution's human ethics committee. This requirement, and that such endorsement is necessary but not sufficient before conducting the research, should be made clear to the instigators. Tony (talk) 07:50, 22 August 2010 (UTC)[reply]
There should not be a coverup by Wikipedia or Signpost of misconduct by some graduate student who engaged in unethical conduct. Name the name, the institution, and the department. Let the chips fall where they may. It will be a lesson to him and his peers. ("I only hid in the washroom at the store and then climbed out the bathroom window with a bundle of cash because I was doing research on vulnerabilities of small retail establishments") Yeah, right. More accurately, someone had a gigantic ego and wanted to show off with the caper he could pull. Edison (talk) 22:26, 22 August 2010 (UTC)[reply]

Wikipedia:WikiProject Vandalism studies[edit]

It may be useful to note that the author of the study was a member of Wikipedia:WikiProject Vandalism studies and that this project intends to determine the scope and type of vandalism that occurs on wikipedia. Remember (talk) 13:46, 17 August 2010 (UTC)[reply]

I don't care if he has angel wings and a halo. Like the high school student who was caught shoplifting and told the judge he "was doing a research paper on shoplifting." The judge, the public defender and the prosecutor all had a big smile on their faces at hearing the explanation. In this case it was a demonstration of immaturity, unprofessional conduct and bad judgment. If getting the appropriate response to this stunt hampers his career, so be it. Otherwise we only encourage others to abuse the trust of the community. Edison (talk) 22:33, 22 August 2010 (UTC)[reply]

Re-inventing the wheel -- hubris or folly?[edit]

About 5 to 10 volunteers spend a lot of time dealing with spam problems here and on other Wikimedia projects. I may be wrong but I don't think A.W. communicated with any of us. It would have been helpful to us to have some input into A.W.'s research. We do not have the resources of an Ivy League university behind us and it might have been useful to point his research towards those particular challenges we've found especially vexing.

Conversely, A.W.'s research might have been more useful to him, his advisors and his university had he bothered to find out what these Wikipedia volunteers already know about spam. Collectively, we've spent 1000s of hours studying spammer patterns -- their motivations, their methods, etc. Some of us have spent time in black hat spammer forums --the spam world has its own little multinational ecosystem of players with various economic niches. Other volunteers have developed various tools and scripts for spotting, tracking and cleaning up spam. If nothing else, we're aware of many of our own vulnerabilities. For obvious reasons, we don't post everything we know online since we know some spammers read and disseminate stuff posted at WikiProject Spam, its very active talk page and Meta-wiki's Talk:Spam blacklist page.

Step One in academic research is to find out what's already been learned, then build on it rather than repeat it. Perhaps A.W.'s been communicating with others and I'm just not aware of it. Otherwise, he's wasted not just our time and resources but his own and that of his sponsors. The editors who involuntarily wasted their time on A.W.'s research probably don't appreciate it anymore than his school's faculty would enjoy one of us periodically knocking over their desks in the name of science.

This has to be embarrassing for his school's computer science department. As annoyed as I am, at some level I feel sorry for this guy; I'm sure he has talent and promise and I hope for his sake this doesn't damage his career too much. --A. B. ^{(talk • contribs)} 14:08, 17 August 2010 (UTC)[reply]

I should point out that there's no central place he could have talked to all of us together. That said, I'm not aware he made any effort whatsoever to speak with any of us - truly disappointing. I'd encourage him to remedy that soon. – mike@enwiki:~$ 18:11, 17 August 2010 (UTC)[reply]

I've seen it too. I don't know if I had reverted the additions though. —I-20 the highway 00:24, 18 August 2010 (UTC)[reply]

Perhaps slow down use of write API[edit]

I am not a developer, but perhaps we could put a limit on autoconfirmed-and-below's use of the write API, perhaps to only, say, 20 times per minute (one every three seconds) per IP or per user, rather than allowing just one of these users to post the same link to 172 articles in the space of 3 minutes (average of 57.3 posts per minute or one post every 1.05 seconds)? Clearly, we don't want to limit approved bots in this manner, but it would cut down significantly on vandalbots and spambots. Bk314159 (talk) 16:35, 17 August 2010 (UTC)[reply]

Note, it doesn't make sense to just rate limit the api, since bot users can also use the normal interface used by humans. Approved bots (with bot flag) don't have rate limits applied to them AFAIK. Bawolff (talk) 00:11, 18 August 2010 (UTC)[reply]

Researcher Response[edit]

Wikipedia community,

This is "A.W.", the researcher who led the aforementioned experiments. It is obvious this topic is the source of some controversy and for very good reasons. Since the publication of the Signpost article, I have been asked many questions; via (1) discussion pages, (2) my own talk page, and (3) privately via email.

I believe it to be in the best interest of all parties to not immediately address these queries. For the protection of WM/WP/WMF, the minutiae of my experiments should not be put into the public domain until the developers have protections ready. I'll note I have already provided my code to developers -- and asked ArbCom to put me in contact with a developer so I can cooperate with them beyond the terms of my unblocking.

Following this post, I will also contact ArbCom regarding what information I should share, with whom, and when. Pending that, I will in due time return to (1) engage those who have contacted me, (2) actively participate in discussions about what transpired, and (3) discuss how I plan to cooperate to create a more secure WP/WM. Until that time, please do not interpret my lack of communication as an act of bad faith. Thank you. -- A.W. 19:55, 17 August 2010 (UTC)—Preceding unsigned comment added by A.W. 19:55, August 17, 2010

Spamming the project was highly untowards, A.W., a wanton tinkering with the time and good faith of volunteer editors. Please don't do it again. Gwen Gale (talk) 20:57, 17 August 2010 (UTC)[reply]

Although Gwen explained the matter quite succinctly & clearly, let me explain our concerns from another direction. Spamming Wikipedia for any reason is prima facie grounds for concluding one is acting in bad faith; testing Wikipedia's defenses in this manner & without warning anyone will only result in a vicious response, which will include targeting your reputation & the future of your career. I think a fair analogy to your actions would be testing the security of any major political figure, such as a head of state, without the knowledge of that government: at best, one might be let off with a warning, but more likely fined & imprisoned; at worst, the researcher would be killed. And in some countries, in a most horrible & painful manner. -- llywrch (talk) 15:56, 18 August 2010 (UTC)[reply]

A stark way of putting it, llywrch, but that's the pith. Yet another way is WP:Point, which is indeed blockable. Gwen Gale (talk) 19:33, 18 August 2010 (UTC)[reply]

This whole silly and unprofessional exercise was akin to a child shoplifting and when caught explaining that "He was doing a research project on shoplifting," or someone trying to carry a gun onto a plane to "expose shortcomings in airport security." If such actions are taken without advance agreement from an authority at the target (the Arb Com or some such body at Wikipedia) then there should be no more leniency than if it were more tomfoolery by Grawp. Shame on "W" and his ethics committee, if he even consulted them. Edison (talk) 02:59, 19 August 2010 (UTC)[reply]

I read this report feeling very angry about this experiment but, having read the other side of the argument, I am now more ambivalent. I'm pleased to see that some good should come out of it. Perhaps, as well as collaborating with AW, lots of thought should also be gone into about how someone in future can effectively experiment in as near a real-world scenario as possible but without exposing our readers to the experiment. I can appreciate why the experiment involved real users because it was desirable to measure the click-thru and potential income to be made. However, I'm not sure it was necessary; surely it would have been enough to simply know that the spam would display? Our vulnerability to spam would need to be closed without knowing about potential profit. On that basis I'm not sure why real users needed to be exposed and wonder why the experiment couldn't have stopped at a test wiki with follow-up reporting and collaboration with those who could plug the weakness. --bodnotbod (talk) 10:11, 19 August 2010 (UTC)[reply]

An interesting question is if A.W. HAD asked first, would arbcom or the foundation permitted the experiment as it was undertaken on Wikipedia and to what extent? -- œ ^™ 10:30, 19 August 2010 (UTC)[reply]

I would say anything like that, which gobbles up volunteer time and good faith, would need to be done through wide consensus. This kind of thing can quickly become a very slippery slope. There may now be a need for both arbcom and WmF to let editors know if they're aware of anything else like this going on. Gwen Gale (talk) 11:35, 19 August 2010 (UTC)[reply]

Agreed, and I would challenge any assertion that ArbCom has the authority to allow anything like this. I would say only the WMF or the community itself is capable of approving such experiments. Reso lute 13:48, 19 August 2010 (UTC)[reply]

Yes and moreover, given it's WmF's privately owned website and they can do as they please with it, volunteers should be told if and when their time is being spent towards any ends other than those which are straightforwardly and unabashedly encyclopedia building. Gwen Gale (talk) 14:06, 19 August 2010 (UTC)[reply]

Speaking only for myself, and not the Committee as a whole, if we had an inkling of it beforehand, the Foundation would have been notified immediately (as things went, we worked with members of the WMF in an attempt to contain and determine the source of the issue) SirFozzie (talk) 15:44, 19 August 2010 (UTC)[reply]

The requirement to ask for permission to do an experiment like this is separate from whether the WMF or the ArbCom should have given permission. I have no problem with them approving experiments like this on a case-by-case basis, and not only does Wikipedia need to benefit from the proposed experiment but it should be done with sensitivity to the community's feelings. Even better would have been to somehow include the volunteers affected by this experiment; a lot of important work gets done maintaining & improving Wikipedia without any acknowledgment, let alone a sign of appreciation; getting their involvement in some manner would have made for a better situation all around. On a related note, doing something for the spam-fighters involved in this -- even just a number of written thank-you notes -- should have been one of the conditions A.W.'s advisor set for this experiment, & I would expect something far more expressive of not only A.W.'s but his university's thanks for participating in this. -- llywrch (talk) 18:42, 19 August 2010 (UTC)[reply]

Many editors who think they're helping build an encyclopedia will be unhappy to learn that they're being used like lab rats for a study which has more to do with open wiki anthropology than encyclopedia building. This said, if editors are told a study is being done, some might be happy to volunteer their good faith time to it while others will at least be able to stay away from making edits which don't match their own goals for how they spend their time. Gwen Gale (talk) 20:47, 19 August 2010 (UTC)[reply]

To be fair, the usual way to progress this sort of issue in the wider world would be to complain directly to the university concerned, noting any breach of normal research practice and any damage being done to the institutional reputation of the university and the supervisors concerned. There are some lists of webpages covering Penn's ethics instructions above by A.B., but there are various other contacts one could pursue a grievance with, including here; or one could go directly to his supervisors. Hchc2009 (talk) 08:44, 21 August 2010 (UTC)[reply]

Your actions were unethical and unprofessional. It is clear that you "just don't get it" that you violated both the expectations of Wikipedia and of the academic community. Edison (talk) 22:36, 22 August 2010 (UTC)[reply]

I'm with Edison on this one; based on what I know of the situation, it looks like Andrew got off way too easily. And how many weeks ("due time"??) should I wait for him to respond to the message I left on his talk-page? I'm not asking him for technical advice on how to spam Wikipedia; I want figure out what motivates this guy. --Stepheng3 (talk) 17:39, 1 September 2010 (UTC)[reply]

No harm to participants[edit]

I'm sorry that the "researcher" has such a limited view of harm to participants. Sure, he apparently didn't take money from people, but where do we go to get our time and energy back? It's like saying that a noisy, all-night party next door "didn't harm the neighborhood", because all it did was temporarily disrupt everyone's sleep.

I like to believe that I'm a reasonable person: I think that for every editor-hour we spent responding to and cleaning up his "harmless" vandalism, this "researcher" owes the community an equivalent number of hours patrolling Special:RecentChanges. WhatamIdoing (talk) 23:07, 23 August 2010 (UTC)[reply]

This is simply exploitation. A.W. surreptitiously creates a situation where a whole load of volunteers spend their time dealing with his disruption, so that he can get academic brownie points. What is disturbing is the way ArbCom seem OK about this. On a transactional level A.W. gains both a data set and some cred, then trades the dataset with ArbCom, in order to cash in on more cred with little more than a slap on the wrist. This is not good!Harrypotter (talk) 09:33, 24 August 2010 (UTC)[reply]

Have you tried using his anti-vandalism program WP:STiki? It's probably served hundreds of hours already. Academic brownie points is a low blow, even if his research also serves his education, we can benefit from it. Many a hacker has ultimately improved the systems they infiltrate. I don't think ArbCom is okay with what happened--the terms of his unblocking were very specific and wouldn't permit a similar experiment. So it ultimately comes down to what is better for the community: making an example of someone who has made their academic quest understanding how to keep vandalism out of open systems or taking full advantage of his abilities with the caveat that he not ignore Wikipedia's guidelines. Ocaasi (talk) 19:15, 24 August 2010 (UTC)[reply]

That's nice, of course, but I'm feeling rather eye-for-an-eye over this, not eye-for-a-tooth. (It is far from my usual policy.) IMO if he's deliberately inflicted the need to revert X instances of vandalism on the community, or a sucked up a given number of editor-hours, then he needs to pay that back by fixing an equivalent amount himself. NB that IMO this is his debt, and he needs to pay it himself in the coin that he used to create this debt. Making someone else more efficient is not the same thing as cleaning up the mess yourself. WhatamIdoing (talk) 23:47, 25 August 2010 (UTC)[reply]

Epilogue[edit]

A.W. has now published a related conference paper together with four other resarchers, cf. Wikipedia:Wikipedia Signpost/2011-09-26/Recent research/m:Research:Newsletter/2011-09-26. Regards, HaeB (talk) 03:57, 29 September 2011 (UTC)[reply]