Jump to content

Wikipedia talk:Sockpuppet investigations

Page contents not supported in other languages.
From Wikipedia, the free encyclopedia
(Redirected from Wikipedia talk:SPI)

Triaging SPI

[edit]

There is often an extensive backlog at SPI that can result in reports going unreviewed for months. For the most active socks, this can give them time to make thousands more edits and thus be significantly more disruptive.

Would clerks and check users be interested in prioritising such editors?

If they are, it might be helpful to include a column containing the number of edits made in the past 60 days - the easiest way to do this would probably to update Mz7s bot, but if that isn’t an option for whatever reason we could just add an additional column that I could create a bot to fill. BilledMammal (talk) 01:08, 27 August 2024 (UTC)[reply]

I'm sure we could argue forever about exactly what metric to use, but I certainly support the general concept. RoySmith (talk) 10:35, 27 August 2024 (UTC)[reply]
I believe that any tool that could be brought to bear against prolific socks would be of benefit to the project. Regards,   Aloha27  talk  12:14, 27 August 2024 (UTC)[reply]
I feel the same as RS. Firefangledfeathers (talk / contribs) 19:14, 30 August 2024 (UTC)[reply]
If CU's and clerks want a different metric I'm happy to try and make it work - just let me know what you would like to see and we can try it out. BilledMammal (talk) 10:35, 31 August 2024 (UTC)[reply]
The SPI table task is already relatively expensive, taking around a minute to go through all the SPI cases and pull the information it needs to construct the table. My first concern is that pulling the edit counts of all socks across all SPIs might be a heavy operation that would worsen the already suboptimal performance of the bot. I also don’t think it needs to be done every 10 minutes, which is the current frequency of the bot. If this is desired, I think it might be better to do it as a separate bot that updates a different page (at least to start) on a less frequent basis (e.g. maybe once per day rather than every 10 mins). Mz7 (talk) 00:11, 28 August 2024 (UTC)[reply]
@Mz7: I take it that your script parses the case pages of all active SPI’s? BilledMammal (talk) 00:40, 30 August 2024 (UTC)[reply]

This is a great initiative. Does anyone know the main 2 or 3 causes of the backlog, not enough checkusers, not enough clerks, the inherent advantage of replicators in a community of agents etc.? Sean.hoyland (talk) 08:24, 1 September 2024 (UTC)[reply]

I'd support any system that helps prioritizes cases. I think a daily report would be sufficient. I've also wondered about the bottleneck, sometimes an SPI case I file gets reviewed the same day I post it while others linger around for days or weeks. Also cases are not being archived after being closed so I think you might need to make a pitch for more SPI clerks to come on board. Liz Read! Talk! 22:17, 1 September 2024 (UTC)[reply]
I can't speak for other SPI denizens, but from my perspective the best way to get a case handled quickly is to lay out the evidence clearly and simply, which almost always means pairs of specific diffs. The harder it is to understand the evidence, the more likely I'm going to skip over a case and move on to something else. RoySmith (talk) 22:44, 1 September 2024 (UTC)[reply]
I would agree with RoySmith here. Yesterday I went through most of the backlog, and was usually skipping cases where it was not laid out for me with diffs. In some cases I could see the similarity quickly, but I want to avoid doing a deep dive to not find enough evidence to run a check. Dreamy Jazz talk to me | my contributions 09:12, 2 September 2024 (UTC)[reply]
@Dreamy Jazz: Can you provide some examples of "good" SPI cases against prolific editors? BilledMammal (talk) 15:14, 2 September 2024 (UTC)[reply]
It's difficult to provide a "good" case for a prolific editor, but I would emphasise diffs are important. Essentially something like Editor A edited [diff 1] article 1 to add the same content as Editor B in [diff 2] is very helpful as it establishes a common editing pattern and also the grounds for violating WP:SOCK (editing the same page using multiple accounts without disclosure). Even something like Editor A [diff 1] compared to Editor B [diff 2] is still good as long as it's clear why the diffs are similar. When it's a editor who has been around longer, more diffs helps because the chance of a it happening without them being the same person drops as they make more edits. However, adding a boat load of diffs can be unhelpful. For example Editor A [diff 1] [diff 2] [diff 3] [diff 4] [diff 5] [diff 6] [diff 7] [diff 8] [diff 9] [diff 10] and Editor B [diff 11] [diff 12] [diff 13] [diff 14] [diff 15] [diff 16] [diff 17] [diff 18] [diff 19] is not as good because I cannot immediately see pairs of diffs that establish a common editing pattern. Dreamy Jazz talk to me | my contributions 21:49, 4 September 2024 (UTC)[reply]

Prototype

[edit]

@RoySmith, Aloha27, and Mz7: I've thrown together a prototype; see User:BilledMammal/SPI Edit Counts. There are currently six open SPI's where the accused accounts have made more than 1000 edits in the past month. It took a few minutes to compute, but that was due to the API being needed as I did so locally - it would be much faster on toolforge, as it's possible to use SQL queries there. BilledMammal (talk) 18:11, 30 August 2024 (UTC)[reply]

If you know your way around SSH, you can set up a port tunnel and access the toolforge SQL server from your local machine. RoySmith (talk) 13:18, 2 September 2024 (UTC)[reply]
Thank you, that is very helpful BilledMammal (talk) 15:12, 2 September 2024 (UTC)[reply]
Maybe I should have been taking coding classes rather art history in school. Liz Read! Talk! 20:05, 2 September 2024 (UTC)[reply]