User:WP 1.0 bot/Second generation

This page discusses the new version of the WP 1.0 bot that is under development. This will replace the current version. For a brief introduction, consult the FAQ.

Background[edit]

WP 1.0 bot collates article rating data for Wikipedia:Version 1.0. The data is stored in subpages of Wikipedia:Version 1.0 Editorial Team/Index. The generated data consists of:

Summary tables (example)
Logs of the changes in ratings (example)
Lists of all rated articles in each project (example)

The current, first generation code for WP 1.0 Bot was written by Oleg Alexandrov and has performed extremely well. It currently handles data for over 1.7 million articles in over 1,300 WikiProjects. To accomplish its work, the bot has made over 1 million edits since February 2007.

The purpose of this page is to discuss a new, second generation version of the code. The goal is to rewrite the present code, keeping the successful functions of the bot while addressing shortcomings and possibly adding new functionality.

Motivation for an update[edit]

Issues encountered with the current code include:

The code stores all its data in long lists in wiki pages, using the wiki as a makeshift database. Updating this data requires an enormous number of page edits to complete a single run of the script. A full update now requires several days.
The code was originally written and maintained by a single individual, Oleg Alexandrov. More recently, CBM has helped with updates. Given the broad role of this bot, a slightly larger group of maintainers is desirable.
The code is not configurable on a per-project basis. Requests to add a special rating for a WikiProject are common, but the bot code was not written with that in mind.
Although the code generates a great deal of data, it isn't possible to use this data to make dynamic queries.
1. There is no easy way to generate a list of articles rated by both the Military History and Australia projects, although the data needed for this is already collected.
2. There is no easy way to get a log of all assessment changes for a particular article. When a log page gets too long, old information must be removed, leaving it only available in the log page's history. Rarely, logging data is lost when there are too many log entries in a single day to include on a project's log page.

Additional maintainers[edit]

A small group of 3-4 maintainers will spread the load of maintenance and prevent the departure of a single person from impacting the code. The existing code is in Perl, but many languages are supported on toolserver.

Volunteers wanted[edit]

Programmers of all experience levels are welcome to contribute code. This project would provide an excellent setting to familiarize yourself with the LAMP framework. Please contact User:Theopolisme, either on his talk page or by email, if you're interested in contributing code or becoming a maintainer.

Frequently asked questions[edit]

A list of frequently asked questions is available. Please ask new questions at User talk:WP 1.0 bot/Second generation.

Feature requests[edit]

The following table contains some early requests that were discussed in the alpha phase of development. Additional task requests are very welcome. They can be filed below or at this project's bug tracking page.

#	Request	Status
1	Category intersections in tables (e.g. GA-Class Top-Importance Foo articles)	Implemented in the alpha version
2	Support for WikiProject preferences (custom article ratings)	Implemented in the alpha version
3	Support for multiple threads of the bot (e.g. running WP:WPBIO on a separate instance)	Implemented in the alpha version
4	Allowing queries on the database A proper database, by that matter Maybe allow machine-readable access, so some tasks can be delegated to other bots?	(1) is implemented in the alpha version. (2) can be implemented upon request
5	A log of an article's assessment evolution thoughout time (similar to {{ArticleHistory}})	Implemented in the alpha version
6	Having the ability to continue if a bot run is stopped due to technical issues	Implemented in the alpha version
7	Better subproject / task force integration e.g. making task forces' assessment tables subpages of the main project's "assessment space" to shrink the main listing at WP:1.0/I maybe counting a task force's articles would allow the bot to skip the main project's articles, either partially or completely?	More discussion required. 7.2 is not necessary anymore due to performance improvements. The index is dynamically generated, so it can be formatted differently, or several different indices can be generated.
8	"WikiProject News" feature, or at least built the database in a way that such a feature could be implemented eventually (by either this bot or another).	A separate task from the main WP bot. See User:B. Wolterding/Article alerts
9	Add an assessment completion summary for each project in the table, similar to what is done at fr:Projet:Wikipédia_1.0#Projets.	Implemented in the alpha version
10	To be discussed at least: Display the article's importance rank generated by the SelectionBot in the WP1.0 template. Can this be done? Currently we mainly display a ? for importance.	Implemented in the alpha version
11	The ability for a WikiProject to tag a particular version of an article as being the one they would like used for a release. This might be used in conjunction with WP:FLR.	More discussion required
12	Keep track of featured articles, good articles, and WP 0.5 articles using their dedicated categories. The first generation code does this for WP 0.5 articles only.	Implemented in the alpha version
13	Track assessed pages in Portal, Image, Template namespace.	Implemented in the alpha version - all namespaces except User/User talk are tracked
14	That the log is color coded and lists who made the new assessement so we can see at a glimpse wich article have been raised, which have been lowered, and by whom (finding who assessed is not critical, it's just nice to see who made the assessment so projects can review if it was made by IPs, bots, project members, etc...). Something like: Jimmy Hendrix has been re-assessed from C-Class (Top-Importance) to B-Class (Low-Importance) by Headbomb (talk ·contribs)	The log is color coded, but at the moment there is no good (and robust) automated way to figure out who changed the rating of an article. I can think about how to achieve that for a later version, but it will have to wait for a little while.
15	Generate graphics which show the progression of \|class= and \|importance= assessments on a per project/per taskforce basis.	Implemented in the alpha version

Activation[edit]

On Saturday, January 23, 2010, the old bot was turned off and the new bot (2G) began to update the tables and logs on the wiki.