User:HBC archive builderbot

From Wikipedia, the free encyclopedia

This bot is pending approval and is thus inactive. Contact User:HighInBC for questions. HighInBC (Need help? Ask me) 21:23, 2 February 2007 (UTC)

Purpose[edit]

This bot is designed to go through the revision history of pages such as WP:RFCN and automatically detect the removal of sections, and add a link to the last occurrence of that section in an archive. This will provide an archive of all past and future names discussed on the board.

See /sandbox for an example of what my output will look like once approved.

Technical[edit]

Source code

This bot runs in perl. It uses the Algorithm::Diff module to compare each revision with the next. If it detects that both a header was removed, and nothing was added, then it considers it an archiving of a discussion. It uses the revision number, the edit summary, the user doing the edit, and the contents of the heading to make an archive entry.

The actual revision history is gathered using the Special:Export command and a caching system I wrote that ensures only new revisions are downloaded. The first run of this will take 10-15 minutes to populate the cache, subsequent runs will take only moments as it load only the new ones.

In testing I found the diff module could analyze over 2600 diffs in less than 3 seconds, this is very fast.

The program will mostly likely run twice daily.