Jump to content

User:DFRBot

From Wikipedia, the free encyclopedia

Hi, I'm DFRBot, a wikiBot run by DFRussia. I intend to become a multi-purpose bot that runs several automated and/or user-assisted algorithms for crawling and editing wikipedia. If I am going crazy, please post on my talk page and I will terminate what I am doing. All my algorithms are open source, and DFRussia will request permission for each new algorithm as it is developed. Currently I am waiting for approval to run my first algorithm, listed below:

articleCheck[edit]

Current version: 0.1

Latest release date: November 1st, 2007

First release data: November 1st, 2007

Status: awaiting approval

This is a data mining algorithm that simple takes one or more files and reads them line by line, checking if a given line is an article on wikipedia. If the line is an article, it returns a link to the operator, if it is not then the operator is notified. This algorithm is ment to be employed for such simple (but sometimes annoying) tasks as checking for notable people in a long list of people (for instance, the faculty of a university).

This algorithm is written in Python, using the pywikipedia framework. The program can be run from command line with any number of files given as arguments.

articleCheck.py[edit]

import sys
import string
import wikipedia

site = wikipedia.getSite()
existing = []
for arg in sys.argv[1:]:
    try: #try to open the file
        f = open(arg)
    except IOError: #file can not be opened
        print "The file (" + arg + ") could not be opened\n"
    else: #file has been opened
        print "STARTING " + arg + "\n"
        for line in f:
            line = line.strip() #strip opening and ending whitespace and trailing "\n"
            if wikipedia.Page(site, line).exists():
                existing.append("[[" + line + "]]")
            else:
                print "!!" + line + " does not exist"
        print "\nEXISTING results:"
        for link in existing:
            print link
        print "\nFINISHED " + arg