User:MER-C/Spamsearch

From Wikipedia, the free encyclopedia
Directory
User space: Home | Talk (archives) | Sandboxes: General 1 · General 2 | Smart questions · Cluebat
Software: Test account | Wiki.java | Servlets
Links: WikiProject Spam · Spam blacklist: local · global · XLinkBot | Copyvios | Contributor copyright

Finally, something that can search for spam across all 700 or so Wikimedia projects. It is licensed under the GNU General Public License version 3. A copy of the license is available at http://www.gnu.org/licenses/gpl-3.0.txt or in the JAR itself.

At the moment, 16 wikis are searched every 8.5 seconds. Hence, a one-site spamsearch takes about 6.5 minutes (in comparison, the toolserver spamsearch searches 57 wikis in about 2.5 minutes). This is a hard-coded limit - it tends to kill itself if the internet connection is overloaded - but you can tweak the values by editing the source code and recompiling.

System requirements
Installation instructions

I need someone to host the JAR file. In the meantime, you can download the source (see below).

Running instructions

Open up a command line interface and change directory to the directory you saved the above JAR file. Enter the following command:

java Spamsearch example.com

... where example.com the sites spammed. Beware of case sensitivity on non-Windows filesystems. The results of the spamsearch, once complete, will be in the same directory with filename results.txt.

Source code and internals

Spamsearch uses Wiki.java as the wiki interface. The other half is here.

Bugs and problems

As I have given you the code, {{sofixit}}, then edit the source files on-wiki. If you can't (perhaps because of a lack of knowledge of Java, a JDK or effort), then file bug reports at User talk:MER-C.