User talk:CorenSearchBot/Archive Dec 2007

Page contents not supported in other languages.
From Wikipedia, the free encyclopedia

Win[edit]

Hopefully this will cut down on copyvios. It becomes much harder once Google runs its bots through Wikipedia. hbdragon88 06:28, 9 August 2007 (UTC) [reply]

Juan Pablo Aldasoro[edit]

I received a message regarding the copyright of the webpage of Juan Pablo Aldasoro as it seemed that it was a breach of copyright. I have created BOTH pages, the one at the wikipedia and the one at geocities, this last one is not copyrighted so there is no problem to be reported. Juan Pablo Aldasoro was my grandfather and all the pictures in both sites are mine and I have uploaded them with copyleft. —Preceding unsigned comment added by Creyes (talkcontribs) 12:42, 22 November 2007 (UTC)[reply]

Lookin' good![edit]

The new bot appears to be working like a treat! Congratulations, it is a great asset to Wikipedia. WWGB 13:05, 9 August 2007 (UTC)[reply]

An excellent idea. ♦ Sir Blofeld ♦ "Expecting you?" Contribs 15:16, 9 August 2007 (UTC)[reply]

Yeah, absolutely fantastic bot. Well done and thanks for the time you'll save me! J Milburn 02:54, 22 August 2007 (UTC)[reply]
Yes indeed. I was just peeking at some changes via my Wikpedia Ticker and saw the page get created. I sighed and went to leave the creator a friendly note. You robot minion already beat me to it. Great work! William Pietri 01:41, 28 August 2007 (UTC)[reply]

I'm impressed![edit]

Feeling a bit lazy I copied two lines from a resume for an author... and BANG! CorenSearchBot got me. I am impressed and a little mortified. No more sloppy article creation from me. Well done. This is a great service for WP Gillyweed 08:08, 6 September 2007 (UTC)[reply]

Question About Removing Tag[edit]

What happens if an editor just removes the CorenSearchBot notice? Before the bot came online, I used to tag copyvio articles with a speedy delete which placed them in my watchlist. If the tag was removed without addressing the copyvio I could take appropriate action. --NeilN 04:41, 14 September 2007 (UTC)[reply]

CSBot also lists the pages at WP:SCV, where they remain until human oversight (which happens frequently, and to which you are, of course, welcome to give a hand). Two other bots with slightly different matching algorithms also log possible copyright violations there, giving better coverage than any one of them could. — Coren (talk) 16:23, 14 September 2007 (UTC)[reply]

Bravo![edit]

Yay for another handy bot, making life easier for us human editors! -HamatoKameko 05:57, 9 October 2007 (UTC)[reply]

Coming to say something similar. Great idea for a bot, and doing great work. Thanks! —bbatsell ¿? 19:45, 11 October 2007 (UTC)[reply]
AGREED, love this bot! --Pmedema (talk) 04:06, 27 November 2007 (UTC)[reply]

Excellent[edit]

Can't imagine why you haven't received one yet.

moved barnstar

Actually, someone else was kind enough to give me a DaVinci Barnstar for CSBot in the past; I've moved yours to my user page. Thanks a lot. — Coren (talk) 22:48, 12 October 2007 (UTC)[reply]

Tagging me twice for copyvio this week within 1 minute of creating an article[edit]

I complained to ANI as I think this treatment is wrong and you seem to think it is O.K. And today your bot not only tagged me the same minute I created the article but was wrong. The site you accused me of copyvio was actually a copyvio of the reference citation I gave in the article that you tagged. I have no hope that this will stop as this is the way of wikipedia. And you and your friends think this is an O.K. way to treat editors. But I do not. --Mattisse 01:13, 20 October 2007 (UTC)[reply]

Petruska Article[edit]

I did copy it from that website...I was going to put a link to the website and say..."Text taken from...followed by the web link...I have removed the text for now...can I put it back if i add the text taken from..part?--Greenwood1010 23:58, 20 October 2007 (UTC)[reply]

As a rule, no. In all cases, certainly, attribution is required; but the site from which the text comes must also specifically allow redistribution either without conditions, or with a license compatible with the GFDL. Unless the site explicitly states so; or the contents is known to be public domain then the presumption goes against permission and you have to request it. — Coren (talk) 03:46, 21 October 2007 (UTC)[reply]


Ann Hraychuck and Gordon Hintz[edit]

They are not pieces with copyrights, they are biographys dispensed by these two politicians. —Preceding unsigned comment added by Egerermg25 (talkcontribs) 03:57, 22 October 2007 (UTC)[reply]

Air Farce Live[edit]

I created new redirects as Air Farce Live is now about the TV series and the article about the comedy album has been moved to Air Farce Live (album). Steelbeard1 15:26, 23 October 2007 (UTC)[reply]

Sea Drift message etc.[edit]

Hi there, I am trying to tidy up the situation re Sea Drift or Sea-Drift. The existing article as Sea Drift (no hyphen) (formerly) was a classical music stub for a 1933 work by Carpenter. The title however is derived from a section of Walt Whitman's poems Leaves of Grass, where however it is Sea-Drift (with a hyphen). In addition there is Delius's great work Sea Drift for baritone and chorus, also based on Whitman. I boldly decided (as the existing Carpenter article was a short stub) to create a new 'Sea-Drift' article as the primary, which describes the poetry section by Whitman. I then decided to disambiguate the two musical compositions based on it by calling them Sea Drift (Delius) (which I will start) and Sea Drift (Carpenter). I did this by creating the Sea Drift (Carpenter) article anew, and transferring the text there by copying, rather than by just 'move', because I then wanted to redirect Sea Drift (formerly Carpenter article) as a search term to the primary site, Sea-Drift, and didn't know how to detach it from the 'move' redirect which would have been created if I have just pressed 'move', the point being to get the search term to take one to the source of the title (Whitman), (and thence onwards), not at first to a relatively unknown classical work which is only one of several derived from the Whitman title. I was in the middle of doing all this when the Bot struck! I will press on, and the end result will be sanity and clarity. Kretzsch 17:25, 24 October 2007 (UTC)[reply]

Yonezawa Toys[edit]

Hi! The article was my own and written on another GFDL-licensed Wikia project. Thanks! --PMDrive1061 23:11, 25 October 2007 (UTC)[reply]

Amazing![edit]

This is an amazing bot! I can't believe you're giving up your bandwidth to do this! --wj32 talk | contribs 06:46, 29 October 2007 (UTC)[reply]

Tubious[edit]

Apparently, there is this website http://tubious.com/ that sucks down new Wikipedia entries, I guess as some sort of mirror. The bot caught one of the pages ([1]). The Tubious page pretty clearly indicates that this is a Wikipedia copy. Not sure if its going to be a big problem, but you might want to investigate and teach the bot to ignore Tubious in the future. CosmicPenguin (Talk) 02:54, 30 October 2007 (UTC)[reply]

Like you may have been, I was confused as to how Google had indexed Tubious so quickly. The page logs indicate that Tubious had caught the page when it was created back on August 6 (and subsequently speedied). The user re-created the exact same article today, and triggered the bot. So, I'm not sure its a huge issue, but probably something you'll want to look into anyway. CosmicPenguin (Talk) 02:56, 30 October 2007 (UTC)[reply]
Yeah, I was wondering that!  :-) But yes, I'll add tubious to the list of known mirrors— it's the first time it pops up in a search result but it can't hurt. — Coren (talk) 03:05, 30 October 2007 (UTC)[reply]

Who is tubious.com? within 3 hours of posting my user page, this website displayed a live link on Google with a redirect on the page "To remove or edit the content of an article, click on the edit tab at the top of the page or you can submit a request for Article Removal." The "Article Removal" text is a live URL, and the text is "Copyright Tubious". How can tubious.com copyright wikipedia submissions or articles? and furthermore; don't you think it odd to take a wikipedia.com submission out of the domain to solicit article removals?--Ccpentecostal (talk) 06:55, 5 December 2007 (UTC)[reply]

With your own form of vandalism[edit]

Now we have a new form of vandalism; the false claim that this bot has identified a copyright violation. I'm sure this will make the bot operator feel "loved". I encountered it at Ome Henk, see the history and Talk:Ome Henk. GRBerry 18:37, 2 November 2007 (UTC)[reply]

Actually, I'd call this a good thing in a sort of twisted way. I does mean CSBot's tags are now so recognizable as to be used by humans intending to make a point.  :-) — Coren (talk) 02:11, 3 November 2007 (UTC)[reply]

KBXO[edit]

You might want to take a look at your bot, it's a bit oversensitive. It identified KBXO as a possible copyright violation based on it's similarity to [2] which is a bit of a stretch. While both do mention the station's callsign and frequency, thats about it.--Rtphokie 20:15, 5 November 2007 (UTC)[reply]

Actually, that was serious brokenness. The bot has just been moved to the Wikimedia tool server and the transition lost some settings (which have since been fixed). Sorry for the trouble. — Coren (talk) 20:32, 5 November 2007 (UTC)[reply]

My recent edit[edit]

Hi bot. good catch, but if your author looks at my change, I was moving a page. I may not know how to best do this, but I did my best. See Bass Trap and Bass trap. I like to saw logs! 10:58, 8 November 2007 (UTC)[reply]

LivLite[edit]

CorenSearchBot flagged it from http://wikicompany.org/wiki/Livlite which licenses the material under the GFDL so perhaps the site should be whitelisted? -- Whpq 16:42, 8 November 2007 (UTC)[reply]

Notification[edit]

I just want to notify you that there is an arbitration case pending, where you are involved as a possible witness. Please feel free to post a statement there as well. 217.233.211.230 22:29, 9 November 2007 (UTC)[reply]

I've removed this case from ArbCom as they will simply reject it. 217.233.211.230 - you do know that you're talking to a bot? Michaelbusch 22:31, 9 November 2007 (UTC)[reply]

On existing pages as well?[edit]

I wonder if there is any way to extend this to large text dumps added to or replacing existing pages as well. I've stumbled into a number of them lately, but I understand there are so many edits made that it may be difficult to follow. Rigadoun (talk) 05:41, 13 November 2007 (UTC)[reply]

That's already been considered, but the biggest problem is that after a short while Wikipedia articles end up being mirrored in dozens or hundreds of places all over the net, so almost any article will match one of those mirrors.

Checking large text additions or replacement might be a better idea, but that also has a number of problems. It's an idea that remains active, however, and a possible future addition. — Coren (talk) 13:29, 13 November 2007 (UTC)[reply]


Lee Donoghue[edit]

sorry bout the copyright i clicked on the wrong stuff when i was editing the page. —Preceding unsigned comment added by SeanMorleyRoxs (talkcontribs) 20:47, 20 November 2007 (UTC)[reply]

Jewish encyclopedia[edit]

the bot is flagging items from the Jewish Encyclopedia (1906) as copyvio, when taken indirectly from [3] , a site licensed under the GNU Free Documentation License, Not just the jewish Encyclopedia, but everything on the site is therefore an acceptable source for WP.DGG (talk) 23:40, 20 November 2007 (UTC)[reply]

Two easy fixes:
  • Attribute the Encyclopedia with a standard template. If that's the case, I can teach CSBot about the template
  • Whitelist the secondhand source. I already did that. — Coren (talk) 03:25, 21 November 2007 (UTC)[reply]
I did the first one even before I posted, to take care of the present article, and your change will deal well with the future ones. DGG (talk) 08:09, 23 November 2007 (UTC)[reply]

CorenSearchBot is Broken[edit]

To whom it may concern (I use this language because I've never written to a bot before) -- Please be assured that my contribution of Gimbi is not a copyright violation. I am simply reusing the same formula for creating this article as I have for every article on an Ethiopian city, town or village. Moreover, you are mistaken in thinking that this page is being infringed on for two good reasons: (1) that article is about the Ethiopian city of Dembidolo while this is about the Ethiopian town of Gimbi, & probably the more important reason (2) that article is a mirror copy of this exact Wikipedia article -- which I started. I don't know how this bot screwed the pooch in this case (maybe it likes dogs -- I don't know, & except that it provides for a cheap but well-meaning joke, I don't wanna know) but you may want to have a look at the logic of the program & make some adjustments. But thank you for your efforts in keeping copyvios out of Wikipedia. -- llywrch (talk) 03:16, 22 November 2007 (UTC)[reply]


Anthemoessa[edit]

Hi! I created a page called Anthemoessa. It's about the island of the Sirens. http://en.wikipedia.org/wiki/Anthemoessa Would you mind editing it please? Thanks! Neptunekh (talk) 08:03, 22 November 2007 (UTC)[reply]

Mr Kane pt2[edit]

It's not copied text, it's an album I've created article for. Any reasons you've found it non-legal? Woop-Woop That's the sound of da Police 12:16, 25 November 2007 (UTC)[reply]


Norfolk Orbital Railway[edit]

This bot is rabid. The content on the Norfolk Orbital Railway was already on Wikipedia - muddled up with other entries, although I have edited it to remove confusion about the schemes. What is going on at Wikipedia. First images become unworkable, and now you cannot enter anything that is mentioned elsewhere on the net? Guess it's time for Wikipedia to fade out. User:RedCoat —Preceding comment was added at 00:17, 26 November 2007 (UTC) [reply]

Source code?[edit]

The source will be made available once the author is reasonably convinced it works well enough to break neither Wikipedia nor Google Yahoo (for now).

I am reasonably convinced. How about you? Hope you might post the source soon. Thanks. - Neparis 04:04, 1 November 2007 (UTC)[reply]

I'm in the middle of a code cleanup before CSBot moves to the toolserver. I'll post the result in about a week, I'd say. — Coren (talk) 02:07, 3 November 2007 (UTC)[reply]
Any news? Regards, Neparis (talk) 12:31, 22 November 2007 (UTC)[reply]
Yes, the bot is now on the toolserver. I'm not at home, so I don't remember the url to access fisheye off hand, but I'll post it tonight when I get there. — Coren (talk) 19:50, 30 November 2007 (UTC)[reply]
Ok, what is the url please? A search on toolserver under "corensearchbot" and even just "coren" didn't return any urls. I'm posting here in your (rather fresh) archive to keep the thread of conversation all in one place. - Neparis (talk) 13:48, 6 December 2007 (UTC)[reply]
[4]. Sorry it took so long; I forgot about posting the URL. — Coren (talk) 16:49, 8 December 2007 (UTC)[reply]

Please stop mis-reporting false positives for articles sourced from the NCI Cancer Dictionary[edit]

The hits your bot is finding are from a commercial site that mirrors the public domain NCI Cancer Dictionary. I've tagged these articles with {{NCI-cancer-dict}}: you can check their original public domain source by following the external link provided in each article. -- The Anome (talk) 14:51, 24 November 2007 (UTC)[reply]

an example is Biochemical recurrenceDGG (talk) 23:57, 25 November 2007 (UTC)[reply]
 Done That tag is now known by CSBot. A new version of the bot will be deployed this weekend which will allow administrators to add to the list of known tags on a local page, alleviating the need to make manual patches like this. — Coren (talk) 19:49, 30 November 2007 (UTC)[reply]

Misfound error[edit]

Hey, I recently split an article in which the discography of the band was placed along with general information, I put part of the information of the discography on a separate document (it is not completed yet), the information about the songs was copied from the original wikipage, it has identified this however as coming from a site I haven't used.ferrarius 19:22, 30 November 2007 (UTC)[reply]

You will notice that I made the latter into a redirect page to the former, in conformation to the convention about names of islands Wikipedia:Naming conventions. NO conflict here. Peter Horn 20:13, 30 November 2007 (UTC) [reply]

Hello, not so fast. Egg Island, Bahamas is now a redirect page, there is no duplication!!! Peter Horn 03:10, 7 December 2007 (UTC)[reply]

Cornwallis Island, Nunavut is now a redirect page. Peter Horn 01:00, 8 December 2007 (UTC)[reply]

THe above two from Peter Horn are refering to the bot tagging Wikipeida pages as copyvios of Wikipedia pages. Look at the deleted history at Cornwallis Island (Nunavut). Cheers. CambridgeBayWeather (Talk) 03:04, 8 December 2007 (UTC)[reply]

Please don't flag quotes from US law[edit]

Your bot erroneously flagged new article Conspiracy against rights because both it and some Web page contain quotes from the United States Code. Please keep an eye out for "USC" and references to U.S. Government web sites, and realize that these are not copyrighted. --FOo (talk) 18:03, 8 December 2007 (UTC) [reply]

Biker Mice[edit]

I am using the information from the original Biker Mice webpage and was was not aware that the episode summaries were from www.bikermice.tv i will remove them Dwanyewest (talk) 00:02, 10 December 2007 (UTC) [reply]

Checking existing pages?[edit]

Hi. Would it be possible for "you" to check the existing pages in Category:Home and Away characters? A few of these have recently been shown up as being copyvios from backtothebay.net, and I was hoping it would be possible to get the rest checked in a (semi-)automated fashion? If it's not doable, no problem, and thanks --Pak21 (talk) 00:15, 10 December 2007 (UTC)[reply]

Simply add a wikilink to the page you want to check to the "unprocessed requests" section User:CorenSearchBot/manual and CSBot will do the legwork when it gets a chance (usually 1-3 min delay). — Coren (talk) 00:27, 10 December 2007 (UTC)[reply]

The former was split off from the laatter, and the latter was converted to a disambiguation page. You caught it between the time I posted latter article and the changes to the former. — Bellhalla (talk) 14:49, 16 December 2007 (UTC) [reply]

This is an open source project, so is not copyrighted. It is an important collaboration, not a commercial venture, so this is not an advert. I will get permission of the site organisers to write about it (even though I don't hink I need it). Some of the text is taken from various places inside the website, but I am just getting the article started. Please whitelist this web site. Mike Young (talk) 11:05, 19 December 2007 (UTC) [reply]

Act Further to Protect the Commerce of the United States[edit]

The text from the webpage is from an act of Congress and is over 200 years old and is to my knowledge in the public domain.--Cdogsimmons (talk) 18:32, 21 December 2007 (UTC)[reply]


www.encyclopediawiki.org[edit]

You need to tell your bot to ignore this site; it is a scientific research project that is analyzing Wikipedia, and is copying a lot of data from it. Your bot is flagging Wikipedia articles as copyright violations of their pages, when it is that site copying Wikipedia. -- VederJuda (talk) 00:42, 22 December 2007 (UTC)[reply]

Bot detedts copys from one own sandbox..[edit]

I posted a new article copied from my own sandbox and the bot detected it. Perhaps the bot should ignore user space when comparing to main space articles?

--DP67 (talk/contribs) 12:54, 23 December 2007 (UTC)[reply]
Yes, it should. That's a known bug I hope to fix over the holidays. — Coren (talk) 16:14, 23 December 2007 (UTC)[reply]


Zăpodia River[edit]

They are two different rivers Afil (talk) 21:55, 23 December 2007 (UTC) [reply]

Change the text and the bot thinks it's copied from some other website and flags it again. User:Waqas.usman (Talk) 05:16, 27 December 2007 (UTC)[reply]

Actually, only new artciles can be tagged by CSBot — you have recreated the article that was deleted for copyright violation, but the slight paraphrase was insufficient. Please remember you must write the article in your own words, not simply paraphrase a source. — Coren (talk) 05:42, 27 December 2007 (UTC)[reply]

There's no mistake. I've combined the two articles with additional data, and have redirected them to this new one. --DanTD (talk) 16:32, 27 December 2007 (UTC)[reply]