User talk:GreenC/WaybackMedic 2.5

Updated link won't work

PLEASE HAVE ANOTHER LOOK AT THIS ONE—I find that link is still dead. Cheers, Bjenks (talk) 15:19, 28 August 2020 (UTC)[reply]

Bjenks. The purpose of {{dead link}} is to flag when a link does not have a web archive URL. Once it has a web archive URL the {{dead link}} template is removed. It is redundant to have {{dead link}} and archive URL. -- GreenC 15:23, 28 August 2020 (UTC)[reply]

Look at this

Is this still a problem ( https://en.wikipedia.org/w/index.php?title=Bookmarklet&type=revision&diff=904183172&oldid=901585122 https://en.wikipedia.org/w/index.php?title=Date_format_by_country&diff=next&oldid=805456307 ). I have done a bunch of nobots removal (for dead bots, and issues that are now fixed), and this one stood out. AManWithNoPlan (talk) 20:28, 15 May 2021 (UTC)[reply]

Great, AManWithNoPlan, glad someone is checking these!

In Bookmarklet javascript:location.href='https://web.archive.org/save/'+document.location.href; is causing trouble as it is trying to convert the /save/ URL to a proper archive URL which breaks with the +document as a "path". Given the nature of the article I decided to bypass it entirely. There's no other way such as {{cbignore}}. I don't keep a blacklist (skiplist) though that might be more polite than nobots, for my own bot.

For the other I can't tell at the moment why it is nobots, but seeing a lot of problems in the citations Medic would normally fix. It's possible there are severe timeout delays at the remote sites exceeding the ~ 4 hour limit to complete. Trying in expedited debug mode. -- GreenC 21:37, 15 May 2021 (UTC)[reply]

I guess just check https://en.wikipedia.org/wiki/User:AnomieBOT/Nobots_Hall_of_Shame from time to time. I found lots of UNreported bugs in a bot I work on that way. AManWithNoPlan (talk) 13:34, 16 May 2021 (UTC)[reply]

Manual option

GreenC, is there an option or a Toolforge process to manually run WaybackMedic or a flag that can be placed within an article to invite the bot for a visit and be included in its next run? Thanks! — WILDSTAR^talk 16:19, 7 November 2021 (UTC)[reply]

@WildStar: There is not, sorry. If you want to save dead links you can run IABot on the page (history tab->fix dead links). If it something you think WaybackMedic is best for let me know the page name and I'll run it. -- GreenC 17:08, 7 November 2021 (UTC)[reply]

adding in protocol of pages

could this get moved to the 'cosmetic' section? https://web.archive.org/web/20110205011118/http://users.utu.fi/mjranta/reprints/1.%20Rantala1999.pdf and https://web.archive.org/web/20110205011118/users.utu.fi/mjranta/reprints/1.%20Rantala1999.pdf go to the same place. Arlo James Barnes 08:20, 19 May 2022 (UTC)[reply]

Hi User:Arlo James Barnes. I believe it exists if this is what you mean Special:Diff/1088342443/1088614489 -- GreenC 14:21, 19 May 2022 (UTC)[reply]

https://en.wikipedia.org/w/index.php?title=Facebook&diff=1147940438&oldid=1147553700&variant=en

01760655558 119.30.39.122 (talk) 09:16, 5 April 2023 (UTC)[reply]

Source code for WaybackMedic 2.5

The GitHub repository only has source code for WaybackMedic versions 0, 1, 2, and 2.1. Where is the source code for WaybackMedic 2.5? Solomon Ucko (talk) 00:16, 22 September 2023 (UTC)[reply]

Curious about webcitation-to-archive.org conversion by bot

Regarding this diff, which is described in relevant part as "Rescued 1 archive link," I am curious why a live link to webcitation.org is considered to be in need of "rescue" and conversion to archive.org instead. Is there a WM policy I missed favoring the use of archive.org over other archivers? Is there some concern about the long-term viability or availability of webcitation.org I should know about, deprecating its use? Or maybe the original link was down when the bot checked it, though it was live when I made the edit and when I checked just now? Much obliged for any insight. —KGF0 ( T | C ) 19:23, 22 September 2023 (UTC)[reply]

WebCite was dead for nearly a year and half with no indication it was ever coming back. There is also an RfC to deprecate it. -- GreenC 16:31, 23 September 2023 (UTC)[reply]

Soundtrack Geek

Hi @GreenC: I noticed that your bot was able to tag a website as "usurped". I was wondering if you could do the same for http://www.soundtrackgeek.com/, which formerly hosted film soundtrack reviews but is now a website for adult content (content advisory!). There aren't very many incoming links, but could you tag those as |url-status=usurped as well? Thanks! InfiniteNexus (talk) 21:08, 1 February 2024 (UTC)[reply]

User:InfiniteNexus: I added it to the queue: Special:Diff/1198014184/1202023308 .. it might take a few months because I wait for domains to accumulate before processing at once is easier. I notice the link you gave shows a database error, but once usurped a site is at risk, so it will be good to do so. Thanks for the report. -- GreenC 21:59, 1 February 2024 (UTC)[reply]

Thanks. Interesting it now shows a database error; this wasn't the case a year ago. I don't know if the site will be back up. But old links, like http://www.soundtrackgeek.com/reviews/inception-soundtrack-review.php, still redirect to URLs with dirty words. InfiniteNexus (talk) 22:26, 1 February 2024 (UTC)[reply]

I'll make sure the archives are old since the newer archives appear infected. -- GreenC 01:46, 2 February 2024 (UTC)[reply]

Thanks. InfiniteNexus (talk) 18:18, 3 February 2024 (UTC)[reply]

IC Ronna lynn794 (talk) 19:05, 21 April 2024 (UTC)[reply]