Wikipedia talk:AutoEd/Archive 1

Page contents not supported in other languages.
From Wikipedia, the free encyclopedia
Archive 1 Archive 2 Archive 3

Initial discussion

(copied from User talk:Drilnoth)

Hi, since we seem to be collaborating quite a bit on script writing, I was wondering if you might want to try to come up with a more systematic way for collaboration. For example, I noticed that you copied some stuff from my script, which is absolutely fine as I am very happy to see people reusing what I have written. However, when I find bugs in this code, I am not able to fix them in your version and I may not be aware of every place where this bug has propagated if other people are using it. One possible option would be to continue as we have been working, but split off some of our tools into a (or a few) separate file(s). For example, see User:Plastikspork/tools.js. This would allow us to include one another's tools, using an 'includescript' statement. The function names would be prefixed with either Drilnoth or Spork or Plastikspork or whatever to make it clear where the function lives. The names of the functions may be long, but descriptive. The other option would be to create a separate project space where we could both contribute these short common helper functions, but that might be more complicated. Of course, both options require a bit of trust, but so does copying and pasting one another's code. But this is why I put every script that I include on my watchlist. In either case, I feel small helper functions and short scripts with collections of related helper functions are extremely useful. The problem with codes like 'Formatter' is that primarily that I cannot include it without everything. Let me know what you think or if you have any better ideas. I plan to continue to split my script into these smaller subscripts since there are several users of my scripts who have expressed an interest in using on part of its functionality. Thanks! Plastikspork (talk) 16:44, 27 April 2009 (UTC)

Another reason why this is a good idea. If you check no:Bruker:Helt/codefixer.js, you will see it has some of the bugs which you fixed in your version. This collaboration would allow Helt to use parts of your script and regionalize it for his/her own purposes. Plastikspork (talk) 17:54, 27 April 2009 (UTC)
Interesting ideas... so basically, you're suggesting that a single "main script" page contain the basic code needed to initiate the other scripts, and then the others can be imported separately onto that page, because then someone who wants to reuse the code with modifications could just copy the main script page and add/remove what they didn't want? And then some of the functions would be in my userspace, and some in yours? That would make sense to me. We'd need to work out some of the details (e.g., should it all be semi-automated? Use in-edit mode sidebar buttons for some things like your script does now [like the stuff like date reformatting and table conversion], and have things that don't need much human intervention work using a single button? I'd personally prefer the latter), but the overall concept could work pretty well. If you'd like, I can mock up a "main script" page at User:Drilnoth/codefixerworking.js and we can see how it goes... both scripts would need some reworking/reformatting to make this kind of a thing work, although if you didn't mind I could do most of that (for example, we'd want to unify whether we use "str" or "txt.value", and I could update pages in both userspaces so that we don't cross paths or anything). How does that all sound? –Drilnoth (T • C • L) 19:31, 27 April 2009 (UTC)
(copied from the section below where it was posted by accident) –Drilnoth (T • C • L) 20:15, 27 April 2009 (UTC)
Perhaps WP:FORMATTER and WP:ADVISOR could also just be merged in over time if we did this; I can edit the protected pages to make it work. –Drilnoth (T • C • L) 20:08, 27 April 2009 (UTC)
Well, Advisor would take some work but we can probably "steal" a lot of its functions. –Drilnoth (T • C • L) 20:08, 27 April 2009 (UTC)

Yes, I think you have the basic idea. In the first iteration you could start by doing the following:

  1. codefixer.js would include (1) the codeFixerMinor variable initialization, (2) one or more scriptinclude functions, (3) the codefixerstartinedit and codefixerplusstartinedit functions and everything below that point.
  2. tools.js or codefixertools.js or whatever you want to call it would include the codefixer, codefixerlplus, and FirstToLower functions

As a second step, it would be great to see the large codefixer function broken down into subfunctions which do only a specific task

  1. codefixer_htmlchar2wiki could convert html characters to ascii
  2. codefixer_htmltable2wiki could convert html tables to wikitables
  3. codefixer_sectioncaps could convert capitalization of common section headings
  4. ... and so on ...

Of course, we will probably find that once this is performed that there are repeated subfunctions which can start to share. We might want to put these shared functions (once they have been reasonably debugged) in a common place outside of the user namespace which is then edit protected? Much in the same way that Formatter isn't in the user namespace and many people can reuse these functions by simply including them in their scripts.

As a programming note, it would be great if these smaller subfunctions could take strings as input and return strings as output, that way they can be used in a more general context. This is why I have been using str for subfunctions which do not load the editform and txt.value for the parent functions which does load the editform. I actually like the way that WP:FORMATTER is structured, if only it were (a) split into two parts so the subfunctions could be reused and (b) used prefixed function names to prevent name collision. Today, I have been splitting my code into smaller chunks with reusable subfunctions. I currently have User:Plastikspork/tools.js and User:Plastikspork/datetools.js. My plan is to split tools into linktools and perhaps html2wikitools. I believe we will find quite a bit of overlap at that point and we can look into merging my html2wikitools. Or if we can come up with a plan sooner, I can just remove the html2wikitools part of my code and merge it into a shared tool space. Plastikspork (talk) 23:45, 27 April 2009 (UTC)

(not indented due to # of paragraphs) Great! I've been working today on creating a "core" framework at User:Drilnoth/EasyEd/core.js, although moving that into Wikipedia-space and protecting it would probably make more sense. My idea was that it could be structured as follows:

  1. EasyEd/core.js contains the "main" functions needed to make the scripts function, looking pretty much like what it is now.
  2. Each different type of fixer function would be on its own page, so that each one can be imported separately. E.g., Unicodifying would be in one, link simplification in another, HTML to wikitext in another, etc.
  3. EasyEd/main.js would import core.js and the "basic" selection of helper functions, customized appropriately, so that that page could be imported without further configuration.
  4. Because of how it is set up, users could create their own functions and import them into the script for use. For example Dinoguy's fullwidth replacer could be an additional helper script which could be added using a basic customization system.

The setup which I have in mind would allow for every function to be set as either automatic (done simultaneously with other edits by clicking a tab) or selectively activated (like how the various Sprk functions are now). It would be easy to maintain and customize, and we'd both be able to contribute "modules" from our own userspaces, as could others, to the "main" selection. Users who want to customize what their script replaces could configure it to do so, using a basic help guide that describes the script's layout and how it can be altered (maybe including a basic RegExp introduction for new scripters?).

I was trying to come up with a better name for it than "CodeFixer", since that's already kind of inaccurate and with this level of customizability it most certainly would be wrong, and "EasyEd" was the best that I could come up with... it's an editing tool that makes simple or common fixes easy to do. If you have a better idea for a name, I'd be happy to hear it.

I don't quite understand why "str" or something similar would make more sense than just using txt.value... what difference does it really make? I'm open to having it either way, I'm just curious.

I'm thinking that this could then by a real "community" script... the two of us might be the primary "maintainers", but everyone will be able to create new modules for it.

Does that make sense? Or am I going in completely the wrong direction? I can keep working on this... I think that working on the "core" or "main" functions at the same time could get confusing, so if it makes sense to you I could get those basic two set up and then we can work on converting our scripts to use the proper format/design. I'd just need to know whether str or txt.value should be used and I'd be ready to really start, if you think that this is a good idea. I think that its pretty close to yours, except that everything would be set up in one step. –Drilnoth (T • C • L) 00:46, 28 April 2009 (UTC)

All this sounds very good and I think we are converging on a plan. The reason why there are so many 'Spork buttons' is actually due to requests. My script originally only had one button, but then someone said "Hey, could you split your one button into two because I don't want to remove whitespace, just fix links", then someone else said, "Hey, could you split ..." and the next thing you know, I have way too many buttons. This is one of the things that got me thinking about this idea of making smaller modules and allowing people to pick and choose. I think that having a community script and a community set of "smaller tools" is a great idea as soon as we have things reasonably debugged. Having the stable core set of tools admin protected would also be for the best for security.
The txt.value vs. str is not so much a debate about the name, but a question about how to pass the string to a function and how to best repeatedly access it. There is a fairly informative article about forms here. When we write txt = document.editform.wpTextbox1;, the txt variable holds a reference to the textbox object from the first textbox on the page. When we write str = txt.value we now copy the contents of the textbox into a string. Now, the question is do we want these subfunctions to act on strings or act on textboxes? Certainly repeated writing 'str = str.replace(...)' is shorter than 'txt.value = txt.value.replace(...)'. The question is what does javascript do when you execute 'txt.value = txt.value.replace' vs 'str = str.replace' and is one faster than the other? I actually don't know and it probably doesn't make much of that difference. Note that 'str' and 'txt' are just a names and I just as easily could have said 'monkey = document.editform.wpTextbox1', but the convention in most of these scripts seems to be use the name 'str' for strings and 'txt' for textboxes.
My only request would be that when we call these string processing functions we actually pass them an argument rather than having every single one load the contents of the textbox. One reasoning for this is to try to keep the specific name for the textbox we are operating on (i.e., wpTextbox1) isolated to the same location as all the other wikipedia specific functions (e.g., onLoadaddHook, ...).
So we can either call these functions like txt.value = sample_function(txt.value), which is how Formatter does it, or like sample_function(txt), which should work and wouldn't require a return statement at the end of sample_function. The only unmentioned advantage of using txt.value = sample_function(txt.value) would be that these could be thought of as string processing functions which could be used on subsections of a textbox form, making them more general since they don't need to know about the fact that this string came from a textbox. Computer Scientists usually try to have functions operate on a 'need to know basis' for security reasons, although that's probably not an issue here, but just a general programming style. Plastikspork (talk) 02:09, 28 April 2009 (UTC)
As for "Dinoguy's fullwidth replacer", yes that would be another script which could be split into a core plus a main script and could be added to the toolset if he/she is interested. Plastikspork (talk) 02:13, 28 April 2009 (UTC)
And, as for a name for the set of tools, I don't have a strong preference, but I agree that CodeFixer is probably not the best match. I was thinking that since it does automate things, it might be useful to put the word 'Auto' in the title. Something like 'AutoWikify' or 'AutoEdit' or 'AutoWikiClean' or 'AutoClean' or whatever. Just a thought. Thanks! Plastikspork (talk) 02:13, 28 April 2009 (UTC)
Okay; I'll just trust you on that whole thing (I'm a bit too tired right now to really be able to process all of the JavaScript stuff) and start working on the script tomorrow (I'll also move the script to project space so that you can work on it for now, too, although I'll need to protect it once it's ready to "go live" for security reasons). For the name, "WikiCleaner" was my first thought, but that's already taken. :( "AutoWikify" would work with the current selection of fixes, but modules could also be made to fix things like typos (as configuration, of course, not core) and then that wouldn't make sense. "AutoEdit" sounds good to me (probably shortened to "auto ed" for the tab and sidebar names).
I see your point about the buttons and all; that's why I think that this could be a really nice script because of customizability. Theoretically, any module could be used as either a sidebar button or a simultaneous fix with other modules, so that each user can tweak it to suit their personal preferences. –Drilnoth (T • C • L) 02:24, 28 April 2009 (UTC)

Program structure

Excellent. We now have a great start for the project. It really looks very nice so far. I have been a bit busy today, but I have started to look over the project and I have some initial thoughts:

  • If someone average user wants to use the script, I am assuming they just include 'basic.js'?
  • It looks like there is a dependency loop between 'core.js' and 'basic.js', but I think I understand the logic. I believe you are planning to have an advanced version as well, and the 'advanced.js' or 'plus.js' would have more functions through an alternative redefinition of 'autoEdFunctions'.
  • I believe it is possible to do this without having this strict circular dependency, but I will have to think about it. I have an idea, but I will have to mock it up to show you what I am thinking.

Question: How should I go about helping edit the code? Is it possible to an individual exception for the protection, or should I apply for admin? Thanks! Plastikspork (talk) 02:28, 29 April 2009 (UTC)

You are correct that, for the script's most basic use, all that needs to be imported is basic.js. I hope to start writing up some real documentation over the next few days that will clarify that kind of thing.
Both the "core.js" page and another function or set of functions will need to be imported so that the script works; "basic.js" is a predefined set so that the average user won't need to know anything about the coding. Having it set up this way should, however, allow someone who is interested in doing so to fully customize what the script does by importing core.js to a page and then defining autoEdFunctions on their own. My hope is that this can be a really customizable tool useful to build other scripts and to just allow personal customization without too much difficulty. The script will fail to work if the autoEdFunctions function isn't defined. The presets (basic.js, advanced.js, etc.) will allow users to import both the main script and a predefined set of modules without much difficulty. If someone wants to create their own module or remove one of the modules in basic.js, they can just copy the code and modify autoEdFunctions in their personal monobook.js page.
This kind of set up will also make creating other, similar scripts easier; all you'd need to do is import core.js and the individual modules, define autoEdFunctions, and add any other code which is needed. Things like Formatter could potentially be converted to use this script once some more custom variables are defined. At least, that's my feeling on its use.
Now I'll admit that I don't know anywhere near enough JavaScript to understand the circular dependency that you mention or how to remove it; I pretty much just learned the basics of the language from a book and have been experimenting and looking at examples to learn more. But the way I see it, the script works as-is and will be customizable without problems. Users who import the presets will never need to even worry about the circular dependency, and the documentation on customizing the script should make the dependency clear and straightforward for users who are customizing the script.
Does that all make sense to you? Does it sound reasonable?
Anyway, to answer your final question, there are a few things that we can do. First is that you can post any code changes here and I'll add them as soon as I see it here, although I understand that that could be a pain. You could try creating a sandbox version of the script in your userspace, and work on that (I don't plan to alter the main script much anytime soon; I hope to get advanced.js and the documentation created before playing more with user-defined variables). Then you could just let me know when you wanted it copied back over.
Now I haven't looked over many of your contribs, but based on my experience with you an RFA might also be a good idea. My experience with you would indicate that you are an understanding, logical editor who can be trusted with the tools. I'd need to look over your contributions some more to make sure, but I'd certainly consider nominating you based on what I've seen if you're interested. –Drilnoth (T • C • L) 15:44, 29 April 2009 (UTC)
I do like the idea of having a stable branch of the code that can be trusted. How about if we create a test or unstable branch of the project as a subdirectory. It would contain a copy of everything that you have right now, but would not be protected. Once the project has matured, we could use that area to discuss coding technicalities and leave this talk area open for bug reporting and feature requests? This is probably a good idea in general regardless of whether or not I become an admin. Or, if you think that's overkill, I can create a copy of the entire AutoEd tree in my own user space and edit there. Thanks! Plastikspork (talk) 16:39, 29 April 2009 (UTC)
By circular dependency, I mean that someone cannot include 'core.js' without including 'basic.js'. It's absolutely reasonable to have the dependency in the other direction. One way to make it so that 'core.js' could be reused without including 'basic.js' would be to have a function pointer to 'autoEdFunctions' passed to 'AutoEdFromEdit' and 'AutoEdFromView'. I can mock something up to show you what I mean. Plastikspork (talk) 16:43, 29 April 2009 (UTC)
Having a separate, unprotected tree would probably make sense, using names like Wikipedia:AutoEd/Working/core.js. I'm sorry that I had to protect them, but its really just a security thing. I trust you, but not nesseccarily these guys. :)
I think that you should be able to use core.js without basic.js if you define the autoEdFunctions function and import the individual modules separately. That's how it can be customized: Rather than importing a preset, you can just import core.js and the modules that you want and define the function in your own monobook. I'm not opposed to having a better way to do it, but it should be simple so that you don't need to know much of any JavaScript to customize your personal use of AutoEd. –Drilnoth (T • C • L) 21:09, 29 April 2009 (UTC)

Changes to core.js

I decided to create my own user-fied development version (User:Plastikspork/AutoEd). It might make sense to have a development branch, but this should suffice for now as it will allow me to do testing without breaking the main code. As a result of some testing, I have my first request for a changes to the code. I have implemented these changes in User:Plastikspork/AutoEd/core.js, and you should be able to just replace WP:AutoEd/core.js with the contents of that file. If you check the edit history, you can see the sequence of changes, with a few mistakes and second thoughts about design. However, it might be better to just summarize the basic changes:

  1. Create a new function which adds of a tag to the "Edit summary", this function is called 'autoEdEditSummary'.
  2. Lowercased the function names so match the conventions of 'autoEdfoo' and indented code
  3. Prefixed the queryString function name with 'autoEd' to make it 'autoEdQueryString'
  4. Merged 'autoEdFromEdit' and 'autoEdFromView' into a single 'autoEdExecute' function
  5. Enlarged the set of configuration options which will allow significant customization (for example for non-English language users), but with sensible default values in case these options are not set.
    1. autoEdMinor = true (we already had this one)
    2. autoEdClick = true (click on 'Show changes' after edit)
    3. autoEdTag = "Cleaned up using AutoEd" (default edit summary tag)
    4. autoEdLinkHover = "Run AutoEd" (text when mouse hovers over the link)
    5. autoEdLinkName = "auto ed" (name to give to the link)
    6. autoEdLinkLocation = "p-cactions" (where to place the link, cactions or toolbox or ? )
  6. Added a default 'null' autoEdFunctions in case someone includes 'core.js' without 'basic.js', fixes the circular dependency.
    • The default "null 'autoEdFunctions'" issues an alert to let the user know that he/she forgot to define it.

Let me know if you have any questions or suggestions. Thanks! Plastikspork (talk) 20:42, 29 April 2009 (UTC)

Awesome work! I'll add this shortly. The extra variables are really good; I was planning on making them at some point, but I hadn't gotten around to it. Some more could be added later, but that seems to be the most important set. I can write up documentation over the next few days; could you maybe work on making sure that your Sprk script is compatible? –Drilnoth (T • C • L) 21:14, 29 April 2009 (UTC)

htmltowikitext.js

I made some updates to htmltowikitext.js in my version (User:Plastikspork/AutoEd).

  1. Improved the <em>, <i>, <b>, <strong> conversion to allow it to deal with singly nested tags (e.g. <b>bold <i>italics</i></b> or with the order reversed).
  2. Improved the <br> match so that it may match <br> tags with newlines inside the tag.
  3. Simplified the <references> tag match logic and expanded it to include repeated <references> tags
  4. Added a loop to remove newlines from inside heading tag spans which increases the likelihood of a match

I tested it on a sandbox page and it appears to be working correctly.

It's actually interesting to see what WP does if you give it incorrectly formatted HTML tags. For example, if you try to close a <h3> tag with a </h5> tag. It appears that WP corrects this problem and closes the tag for you. In addition if you try to nest a <h1> tag inside an <h3>, like say <h3>This is a <h1>big</h1> heading</h3>. It appears WP trys to split this into <h3>This is a </h3><h1>big</h1> heading. I was tempted to add some of this as well, but I will wait until we actually see it in practice (WP:KISS). Plastikspork (talk) 05:12, 2 May 2009 (UTC)

 Done; looks good, thanks. Also, I have a few requests and was wondering if you could code them (if they're possible), because I'm not experienced enough with RegExp yet.
1) Remove all bold text from section headings, rather than just from the beginning and end.
2) Replace HTML list elements with wikitext. This seems a bit trickier than some things because there are both ordered and unordered lists which would need to be converted differently.
3) Tweak the script so that it doesn't make any edits to external links.
Thanks. –Drilnoth (T • C • L) 14:17, 2 May 2009 (UTC)
In the case of HTML lists, it's important to note that some of them are explicitly used because they allow you to add attributes and styles to the list (most often by specifying start="#" for ordered lists); these lists should not be touched by the script. ダイノ ガイ ?!」(Dinoguy1000) 16:14, 2 May 2009 (UTC)
Good point; thanks for mentioning that. –Drilnoth (T • C • L) 16:21, 2 May 2009 (UTC)
  1. Removing bold from section headings is easy.
  2. List elements can be done. You are correct that the <li> tag is tricky since it can belong to either a <ol> or a <ul>. The answer to this problem is to match the <li> and <ol> (or <ul>) at the same time. If you can show me an example of what is safe and what isn't safe, I can limit how aggressive to make the match. Is it enough to just avoid any <ul>, <ol>, and <li> with class/style/other attributes? We already do this with <i> and <b>.
  3. Is there a particular part of the script is causing problems with external links? I haven't had this problem with any parts in my script, but my script doesn't perform transform the same stuff.

I have some more improvements to some of the current functionality, but I am still merging and testing. Plastikspork (talk) 05:48, 3 May 2009 (UTC)

Thanks. For the lists, I'd think that most list conversions are safe unless one of the tags has something like CSS styles in it, as with what Dinoguy1000 mentioned. Also when working with lists, you'll probably want to match and fix them even if they don't have end tags, if that's possible... that's pretty common. For external links, sometimes unicodifying causes a problem (I removed the really common error-causer, &amp;, however), but there are also some websites which use ISBNs in the link name and sometimes AutoEd will try to "fix" them. That's not very common, though; just a thought. –Drilnoth (T • C • L) 15:01, 3 May 2009 (UTC)

Minor bug of AutoEd

The editor seems to replace &middot; (·) by the bolder dot •, which is a different symbol. (See [1] this edit.) Whoever is involved in the program, it would be good to fix this. Thanks, Jakob.scholbach (talk) 10:48, 3 May 2009 (UTC)

Oops; my bad.  Fixed. Thanks for mentioning this; it seems to have been a copy-paste error while I was setting up the script. –Drilnoth (T • C • L) 14:47, 3 May 2009 (UTC)

Compatibility with wikEd?

Folks, do the new scripts play nice with wikEd or does wikEd have to be disabled before they will function? Thanks. – ukexpat (talk) 22:00, 4 May 2009 (UTC)

I haven't used wikiEd, but I'll install it and give it a go later today. –Drilnoth (T • C • L) 22:24, 4 May 2009 (UTC)
Report: There is some incompatibility, but the script does still function to some extent. It seems that once you are in edit mode AutoEd won't work because wikiEd has replaced the main text box area with its own. However, if you start AutoEd from outside of edit mode, it works because wikiEd then hasn't been fully loaded. I only tested it a few times and I also tried to use an easy way to fix this, but it didn't work. I'll keep this on my mental to-do list and will try to fix it in the future; for now the script is still usable, it just can't be run while you're already making an edit. –Drilnoth (T • C • L) 22:38, 4 May 2009 (UTC)
Excellent, thanks for looking into it so quickly. – ukexpat (talk) 00:00, 5 May 2009 (UTC)
Happy to help! –Drilnoth (T • C • L) 01:26, 5 May 2009 (UTC)

Bug with categorys

When i used AutoEd here it changed [[Category:Saturday morning television| ]] to [[Category:Saturday morning television|Saturday morning television]] as Saturday morning cartoon is the main article in Category:Saturday morning television it should not change [[Category:Saturday morning television| ]] to [[Category:Saturday morning television|Saturday morning television]]. Powergate92Talk 22:17, 4 May 2009 (UTC)

That would be a bug copied from WP:FORMATTER. I'll look into fixing it. Thanks! –Drilnoth (T • C • L) 22:26, 4 May 2009 (UTC)
The error is most likely caused by Wikipedia:AutoEd/links.js, although I can't quite figure out the specific part which I think is the culprit. Plastikspork, could you maybe take a look at this? –Drilnoth (T • C • L) 22:58, 4 May 2009 (UTC)

You should just disable Wikipedia:AutoEd/links.js until we have a chance to audit the code. I see some other sketchy stuff in there as well and I don't trust it. Plastikspork (talk) 00:30, 5 May 2009 (UTC)

I haven't had time to do a full code audit Wikipedia:AutoEd/links.js, but I did find the culprit, and it appears to be a false positive. If you open up and edit window for Saturday morning cartoon and scroll down to the bottom of the page. Now remove the space from [[Category:Saturday morning television| ]] to change it to [[Category:Saturday morning television| ]]. Now click on "show changes". It will indicate that you have changed it to [[Category:Saturday morning television|Saturday morning television]], but this is actually not the case, but a problem with "show changes". Now, what do you think we should do about it? I can make the code not remove this whitespace, but then again, it's not really a problem is it? FYI, the line that's doing it is the first line after the end of the if statement in Wikipedia:AutoEd/links.js. It removes all trailing spaces from inside wikilinks, which is actually not safe in general. The code I use for these transformations is more complicated for this reason. Thanks for the report. Plastikspork (talk) 01:01, 5 May 2009 (UTC)
Weird. Well, I've commented out that one replacement so that it can be fixed. The rest of the that part of the script when it was used in Formatter seemed to work well, so I've left the rest "active" for now, although I can comment them out if you think that that would be better. –Drilnoth (T • C • L) 01:31, 5 May 2009 (UTC)
That sounds reasonable to me. When I have some spare time, I plan to do a fairly complete code audit of each function. Most of these codes have very few comments, which is good since they load faster, but not so good since it's hard for the novice to know what each line does. I try to keep at least minimal comments in what I write, but it might be an even better idea to create more verbose comments in a '/doc' subfile for each module. At least this would be useful for the more complicated functions. If only I had more free time. Plastikspork (talk) 01:45, 5 May 2009 (UTC)
Sounds like a plan. I hope to go through and add comments to a lot of the modules once I'm done writing the information on how to customize the script. –Drilnoth (T • C • L) 01:51, 5 May 2009 (UTC)

Another bug (presumably with the same module): [2]. –Drilnoth (T • C • L) 02:50, 5 May 2009 (UTC)

I figured out what happened. If you look up a couple lines, you will see that the bracket for the external link on the prior line was not closed. The script looked past the newline, and past the following section heading and tried to make the brackets match up by removing one from the link to boxing. This should probably not be a multiline match, if you consider that

[http://en.wikipedia.org wikipedia]

does not render the same as if it were on a single line, without the newline in the middle of the link. Clearly it would be great if it could intelligently detect the difference, but for now it's probably better to be conservative. If you change the line after 'repair bad external links' to the following, it will stop that particular bad edit from happening.
 str = str.replace(/\[?\[http:\/\/([^\]\n]*?)\]\]?/gi, "[http://$1]");
I should have some time in a couple days to look through the rest of that code and test it more completely. My initial reading of that module gives me the feeling that it's pretty bold with its transformations. Plastikspork (talk) 15:29, 5 May 2009 (UTC)
Sounds good;  Done. –Drilnoth (T • C • L) 16:25, 5 May 2009 (UTC)

What am I doing wrong?

Hello again

I have tried to use the AutoEd in English and Norwegian, and in both languages I keep getting the 'AutoEd/core.js: autoEdFunctions is undefined'-alert. I've installed all the scripts and subscripts to my userspace in Norwegian, and added wikichecker.js and core.js to my monobook. I'm sorry if I'm overlooking something, but I would really appreciate some assistance here. -Helt (talk) 09:19, 6 May 2009 (UTC) —Preceding unsigned comment added by Helt (talkcontribs)

Did you figure out the problem? You might have to include the full url path when using it across WPs? Sorry I didn't see your message earlier. Plastikspork (talk) 16:38, 11 May 2009 (UTC)
See also User talk:Drilnoth#CodeFixer this discussion. Maybe there's a browser issue? –Drilnoth (T • C • L) 19:25, 11 May 2009 (UTC)

Extrabreaks

I simplified and expanded the extra breaks code to just three lines User:Plastikspork/AutoEd/extrabreaks.js

  • Line 1: Removes br-tags before ]] OR }} or |.
  • Line 2: Removes br-tags before list items (newline-asterisk)
  • Line 3: Removes br-tags before one more more newlines.
  • Notes: All three will remove whitespace around the br-tag when the br tag is remove. All three use a fairly general match for br-tags to include possible whitespace/slashes/periods inside the br tag. All three should preserve newlines, even those which could be safely removed. Line 1 preserves some whitespace before the ']]/}}/|', just in case it's there for indentation purposes.
  • Comment: This routine could be made more aggressive in its matches/replacements, however, it's probably better to try to keep all the newline-removal to a single sub-function if possible. The reason for this is that removal of newlines tends to generate larger diffs, which makes it harder to see what was changed. I usually run newline removal as a second pass, after I check to make sure the first pass didn't do anything problematic. Plastikspork (talk) 17:10, 11 May 2009 (UTC)
    •  Done; looks good. –Drilnoth (T • C • L) 19:27, 11 May 2009 (UTC)

I have been using <br> within sortable tables so that the sort icon is always in the same place in the heading, centered on a line by itself and thus ALSO not needlessly widening the column. The script is removing them.--JimWae (talk) 03:54, 21 June 2009 (UTC)

I can see how this could be undesired. We could add in a special case to catch it if it is a frequent problem. Could you provide a link to an example page for testing purposes? Thanks! Plastikspork (talk) 12:51, 21 June 2009 (UTC)
This is interesting... it should be fairly easy to fix, but having a sample of where this happens would help. –Drilnoth (T • C • L) 14:41, 21 June 2009 (UTC)

Here is a sample. There is no big rush, and thanks for your attention --JimWae (talk) 18:23, 21 June 2009 (UTC)

Hmm... I see. Plastikspork, could you fix up that code so that it doesn't do this? I'm not good enough with RegEx to feel confident in this case. –Drilnoth (T • C • L) 18:27, 21 June 2009 (UTC)
But this is going to change its behavior only with "sortable"s, right? Locos epraix ~ Beastepraix 18:32, 21 June 2009 (UTC)
Theoretically that could be done, but I think that <br>s located before any | should probably be kept (it's actually prone to false positives). –Drilnoth (T • C • L) 19:22, 21 June 2009 (UTC)

Sortable tables are all I have seen that needs changing --JimWae (talk) 18:53, 21 June 2009 (UTC)

I should be able to make it leave the br tags in the table headers only for sortable tables, using a multiline match. I should have time to work on this fairly soon, although my current jetlag might think otherwise. Plastikspork (talk) 14:02, 22 June 2009 (UTC)

headlines

I have created an updated version of User:Plastikspork/AutoEd/headlines.js.

  • Task 1: Removes up to 10 bold mark-ups from inside a section headline
  • Task 2: Removes any trailing colon from inside a section headline
  • Task 3: Corrects caps for see also sections
  • Task 4: Corrects common synonyms for "see also", but only if "see also" does not already exist.
  • Task 5: Corrects common synonyms for other section names
Comment: It should preserve whitespace as it appears that is currently handled by another sub-function. Plastikspork (talk) 17:46, 11 May 2009 (UTC)
Awesome!  DoneDrilnoth (T • C • L) 19:28, 11 May 2009 (UTC)

[[a]] [[b]] into [[a]] [[b]]

I am copying the autoEd check scripts to Chinese Wikipedia. I noticed that it will convert [[a]] [[b]] into [[a]] [[b]], but this is not necessary for Chinese because in Chinese we don't seperate words by space. Can anyone help me cancel this function? --Ben.MQ (talk) 16:41, 14 May 2009 (UTC)

Yes, this is one of the lines of the code which I thought was suspect. It's the first line of 'links.js' that's making this change. Let me know if you need more help fixing this problem on your end. Thanks for the report! Plastikspork (talk) 16:46, 14 May 2009 (UTC)
Also, just to let you know, we do hope to create a more cross-langugae compatible version, with better documentation on how it could be used across languages without needing to copy the entire code each time that it is updated. –Drilnoth (T • C • L) 16:59, 14 May 2009 (UTC)
Looks pretty good. Thanks a lot for your help.--Ben.MQ (talk) 03:38, 15 May 2009 (UTC)

Changes to interwikis

Hello!, as a wikichecker I usually use this script and I have noted that it damages some Japanese interwikis (as you can see here), I have had to fix the interwiki by myself, is someone else having this trouble? Locos ~ epraix Beaste~praix 03:02, 16 May 2009 (UTC)

That is the FullWidthReplacer module. I'll remove it from AutoEd's complete version because it does cause more false positives than other modules; it can still be used by manual customization by following the instructions at Wikipedia:AutoEd/Customization. Thanks for letting us know! –Drilnoth (T • C • L) 03:06, 16 May 2009 (UTC)
Perhaps wikilinks could be excluded. Locos ~ epraix Beaste~praix 03:08, 16 May 2009 (UTC)
That sounds like a completely reasonable idea. It will require a rewrite of that module as it currently doesn't check the context of what it is modifying. I could write a smarter version if you think it would be useful. Plastikspork (talk) 03:14, 16 May 2009 (UTC)
If I might skip back to the original topic, I think it's probably worth noting here that I was the original author of FullwidthReplacer (my version can still be found at User:Dinoguy1000/scripts/fullwidth2ascii.js), and I never intended it to be a general cleanup tool - it was, in fact, created with the very specific purpose of helping with manga chapter lists, and I actually never once thought about potential effects it might have on wikilinks, interwiki links, and URLs. Drilnoth was a bit persistent in wanting to include it with his CodeFixer tool (and later AutoEd), so I eventually relented. =) ダイノ ガイ ?!」(Dinoguy1000) 05:02, 16 May 2009 (UTC)
Yah, sorry about that. Having it as an AutoEd module is pretty useful IMO, but I now see why it shouldn't be included by default. My apologies. –Drilnoth (T • C • L) 10:44, 16 May 2009 (UTC)
No worries. ;) ダイノ ガイ ?!」(Dinoguy1000) 17:25, 16 May 2009 (UTC)

<br> vs <br />

Wow, you are fast, anyway thanks for the script. I have also another suggest, couldn't it replace every <br> it finds for a <br /> it seems safer for a XHTML website as Wikipedia. Locos ~ epraix Beaste~praix 03:16, 16 May 2009 (UTC) PS:I know of the existence of HTML Tidy but I still consider this change useful.
There was some discussion of this before ([3]). The problem is that there is some disagreement on which form should be used as wikipedia isn't really XHTML, nor is it HTML. The decision was to convert to either <br> or <br />, depending on if there is a slash in the tag to start with. I personally like <br> as it shorter and backwards compatible, but others like <br />. Mediawiki converts these to whatever form it thinks is best. Plastikspork (talk) 03:23, 16 May 2009 (UTC)
Yes, and in the [[Wikipedia:Village pump (technical)/Archive 60#Changing
to
village pump]] too, but should we let the servers do this change?, as you probably know they have no superpowers and perhaps this minor change could take off some processing charge to them. Locos ~ epraix Beaste~praix 03:43, 16 May 2009 (UTC)
We don't need to worry about performance. –Drilnoth (T • C • L) 10:46, 16 May 2009 (UTC)

I would suggest to add a regular expression that detects a common error found in the checker: <br>> and <<br>, the script doesn't detect them and I have to fix them manually. Locos ~ epraix Beaste~praix 17:41, 17 May 2009 (UTC)

 Done. I think... –Drilnoth (T • C • L) 20:00, 17 May 2009 (UTC) Seems I thought wrong; still not good enough with RegExp. Plastikspork! Help! :) This seems like it should be pretty simple and I tried a couple of different things, but with no success. –Drilnoth (T • C • L) 20:05, 17 May 2009 (UTC)
Well, we will have to wait for him ;). Locos ~ epraix Beaste~praix 20:09, 17 May 2009 (UTC)
I checked your attempted changes, and they both looked quite reasonable. What was the problem? Plastikspork (talk) 22:15, 17 May 2009 (UTC)
Perhaps to make things easier to read, how about adding a new line for this new rule. For example, the following line could be added after the last 'br' line, but just before the 'hr' line:
str = str.replace(/<?(<br[\s\/]*>)>?/gim, '$1');
Should it also allow for spaces between the brackets? Plastikspork (talk) 22:25, 17 May 2009 (UTC)
Perhaps this is even better:
str = str.replace(/<[\s]*(<br[\s\/]*>)/gim, '$1');
str = str.replace(/(<br[\s\/]*>)[\s]*>/gim, '$1');
As it allows for spaces and only matches the ones which will be changed, with no "empty changes". Plastikspork (talk) 22:28, 17 May 2009 (UTC)
In response to your question: It just didn't seem to work properly in a userpage test (not saved, just tested)... it replaced <br /> with <br />, which doesn't really help (I did purge my cache). I'll update it with your code. Thanks! –Drilnoth (T • C • L) 22:30, 17 May 2009 (UTC)
Interesting, it must be that the match wasn't "greedy" with the question mark. If you would have used a 'plus' sign instead, it probably would have worked.
This brings up an interesting question, which is are these changes completely safe? Is there ever a context in which a 'br' tag would be followed by a > or preceded by a < sign? It seems like this could legitimately happen? Plastikspork (talk) 22:33, 17 May 2009 (UTC)
I can't think of any. There could be </span><br>, but that wouldn't match. You'll never see anything like </span<br>. The only possible instance that I can think of where this change could be problematic is if the < or > is being used as an actual greater-than or less-than sign, but that should be rare enough right next to a br that it shouldn't matter enough to worry about, IMO... especially since these should all be getting human-checked. –Drilnoth (T • C • L) 22:56, 17 May 2009 (UTC)

cat sort

Hello! First, thanks for your work on developing these tools. I just switched from codefixer and codefixer+ to AutoEd -- very nice. I think this edit is a mistake - the space after the pipe is intentional to ensure that the article shows up at the beginning of the list in Category:Lists of environmental topics. Regards—G716 <T·C> 07:35, 26 May 2009 (UTC)

Hmm... I haven't seen that way of doing it before. It may make more sense to use either an asterisk or just a space (rather than the space followed by the article name). Does the name of the article really make any difference if the sort key is spaced? –Drilnoth (T • C • L) 13:48, 26 May 2009 (UTC)
I had not seen this before either. It seems somewhat obfuscated. Plastikspork (talk) 18:15, 26 May 2009 (UTC)
The space followed by text (in this case the page name, but could be any text) allows that articles appear in the category in some defined order. In this case, it happens to be redundant, as Index of environmental articles will sort before Lists of environmental topics. I'll try to find an example where it's not redundant. —G716 <T·C> 03:10, 27 May 2009 (UTC)
I commonly use this in template namespace, when cleaning up navboxes for various anime and manga series, though the form I use is typically | {{PAGENAME}}]] (and is actually more out of habit than necessity; the template would sort after the main article (assuming it has a proper sortkey itself) regardless of the presence or absence of {{PAGENAME}}). ···「ダイノガイ 千?!? Talk to Dinoguy1000 18:41, 27 May 2009 (UTC)
I had been under the impression that if two things in the same category had identical sort keys (like a single space) they would then be sorted by the article's actual title. So the Index would sort before the List regardless of whether or not the article name was repeated, since the Index comes first. The only exception that I can think of would be where the articles use the same first sort key but the remainder of the sort is different from the article name, but that would be kind of strange. –Drilnoth (T • C • L) 18:58, 27 May 2009 (UTC)

IE vs. Chrome vs. Firefox

After some correspondence with Symplectic Map, I believe we can soon officially support at least three browsers: IE, Chrome and Firefox. SM pointed out that the main problem is that the order in which statements are executed in javascript is not always that reliable. We noticed this before, when Chrome users were getting the 'function is undefined' alert message. I am in the process of testing a suggested fix, (see User:Symplectic Map/autoedcore.js), and it appears to work. One thing that SM did, which I wanted to ask about is simplification to the namespace logic. In particular, in SM's version, it just checks to see if 'wgIsArticle' is true, rather than checking all the various namespaces. Is there a particular namespace which are trying to avoid, or is this check enough? It looks like it at least does the minimum, which is to avoid adding the tab to (most or all?) pages which don't have an edit tab? This is also what SM is doing with his autospeller script. Plastikspork (talk) 19:28, 29 May 2009 (UTC)

Sounds good. Let me know when you're both ready for this to go live and I can make the edit. –Drilnoth (T • C • L) 19:30, 29 May 2009 (UTC)
I put the updated code in User:Plastikspork/AutoEd/core.js. The basic changes are to (1) move the default settings into the various functions where they are used, as it appears we cannot assume these statements will be executed in the correct order otherwise, (2) simplify the namespace logic to only check for wgIsArticle and an element named 'ca-edit', if this is too liberal, we can scale it back, although I suppose we can trust users to only use autoEd in the correct places, (3) reduced the amount of indentation since WP seems to complain about excessive whitespace these days.
Note - It may be enough to just check for 'ca-edit', but checking for 'wgIsArticle' doesn't appear to hurt. I checked it on four different browsers (Firefox Beta-WindowsXP, Chrome-WindowsXP, Internet Explorer-Windows XP, Firefox Stable-Linux), and it seems to work on all of them. Twinkle on the other hand, is having problems on IE. Let me know if you see any problems. Plastikspork (talk) 20:40, 29 May 2009 (UTC)
Awesome;  Done. I also confirmed that it still works with WikiEd... it still needs the autoEdClick to be "false", but otherwise works fine. Thank you! –Drilnoth (T • C • L) 21:08, 29 May 2009 (UTC)

Working in other Wikimedia projects?

I've been trying to make this script work in the spanish wikipedia, putting this code in my monobook.js:

document.write('<script src="'
+ 'http://en.wikipedia.org/w/index.php?title=Wikipedia:AutoEd/complete.js'
+ '&action=raw&ctype=text/javascript"></script>');

and it isn't working for me, is there any other way to make this script work? Locos ~ epraix Beaste~praix 20:53, 31 May 2009 (UTC) I already clean up my cache and all that stuff.

Hmm... we still haven't gotten to the cross-wiki thing, but I will certainly look into this some more when I have the time. –Drilnoth (T • C • L) 22:48, 31 May 2009 (UTC)
I have an idea on how to fix this and will try testing it soon... I'm not sure if all wikis have importScript(); defined, in which case this wouldn't work because importing other pages is required. Maybe changing those functions to use document.write instead will fix it. Just give me a few days. –Drilnoth (T • C • L) 02:47, 3 June 2009 (UTC)
At least, I know that es.wikipedia and es.wikinews have importScript(); defined just like the es.wikipedia. Locos ~ epraix Beaste~praix 02:53, 3 June 2009 (UTC)
Hmm... well, I'll still try this out to see if it works and I'll take a look at the code on es.wikipedia a bit more. Is the tab not showing up at all, or is it just not doing anything when you click it? –Drilnoth (T • C • L) 02:56, 3 June 2009 (UTC)
Take a look of my monobook.js there, the tab is not showing and even if I put the edit link code (&action=edit&AutoEd=true), it doesn't work. Locos ~ epraix Beaste~praix 03:15, 3 June 2009 (UTC)
Is your browser's Javascript console or error console reporting any errors there (I'm assuming you're not using IE)? ダイノガイ 千?!? · Talk⇒Dinoguy1000 20:44, 5 June 2009 (UTC)
It is not reporting any error, it just isn't working. Locos ~ epraix Beaste~praix 02:38, 6 June 2009 (UTC) Firefox 3.10 and 3.5 beta.

(undent) Okay, I think that I've figured this out. When you import Wikipedia:AutoEd/complete.js into another wiki, it then in turn tries to import Wikipedia:AutoEd/core.js and all of the modules using importScript(). The problem is, it is trying to import the local copy of the various modules and the core function, so when you use document.write in that context and use the JavaScript from en's complete.js page, it then tries to import es:Wikipedia:AutoEd/core.js, rather than en:Wikipedia:AutoEd/core.js. To try and fix this I have made this edit. Before changing all of the modules, could you confirm if the "auto ed" tab now appears on the Spanish Wikipedia? It wouldn't really do anything at this point, but I thought I'd ask before I went and converted everything. Thanks! –Drilnoth (T • C • L) 13:58, 12 June 2009 (UTC)

It does appear! Locos ~ epraix Beaste~praix 14:45, 12 June 2009 (UTC)
Also works in es.wikinews, pt.wikipedia and fr.wikipedia, so I expect it works for most Wikimedia projects. At the moment it doesn't do anything, just waiting the full script conversion. Locos ~ epraix Beaste~praix 14:51, 12 June 2009 (UTC)
Excellent! I will try to complete the conversion later today. Sorry about the delays! –Drilnoth (T • C • L) 15:16, 12 June 2009 (UTC)
Okay, it should be good to go. I'll update the documentation sometime in the next few days. –Drilnoth (T • C • L) 15:32, 12 June 2009 (UTC)
Cool! Working now. Locos ~ epraix Beaste~praix 15:47, 12 June 2009 (UTC)

Order

I don't know where but I have read that the basic order of an article is: Content, navtemplate (if necessary), categories, stub-template (if necessary) and interwikis. See this change, or any other with a stub template misplaced, AutoEd is not correcting the place of the Stub template. Locos ~ epraix Beaste~praix 02:19, 3 June 2009 (UTC)

That's because it can't at the moment... this is certainly something that can and should be done, but I can't code it yet. –Drilnoth (T • C • L) 02:39, 3 June 2009 (UTC)

Something like this may be useful for numerical character refs, for example to convert annoying crap like "&#252;" or "&#xFC;" into "ü";

// symbols for which there may be a good reason to obfuscate/excape
var dont_repl = "|!{}[]=<>";
function unicodify2(str){
 function repl(ent, base){
 num = parseInt(ent.replace(/[\&\#\;x]/g, ''), base);
 // see [[UTF-16]] for chars outside the BMP
 // try this with Gothic letters at full volume ^_^
 if (num > 0xFFFF) {
 num -= 0x10000;
 chr = String.fromCharCode(0xD800 + (num >> 10), 0xDC00 + (num & 0x3FF));
 }
 else chr = String.fromCharCode(num);
 if (dont_repl.indexOf(chr) == -1) str = str.replace(ent, chr, "gi");
 }
 if(m = str.match(/\&\#(\d+)\;/g)) for(i = 0; i < m.length; i++) repl(m[i], 10);
 if(m2 = str.match(/\&\#x([\da-f]+)\;/gi)) for(i = 0; i < m2.length; i++) repl(m2[i], 16);
 return str;
 }

This is close to what I've been using. Other people might want it as well, at least as an option. — CharlotteWebb 17:18, 6 June 2009 (UTC)

Could you clarify what this would do? I mean, would that exact code work for most of those characters? –Drilnoth (T • C • L) 03:01, 9 June 2009 (UTC)

It finds numerical character references and changes them to literal letters/numbers/symbols etc. It works for any character except those specified to be ignored. For example it would make changes like these:

from to
  • &#76;if&#x0065;
  • Z&#252;rich
  • S&#xE3;o Paulo
  • &#x590F;&#x6D1B;&#x7279;·&#x97E6;&#x5E03;
  • &#66352;&#x1033C;&#x10334;&#66365;
  • Life
  • Zürich
  • São Paulo
  • 夏洛特·韦布
  • 𐌰𐌼𐌴𐌽

CharlotteWebb 10:31, 13 June 2009 (UTC)

Ah, thanks for the clarification. The code's a bit complicated... I'll implement it later today or tomorrow. Plastikspork, could you take a look before I add it in? (I do trust you CharlotteWebb, I just think its good to have two pairs of eyes look at the code before implementing it). –Drilnoth (T • C • L) 14:52, 13 June 2009 (UTC)

Actually something like this falls in a category of its own. I think there are certain users who use entity-refs in the 0x80–0x9F (128–159) range based on the unwarranted assumption that everyone's browser will fall back on Windows-1252 encoding when unprintable control characters are found. However I doubt this works well on non-Windows systems, and I know it doesn't work if these are changed to literal control-char code points. Personally I'd follow the above with replacements like this:

 fail = "<!-- rm unicode ctrl char w/no win-1252 mapping, intent unknown -->";
 return str
 .replace(/\u0080/g, "\u20AC") // euro
 .replace(/\u0081/g, fail) // none
 .replace(/\u0082/g, "\u201A") // sbquo
 .replace(/\u0083/g, "\u0192") // florin (italic f)
 .replace(/\u0084/g, "\u201E") // bdquo
 .replace(/\u0085/g, "\u2026") // ellipsis
 .replace(/\u0086/g, "\u2020") // dagger
 .replace(/\u0087/g, "\u2021") // double dagger
 .replace(/\u0088/g, "\u02c6") // circumflex
 .replace(/\u0089/g, "\u2030") // per mil "0/00"
 .replace(/\u008a/g, "\u0160") // capital S with caron (hacek)
 .replace(/\u008b/g, "\u2039") // lsaquo
 .replace(/\u008c/g, "\u0152") // OElig
 .replace(/\u008d/g, fail) // none
 .replace(/\u008e/g, "\u017D") // Z with caron (hacek)
 .replace(/\u008f/g, fail) // none
 .replace(/\u0090/g, fail) // none
 .replace(/\u0091/g, "\u2018") // lsquo
 .replace(/\u0092/g, "\u2019") // rsquo
 .replace(/\u0093/g, "\u201C") // ldquo
 .replace(/\u0094/g, "\u201D") // rdquo
 .replace(/\u0095/g, "\u2022") // bullet
 .replace(/\u0096/g, "\u2013") // ndash
 .replace(/\u0097/g, "\u2014") // mdash
 .replace(/\u0098/g, "\u02DC") // small tilde
 .replace(/\u0099/g, "\u2122") // trademark (tm)
 .replace(/\u009a/g, "\u0161") // lowercase s with caron (hacek)
 .replace(/\u009b/g, "\u203A") // rsaquo
 .replace(/\u009c/g, "\u0153") // oelig
 .replace(/\u009d/g, fail) // none
 .replace(/\u009e/g, "\u017e") // lowercase z with caron (hacek)
 .replace(/\u009f/g, "\u0178") // y with umlaut
 ;

CharlotteWebb 23:07, 13 June 2009 (UTC)

Nice work. Thanks for your impressive contribution. I have created a new version of 'unicodify.js' in the usual place. This new version will look like a big change in terms of a big diff, but it's actually not the big of a change.
1) Merged CharlotteWebb's suggested additions, labeled as 'Task 2' and 'Task 3' at the end of the script. This is nearly a verbatim copy of what CharlotteWebb provided above, with minor tweaks mostly to make it easier for me to read the code (variable names, indentation, ...). It appears to still function as intended as far as I can tell.
2) Grouped the other transformations into sections, with an initial test to check if the pattern '&foo;' exists before going through a long list of replacements. This should, hopefully, make the code run a bit faster.
3) Simplified the 'new RegEx' to a more simple replace pattern.
4) Corrected &Ioeta; -> &Iota;, &ioeta; -> &iota;, and &Eth; -> &ETH;.
That's about it. If you would rather have this new stuff in a new function, you can safely split the function right before the 'Task 2' comment. On a related note, I have been meaning to merge my 'spork_unicode_wikilinks' found in User:Plastikspork/tools.js, which does a similar thing for hex encoded wikilinks. The question is if it would make sense to merge it with another existing function, or create a new one. Plastikspork (talk) 19:30, 17 June 2009 (UTC)
 Done, looks good. Thank you both! –Drilnoth (T • C • L) 16:03, 18 June 2009 (UTC)

In response to the comment on my talk page, the best way to test this would be to create a test page full of them:

{| class="wikitable" style="font-family:monospace;"
! !! 0 !! 1 !! 2 !! 3 !! 4 !! 5 !! 6 !! 7 !! 8 !! 9 !! A !! B !! C !! D !! E !! F
|-
! 0x8…
| &#128; &#x80; || &#129; &#x81; || &#130; &#x82; || &#131; &#x83;
| &#132; &#x84; || &#133; &#x85; || &#134; &#x86; || &#135; &#x87;
| &#136; &#x88; || &#137; &#x89; || &#138; &#x8A; || &#139; &#x8B;
| &#140; &#x8C; || &#141; &#x8D; || &#142; &#x8E; || &#143; &#x8F;
|-
! 0x9…
| &#144; &#x90; || &#145; &#x91; || &#146; &#x92; || &#147; &#x93;
| &#148; &#x94; || &#149; &#x95; || &#150; &#x96; || &#151; &#x97;
| &#152; &#x98; || &#153; &#x99; || &#154; &#x9A; || &#155; &#x9B;
| &#156; &#x9C; || &#157; &#x9D; || &#158; &#x9E; || &#159; &#x9F;
|}

Then run the script and confirm that the result matches that shown in rows 8 and 9 of this table. — CharlotteWebb 19:26, 18 June 2009 (UTC)

Great. Thank you. I would like to create a set of unit test of the script, and this will be on the list. Thanks again. Plastikspork (talk) 23:24, 18 June 2009 (UTC)

url-encoded wiki links

Plastikspork, I looked at your other script and if you really want to clean up percent-encoding, anchor-encoding, etc. from wiki-links (once again, to make the text more readable in the edit window) you should use something like this:

function normalize_wikilinks(txt){
 // to keep things simple we'll ignore all image links. because some people prefer
 // underscores in the file name and the caption can contain god-knows-what.
 // one easy way is to flag them with a character which should never be used,
 // but if it is already present we have a problem, so let's just quit.
 if(txt.match(/\uE000/)) return(txt); // see [[Private Use Area]]
 txt = txt.replace(/(\[\[[\:\s*]*(?:Image|File|Media)\s*\:)/gi, "$1\uE000");
 if(m = txt.match(/\[\[[^\[\]\n\uE000]+\]\]/g))
 for(var i = 0; i < m.length; i++){
 parts = m[i].split("|"); link = parts[0];
 a = link.split("#"); title = a[0]; section = a[1];
 try {
 link = decodeURIComponent(title
 // explained below
 .replace(/\%(.[^0-9A-F]|[^0-9A-F].|$)/gi, "%25$1")
 ) +
 ( section ? (
 "#" + decodeURIComponent(section
 // change "." to "%" when followed by valid hex
 .replace(/\.([0-9A-F]{2})/gi, "%$1")
 // explained below
 .replace(/\%(.[^0-9A-F]|[^0-9A-F].|$)/gi, "%25$1")
 )
 ) : ""
 )
 }
 catch(e) { } // just do no decoding
 parts[0] = link.replace(/[\s_]+/g, " ").replace(/\s*#\s*/, "#");
 txt = txt.replace(m[i], parts //cleanup some spaces
 .join("|").replace(/\s*\|\s*/, "|").replace(/\s*(\[|\])\s*/g, "$1")
 );
 }
 return(txt.replace(/\uE000/g, ""));
 }

Here's a good example of what it does: [4]. The reason for changing literal "%" signs to "%25" first is because decodeURIComponent() will choke otherwise because it expects the next two characters to always be valid hex digits. Wiki-links are more flexible than that. A page title can contain a literal percent sign as long as it is not mistakable for url-encoding. [[70%WATER]] is ok (the url is "/wiki/70%25WATER") but [[10%FAT]] just links to [[10úT]] (though it doesn't display properly). Trying to create a page with url "/wiki/10%25FAT" will give you a Bad Title error. On the other hand [[I%U]] and [[I%U]] produce identical html.

While /wiki/%FA is accepted as a valid alternative to the properly UTF-8 encoded /wiki/%C3%BA—both of them link to ú (which I wrote, just the other day by some coincidence)—decodeURIComponent() will still reject it. I'm wondering whether it would be worthwhile for the script to try and pick up the slack in this respect (perhaps violate RFC 3629), or at least produce a meaningful error message. Guess that would depend on how commonly it becomes an issue. — CharlotteWebb 19:26, 18 June 2009 (UTC)

Thank you yet again. This is really a much more compact (and correct) solution to what I had. In fact, I just recently noticed some quirky behavior in my version, which I was just preparing to debug. Now, I don't have to debug it, which leaves more time for other stuff. I will work on merging it into AutoEd when I have a chance. Plastikspork (talk) 23:22, 18 June 2009 (UTC)

Refs cleanup module

A module for cleaning up references would be quite useful. Basics would be checking for quotes on ref names (e.g. changing <ref name=name> to <ref name="name">), and misspelled parameters for the various cite templates (stuff like accesdate instead of accessdate). Thoughts? ダイノガイ 千?!? · Talk⇒Dinoguy1000 02:40, 9 June 2009 (UTC)

I'll try to give this a go within a week or so. Some of the regex should be pretty simple. –Drilnoth (T • C • L) 03:00, 9 June 2009 (UTC)
I had started working on some ref tools, but I was mostly targeting the reduction of duplicate references, and cleaning citation templates. Plastikspork (talk) 07:21, 9 June 2009 (UTC)
Both of which are also good tasks for a ref cleaner to do. You can see a lot of possible cleanup tasks such a script might do in this edit (obviously, spacing probably shouldn't be touched, except maybe to standardise it within a ref; it needs a bit of discussion, I think). ダイノガイ 千?!? · Talk⇒Dinoguy1000 09:59, 9 June 2009 (UTC)
I can do the more simple stuff like adding quotes around names, fixing common parameter typos (and, maybe, removing empty parameters which aren't used much). If Plastikspork could come up with code for merging duplicate references, that would be great... it's a bit out of my reach yet, I think. –Drilnoth (T • C • L) 13:55, 9 June 2009 (UTC)
One more minor point to get out of the way really quick: should there be a space before the closing slash in <ref name="name" />? Do any components currently touch the space (or lack thereof) in <references />, and if so, what do they do with it (since consistency is good)? ダイノガイ 千?!? · Talk⇒Dinoguy1000 18:15, 9 June 2009 (UTC)
Currently, the <references /> tag (incorrectly formatted), and ones without a space, add a space and slash if needed. For the ref tags themselves, I'll code something up. (to my knowledge, yes, there should be a space before the slash) –Drilnoth (T • C • L) 21:52, 9 June 2009 (UTC)
Okay, cool. ダイノガイ 千?!? · Talk⇒Dinoguy1000 18:18, 10 June 2009 (UTC)

~poke~ Any attention on this, either? ダイノガイ 千?!? · Talk⇒Dinoguy1000 19:12, 14 September 2009 (UTC)

Ah, dang. I've been working on Wikipedia:Dazzle! so much that this had completely fallen off of my list. I might be able to look into it, but I might not have the time. There's just so many different things to do here! –Drilnoth (T • C • L) 19:21, 14 September 2009 (UTC)
Aah, so that's what's been keeping you busy! Have you considered advertising it yet, or are you waiting until the beta version is sufficiently developed or something? ダイノガイ 千?!? · Talk⇒Dinoguy1000 21:45, 14 September 2009 (UTC)
I'm waiting until it is more developed with 5-6 functions; right now it can't do much of anything. –Drilnoth (T • C • L) 21:51, 14 September 2009 (UTC)
That explains it. I thought you had some sort of Dr. Jekyl and Mr. Hyde thing going on with as many times as you keep reverting your own edits to your talk page. Plastikspork ―Œ(talk) 22:42, 14 September 2009 (UTC)
As far as the refs module goes, help fill out the desired features list, and I can work on it. Plastikspork ―Œ(talk) 22:42, 14 September 2009 (UTC)
I've added a few items; some of them (such as ref reordering and dupe ref checking) are probably lower-priority since AWB also covers them. There are probably other issues that could be corrected as well. ダイノガイ 千?!? · Talk⇒Dinoguy1000 17:41, 15 September 2009 (UTC)
  1. Change <ref name=foo> to <ref name="foo">
  2. Correct <ref name="foo> and <ref name=foo"> to <ref name="foo">
  3. Change <ref name="foo"></ref> (empty or whitespace ref) to <ref name="foo" />
  4. cleanup common misspelled parameters in {{cite}} templates
  5. Reorder refs from earliest used ref to latest (e.g. "A statement.[5] [2]" becomes "A statement.[2] [5]")
  6. Check for duplicate refs - if unnamed, assign a name (or notify the user); if named, use the first name found
  7. Fix borked accessdate and date parameters, when possible, (e.g., 02-19-2005 to 2005-02-19)
  • All above issues relating to name=... should also apply to group=...
Question

Should we just do this instead? It appears there is now support to move all the named references to the references section. It should make the text much more readable. Plastikspork ―Œ(talk) 21:37, 23 September 2009 (UTC)

It sounds good, but first we'd have to be sure there's consensus for general cleanup tools to perform this change. If not, it could still be provided as a separate module or something. Personally, though, I have rather mixed feelings on deploying this generally. ダイノガイ 千?!? · Talk⇒Dinoguy1000 17:14, 24 September 2009 (UTC)

Whitespace

Currently, the Formatter and Complete presets collapse double (and more) spaces into single spaces (via the Whitespace module). This is generally good, but there are some places (especially in template code) where this extra whitespace is used for readability purposes:

with extra whitespace
{{Infobox animanga/Header
| name = Fist of the Blue Sky
| image = [[Image:FistoftheBlueSky1.jpg|230px]]
| caption = Volume 1 cover
| ja_kanji = 蒼天の拳
| ja_romaji = Sōten no Ken
| genre = [[Historical fiction]], [[Action genre|Action]], [[Drama]]
}}
{{Infobox animanga/Manga
| title =
| author = [[Tetsuo Hara]], [[Nobuhiko Horie]]
| publisher = {{flagicon|Japan}} [[Shinchosha]]
| publisher_other = {{flagicon|Italy}} [[Panini Comics]]
| demographic = [[Seinen]]
| magazine = {{flagicon|Japan}} [[Comic Bunch]]<br>{{flagicon|USA}} [[Raijin Comics]] (2003-2004)
| first = May 2001
| last =
| volumes = 20
}}
{{Infobox animanga/Anime
| title =
| director = Yoshihiro Yamaguchi
| studio = Souten Studio
| network = {{flagicon|Japan}} [[TV Asahi]], [[Animax]]
| first = [[October 4]], [[2006]]
| last = [[March 14]], [[2007]]
| episodes = 26
}}
{{Infobox animanga/Footer}}
without extra whitespace
{{Infobox animanga/Header
| name = Fist of the Blue Sky
| image = [[Image:FistoftheBlueSky1.jpg|230px]]
| caption = Volume 1 cover
| ja_kanji = 蒼天の拳
| ja_romaji = Sōten no Ken
| genre = [[Historical fiction]], [[Action genre|Action]], [[Drama]]
}}
{{Infobox animanga/Manga
| title =
| author = [[Tetsuo Hara]], [[Nobuhiko Horie]]
| publisher = {{flagicon|Japan}} [[Shinchosha]]
| publisher_other = {{flagicon|Italy}} [[Panini Comics]]
| demographic = [[Seinen]]
| magazine = {{flagicon|Japan}} [[Comic Bunch]]<br>{{flagicon|USA}} [[Raijin Comics]] (2003-2004)
| first = May 2001
| last =
| volumes = 20
}}
{{Infobox animanga/Anime
| title =
| director = Yoshihiro Yamaguchi
| studio = Souten Studio
| network = {{flagicon|Japan}} [[TV Asahi]], [[Animax]]
| first = [[October 4]], [[2006]]
| last = [[March 14]], [[2007]]
| episodes = 26
}}
{{Infobox animanga/Footer}}

Two other issues: First it seems that AutoEd will also remove trailing whitespace from empty template parameters (as in the above example, even though you can't see it) - once again, this isn't desirable either. And second, it currently adds whitespace between list markup and the list element's content (e.g. *Some text to * Some text); however, it currently doesn't touch definition lists (ones using semicolons, ; ). This last point is the easiest of these to address, I think (the others require some memory on the part of the script, to know when it's in template code and (in the first case) how much extra whitespace to add, by way of remembering how long the longest parameter name is). Thoughts? ダイノガイ 千?!? · Talk⇒Dinoguy1000 18:29, 10 June 2009 (UTC)

Good point about the definition lists. Why is the removal of trailing whitespace problematic? Does this cause a problem? As for the parameter setup, I personally think that the infoboxes look better without all of the additional whitespace... that's how most templates are formatted, and it also reduces the page's size. That said, if it is controversial that could be commented out. –Drilnoth (T • C • L) 17:25, 11 June 2009 (UTC)
I think the key here is to try to make a clear separation between whitespace removal, newline removal, and other tasks. This exact same issue came up when I was writing my spork-script, which is why I created a second WS function, and the rest of the functions should not change whitespace. In the context of AutoEd, I would think that it would be advantageous to have all/most the whitespace removal in one module, and the newline removal in another module. These modules would be included in 'complete' but not in others. One of the reasons why I like the idea of separating newline removal is that it creates larger diffs that are much harder to read. The 'show changes' function does not appear to be smart enough to understand combinations of newline removal with other neighboring changes. I can help code all this stuff up in a few days, but I am boarding an international flight in about three hours. Plastikspork (talk) 08:44, 12 June 2009 (UTC)
Trailing whitespace removal is generally not a problem except in such places as after equals signs (when they are part of a template's parameter) and after empty list items, if the rest of that list has a space between the list indicator and the list contents (yep, more consistency). As for collapsing whitespace in templates, I think AutoEd should probably just generally leave it alone, except for making sure everything is lined up right if multipe consecutive spaces are used. And whitespace in templates is generally used to enhance readability - I regularly work with pages whose source would be nigh incomprehensible without such spacing, and this keeps me from using AutoEd on them to see what sort of issues it might pick up. ダイノガイ 千?!? · Talk⇒Dinoguy1000 08:48, 13 June 2009 (UTC)
Okay... I personally prefer not having all the extra whitespace, even in templates, but if AutoEd shouldn't be doing it I'll comment out the code. I'm guessing that it's str = str.replace(/[ \t] [ \t]+/g, " ");, but I'm not real familiar with that module (just stole it from Formatter with a handful of tweaks). –Drilnoth (T • C • L) 14:55, 13 June 2009 (UTC)

~poke~ Can any of you guys have another look at this? It's been a few months, but hasn't been touched... =/ ダイノガイ 千?!? · Talk⇒Dinoguy1000 19:08, 14 September 2009 (UTC)

I will have a look. If there is a pattern to recognize these cases, I can pretag those with a special character, then remove whitespace, then remove the tags. A similar feature could be used to ignore spacing in comments, and pre formatted text. For the infobox, is it enough to ignore whitespace before the equal-sign in lines that start with | foo =? Plastikspork ―Œ(talk) 22:55, 14 September 2009 (UTC)
Yes, that should be fine. Ideally, the script would see if a "significant portion" (doesn't have to be as much as 1/2) of the parameters use extra whitespace like that, and automatically insert proper amounts of whitespace to line everything up, but that is perhaps best reserved for a different tool or module. ダイノガイ 千?!? · Talk⇒Dinoguy1000 17:31, 15 September 2009 (UTC)

Strange bug

There seems to be a strange new bug, see discussion here: User talk:Aldaron#Script problems?. My guess is an incompatibility with another script, but I'm not sure exactly what is going on at this point. The strange thing is that the text being pasted is exactly the 'document.write' commands which were recently added. Plastikspork (talk) 20:08, 17 June 2009 (UTC)

I'm getting the same bug in Google Chrome: here. I haven't been using AutoEd at all today, but the list of scripts is being inserted into my edits. Symplectic Map (talk) 03:27, 18 June 2009 (UTC)
I am guessing this has to do with the 'document.write' command being executed at some random inopportune time while the page is loading. I am wondering if this wouldn't be fixed by changing the 'document.write' commands to 'importScriptURI'? I could help debug on Chrome if you want. I just had to go back and correct about 15 of my last 150 edits, so it appears to happen about once every ten times. Symplectic Map (talk) 04:01, 18 June 2009 (UTC)
Okay, I just changed all the 'docment.write' commands to 'importScriptURI' in my version, which should still allow for use with the foreign wikipedia, but avoid the dangerous document.write command. My preliminary test show this works, but as SymplecticMap said, it could require more testing to see the bug appear. You can have a look at mine, User:Plastikspork/AutoEd/complete.js, but don't just copy it as mine imports my development version of the scripts. Could we get Drilnoth or Dinoguy to make this change? I would do it but I don't have admin privileges ... yet ... Plastikspork (talk) 14:54, 18 June 2009 (UTC)
 Doing...Drilnoth (T • C • L) 14:59, 18 June 2009 (UTC)
 Done; my apologies about this, 'twas my bad. I'd seen document.write used for a lot of script-importing before so I thought that it would work fine here too... I actually hadn't known about importScriptURI() before; thanks. It should all be working good now. –Drilnoth (T • C • L) 15:08, 18 June 2009 (UTC)
Use of document.write had bothered me before, but I forgot to ever bring it up. Nice to see it changed to something less random, though! ^_^ ダイノガイ 千?!? · Talk⇒Dinoguy1000 18:10, 18 June 2009 (UTC)

mojibake corrector (don't try this at home)

I've got an experimental new one for fixing mojibake, something like this:

function fixmojibake(txt) {
 r = /[\u00C2-\u00DF] [\u0080-\u00BF]|[\u00E0-\u00EF] [\u0080-\u00BF]{2}|[\u00F0-\u00F4] [\u0080-\u00BF]/g;
 if(m = txt.match(r)) for(i = 0; i < m.length; i++)
 txt = txt.replace(m[i], decodeURIComponent(escape(m[i])));
 return txt;
 }

Something like this would catch encoding errors such as those introduced by the utf-impaired Polbot in late 2008 and which I corrected just now (see "Fidèle" ↔ "Fidèle" in the last line).

It uses sort of a reverse type punning to ASCII-encode à and ¨ as [[%C3]] and [[%A8]], then UTF-decode %C3%A8 as è. This should be used manually and with caution due to the potential for false positives (which I have yet to estimate).

For most users mojibake is an "I know it when I see it, but not what it means or how to fix it" sort of thing, whereas a script like this knows what it might mean, and how to fix it, but not whether the status quo isn't already correct. I'm not sure it could be trained for the latter , but feel free to try that . — CharlotteWebb 21:05, 27 June 2009 (UTC)

Note that Twinkle (and possibly Friendly) will generate mojibake when performing certain operations on pages with special characters in their names. This isn't really a problem, but can be annoying when you get notified and have to guess at the correct page title or look through your contributions/watchlist to find the correct page. I really should do some experiments, find out just what causes problems, and file a bug or two... ダイノガイ 千?!? · Talk⇒Dinoguy1000 19:11, 14 September 2009 (UTC)

User:SUL messed up the formatting of Acid dissociation constant under the self-ionization section when he used Auto Ed here.--Jorfer (talk) 01:25, 25 August 2009 (UTC)

That's not good. It appears there is some confusion in thinking that [H_2O] is a wikilink. I will have a look soon. Plastikspork ―Œ(talk) 04:48, 25 August 2009 (UTC)
I fixed it by making a small change to links.js, to force it to match double brackets before removing underscores. My fix makes the module less aggressive in its transformations, which is good, but it probably also disabled one of its functions, which is not optimal. However, I plan to rewrite that module fairly soon as it has caused more than one problem. Plastikspork ―Œ(talk) 05:23, 25 August 2009 (UTC)

AutoEd Tab?

I installed the autoed script, purged the page, cleared the cache but cannot find the autoed tab or button outside of the edit box. Where is the autoed tab? warrior 4321 14:17, 12 September 2009 (UTC)

It should be at the top of the page, underneath "my talk", "my preferences", ... and next to "edit this page", "history", "move", ... Let me know if it's still not showing up. Plastikspork ―Œ(talk) 14:35, 12 September 2009 (UTC)
It looks like you are using 'vector.js'. I am personally using 'monobook.js', so that could potentially be an issue. I don't know if we have ever performed any testing with vector. Plastikspork ―Œ(talk) 14:37, 12 September 2009 (UTC)
I can't use the autoed then? Vector is the new beta, and if Vector performs well, it will eventually become the default skin. I think it'd be smart to bring up a compatibility version of vector. warrior 4321 14:40, 12 September 2009 (UTC)
I think it's a vector issue, I have no seen any script capable of adding new tabs to vector skin. Locos epraix ~ Beastepraix 14:43, 12 September 2009 (UTC)
Is it not possible to make AutoEd like WikiEd? When you edit a page, there is a toolbar? warrior 4321 14:53, 12 September 2009 (UTC)
You can use the advisor while there is vector compatibility. Locos epraix ~ Beastepraix 14:56, 12 September 2009 (UTC)
I will look into making AutoEd work with Vector, but for now, it appears it's a "no go". Thanks. Plastikspork ―Œ(talk) 15:13, 12 September 2009 (UTC)
AutoEd works fine in Vector using FireFox 3.5... that's what I'm using, and everything seems OK. However, with Vector, buttons other than "read", "edit", "new section", and "view history" are instead collapsed under the arrow next to the search box. Mouse over that arrow and you should see the button, along with things like watch/unwatch and move. –Drilnoth (T • C • L) 22:04, 12 September 2009 (UTC)
Yes I did, but it seems I had script that was blocking all the other scripts that I had installed, except the ones from the gadgets. I removed all the garbage from my vector and re-installed Auto-Ed. Now, all my scripts work again, thank you for all your help. warrior 4321 23:48, 12 September 2009 (UTC)

My problem with the auto-ed tab is that it has been replaced with "check". This happened a few days ago soon after I had installed auto-ed. The check tab works perfectly, so I thought check was the new default tab and changed the text in the project page. Obviously I was wrong. But in my browsers, both firefox and IE, the check tab is still there. So what's the deal? --Hans (talk) 08:15, 18 September 2009 (UTC)

Aah, you appear to be using the WikiChecker preset, which changes the Auto Ed tab to read "Check". Interestingly, you are also using two other presets; I'm surprised they don't conflict with each other... In any case, you may want to replace all three with the complete preset, since it has all the same modules as those three, in one package (I can change it for you if you'd like). ダイノガイ 千?!? · Talk⇒Dinoguy1000 19:30, 18 September 2009 (UTC)
Please do so and let me know what the advantage of the change will be? I am an absolute greenhorn, but I appreciate the possibilities of Auto Ed very much. Thank you.--Hans (talk) 07:14, 19 September 2009 (UTC)
 Done, please clear your cache to see the change. As for the advantage, you get the old "auto ed" tab text back, and you eliminate the possibility of any conflicts or other problems arising from calling Wikipedia:AutoEd/core.js three times. ダイノガイ 千?!? · Talk⇒Dinoguy1000 16:53, 19 September 2009 (UTC)

Broken on Chrome 3?

AutoEd doesn't seem to work on Google Chrome 3. Error console gives the error "Uncaught ReferenceError: autoEdUnicodify is not defined" with URL "/w/index.php?title=Wikipedia:AutoEd/complete.js&action=raw&ctype=text/javascript:20".

On a semi-not-really-related note, is there any difference between document.editform and document.forms.editform? ダイノガイ 千?!? · Talk⇒Dinoguy1000 19:26, 24 September 2009 (UTC)

I have no idea as to the answer for your first question; I don't see what would be wrong from looking at the code. For the second, I think (no guarantees) that the first one is deprecated... editform would there refer to the editform's ID, which now should be done via document.getElementById(), whereas the latter is more correct form-access syntax, using the name editform. –Drilnoth (T • C • L) 19:35, 24 September 2009 (UTC)
Yeah, I gave the code a brief glance myself, and couldn't see anything that would cause the error either. For the second, would you mind me running through and updating usages in the various script files, then (and if not, do you want forms.editform or getElementById())? ダイノガイ 千?!? · Talk⇒Dinoguy1000 20:25, 24 September 2009 (UTC)

Please allow double spaces after periods

The AutoEd whitespace cleaner seems to be collapsing double spaces after a period (full stop) into a single space. An example is this edit. However, double spaces after periods is allowed by MOS:FULLSTOP, and many editors (including myself) use them intentionally, to make where sentences end easier to find in the edit view and because it's a well-ingrained habit from the double spacing at the end of sentences typographic convention. So I would request that this case of whitespace editing be turned off in the script. Thanks. Wasted Time R (talk) 23:58, 28 September 2009 (UTC)

Thanks for the request. I'm in the process of creating a more fine-grained whitespace algorithm. I will see if this is feasible. Plastikspork ―Œ(talk) 00:03, 29 September 2009 (UTC)

dashes.js

I created module Wikipedia:AutoEd/dashes.js for converting hyphens to dashes. I've extensively tested it and false positives are fairly rare. Some specific tests are in my sandbox. —GregU (talk) 13:22, 2 October 2009 (UTC)

What a nice tool, a great replacement for my own script I've been using so far :) As a suggestion: maybe check for incorrect use of minus signs in place of en dashes as well? —Quibik (talk) 17:56, 2 October 2009 (UTC)
Also, in cases like "word — anotherword" I think the em dash should be replaced with en dash or possibly the spaces be removed. I believe the en dash is closer to what the editor really intended to use. —Quibik (talk) 19:16, 2 October 2009 (UTC)
Neat! Any objections to adding this to the core modules? –Drilnoth (T • C • L) 17:58, 2 October 2009 (UTC)
If you're asking me, no objections. Based on testing many random pages, I'd estimate that it falsely converts a hyphen to a dash (e.g., thinks M–N is a range when it's really a two-part number) on about 1 out of 50 pages. So maybe that will help you determine which grouping to add it to.
Quibik, I'll look at these ideas this week. So have you seen this problem with the minus signs on a number of pages? I wasn't looking for this problem in my testing. On the em dash suggestion, would like to hear more input on what should be done here. Or I could make it a user preference. So there are no cases you can think of where a spaced em dash should stay as-is? Any other ideas welcome. Maybe note on User talk:GregU/dashes.js any pages you run across where you think it could be smarter. I do plan to keep on tuning this. —GregU (talk) 04:59, 3 October 2009 (UTC)

I implemented Quibik's suggestion to check for incorrect use of minus signs, where possible. It can't really determine if a spaced minus sign is incorrect in many cases—at least not without adding a "math" heuristic. I tested on quite a few math articles to ensure no false positives were added by this. I'm not comfortable with automatically changing a spaced em dash, as think this needs human judgement on a case-by-case basis. And this is one of the easiest things to catch by eye. I moved the script back under my user space so that I can continue to maintain it, since I'm not an admin. If using it already, please update your monobook.js to read it from User:GregU/dashes.js. Please go ahead and add it to the appropriate presets, if no objections from anyone. —GregU (talk) 00:00, 13 October 2009 (UTC)

dashes.js status?

Hello, I'm still waiting for the dashes.js module to be added to the presets, so I can begin advertising use of AutoEd to fix hyphens/dashes. I've continued testing on a daily basis and it is still working well, finding corrections in most featured articles. —GregU (talk) 03:04, 27 October 2009 (UTC)

GregU. Sorry, I think we all have been a bit busy lately. I moved your module down to the user contributed section. If you want me to add it to the main code, I can do so as well, but our general policy is to put those modules under full protection. Would that be okay? Of course, you can keep your own personal version as well! Otherwise, we can leave it in the user module section. Thanks! Plastikspork ―Œ(talk) 23:07, 19 November 2009 (UTC)

Bug/Error?

Please explain why was removed all wikilinks to American Psychological Association from the page? http://en.wikipedia.org/w/index.php?title=Homosexuality&action=historysubmit&diff=330641150&oldid=330639500 I don't see any reason why other associations are wikilinked to be ready to click on them (which is useful) and this one not. --Destinero (talk) 10:07, 9 December 2009 (UTC)

You are probably in the wrong place as this edit was not based on a feature of AutoEd per se and so not a bug. The diff shows that there were 3 links to the American Psychological Association and the latter two were removed leaving the first link. Considering the guidance of WP:OVERLINK you might consider where a link to the association would be most appropriate rather than turning every reference into a link. If you would like to discuss further, Talk:Homosexuality would be a more relevant location.—Ash (talk) 10:19, 9 December 2009 (UTC)