User talk:Robert.Allen/Draft

Page contents not supported in other languages.
From Wikipedia, the free encyclopedia

The first part of the following section was copied from [1]

Unhyphenated ISBNs[edit]

Hi Michael, Many editors prefer the hyphenated form of the ISBNs, but they do not show up as targets when searching for pages citing particular ISBNs when using the unhyphenated form as the search string. I have been unable to persuade other editors that the unhyphenated form might be preferable. Do you know of a way to add the unhyphenated form to pages so it will be found by search engines, but not be visible to readers? Then in my view it would not matter as much, if we use the hyphens.

I tried adding an anchor with the unhyphenated form (see User:Robert.Allen/Busoni revamp), but a Google site search did not find it. My results searching with the hyphenated form in the Google site search worked to find the displayed hyphenated target, but with the unhyphenated string it did not find the hidden one. Using the Wikipedia Help search box, I was able to find both, and the hyphenated form seemed to work better in quotes, ie, it did not pick up irrelevant results.

Some sample searches:

  • Wikipedia Help search of User space:
    • 9780828825863 [2]
    • 978-0-8288-2586-3 [3]
    • "978-0-8288-2586-3" [4]
  • Google searches of en.wikipedia.org:
    • 9780828825863 [5]
    • 978-0-8288-2586-3 [6]
    • "978-0-8288-2586-3" [7]

I would like for it also to work with Google, Yahoo, etc. --Robert.Allen (talk) 00:53, 19 February 2012 (UTC)[reply]

It can be concluded from your experiments that Google doesn't find HTML code created by {{Anchor}}, which in this case is <span id="9780828825863"></span>. I find that not surprising. You could try instead hidden text, generated by {{Hid}} which, subtly different, should be <span style="display:none">9780828825863</span>. I have no idea whether this will work but it's a different approach. -- Michael Bednarek (talk) 06:23, 19 February 2012 (UTC)[reply]
So this way it is not a parameter value. Thanks for the suggestion. I'll try to test it soon. (This is like a citation index issue. It may not be particularly important. I just feel like if we can, we ought to try to make it a bit easier, by reducing the complexity of the searches needed to find them. Unhyphenated ISBNs are all over the place. The hyphenated ones are quite a bit more difficult to come across unless you know where to look.) --Robert.Allen (talk) 08:47, 20 February 2012 (UTC)[reply]
Hello Michael, I made the change more than 24 hrs ago, and now Google seems to be finding the hidden ISBN. Just what we were looking for. Thanks! I will try to test some other search engines to see if this works generally. I'm going to assume it will, so, the question arises as to how this approach might be implemented. My inclination would be to try to get the people who write the Wikisoftware that recognizes ISBNs to insert the unhyphenated hidden one when the page gets saved down as html. Then I assume it would not appear in the editable text of the page, and these additional search targets would be completely automatic and hidden from editors. The question then also arises whether pages with displayed ISBN-10s could easily acquire hidden unhyphenated ISBN-13s as well as hidden unhyphenated ISBN-10s. Or vice-versa for that matter: could pages with displayed ISBN-13s easily acquire both hidden unhyphenated ISBN-13s, and if relevant, hidden unhyphenated ISBN-10s? (It is my impression that there is a simple calculation to interconvert the unhyphenated forms from 10 to 13 and from 13 to 10.) What do you think? I don't even know how one proposes such a thing, or whether it would get serious consideration or not. If not, then a bot which inserts these hidden ISBNs into the editable text may be the only approach. The drawback with this second option is that it might encounter a reasonable amount of opposition. The hidden ISBNs might regularly get deleted by editors who find them a nuisance, and I would not necessarily disagree, because all this extra text would clutter up the edit window and make editing more difficult. You are probably way more knowledgeable about this sort of thing than I am. Does any of this make sense to you? Does this seem like a good idea even? --Robert.Allen (talk) 09:24, 23 February 2012 (UTC)[reply]
I think having the Wikisoftware insert unhyphenated ISBNs as hidden text into its HTML output would be a great idea, much preferrable to the other solutions for the reasons you point out. The area where such a proposal could initially be raised is probably Wikipedia:Village pump (technical), although I suspect it will eventually have to go to WP:BUGZILLA/bugzilla. As for the algorithms involved in ISBNs, I'm completely ignorant. -- Michael Bednarek (talk) 10:26, 23 February 2012 (UTC)[reply]
If a proposal is to be made, maybe we can suggest preventing wrapping of the hyphenated ISBNs at the same time (ie, the html equiv of "{{Nowrap|ISBN 978-0-8288-2586-3}}") I will think about this some more and try to write a proposal on one of my user pages. Perhaps when I get it worked up you will be willing to read it and make suggestions. BTW, one drawback I have noticed, is that my browser (Safari) does not find the hidden text when I search the page. I'm wondering whether this is the case for most browsers. Is there any way around this problem? Or can we just ignore it? --Robert.Allen (talk) 19:03, 23 February 2012 (UTC)[reply]
I'm pretty sure every browser will find only visible text, and sometimes not even that (if it's in different frames). -- Michael Bednarek (talk) 07:28, 24 February 2012 (UTC)[reply]

Example of how it might work[edit]

This ISBN-13:

  • ISBN 978-0-8288-2586-3

might produce the following page source code:

  • <span class="nowrap">978-0-8288-2586-3</span><span style="display:none">9780828825863 0828825866</span>

Would this work? --Robert.Allen (talk) 20:17, 19 March 2012 (UTC)[reply]

I see I left out the link to Special:Book sources. I need to add that. --Robert.Allen (talk) 20:25, 19 March 2012 (UTC)[reply]

Is this better?

  • <a href="/wiki/Special:BookSources/9780828825863" class="internal mw-magiclink-isbn"><span class="nowrap">978-0-8288-2586-3</span><span style="display:none">9780828825863 0828825866</span></a>

--Robert.Allen (talk) 20:32, 19 March 2012 (UTC)[reply]

This is the trick I have used successfully at Learning Perl, for the search issue. As for the no-wrap I have never seen an ISBN break on hyphens and my testing failed to do that. The class used is like this: <a href="/wiki/Special:BookSources/9780828825863" class="internal mw-magiclink-isbn">ISBN 978-0-8288-2586-3</a> .Rich Farmbrough, 02:09, 20 March 2012 (UTC).[reply]
Ah you had already found the class. If you can find the style sheet, I suspect it includes some no-wrap functionality. Rich Farmbrough, 02:34, 20 March 2012 (UTC).[reply]

Example with ISBN-10[edit]

This ISBN-10:

would produce the following page source code:

  • <a href="/wiki/Special:BookSources/0828825866" class="internal mw-magiclink-isbn"><span class="nowrap">0-8288-2586-6</span><span style="display:none">9780828825863 0828825866</span></a>

--Robert.Allen (talk) 20:42, 19 March 2012 (UTC)[reply]

Converting ISBN-10 to ISBN-13[edit]

This ISBN-10

can be converted to an ISBN-13 using the following steps:

  • remove the final check digit
  • add the "978-" prefix
  • calculate the new check digit using this algorithm:

The calculation of an ISBN-13 check digit begins with the first 12 digits of the thirteen-digit ISBN (thus excluding the check digit itself). Each digit, from left to right, is alternately multiplied by 1 or 3, then those products are summed modulo 10 to give a value ranging from 0 to 9. Subtracted from 10, that leaves a result from 1 to 10. A zero (0) replaces a ten (10), so, in all cases, a single check digit results.

s = 9×1 + 7×3 + 8×1 + 0×3 + 8×1 + 2×3 + 8×1 + 8×3 + 2×1 + 5×3 + 8×1 + 6×3
  =   9 +  21 +   8 +   0 +   8 +   6 +   8 +  24 +   2 +  15 +   8 +  18
  = 127
127 / 10 = 12 remainder 7
10 –  7 = 3

--Robert.Allen (talk) 21:08, 19 March 2012 (UTC)[reply]

I wrote a template to do this back in the day, you might want to search for it. Rich Farmbrough, 02:11, 20 March 2012 (UTC).[reply]
On the other hand it's probably not much use for this. Rich Farmbrough, 02:14, 20 March 2012 (UTC).[reply]

Converting ISBN-13 to ISBN-10[edit]

To be added

Can't be done if the 13 digit number starts with 979. If it starts with 978, take the 9 digits and calculate the checksum per ISBN. Rich Farmbrough, 02:13, 20 March 2012 (UTC).[reply]

Creating the missing hyphenated form[edit]

The (right aligned) hyphenation is consistent between the two formats, 13 and 10 digit. Therefore adding the 13 digit or 10 digit hyphenation is trivial.

Rich Farmbrough, 17:37, 20 March 2012 (UTC).[reply]

Hidden text[edit]

Possibly should be:

ISBN 1000000000 Parameter error in {{ISBN}}: checksum ISBN 1-00-000000-0 Parameter error in {{ISBN}}: checksum ISBN 9781000000000 Parameter error in {{ISBN}}: checksum ISBN 978-1-00-000000-0 Parameter error in {{ISBN}}: checksum

for simplicity

or

ISBN 1000000000 Parameter error in {{ISBN}}: checksum ISBN 9781000000000 Parameter error in {{ISBN}}: checksum ISBN 978-1-00-000000-0 Parameter error in {{ISBN}}: checksum (if we use the 10 digit in the text)

ISBN 1000000000 Parameter error in {{ISBN}}: checksum ISBN 1-00-000000-0 Parameter error in {{ISBN}}: checksum ISBN 9781000000000 Parameter error in {{ISBN}}: checksum (if we use the 13 digit in the text)


This provides both human readable and both machine readable versions in both 10 and 13 digits, as well as the "ISBN" for more accurate searching.

Rich Farmbrough, 17:37, 20 March 2012 (UTC).[reply]

I was under the impression that adding hyphens to unhyphenated ISBNs required a web lookup that would significantly slow execution. However, if we assume that a bot has already added hyphens to the ISBN in the edit window text, then you are saying that adding the hyphenated 10 or 13 alternative format is trivial. We could certainly do that. --Robert.Allen (talk) 20:12, 20 March 2012 (UTC)[reply]
Yes, that's what I'm saying. In fact ISBN recommend storing the number unhyphenated and hyphenating at presentation, but this is inefficient for Wikipedia, and would require maintenance of the hyphenation table (or tree). Rich Farmbrough, 08:25, 21 March 2012 (UTC).[reply]


Code[edit]

Code needs to go in includes/parsers/Parser.php Rich Farmbrough, 08:27, 21 March 2012 (UTC).[reply]

Sorry, not sure what you mean by this. --Robert.Allen (talk) 08:54, 21 March 2012 (UTC)[reply]

Wrapping is browser specific[edit]

Wrapping appears to be browser dependent. On the page Samson and Delilah (opera). I'm not seeing wrapping in Firefox, but I do see it in Safari. Here's the source code from Firefox:

  • <a href="/wiki/International_Standard_Book_Number" title="International Standard Book Number">ISBN</a> <a href="/wiki/Special:BookSources/978-0-19-518954-4" title="Special:BookSources/978-0-19-518954-4">978-0-19-518954-4</a><span class="printonly">

Here it is from Safari:

  • <a href="/wiki/International_Standard_Book_Number" title="International Standard Book Number">ISBN</a> <a href="/wiki/Special:BookSources/978-0-19-518954-4" title="Special:BookSources/978-0-19-518954-4">978-0-19-518954-4</a><span class="printonly">

It looks identical to me, but behaves differently in the two browsers. --Robert.Allen (talk) 21:05, 24 March 2012 (UTC)[reply]

How to deal with book templates[edit]

An alternative would be add the hidden ISBNs so they are visible in the edit window:

  • This does not work:
    • {{cite book |title=French Opera at the Fin de Siecle: Samson and Delilah |url=http://books.google.com/books?id=KSQGZOTQKmwC&pg=PA206&dq=%22Saint-Saens+on+the+Cusp%22 |last=Huebner |first=Steven |year=2006 |publisher=Oxford Univ. Press, US |isbn=978-0-19-518954-4{{Hid|9780195189544 0-19-518954-X 019518954X}}}}
    • Huebner, Steven (2006). French Opera at the Fin de Siecle: Samson and Delilah. Oxford Univ. Press, US. ISBN [[Special:BookSources/978-0-19-518954-4|978-0-19-518954-4<span data-sort-value="9780195189544 0-19-518954-X 019518954X"></span>]]. {{cite book}}: Check |isbn= value: invalid character (help)
  • But this does:
    • {{cite book |title=French Opera at the Fin de Siecle: Samson and Delilah |url=http://books.google.com/books?id=KSQGZOTQKmwC&pg=PA206&dq=%22Saint-Saens+on+the+Cusp%22 |last=Huebner |first=Steven |year=2006 |publisher=Oxford Univ. Press, US |isbn=978-0-19-518954-4}}{{Hid|9780195189544 0-19-518954-X 019518954X}}
    • Huebner, Steven (2006). French Opera at the Fin de Siecle: Samson and Delilah. Oxford Univ. Press, US. ISBN 978-0-19-518954-4.

--Robert.Allen (talk) 19:39, 24 March 2012 (UTC)[reply]

  • Of course, this does not solve the wrapping problem in Safari. Maybe it would be possible to change Template:Cite book so that it would be compatible with the Wiki ISBN Magic. --Robert.Allen (talk) 19:53, 24 March 2012 (UTC)[reply]
  • Another thought that occurs to me: a parameter called "alt_isbn" could be added to Template:Cite book, which your bot could use to add the alternative ISBN 10 or 13. Then the template could be redesigned to add the hidden unhyphenated and alternate ISBNs and also wrap the displayed ISBN in <span class="nowrap">...</span>. --Robert.Allen (talk) 20:09, 24 March 2012 (UTC)[reply]
  • Are there any other templates which add ISBNs? --Robert.Allen (talk) 20:28, 24 March 2012 (UTC)[reply]
  • There are, but they all render the ISBN using wikimagic. And there is consensus that this is the best way to do it. Certainly having ISBN link to the article is not following principle of least surprise. So as long as they do this, the proposal deals with all templated ISBNs too. Rich Farmbrough, 23:26, 24 March 2012 (UTC).[reply]

At least three of these templates do not use wikimagic[edit]

I tested three of the above templates using fake ISBNs which Wikimagic ignores (note that they do not turn blue until they are added to the template)

  • Results with Template:Cite book using fake ISBN XXXXXXXXXX
  • Results with Template:Citation using fake ISBN YYYYYYYYYY
    • Turner, Orsamus (1851), History of the pioneer settlement of Phelps and Gorham's purchase, and Morris' reserve, Rochester, New York: William Alling, ISBN YYYYYYYYYY {{citation}}: Check |isbn= value: invalid character (help)
    • I should reproduce the entire HTML code for this one, but can't because this page will get too wide, but here is a small part which displays the ISBN (you can view the entire code in the HTML source code window of the browser):
      • <a href="/wiki/International_Standard_Book_Number" title="International Standard Book Number">ISBN</a> <a href="/wiki/Special:BookSources/YYYYYYYYYY" title="Special:BookSources/YYYYYYYYYY">YYYYYYYYYY</a>
  • Results with Template:ISBNT using fake ISBN ZZZZZZZZZZ
    • ZZZZZZZZZZ Parameter error in {{ISBNT}}: invalid character
    • HTML code is:
      • <a href="/wiki/Special:BookSources/ZZZZZZZZZZ" title="Special:BookSources/ZZZZZZZZZZ">ZZZZZZZZZZ</a>

Conclusion: I think we could safely ignore template ISBNT, but in the long run, "Cite book" should probably not be ignored, and probably also template "Citation". I feel that persuading the authors of these two templates to change them so that the hidden targets can be added may be a difficult goal. We could focus on ISBNs that are formatted by Wikimagic first, and if that can be accomplished, then the authors of the templates may possibly decide to modify their code somehow to replicate it. --Robert.Allen (talk) 09:59, 25 March 2012 (UTC)[reply]

Yes, you are quite right, ISBN 10 and ISBN 13 should not really be used. The Cite family should be fixed... Rich Farmbrough, 05:49, 26 March 2012 (UTC).[reply]