User talk:Ark25/RefScript

Page contents not supported in other languages.
From Wikipedia, the free encyclopedia

More about the script:

If the script doesn't work with a particular link from a website it knows, it doesn't mean the script doesn't work for all that website. The problem is that the big newspapers/websites have more than just one way to create their webpages and the script doesn't know all those different formatings. From what I noticed, big websites like BBC or New York Times have at least three or four different ways of formatting their news. In this case, report the link here, so I can improve it.

Reference names[edit]

If you want to personalize the reference names, so they don't mix with references added by someone else using this script in the same article in the same day, then change the second line in the script from var User_Prefix == ""; into var User_Prefix = "MyPrefix";

Where MyPrefix can be your user name for example, or some other distinct signature of yours.

An then, the reference will look like this:

<ref name="MyPrefix_BBC_2013-09-19c">{{cite web |url=http://www.bbc.co.uk/news/science-environment-23814524 |title=Sea otter return boosts ailing seagrass in California |newspaper=BBC |date= 26 August 2013 |last=Suzi Gage |accessdate=2013-09-19}}</ref>

instead of looking like this:

<ref name="BBC_2013-09-19c">{{cite web |url=http://www.bbc.co.uk/news/science-environment-23814524 |title=Sea otter return boosts ailing seagrass in California |newspaper=BBC |date= 26 August 2013 |last=Suzi Gage |accessdate=2013-09-19}}</ref>

Date formatting[edit]

If you need the "accessdate" to look like "September 19, 2013" (US style) instead of "2013-09-19", then you have to change the third line in the script from var Date_Format = ""; into var Date_Format = "US";

And then, the reference will look like this:

<ref name="BBC_2013-09-19c">{{cite web |url=http://www.bbc.co.uk/news/science-environment-23814524 |title=Sea otter return boosts ailing seagrass in California |newspaper=BBC |date= 26 August 2013 |last=Suzi Gage |accessdate=September 19, 2013}}</ref>


If you need the "accessdate" to look like "19 September 2013" (UK style) instead of "2013-09-19", then you have to change the third line in the script from var Date_Format = ""; into var Date_Format = "UK";

And then, the reference will look like this:

<ref name="BBC_2013-09-19c">{{cite web |url=http://www.bbc.co.uk/news/science-environment-23814524 |title=Sea otter return boosts ailing seagrass in California |newspaper=BBC |date= 26 August 2013 |last=Suzi Gage |accessdate=19 September 2013}}</ref>

Test links[edit]

BBC[edit]

no author:

one author:

others:

comma in date:

Daily Mail[edit]

no author:

one author:

two authors:

four authors:

wrong formatting, missing fields:

Daily Mirror[edit]

no author:

one author:

Daily Telegraph[edit]

no author:

one author:

videos:

The Guardian[edit]

One author:

Two authors:

missing fields:

The Independent[edit]

No author:

One author:

The Register[edit]

One author:

ZDNet[edit]

No author:

One author:

Huffington Post[edit]

one author:


Huffington Post Canada[edit]

no author:

one author:

New York Times[edit]

no author:

one author:

The Washington Post[edit]

one author:

to work at:

The Boston Globe[edit]

one author:

two authors:

The Times of India[edit]

no author:

one author:

Business Week[edit]

one author:

two authors:

Financial Times[edit]

on Financial Times you can only read a few articles for free, then you must pay

one author:

three authors:

The Economist[edit]

on The Economist you can only read a few articles for free, then you must pay

Series of articles: http://www.economist.com/topics/aquatic-animals?page=4

Wall Street Journal[edit]

blogs.wsj.com
online.wsj.com
Video
Images
pages with errors

Ars Technica[edit]

one author

two authors:

three authors:

TG Daily[edit]


YouTube[edit]

Other samples[edit]

posted, updated:

First Posted: 09/16/11 03:51 PM ET - Updated: 11/16/11 05:12 AM ET
Author name in capital letters

Not working links[edit]

Other bugs[edit]

Author name screwed by Title Case function:

Sample references generated[edit]

Here's a few examples of generated references:

  • Reference generated with default settings:

<ref name="MyUser_The_Washington_Post_May_12_2014c">{{cite web |url=http://www.washingtonpost.com/wp-dyn/content/article/2008/07/18/AR2008071803258.html |title=Whale Advocates Gain Victory |newspaper=The Washington Post |date= July 19, 2008 |author=David A. Fahrenthold |accessdate=May 12, 2014}}</ref>

David A. Fahrenthold (July 19, 2008). "Whale Advocates Gain Victory". The Washington Post. Retrieved May 12, 2014.
  • UK-style date: - (var Date_Format = "UK";)

<ref name="MyUser_The_Washington_Post_12_May_2014c">{{cite web |url=http://www.washingtonpost.com/wp-dyn/content/article/2008/07/18/AR2008071803258.html |title=Whale Advocates Gain Victory |newspaper=The Washington Post |date= 19 July 2008 |author=David A. Fahrenthold |accessdate=12 May 2014}}</ref>

David A. Fahrenthold (19 July 2008). "Whale Advocates Gain Victory". The Washington Post. Retrieved 12 May 2014.
  • Date in YYYY-MM-DD format: - (var Date_Format = "YMD";)

<ref name="MyUser_The_Washington_Post_2014-05-12c">{{cite web |url=http://www.washingtonpost.com/wp-dyn/content/article/2008/07/18/AR2008071803258.html |title=Whale Advocates Gain Victory |newspaper=The Washington Post |date= 2008-07-19 |author=David A. Fahrenthold |accessdate=2014-05-12}}</ref>

David A. Fahrenthold (2008-07-19). "Whale Advocates Gain Victory". The Washington Post. Retrieved 2014-05-12.
Article publication date into the reference name - (var Ref_Name_Date_Publication_Date="Yes";)
  • US-style date: - (var Date_Format = "US";)

<ref name="MyUser_The_Washington_Post_July_19_2008c">{{cite web |url=http://www.washingtonpost.com/wp-dyn/content/article/2008/07/18/AR2008071803258.html |title=Whale Advocates Gain Victory |newspaper=The Washington Post |date= July 19, 2008 |author=David A. Fahrenthold |accessdate=May 12, 2014}}</ref>

David A. Fahrenthold (July 19, 2008). "Whale Advocates Gain Victory". The Washington Post. Retrieved May 12, 2014.
  • UK-style date: - (var Date_Format = "UK";)

<ref name="MyUser_The_Washington_Post_19_July_2008c">{{cite web |url=http://www.washingtonpost.com/wp-dyn/content/article/2008/07/18/AR2008071803258.html |title=Whale Advocates Gain Victory |newspaper=The Washington Post |date= 19 July 2008 |author=David A. Fahrenthold |accessdate=12 May 2014}}</ref>

David A. Fahrenthold (19 July 2008). "Whale Advocates Gain Victory". The Washington Post. Retrieved 12 May 2014.
  • Date in YYYY-MM-DD format: - (var Date_Format = "YMD";)

<ref name="MyUser_The_Washington_Post_2008-07-19c">{{cite web |url=http://www.washingtonpost.com/wp-dyn/content/article/2008/07/18/AR2008071803258.html |title=Whale Advocates Gain Victory |newspaper=The Washington Post |date= 2008-07-19 |author=David A. Fahrenthold |accessdate=2014-05-12}}</ref>

David A. Fahrenthold (2008-07-19). "Whale Advocates Gain Victory". The Washington Post. Retrieved 2014-05-12.
Short reference name - (var Ref_Name_Short = "Yes";)

<ref name="MyUser_TWP_May_12_2014c">{{cite web |url=http://www.washingtonpost.com/wp-dyn/content/article/2008/07/18/AR2008071803258.html |title=Whale Advocates Gain Victory |newspaper=The Washington Post |date= July 19, 2008 |author=David A. Fahrenthold |accessdate=May 12, 2014}}</ref>

David A. Fahrenthold (July 19, 2008). "Whale Advocates Gain Victory". The Washington Post. Retrieved May 12, 2014.

Date contained in "<meta>" tags[edit]

<meta  name="DISPLAYDATE" content="May 31, 2007">

Nice tags[edit]

<span class='byline-text'>July 12, 2012</span>
<span class="time">May 19, 2010 7:10 pm</span>
<span class="date" data-time="1334669580">Apr 17, 2012 1:33 pm UTC</span>
<span class="meta-date">Posted <time datetime="2011-07-12T14:43Z">July 12, 2011 - 10:43</time></span>

Reference Standard[edit]

In order to make it possible to generate the references for newspapers articles in one click, we need a tiny standard to be defined and then used by the newspapers.

Let's analyze the following newspaper article:

In order to generate the reference in one click, we need a bookmarklet (a JavaScript program) that is capable to capture the following fields:

  • Title: Elephants recognise human voices
  • Publication date: 10 March 2014
  • Author(s) name(s): Victoria Gill
  • Publication name: BBC

We already have such a script at User:Ark25/RefScript, but because the standard doesn't exist, maintaining the script is a mess.

Now, let's take a look at how the fields are encoded in the webpage:

  • <title>BBC News - Elephants recognise human voices</title>
  • <span class="date">10 March 2014</span>
  • <span class="byline-name">By Victoria Gill</span>
  • the name of the publication is not encoded in the webpage, because it can be deduced from the URL (http://www.bbc.com/news/..)


Now let's go to another publication:

The fields:

  • <title>Speak Whale to Me - New York Times</title>
  • <div class="timestamp">Published: May 31, 2007</div>
  • <div class="byline">By DAVID ROTHENBERG</div>


As you can see, two different publications have two different ways to specify those fields (title, publication date, author name). That means that our script has to learn how each site is encoding those fields. What makes things worse is that many times, the same publication uses five or six different ways to encode those fields. Even worse than that, the newspapers constantly change the way they encode those fields - every three years or so. That means, the script always has to be updated. It also means that the script has to be gignormous in order to become capable handle let's say the biggest 20 newspapers from every country.


Now, if there would be a tiny standard to encode those fields, the script will be very short and it will work with any site that implements the standard. The standard - let's call it "Ref Standard" defines the fields like this:

  • <span class="Article title">Elephants recognise human voices</title>
  • <span class="Publication date">10 March 2014</span>
  • <span class="Date style">UK</span>
  • <span class="Authors names">Victoria Gill</span>
  • <span class="Publication name">BBC News</title>

From what I can say, using <span> tags is much better than using <meta>, because the bookmarklets are not capable to capture the <meta> tags.

This is "Ref Standard A", the standard can have more sophisticated versions, like "Ref Standard B", where many other fields can be defined. But for us to be capable to generate the references in one click, the "Ref Standard A" is more than enough.

Once a few newspapers and websites implement the "Ref Standard A", the script will work without any need to update it or to teach it how to treat every such individual website.

The standard already exists (above). If needed, it can be improved very quick and very easy. Now we only need the WMF board to convince a few newspapers to implement the standard and then off we go. —  Ark25  (talk) 00:40, 18 September 2014 (UTC)[reply]

See also[edit]

User feedback[edit]

To Benefac and to all others: If you have any suggestions to improve the script, please let me know. Or if you don't like something. Or if you have any other kind of comment. The best way to do that is to leave a comment on my talk page. Thanks. —  Ark25  (talk) 17:04, 4 August 2014 (UTC)[reply]

Discussions about the script[edit]

Wikipedia articles with poorly formatted references[edit]

Test[edit]

[1]

  1. ^ "Seal filmed swimming in flooded Cambridgeshire fens". BBC. {{subst:date|2013-01-02}}. Retrieved {{subst:date|2014-10-15}}. {{cite web}}: Check date values in: |accessdate= and |date= (help)