Module talk:PopulationFromWikidata

Page contents not supported in other languages.
From Wikipedia, the free encyclopedia
WikiProject iconAustralia Template‑class
WikiProject iconPopulationFromWikidata is within the scope of WikiProject Australia, which aims to improve Wikipedia's coverage of Australia and Australia-related topics. If you would like to participate, visit the project page.
TemplateThis module does not require a rating on Wikipedia's content assessment scale.

Several population figures[edit]

The module (which is fantastic) seems to remove the visibility of Population 2 in the infobox. For example: Warragul Poketama (talk) 12:28, 30 June 2022 (UTC)[reply]

@Poketama: Glad you like it! The idea is that |pop2= will also be automatically filled. The module does handle outputting a list of populations when there are multiple available in Wikidata. There is some complexity to this, however; have a read of this bit of the documentation. Roughly, if an infobox type matches one of the ABS geography types (according to the mapping in the module) then only one value is shown, but if it doesn't match then more will be. This is definitely up for discussion! It's a balance between showing all available information, and only the most useful.
In the case of Warragul, it looks like you want to display the UCL and SUA populations. There is no new data for the SULSUA or UCL population yet, in Wikidata. It will be imported later this year for the 2021 census, or you can add it now manually for the 2016 census.
Sam Wilson 12:57, 30 June 2022 (UTC)[reply]
Fantastic! Poketama (talk) 12:59, 30 June 2022 (UTC)[reply]
If not manually supplied, I'd like to see pop2 used for the SSC data for towns which are also suburbs/localities. If there is both a UCL and an SSC, it would be useful to display both. — Preceding unsigned comment added by Kerry Raymond (talkcontribs) 02:01, 1 July 2022 (UTC)[reply]
There are some situations where there are three or more relevant geographies, so it's good to list them all I think. Although it sounds like there's also a bunch of items that are confused about what geography they represent. We'll make a list of these issues soon and try to figure out the best system (if possible, it'd be nice to not have to do things manually; perhaps the rules aren't that hard and fast though). —Sam Wilson 04:11, 1 July 2022 (UTC)[reply]
Is there any sign of the 2021 UCL data being imported to Wikidata? It seems to be available on the ABS website, but maybe it's not in an importable format - yet. Innesw (talk) 12:01, 27 June 2023 (UTC)[reply]
You're right the data is available and can be uploaded in bulk to Wikidata. It's on my radar and thanks for the reminder! There are a few stages to the process.
1. The first step is collate the correspondance table of item versus 2016 UCL ID versus 2021 UCL ID. I have done this.
2. The next step is to use that table to add the 2021 UCL IDs to Wikidata items using QuickStatements. I'll do this in the next week.
3. Then the reference data needs to be added to the 2021 UCL population data and this all uploaded to Wikidata using the 2021 UCL IDs to match. I'll aim to do this soon!
I'll let you know when the UCL data is up on Wikidata. MaiaCWilliams (talk) 06:58, 28 June 2023 (UTC)[reply]

Jimbour East and 2 or more infoboxes[edit]

As one infobox is type town and the other a suburb, I think following the current strategy will work just fine in this case (UCL for town, if available, and SSC for the suburb); I personally think the problem there is due to a poorly-considered merge and not a general problem. In my experience double Australian place infoboxes are quite rare, I would suggest only worrying about the first Australian place infobox, and use AWB to seek out those articles with more than one and deal with them manually. Double infoboxes on Australian place articles more often a case of one being an Australian place infobox (as it is the name of a town/suburb/locality) and the other being a different type of infobox, usually for the significant geography feature from which the town or whatever takes its name and is within that town, mountains and islands being the most common scenario I see. We do sometimes have a 2nd Australian place infobox for a protected area but we don't have population data for these so this it's a non-problem (and can be dealt with by the AWB and manual decision process I suggest above). Kerry (talk) 02:01, 1 July 2022 (UTC)[reply]

Non-census estimates and the larger statistical areas[edit]

I don't think anything needs to be done with these estimates. Firstly they are estimates and, while it's fine to include them in article content, IMHO they should not replace the infobox data which is drawn from actual census data (a more reliable source). Only a small number of larger towns have these estimates calculated and, in my observation, these get added manually quite quickly as most towns have a contributor with an obsession with "look how big my town is". The same comments apply to the larger statistical areas, which mostly do not have a natural mapping to the named places we have articles for, but again get seized upon by the "look how big" crowd and manually added (mostly inappropriately). Or to put it another way, the small number of larger urban areas will get manually updated population data without any automated/semi-automated assistance. The need for automation/semi-automation is to update the population of smaller towns and the suburbs/localities that don't get the same level of contributor interest. And these have UCL and SSC data. Kerry (talk) 02:54, 1 July 2022 (UTC)[reply]

I think I have to agree. Using GCCSA as the primary population count never looked right to me. A table referencing these bigger statistical areas might be useful but not in the infobox population count. Or add non-mandatory metro and SUA pop option boxes. Sdinesh2222 (talk) 02:15, 25 August 2022 (UTC)[reply]

Typo in mapping to geography types?[edit]

In the code mapping Infobox place type to census place type (at the top of function ListForInfobox), the tests for suburb are "suburb and "suburb", neither with an initial uppercase 'S' which the other tests all have. I presume this could be stopping the module finding any data for some Infoboxes. Innesw (talk) 04:46, 8 July 2022 (UTC)[reply]

In fact, wouldn't it be better to have articleplacetype = string.lower(frame.args.type), and just test for the all-lowercase values? Then user case typos in the infobox type are irrelevent. Innesw (talk) 05:03, 8 July 2022 (UTC)[reply]
@Innesw: Great idea! This is now done (along with a few other fixes). Sam Wilson 01:22, 13 July 2022 (UTC)[reply]

Update from sandbox[edit]

This version of the sandbox (diff) is ready for release as the new version of this module. It resolves a few of the issues that have been reported or discovered. Could it please be copied over to the live module? Thanks! — Sam Wilson 09:38, 12 July 2022 (UTC)[reply]

 Done * Pppery * it has begun... 13:41, 12 July 2022 (UTC)[reply]
@Pppery: Thanks! Sam Wilson 01:20, 13 July 2022 (UTC)[reply]
  • I *love* this. Well done!!! Can the coord be done next? I’ve been experimenting by using inline wikidata calls (see Willetton and Derby – note wikidata calls for pop 😅), but it is horrible and this is a much better approach. Betterkeks (talk) 08:10, 16 July 2022 (UTC)[reply]
    Glad you like the module!
    I didn't know about the Coord template taking a population argument! Seems a good idea to bring in place coordinates (and set appropriate map scales) using Wikidata values. I've just added a description of your problem to the project to-do list here. Feel free to edit if I haven't got it right! And I'd love to talk more about what you think the best approach would be. I'm hoping to organise a workshop (online? and/or at the community conference in November?) for anyone who's interested in brainstorming best ways to use values (particularly population) from Wikidata. We've got quite a list to tackle. I'll post a notice here if we organise something. MaiaCWilliams (talk) 03:18, 18 July 2022 (UTC)[reply]

Ref name example?[edit]

What is the ref name for the 2021 census link generated by this module at Mount Lawley, Western Australia? An IP tried to update the article for the new census without changing the ref and I couldn't figure out the ref name, so I had to revert their edit. An example in the documentation would be very helpful here. Thanks! Graham87 13:01, 3 September 2022 (UTC)[reply]

@Graham87: The <ref name=Census /> in use in that article is not actually coming from the Infobox, but is defined within the article (in the Public transport section). So your reversion was quite right, but the other fix would've been to update the reference to match the source of the new numbers. That said, doing so would mean that there would then be multiple references for the 2021 Census, and that's not ideal — at the moment, however, it's the only solution. The issue is that it's not possible to name a reference within this module and then have that name be used outside it. This is a known issue, and we've got a few ideas for how to resolve it, the most likely being to create a template e.g. {{PopulationFromWikidata}} that will output just the current population along with a reference (which would then not be a duplicate to the Infobox reference). However (hehe), this still wouldn't help with sentences such as "while the median family income was $3,117 per week. The median age of Mount Lawley residents was 38 years.", where it's not worth having an auto-updating reference paired with manually-updated in-text figures. That's a whole other story. Hmm… I've blathered on a bit long here — the short of it is that to fix Mount Lawley, the citation needs to be updated along with the numbers (and the resulting dupe reference put up with for the time being, I think). :-) Sam Wilson 08:16, 7 September 2022 (UTC)[reply]

Re-using the reference[edit]

A few people (including @Steelkamp) have asked about how to avoid duplicating the reference if a population is also given within an article's text. I'm trying to figure out the best way to do this. I thought that perhaps something like Hamersley has a population of {{PopulationFromWikidata}}. would be good, in that it'd always show the latest population with the correct reference and that reference would exactly match the one for the infobox and so wouldn't be duplicated (Cite removes dupes, but they do have to be exact). The problem is that generally there are other figures given in the text; the example above is actually Hamersley had a population of 5,209 at the 2021 census. This was an increase on the 4,982 recorded… and so there's no point in making the first figure automatic, because the second figure is dependent on it and if one is updated they all have to be. So, my next thought is that we should be able to make a generic reference template, e.g.: Hamersley had a population of 5,209 at the 2021 census.{{PopulationFromWikidata|refonly|claim=210a80be3f80b58f394ade59f08b8aaae9f0b06e}} That would work, and would output whatever references are used on that claim, but it's pretty clunky having to know the hash of the claim (not to mention that that can change if the claim is removed and recreated). It's not enough to say "just give me the 2021 figure" because the population figure that gets used is based on the combination of determination method (P459), point in time (P585), and applies to part (P518). So I'm not really sure what to do! (There's a separate discussion about making a tabular output of all past population figures, and I think that's much easier.) — Sam Wilson 10:22, 3 December 2022 (UTC)[reply]

Would it be possible to enable using the pop_footnotes= parameter to insert a named reference instead of using the default reference generated by the module? Steelkamp (talk) 18:34, 3 December 2022 (UTC)[reply]
Or does that encounter the problem of the reference not being updated when the data updates? Steelkamp (talk) 18:34, 3 December 2022 (UTC)[reply]
@Steelkamp: Yeah that's the big issue with all of this, really. The fact that there's a duplicate citation is only an artifact of the fact that the article text and the infobox are in sync at the moment. As soon as the next census comes out, they'll be out of sync again — and avoiding the manual update of 12,000 articles every 5 years is what this is all trying to avoid! :-) For now, the best we can do I think is {{PopulationFromWikidata|wikidata=Q5644515}} which gives: 5,209 (SAL 2021)[1], but it might not really be worth it. Sam Wilson 01:45, 5 December 2022 (UTC) Sam Wilson 01:45, 5 December 2022 (UTC)[reply]

References

  1. ^ Australian Bureau of Statistics (28 June 2022). "Hamersley (suburb and locality)". Australian Census 2021 QuickStats. Retrieved 28 June 2022. Edit this at Wikidata

Error[edit]

There is an error showing on Hills beach, New South Wales. Can someone fix it please? — Martin (MSGJ · talk) 12:24, 23 December 2022 (UTC)[reply]

@Samwilson — Martin (MSGJ · talk) 12:25, 23 December 2022 (UTC)[reply]
I fixed this. The specific problem is that the module produces a Lua error when an article is connected to a Wikidata item with no claims at all. Probably not worth fixing given its rarity and that it highlights an underlying issue. * Pppery * it has begun... 18:23, 23 December 2022 (UTC)[reply]

Thanks — Martin (MSGJ · talk) 18:38, 23 December 2022 (UTC)[reply]

I fixed the error because it is irritating when it happens as it shows a big red message with no clue about the problem. I noticed Municipality of Scone which was created on 5 July 2023. The article was fine, but today a bot created an empty Wikidata item which generated the confusing error. Johnuniq (talk) 02:03, 20 July 2023 (UTC)[reply]

Coolabunia isn't getting an automatic census population?[edit]

I removed the 2016 population from the infobox but the 2021 population did not magically appear, yet there is 2021 SAL data for Coolabunia. What's gone wrong? I tested on a neighbouring locality and that one worked as expected. Kerry (talk) 23:29, 8 January 2023 (UTC)[reply]

@Kerry Raymond: It looks like Coolabunia (Q30763775) and Coolabunia (Q48808424) need to be merged. The article here links to the latter, which doesn't have any population data. I'm not sure if this is another example of a locality/area and town that are separate things, but I suspect not. Sam Wilson 00:20, 9 January 2023 (UTC)[reply]
Qld places on Wikidata are full of rubbish data, so nothing new there. Not being a Wikidatian, I tried to merge the two but it failed. I used https://www.wikidata.org/wiki/Special:MergeItems It initially refuses to let me merge them because that their en descriptions are different. Since neither description was correct (rubbish data!), I tried to change them both to be the same correct description "locality in the South Burnett Region, Queensland, Australia" but it won't let me change it on Q30763775, saying "Item Q48808424 already has label "Coolabunia" associated with language code en, using the same description text." So it seems they can't be merged. Kerry (talk) 10:23, 9 January 2023 (UTC)[reply]
Solved. Remove all the statements and it lets you merge them. Kerry (talk) 10:42, 9 January 2023 (UTC)[reply]
@Kerry Raymond: Sorry, I should've just merged them this morning! But I wasn't 100% certain, so I'm glad you looked too. And yeah, merging is a bit annoying, but there's a gadget that makes it quite simple! All looks sorted now I think. Ping me about any others you find, if you want! :-) Sam Wilson 12:36, 9 January 2023 (UTC)[reply]

Non-numeric populations[edit]

I have raised this issue in Template talk:Infobox Australian place#Non-numeric values for pop and pop2 (population_data) but I want to also discuss the implications on this Module. But to save you reading the above, I'll repeat it here

In the 2021 census, not all population counts are numeric. Probably for privacy reasons, they are reluctant to provide data for places with little or no population, describing it as "No information can be provided because the area selected had no people or a very low population in the 2021 Census.", for example, Garrawalt QLD [1]. So I went to put "no or low population" in the "pop" field for that locality article and of course it does a dummy spit when it tries to calculate the density because the value in "pop" is not numeric. However, that is the value provided by the ABS so we need to accommodate it. I can see two approaches:

  1. allow non-numeric values in the "pop" field but make the code that calculates the density smarter to not calculate it when non-numeric data is present
  2. have an alternative field (say) "pop-text" which is used for non-numeric data and display pop or pop-text (whichever is provided) and density is calculated only if there is a value in "pop". Note a "pop2-text" would also be needed.

At the moment, if you leave the population fields empty for Garrawalt's infobox, this Module returns the 2016 census, as if there is no 2021 Census for Garrawalt, but there is 2021 census for Garrawalt, so I think a 2021 census result should be returned from this modulee with a "no or very low" value (once we sort out how to the template will handle non-numeric data). I don't know whether any particular solution to resolving the non-numeric data makes life easier or worse for this module, so I am flagging it now as an issue here. Kerry (talk) 02:06, 6 February 2023 (UTC)[reply]

How to implement it for other censuses?[edit]

Hi - thanks for your great work. Have you considered how this module or similar ones could be implemented for other census data, e.g. the 2021 Maltese census? I would like to do it, but I lack the technical skills... --Dans (talk) 12:47, 19 August 2023 (UTC)[reply]

The Abbreviation Label[edit]

I wonder about the usefulness of the abbreviation label (the first part of what is in the brackets after the population figure) to normal WP users, and my suggestion is that it be removed. The abbreviation (UCL / SAL etc.) is familiar to those who understand the Australian Statistical Geography Standard, but of little use to normal readers. The tooltip doesn't really help, as the meaning of eg: 'Suburb and Locality' is still pretty obscure - and in that particular case, especially so. The use of the tooltip is also not really in accord with the guidelines for Abbr, as the abbreviation has not been previously expanded in the body of the text - and this would be unlikely, as the the only 'previous' place would be the lede of the article.

At the moment if the infobox |type=town but the population figure is for an SAL (because the UCL data is not in Wikidata yet) the population figure will possibly not be for the geographic object the article is about. Eg: Eildon, Victoria, Eildon UCL and Eildon SAL. The abbreviation label is I suppose useful to those who know about the issue, but (a) it is very obscure, and (b) it should be overcome in most cases once the UCL data is in Wikidata. Then maybe a maintenance category would allow tracking of articles where there is no UCL data for |type=town? Innesw (talk) 06:42, 27 September 2023 (UTC)[reply]