Wikipedia talk:No original research/Archive 63

Page contents not supported in other languages.
From Wikipedia, the free encyclopedia
Archive 60 Archive 61 Archive 62 Archive 63 Archive 64

Birthdate and Birthplace for BIO

If I am unable to source birthdate and birthplace of a biographical subject who I know personally, can I ask directly for the birthdate and birthplace. If they reply with the information in text form is it OK? What about a scan of a government document?--TonyTheTiger (T / C / WP:FOUR / WP:CHICAGO / WP:WAWARD) 15:20, 2 February 2022 (UTC)

No… just leave the birth date blank. Blueboar (talk) 15:24, 2 February 2022 (UTC).
But not the birthplace? You are aware that made-up birthplaces are a long-term, ongoing problem, aren't you? "We take BLP seriously" doesn't have a whole lot of credibility when the prevailing attitude is actually "fill any holes in this infobox parameter no matter what". RadioKAOS / Talk to me, Billy / Transmissions 15:12, 18 February 2022 (UTC)
@TonyTheTiger: if they publish their name and birth place somewhere, we can use it per WP:ABOUTSELF, also mentioned at WP:BLPPRIVACY. Posting it on an established personal website or verified twitter account would work. Firefangledfeathers 15:29, 18 February 2022 (UTC)
Sources must be WP:Published. Also, if it's someone you know personally, you might also ask them to consider whether they want that information to be (extremely) public. WhatamIdoing (talk) 23:06, 21 February 2022 (UTC)

Can we have an article on some topic if no RS exists that defines it?

It seems that NOR's spirit prohibits us from having an article about some topic that is not defined in some generally accepted RS (even if many RS exist that discuss different aspects of this topic). That source cannot be a minority of fringe source, because that would violate another policy, NPOV. Is my understanding correct?--Paul Siebert (talk) 20:54, 18 January 2022 (UTC)

That's basically correct. If you invent a topic out of thin air, it leads to a host of other policy violations. Under the OR policy, if the topic isn't verified in independent sources, then you're creating an original essay or compilation about your original idea for a topic. You are violating NPOV by using your personal opinion to give a topic more weight than is justified in secondary sources. You're also violating what Wikipedia is not, which says that Wikipedia is not a complete exposition of all details, but a summary of accepted knowledge on different subjects. "Accepted knowledge" is what is covered in multiple, independent, reliable sources. If those sources haven't made something a topic, then we shouldn't either. All that said, it's within the spirit of this guideline to add more detail to an article about an accepted topic area. But we also have other guidelines to keep that from getting out of hand. Shooterwalker (talk) 22:00, 18 January 2022 (UTC)
At the risk of getting shouted down by people who want to apply rules in some completely arbitrary mannner, I'm going to make a somewhat vociferous objection. On the one hand, if you claim there's some topic that's widely discussed in reliable sources, but there were no particular source that provided general information about that topic, I would want to verify that this was actually true. But on the other hand, definitions are not facts, hence they are neither true nor false, they are just definitions. Definitions are mere conventions, i.e. a convenient way of referring to something. As a third point, reliable sources are not constrained to those found on the Perennial sources page. Fabrickator (talk) 22:49, 18 January 2022 (UTC)
I don't understand the premise, Paul Siebert. What do you mean by many RS discuss different aspects of this topic but no RS defines it? Schazjmd (talk) 23:09, 18 January 2022 (UTC)
The OP needs to say what the actual case is - it makes little sense as an abstract question. It sounds like he’s saying RS describe sub-topic X and RS describe sub-topic Y but no RS links the two to make topic XY. If it’s that then it basically amounts to WP:SYNTH to have an article on XY. DeCausa (talk) 23:19, 18 January 2022 (UTC)
I believe that this is the elephant in the room. Davide King (talk) 14:38, 20 January 2022 (UTC)
"Intersection" articles (e.g., Transport and the environment, Circumcision and HIV) are usually scoped to include the overlap of the two separate subjects. With the elephantine subject matter, you would find a workable concept of Mass killing and a workable concept of communist regime and then look for sources that cover both of those subjects. For the second, probably anything listed in Communist state#List of communist states (during the relevant years) would count. However, for that specific subject, I believe there are some sources explicitly about that intersection, which should make it easier than usual. WhatamIdoing (talk) 17:53, 20 January 2022 (UTC)
Can we? Yes. Otherwise, we delete all the biographies, because there are basically no sources that "define" any individual humans.
I wonder whether the question that is intended sounds more like "How the heck do we have an article about _____ when we can't even figure out what _____ is (and isn't)?" WhatamIdoing (talk) 02:09, 20 January 2022 (UTC)
100% agree with @WhatamIdoing. My experience has been that "spiritual" interpretations of our guidance usually fail to correlate with the actual "reality" that was intended by them a fair majority of the time. Huggums537 (talk) 08:10, 20 January 2022 (UTC)
Actually, that’s not quite true on biographies. They are “defined” - by their identity which all WP:RS on a person will (normally) demonstrate in some form. An RS on the childhood of X can be used with an RS on X’s later life to support an article - provided they both identify they are talking about the same person (that’s the “definition” aspect). Where I’ve seen difficulty with that is that where a figure is less well known and with a common name there can be a question over whether you have you got the same person in the sources. DeCausa (talk) 08:23, 20 January 2022 (UTC)
I think it is very debatable that an "individual human" (a person on Wikipedia) is "defined" by the fact RS identify them, but even if this were the case, then you have just answered the OP's question because if simply identifying a topic is enough to define it, then any RS discussing different aspects of a topic that also identify it have defined it according to your argument. However, there is nothing in NOR or RS that suggests this is true. The simple fact is that there is no requirement for our topics to be "defined" in RS. Huggums537 (talk) 09:06, 20 January 2022 (UTC)
No, they’re defined by their identity. I don’t see what you are getting at. The point made earlier was that ‘there are basically no sources that "define" any individual humans’. My point was that the definition of a bio i.e. the scope within which a bio article should fit, is the collective RS about the individual. And for that to work, the RS in question have to identify that specific individual - that’s the bridge that links potentially disparate pieces of information to create a defined topic for an article. DeCausa (talk) 10:26, 20 January 2022 (UTC)
To make my point more simple and easier to understand, the obvious problem with your theory is that it stretches the normal definition of "defined" beyond reasonable limitations. It's really not possible for an individual source to "define" the subject of a biography in the sense that the OP mentioned. And, even though our guidance recommends that we define our subjects, there is really no way to do this with biographies. What we are doing is describing them, not "defining" them. Huggums537 (talk) 13:59, 20 January 2022 (UTC)
DeCausa, I believe that people are defined by their characters, and not by what others say about them. The most we can hope for from other people is a description of a subject's actions, not the definition of their existence. WhatamIdoing (talk) 17:45, 20 January 2022 (UTC)
I wouldn't want to speak for the OP, but I wouldn't want to mince words. I think they just mean – has this topic been clearly identified and described in reliable sources? With more biographical articles about notable public figures, that would usually be yes. That would be different from cobbling together a biography about a celebrity's sibling based on several mentions in sources that are really aren't about them. Shooterwalker (talk) 18:53, 20 January 2022 (UTC)
Even so, and even without mincing any words, if the topic is "identified" and "described" in any RS, then it has satisfied whatever imaginary requirement was thought that it must be "defined". It should also be noted that ...cobbling together a biography about a celebrity's sibling based on several mentions in sources that are really aren't about them is different from creating an "intersection article" as well. Huggums537 (talk) 15:58, 21 January 2022 (UTC)

I think that there are a lot of questions bundled together here. One is a general question, and if the answer isn't already defined in a guideline or policy, no categorical answer is possible. And there is discussion of a particular article, where other variables would also affect any answer. "Intersection" articles can be a lot of different situations and Wikipedia gives some but not much guidance on them and their titles. But there are so many cases where the title/ scope isn't specifically defined by sources that it would be impossible for such a rule as in the OP to exist. Doubly so for making "agreement of those sources" an additional condition on that. For example, a wiki-division just to cut down the size of an oversize article, covering an area where there are unknowns or conflicting opinions on what falls under the terms. North8000 (talk) 17:09, 21 January 2022 (UTC)

  • Yeah, I'll admit I'm responding in the abstract. It's not totally clear what the OP is getting at, and it becomes more practical if there's a real example. Shooterwalker (talk) 17:18, 21 January 2022 (UTC)
    @Shooterwalker: I would prefer not to present the real example, because that discussion is currently in progress, and if I present the real example here it may be seen as forum shopping. Paul Siebert (talk) 21:15, 21 January 2022 (UTC)
I agree with DeCausa that biographies are not a good example. Contrary to what WhatamIdoing says, biographies and similar topics do not need an explicit definitions, because the topic is automatically defined.
In addition, when I wrote "that is not defined in some generally accepted RS", I didn't necessarily mean some formal definition. It may be sufficient if some reputable RS outlined the scope.
WRT "intersection" articles, they are a totally different things, for intersection of a topic A and a topic B is not a discussion of A, and discussion of B, and discussion of their linkage, but the discussion of only those aspects of A that are linked to B. The scope of intersection article is narrower than the scope of each separate topics. Paul Siebert (talk) 21:11, 21 January 2022 (UTC)
It sounds like we've made progress. Your opening sentence was "It seems that NOR's spirit prohibits us from having an article about some topic that is not defined in some generally accepted RS"; now you agree that NOR does not require definitions in some cases, because "biographies and similar topics do not need an explicit definitions".
I suggest that no topic is required by policy to have an explicit definition. Consider, e.g., List of Harvard University people. I don't expect to find a source that says "Harvard University people are people who...". Ditto for business organizations like Apple Inc. Also for creative works such as the Mona Lisa.
In other cases, such as Blond hair, I expect to find definitions in dictionaries, and for these definitions to be unimportant to the process of writing an encyclopedia article. For other subjects, such as Mother or Chronic fatigue syndrome and Ketogenic diet and Borders of India, I expect to find different and contested definitions. Having one agreed-upon definition is not necessary for writing any of those articles.
What you need for an article is for editors to have a reasonable shared understanding of what the article is supposed to be about. That is, our article at Ketogenic diet is about a stringent, temporary, growth-stunting diet for kids who might otherwise die from epilepsy. It doesn't really matter what the definition of the term ketogenic diet is, because the subject of the article is a last-chance diet for a life-threatening problem, and not any of the multiple things that get called by that name. In this sense, encyclopedias are different from dictionaries, in that we start with the subject and then give it a title, but dictionaries start with the title, and then find out what the word means. WhatamIdoing (talk) 21:32, 21 January 2022 (UTC)
I already explained that by "define a topic" I meant not a formal definition. I am sure that majority of general sources about, e.g. WWII (Churchill, Overy, Beevor, etc) contain no formal definition of WWII. However, these sources provide some implicit or explicit overview of the topic, and every reasonable person with no preliminary knowledge on the subject can get an impression about the topic, and it becomes intuitively clear which events (like Pearl Harbor or El Alamein) were a part of WWII, and which events (like Peruvian-Equadorian war) were not.
My question was mostly about a situation when no sources are provided that discuss some topic as a whole: they either discuss some bigger or smaller topic, and then Wikipedians assemble these sources as a mosaic, and a new article is created from sources that discuss different aspects of the topic, but do not discuss it as a whole. Paul Siebert (talk) 02:15, 22 January 2022 (UTC)
We've got a title, but on Wikipedia it is some kind of reverse quantum title. Just by observing it you are somehow entangled with a big chunk of last century's history and visions of the next. fiveby(zero) 06:27, 22 January 2022 (UTC)
The relevant policy point is SYNTH is not mere juxtaposition. There is no policy, nor do I think there should be a policy, that prevents as from assembling facts from disparate places and making them into an article. We only cross the line when we start to draw conclusions from the assemblage that are not drawn by the sources. Articles will disappear at AfD if they are just collections of totally unrelated facts, so an article should have a well-defined (by us) topic and sources that are relevant to that topic. It isn't necessary that the topic be previously treated as a topic somewhere else. As an example, the history of important locations is often broken into periods; there is no need that the same periodisation has been used before and we are free to use whatever works for us. Zerotalk 06:40, 22 January 2022 (UTC)
That doesn't make much sense to me. I would think we would re-structure the articles into a periodization that can be sourced to reliable sources. For example, Paleogene vs Cretaceous, or Bronze Age vs Iron Age. If there was evidence that an editor had invented their own idea for a period of history, we would likely expect them to source it. I admit this is sort of moot, because it's easy to find sources that break history into logical periods. Shooterwalker (talk) 04:28, 31 January 2022 (UTC)
For some of them, we might (using our best judgment) want to align with existing systems, but for others, we might not. Should we have an article on Tudor period or 15th century in England? We could have either, or both, without violating NOR. WhatamIdoing (talk) 23:05, 21 February 2022 (UTC)

Taking the OP question literally and narrowly. IMO there is no such categorical requirement. And there are many many articles on topics where no such definition-situation exists. This does not mean that lack of such a definition-situation can't be taken into consideration in discussions. North8000 (talk) 18:25, 26 February 2022 (UTC)

Primary, secondary and tertiary sources

In the section 'Primary, secondary and tertiary sources', it is stated that 'Any interpretation of primary source material requires a reliable secondary source for that interpretation.' Does this include the authors' own interpretation of the results? Username142857 (talk) 13:31, 18 February 2022 (UTC)

  • We can mention the conclusions and interpretations that are directly stated in our source material (using in-line attribution so the reader knows who is making those interpretations or conclusions … and provided that doing so does not give those interpretations or conclusions UNDUE WEIGHT)… we can not mention our own conclusions or interpretations based on that source material. Blueboar (talk) 20:45, 26 February 2022 (UTC)

 You are invited to join the discussion at Wikipedia talk:Verifiability § Verifiability of animal habitat maps. {{u|Sdkb}}talk 00:27, 9 March 2022 (UTC)

"Anywhere in the world, in any language" and other matters with the new table

This and especially this seems to imply that editors should be expected to somehow check other-language sources, even every other language, before something can be determined to definitely be original research (rather than just, say, failed verification). And do they have to scour 'the whole world' too? Ultimately, the WP:BURDEN of supporting a claim belongs to those adding it. If the cited sources don't support it, then it can be presumed to be original research by synthesis and removed.

Why would or did we ever define OR as based on 'all the RS on Earth' rather than whatever is being used to purportedly support a claim? If someone inserts material that is a synthesis of their sources, why shouldn't this be reverted per WP:SYNTH? It isn't my job, or even possible, to first prove that no sources make the point they are trying to make - I'm aware the new text doesn't specifically say I must, but why can't I use this policy to remove it?

Additionally, even if some RS does support whatever claim the person added, it can be removed per WP:UNDUE or WP:FRINGE if applicable (not just if no source exists for it). Crossroads -talk- 03:58, 17 March 2022 (UTC)

I don't interpret it as particularly radical. OR is just one of many tests we apply to content -- fringe that survives OR may still be removable under WEIGHT. But maybe I'm missing some subtle point of policy. Feoffer (talk) 05:50, 17 March 2022 (UTC)
The addition here of the phrase "anywhere in the world, in any language," is an example of WP:DUH, "pedestrian details the reader likely knows or would already assume". I'd remove that particular phrase. 13:46, 17 March 2022 (UTC)
I lean this way, it's overly specific in a way that's not particularly helpful. The WP:BURDEN is on demonstrating WP:V, not those challenging it. Perhaps a rewrite to make it clearer that the challenge can be made with any reasonable doubt that the claim has been published, and that the 'anywhere in the world, in any language' is the scope that the editor who may have added OR is permitted to search for a reliable source. Bakkster Man (talk) 14:13, 17 March 2022 (UTC)
I also disagree with adding the phrase. While basically true, it's stated out of context and without explanation, and I can see how other editors would find that phrasing misleading. There is a decent enough footnote in this policy already, which says "By "exists", the community means that the reliable source must have been published and still exist — somewhere in the world, in any language, whether or not it is reachable online — even if no source is currently named in the article. Articles that currently name zero references of any type may be fully compliant with this policy — so long as there is a reasonable expectation that every bit of material is supported by a published, reliable source." This is at least more clear. Shooterwalker (talk) 15:32, 17 March 2022 (UTC)
  • We aren't talking about writing styles for articles here. We are talking about policies and guidelines. P&G need to be very clear, and precise; not concise. Huggums537 (talk) 11:22, 20 March 2022 (UTC)
  • If it also appears in the footnote, then it is consistent with the whole of the policy. I agree with it. Huggums537 (talk) 11:29, 20 March 2022 (UTC)
    • "ever published" is still unnecessary and a pedestrian detail. I've removed it, and if I had half a mind more I'd probably remove the whole table until a version which pleases everyone can be agreed upon, because policy is the one place where it is almost always better to be safe than sorry (hence, no reason to be really bold about it), especially something as fundamental as WP:OR. RandomCanadian (talk / contribs) 00:38, 23 March 2022 (UTC)
@Crossroads, about Why would or did we ever define OR as based on 'all the RS on Earth' rather than whatever is being used to purportedly support a claim?: That's actually the point. The definition of OR was originally about a claim posted by an editor that had never been published anywhere (not even in unreliable sources). Early editors were trying to use Wikipedia as a publishing opportunity for "original research", with that name meaning approximately the same as thing as a "research paper containing a completely original idea that really ought to have been submitted to an academic journal instead of being posted at Wikipedia".
Under that model, we had these options:
  • I create cold fusion in my kitchen, and Wikipedia is the first place that I write about how I did it: OR violation (no source)
  • I read on social media about my neighbor creating cold fusion in her kitchen, and I write on Wikipedia about how they did it: WP:V violation (unreliable source)
Some years later, editors decided that if you copied something off an unreliable source (e.g., social media posts), then that should be considered the same as one that the editors made up themselves.
Under this newer model, we have these options:
  • I create cold fusion in my kitchen, and Wikipedia is the first place that I write about how I did it: OR violation (no source)
  • I read on social media about my neighbor creating cold fusion in her kitchen, and I write on Wikipedia about how she did it: WP:OR and WP:V violation (unreliable source)
There has never been a time in which OR had any connection to which sources happen to be cited in the article at any point. OR has always been about whether the idea originated from the Wikipedia editor or from external (reliable) sources.
I grant that it's more convenient for the patrollers when editors cite sources. It's much easier to spot OR violations if you give an edit summary of "I just proved this experimentally in my kitchen", and it's much easier for them to avoid the embarrassment of falsely claiming an OR violation if you cite sources that they are familiar with and believe to be credible. But the presence or absence of citations doesn't actually change the relevant facts about whether the Wikipedia editor is posting research that originated from the Wikipedia editor's own experience, and it is those facts alone that define OR. WhatamIdoing (talk) 21:34, 29 March 2022 (UTC)
As for why it's helpful to specify that this applies to any language, any time, and any place: Wikipedia editors are not born knowing that WP:NONENG, WP:PAYWALL, and similar statements are part of the core policies. When we make these things v-e-r-y clear, then diligent editors are less likely to find themselves making embarrassed apologies about their false assertions (e.g., demanding that articles only cite English-language sources). WhatamIdoing (talk) 21:37, 29 March 2022 (UTC)
Do not reinsert this when it clearly lacks consensus. While the historical background may be interesting, nowadays, our concern is not how exactly the editor got some idea. It doesn't matter. It makes zero material difference if the claim they are inserting was made up by them or was read by them elsewhere and they are merely propagating it - it looks the same when being added. And if I revert some junk per OR, and they come back with proper sourcing, I am not "embarrassed" in the least. They should be embarrassed from not properly citing sources to begin with.
This is not the place to add undue-weighted caveats about NONENG and the like. If something is unsourced or synthesizes their claimed sources, it can be presumed OR. In no way should we imply that it is a patroller's job to scour all the world before claiming OR. Crossroads -talk- 01:55, 30 March 2022 (UTC)
@WhatamIdoing has already pointed out that it doesn't make any difference where the OR comes from in both the old, as well as the new model, so your remark about the historical background, and that whole first paragraph is kind of pointless.
If there is any place where important caveats need to be clarified, it would be the P&G. It is absolutely false that just because something is unsourced means it can be presumed OR. It is flat out against policy to do that. It is also false that the insertion implies patrollers must scour the world before claiming OR. You have proven that by your own admission in your original post. If you suspect OR, all you have to do is remove per WP:UNDUE or WP:FRINGE, and leave the WP:BURDEN on the editor who inserted the OR. So, you don't need this bit taken out of the P&G. OTOH, WhatamIdoing has shown a practical reason why the bit is helpful to be inserted. Your complaint about what it implies is what really actually doesn't make any difference... Huggums537 (talk) 02:43, 30 March 2022 (UTC)
Against what policy? The WP:BURDEN is on the editor making the claim. And just because I know that I don't have to scour the world first, that doesn't mean newbie editors do, and this page should not put undue emphasis on that matter in such a misleading way. If I suspect OR, I can cite such in my edit summary or on the talk page, not just FRINGE or UNDUE (which don't always apply to bad text). It's much less useful otherwise. Also, I'm not taking anything "out" of the page; it's WhatamIdoing who inserted it both times recently - and neither time did it stick. Crossroads -talk- 04:24, 30 March 2022 (UTC)
What policy? WP:V has a section that links to several pages with more information about the WP:V policies, where you find a certain supplement to the WP:V policy called When to cite. In the When to cite policy supplement, there is a section called When a source or citation may not be needed. It also links to WP:BLUE at the top of that section. These supplements to the policy tell us beyond doubt that there are certain content you simply are not allowed to presume as OR just because it is unsourced. Nothing about the insertion is preventing you, or any newbies from citing OR, FRINGE, and UNDUE other than these imaginary implications that don't make any difference. Huggums537 (talk) 05:14, 30 March 2022 (UTC)
Crossroads, "The WP:BURDEN is on the editor making the claim" is kind of the point here. You're saying that something is a violation of NOR if it's a violation of some other policy. (Also, you can't really "violate" BURDEN anyway; BURDEN is the editing equivalent of "If you want to eat dinner today, you have to make dinner yourself, because I'm not going to do it for you". Nobody's "violating" any rules if they do, or don't, choose to eat dinner today.) WhatamIdoing (talk) 16:23, 30 March 2022 (UTC)
Something can be in violation of multiple policies (and many things are - eg. WP:SYNTH is generally a subset of WP:OR.) A statement can both be OR and have problems with unreliable sourcing; which aspect of it to follow up on when challenging it and discussing it depends on context. For example, if there is disagreement over whether the source is reliable, that should be resolved first; if there's agreement that it's unreliable, it might make sense to search for a better source or to just remove the text, depending. There are also degrees of reliability and contexts in which something is reliable or unreliable (eg. sometimes it might be fixable just by attributing it.) But if it's completely impossible to repair because no reliable published sources exist, then OR is part of the underlying issue. --Aquillion (talk) 18:41, 31 March 2022 (UTC)
  • There has never been a time in which OR had any connection to which sources happen to be cited in the article at any point. OR has always been about whether the idea originated from the Wikipedia editor or from external (reliable) sources. The very first paragraph of WP:OR says To demonstrate that you are not adding original research, you must be able to cite reliable, published sources that are directly related to the topic of the article, and directly support the material being presented, which reflects both WP:V and WP:BURDEN. I also strenuously disagree with your assertion that whether the editor themselves or a buddy on Twitter came up with something makes a difference for whether something is OR or not - anything that is implied by a combination of sources and is not citeable to reliable, published sources is OR. If someone cites their buddy on Twitter for it, then it is both OR and poorly-sourced - it does not have to be one or the other; it can reasonably be templated either way and approached from either angle depending on which discussion seems more likely to be productive. This has always been the case and has always been central to how OR is defined, and your history implying otherwise is flatly untrue. Looking back slightly, I also take issue with the giant template you added recently, partially because it appears to be trying to force through this divide you're creating here, though also partially because it is huge and clunky and doesn't really seem helpful enough to justify the massive presence in the policy page. Since I feel your interpretation of WP:OR is a massive shift from the status quo, please do not add anything implying it is the case to any policy pages without an unambiguous consensus supporting it. At the very least, it's clear we need more discussion before such a massive addition to a longstanding policy page, especially if there's such a fundamental dispute over the policy's core purpose. --Aquillion (talk) 04:47, 30 March 2022 (UTC)
    There is an important difference between "you must be able to cite" and "you must have already cited sources, and if I judge the currently cited sources as being inadequate, then you have violated this policy". This policy says "Articles that currently name zero references of any type may be fully compliant with this policy". I therefore think that this policy says that, well, that articles that currently name zero references of any type may be fully compliant with this policy. If it's possible – and the policy says that it is – to have a zero-sources article that fully complies with this policy, then this policy does not care about which sources happen to be cited int he article at any point in time. It cares about the sources that you are "able to cite", not about the sources you "already cited". WhatamIdoing (talk) 16:44, 30 March 2022 (UTC)
No, I feel like you are not understanding at all. Once a statement has been challenged, only thing that matters for WP:OR is "are we citing reliable sources that specifically support the statement being cited?" If the answer is no then it is OR. There is, yes, one catch, which it feels like you are vaguely mixing up with your suggestion that we change policy to let unreliable sources render something not OR - it is true that not everything requires a source in the article voice, only things that are "challenged or likely to be challenged." But that leeway obviously goes away the moment something is challenged; at that point you must produce a reliable source supporting the statement. If you cannot, it is OR. This is how WP:OR has always worked. --Aquillion (talk) 18:32, 31 March 2022 (UTC)
I don't agree. The only thing that matters for OR is whether published reliable sources exist, and not whether we are citing them. This is the second sentence of the policy: "The phrase "original research" (OR) is used on Wikipedia to refer to material—such as facts, allegations, and ideas—for which no reliable, published sources exist." NB the footnote, that carefully spells out that what mattes for this policy is whether those sources exist in the real world and has absolutely nothing to do with whether any sources have been cited in the article.
Consider, e.g., this sentence: "The population of Paris is about 25,000 people." There is no citation.
  • Is it OR? No. Why? Because sources exist in the real world (e.g., this one) that could be cited to support this number.
  • Is it a WP:V problem? Yes. Why? Because statistics are the kind of content that is Wikipedia:Likely to be challenged, and therefore WP:V requires an inline citation.
According to what you wrote, this is a NOR violation because nothing is cited. That is not what NOR says. WhatamIdoing (talk) 20:44, 31 March 2022 (UTC)
But the moment it is challenged, the burden is on you to produce a source. If you can't do so, then the presumption is that it is WP:OR. --Aquillion (talk) 09:04, 1 April 2022 (UTC)
No, not really. If I can't (or merely don't) produce a source to support it, the presumption is that it fails WP:V. That's why BURDEN is in WP:V and not in NOR. WhatamIdoing (talk) 16:58, 1 April 2022 (UTC)
  • "Anywhere in the world, in any language" is unnecessary verbiage. Headbomb {t · c · p · b} 06:53, 30 March 2022 (UTC)
    "Anywhere in the world" is probably unnecessary, but "in any language" is not. We have too many editors who think "not in English, not reliable" to omit that. WhatamIdoing (talk) 16:23, 30 March 2022 (UTC)
This is not the place for it, and frankly citing non-English sources in some contexts (ones where we would expect the best sources to be in English) is actually suspicious. Such has been used to perpetrate hoaxes or in the promotion of fringe theories. Anyway, we certainly should not be implying that editors engaged in verification should first somehow survey sources in languages they don't understand. Crossroads -talk- 03:07, 31 March 2022 (UTC)
This policy is one of the places where this specific point has been spelled out for years, so I think that it is appropriate here. WhatamIdoing (talk) 20:46, 31 March 2022 (UTC)

The table

How to fix problems with verifiability and original research
When the problem is... you can... optionally with this template:
The content is uncited.
  • Find and cite a reliable source yourself.
  • Tag the content as needing an inline citation.
  • WP:CHALLENGE the content, if you believe that no reliable source exists to support that claim.
{{citation needed}}
The cited source is not reliable for this content.
  • Replace the unreliable source with a reliable source yourself.
  • Tag the statement to indicate that it needs a different source.
  • WP:CHALLENGE the content, if you believe that no reliable source exists to support that claim.
{{unreliable source?}}
The cited source does not support the content.
  • Find and cite a different reliable source that supports the content.
  • Re-write it to match the cited source.
  • Tag the statement to indicate that it needs a different source.
  • WP:CHALLENGE the content, if you believe that no reliable source exists to support that claim.
{{failed verification}}
The content combines information from multiple (cited) sources to claim something that no single source says.
  • Find and cite a different reliable source that supports the content.
  • Re-write it to match the cited source.
  • Tag the statement to indicate that it needs a different source.
  • WP:CHALLENGE the content, if you believe that no reliable source exists to support that claim.
{{synthesis inline}}
No published source anywhere in the world, in any language, supports this content.
  • Re-write the content to match reliable sources.
  • Tag the statement as being original research.
  • WP:CHALLENGE the content, if you believe that no reliable source exists to support that claim.
{{original research inline}}

Crossroads removed the last sentence, and replaced it with "No reliable source supports this content." My problem with this is that "no reliable source supports it" can be confused with the very first item in the table, namely that it's not already cited. The difference between the two is that the first says only "that lazy editor didn't cite this" and the second says "I don't care how much effort you put into this, it is impossible to find any source that supports this claim". These are really very different situations, and they need to have obviously different descriptions.

As for whether this description is supported by the policy (a concern raised by Anachronist here), it is taken almost word-for-word from this policy, which says: the reliable source must have been published and still exist—somewhere in the world, in any language, whether or not it is reachable online—even if no source is currently named in the article. Articles that currently name zero references of any type may be fully compliant with this policy—so long as there is a reasonable expectation that every bit of material is supported by a published, reliable source. This may not be what some editors expect if they haven't read the policy closely, or if they're used to people claiming OR when they ought to be issuing a WP:CHALLENGE to uncited material, but this is what the policy says. WhatamIdoing (talk) 16:45, 30 March 2022 (UTC)

I like the table for the most part, and I actually take the minority viewpoint to agree with WhatamIdoing on the language we have been debating thus far, but per an argument I was making with Crossroads earlier, I must protest the advice that one option is to tag a statement as OR if no source supports it; unless we also make it clear in the language that this means the content being challenged is not common knowledge, and does not need to be sourced. I mean I guess anything can be challenged, but this suggests all unsourced content is OR, and can be tagged as such, but my argument to Crossroads proves this simply is not true. I hadn't noticed this about the table before. Also, it is kind of big and clunky so maybe we could fit it somewhere else? Huggums537 (talk) 19:31, 30 March 2022 (UTC)
"Common knowledge" is subjective and frankly leaves the door too far open to personal ideas. WP:NOTBLUE I say. If the knowledge is so "common" then citing a source to demonstrate its (1) accuracy and (2) relevance to the topic should be trivial. Crossroads -talk- 02:59, 31 March 2022 (UTC)
You are entitled to your opinion, and can edit any way you want. However, WP:NOTBLUE is not what the policy point to. As I previously pointed out carefully step by step, the policy points to, and supports the ideas presented in WP:BLUE. Huggums537 (talk) 12:09, 31 March 2022 (UTC)
I will say NOTBLUE is offered as an alternative in the "See Also" section of one of the pages policy is pointing to, but that can often be similar to offering minority coverage of a fringe section in a reputable article. Huggums537 (talk) 12:31, 31 March 2022 (UTC)
Claiming OR is a way of issuing a CHALLENGE, as I see it. If I stumble across text where multiple sources are being combined to reach a conclusion not found in any of them, I should be able to cite WP:SYNTH rather than a less helpful or relevant link. Frankly, I think the matter of whether sources merely "exist" regardless of whether the person adding the text is aware of them is completely irrelevant and a historical artifact. In the meantime, the BURDEN is on the editor making a claim, and I oppose any text that will drag this policy backwards to an earlier era of Wikipedia that was cavalier about text actually being properly cited, and/or that will discourage editors from challenging and removing bad material. Crossroads -talk- 03:17, 31 March 2022 (UTC)
You're right about being more specific when we can, and I have just found and am substituting the more specific inline template for SYNTH.
As for the rest, I think you'd have been happier if the proposal to merge OR and WP:V had succeeded. But since it didn't, this is what we've got: OR is when nobody's published the source that supports what you write, and WP:V problems are all the rest. WhatamIdoing (talk) 20:51, 31 March 2022 (UTC)
A particular piece of disputed text can (and often will) have multiple policy issues with it. In particular, not only are most WP:OR issues also WP:V issues, but almost all issues are WP:V issues to one extent or another (in the sense that producing more or better sources, if they exist, can resolve most of them.) It doesn't make sense to suggest that something cannot be an WP:OR issue simply because it falls under WP:V - even under your proposal, OR issues would still also be V issues in the sense that an editor could say "oh, no, I didn't invent this, here is a source", and they could be approached as OR or V issues depending on how someone feels it is most productive to challenge them. --Aquillion (talk) 09:24, 1 April 2022 (UTC)
  • I think many of the naysayers are missing the point… if text can be directly supported by a published source (any source - reliable or not, in English or not), it isn’t original research. The text may have other problems that require us to challenge and remove it (the source may be unreliable, the text may be worded non-neutrally, etc) but we have other policies and guidelines that cover those problems. Blueboar (talk) 12:01, 31 March 2022 (UTC)
    No, I don't think that that's true at all (and I feel it changes the definition of OR we have traditionally used.) Suppose we have two sources: An opinion piece saying "President Smith broke the law when he did X" and a source to a specific law, intended to convince the reader that President Smith broke it. Citing these together for an article-voice statement that the law was broken is a textbook example of WP:OR (as well as possibly synthesis, but that's a form of OR) - the editor is performing their own original research to back up the opinion-piece. There may be other problems (a piece of text can and often does have multiple issues), but OR is a central problem there. WP:OR says repeatedly that a reliable source is needed to avoid something being OR; I disagree specifically with your statement of any source - reliable or not, which contradicts the definition of OR at the top of the page. A challenged statement with no reliable sources is, obviously, OR, and this has always been the case. What the best approach to address such a problem is depends on context (hence, people will sometimes focus on the problems with the source if that is the part most likely to be fixable or the part that is most clearly in dispute), but changing policy to allow an unreliable source to render something not-OR is a frankly shocking policy suggestion that would defang OR to the point of making it completely useless. Obviously presenting a blog-post or opinion piece to support an in-text bit of WP:OR does not change the fact that it is OR. --Aquillion (talk) 18:28, 31 March 2022 (UTC)
Nope... if there is a published source that directly says "Smith broke the law", then the claim that he did does not ORIGINATE on WP, and it can not be a NOR violation for WP to say it. There are half a dozen other policies and guidelines that we csn use to chsllenge the text... but NOR isn't one of them.
What would be OR would be citing a news source that says “Smith did X”, the law saying “X is illegal” and a Wikipedian drawing the conclusion to say “Smith broke the law”. This is OR because (unlike the op-ed scenario) no source puts all the pieces together, linking Smith, X, and the law. Blueboar (talk) 23:42, 31 March 2022 (UTC)
If there is a published reliable source, then it is not OR (assuming someone can produce that source when challenged, of course, as they are required to do.) But the example I gave is a textbook example of original research - an editor is trying to use Wikipedia to argue an idea not supported by reliable sources - and your proposal to remove the longstanding reliability requirement is absurd. --Aquillion (talk) 09:08, 1 April 2022 (UTC)
Before 2010, Blueboar was absolutely correct. NOR has always required WP:Published sources; it's only since SlimVirgin re-wrote it in 2010 (several years after her proposal to merge WP:V and NOR failed) that information an editor took from a published-but-unreliable source was considered a NOR problem rather than purely an WP:V problem. I don't necessarily agree with this change in theory (there is more logical consistency to the old approach than to the new one), but in terms of what you can put in an article, it's not terribly different. WhatamIdoing (talk) 17:02, 1 April 2022 (UTC)
  • Fun fact: On the day you created your account, the word reliable did not appear anywhere in this policy. WhatamIdoing (talk) 20:59, 31 March 2022 (UTC)
    I think there was a misunderstanding, and the idea @Blueboar was trying to get across is that a source isn't really needed on Wikipedia, or in English, and this would still apply no matter if you want to argue about reliability since arguing about if it is reliable or not is another matter. He did say, directly supported by a published source (as opposed to an unpublished, unreliable source), and he even said, The text may have other problems that require us to challenge and remove it (the source may be unreliable, the text may be worded non-neutrally, etc) but we have other policies and guidelines that cover those problems. So I think your interpretation of him suggesting we use unreliable sources is very highly improper.
    Your other interpretation of policy is; WP:OR says repeatedly that a reliable source is needed to avoid something being OR; I disagree specifically with your statement of any source - reliable or not, which contradicts the definition of OR at the top of the page. The policy doesn't say anywhere at all a reliable source "is needed to avoid something being OR". The whole second paragraph of the OR policy disagrees with you on that. Here is a portion: For example, the statement "the capital of France is Paris" needs no source, nor is it original research, because it's not something you thought up and is easily verifiable; therefore, no one is likely to object to it and we know that sources exist for it even if they are not cited. The statement is attributable, even if not attributed. Even the definition at the top of the page links to the footnote that says, Articles that currently name zero references of any type may be fully compliant with this policy—so long as there is a reasonable expectation that every bit of material is supported by a published, reliable source. So, my specific issue with your whole argument is that you are claiming; A challenged statement with no reliable sources is, obviously, OR, and this has always been the case. when none of that is backed up by any policy whatsoever. Also, nobody has proposed, or suggested any changes to allow unreliable sources. Please WP:Assume good faith. Huggums537 (talk) 21:01, 31 March 2022 (UTC)
um… I was not trying to address the issue of uncited, “BLUESKY” info at all… one way or the other. Blueboar (talk) 00:18, 1 April 2022 (UTC)
Yeah, I get that now since your recent replies explain more about what you meant regarding sources directly supporting content. However, my arguments about uncited BLUESKY content still stand. Thanks for clarifying, and I agree with your assessment of OR by the way. Huggums537 (talk) 02:55, 1 April 2022 (UTC)
  • The "definition of OR we have traditionally used" is this one:

    Wikipedia does not publish original research or original thought. This includes unpublished facts, arguments, speculation, and ideas; and any unpublished analysis or synthesis of published material that serves to advance a position. This means that Wikipedia is not the place to publish your own opinions, experiences, arguments, or conclusions.

    That used to be the first paragraph. It's rather different from what some newer editors have picked up in our Telephone game of policy. The modern version, in which material taken from unreliable sources could be characterized as original research, appeared in April 2010. WhatamIdoing (talk) 21:16, 31 March 2022 (UTC)
    2010 is 12 years ago, long before I and many others ever edited here and plenty long enough to be "tradition". That's not "telephone", it's literally longstanding policy. Frankly, the old version is absurd and useless. Almost any idea, no matter what it is, has been said by someone, somewhere on social media. A version of OR that requires an idea be new to the world rather than new to reliable sources is of no utility whatsoever. It makes zero difference whether some idea existed previously on Twitter or whatever, or was invented by the editor. What unreliable sources say is simply irrelevant. Crossroads -talk- 03:57, 1 April 2022 (UTC)
    By "telephone game", I mean that Wikipedia:Nobody reads the directions. We see other editors claiming that ____ violates NOR, and we copy their behavior, without double-checking that they've named the relevant policy. We see other editors claiming that WP:BRD is required, and we copy their behavior, without reading so much as the first sentence of that page (which contains the word "optional").
    The old version is not absurd or useless. It's just less relevant now. When this policy was created, people had a much vaguer notion of what belonged on Wikipedia. There's a reason that the first version began this way: "Wikipedia is not the place for original research such as "new" scientific theories." People were posting the equivalent of scientific journal articles as encyclopedia articles. Now that most internet users have a general idea of what to expect from Wikipedia, we see less of this. WhatamIdoing (talk) 17:12, 1 April 2022 (UTC)
    "Reliable" was added to the first paragraph in February 2006: Citing sources and avoiding original research are inextricably linked: the only way to verifiably demonstrate that you are not doing original research is to cite reliable sources which provide information that is directly related to the article, and to adhere to what those sources say. In March 2005, "original research" was declared to refer to untested theories; data, statements, concepts and ideas that have not been published in a reputable publication (these include peer-reviewed journals, books published by a known academic publishing house or university press, and divisions of a general publisher which have a good reputation for scholarly publiciations). The very first standard, from the page's creation in December 2003, was commonly accepted reference texts. (The link in that article version seems to be wrong; I think it was supposed to point here.) Jimbo's statement in 2004 was that The phrase orginated primarily as a practical means to deal with physics cranks, of which of course there are a number on the web. Physics cranks "published" their ideas by creating websites, printing newsletters, spamming Usenet groups, etc. So, all along, ideas that appeared somewhere else before Wikipedia would still have been designated "original research" if they weren't published reliably (e.g., surviving formal peer review). XOR'easter (talk) 04:32, 1 April 2022 (UTC)
    This anecdotal commentary about a statement Jimbo made in 2004 is interesting, but I think proponents are saying the sources don't need to be cited on Wikipedia, not that they don't need to be reliable. Huggums537 (talk) 11:07, 1 April 2022 (UTC)
    No, the wording all along has been that the sources need to be reliable: the only way to verifiably demonstrate that you are not doing original research is to cite reliable sources (emphasis added). If you cite an unreliable source, then according to the NOR page in 2006, you're still doing OR. This makes sense, because plenty of garbage exists out there as Internet flotsam before it arrives here. Both the intent of NOR to begin with and the way it has been used ever since is to exclude that, along with excluding things which happen to have been made up here first. If sources weren't required to be reliable, then the whole policy would become a useless dead letter. Anybody whose crackpot theory was rejected could just turn around, make a Facebook post about it, and come back saying that it's no longer "original". XOR'easter (talk) 17:03, 1 April 2022 (UTC)
    First, please mind the gap between "the only way to verifiably demonstrate that you are not doing original rsearch" and "not doing original research". Imagine that I write "The population of Paris, TX is about 25K people." No cited source – is that original research? Nope. I just haven't yet demonstrated that it's not OR.
    Let's imagine next that I write the same sentence and spam a URL after it as the source. You decide my cited source is unreliable. Is that original research? Nope. Why? Because there are reliable sources (e.g, the US Census) that could have been cited to support this statement.
    As for "the intent of NOR to begin with", I wonder how many of the earliest versions you've read. You are absolutely right that adding the intervening step of requiring the crackpot to post on Facebook first would have bypassed this. OTOH, when this policy was originally written, Facebook didn't exist. Also, it would no longer by "original", but it still would have been rejected per WP:V (unreliable and SPS) and NPOV (undue emphasis on crackpot's own posts). WhatamIdoing (talk) 17:20, 1 April 2022 (UTC)
    When the policy was originally written, Facebook didn't exist, but personal webpages, Usenet, zines, etc., all did. Same garbage, different day. (The Crackpot index is thirty years old.) If NOR did not stipulate reliable sources, then other policies might still keep out the nonsense, but NOR itself would still be useless. Why not eliminate it entirely if its job can be done by V and NPOV?
    Let's imagine next that I write the same sentence and spam a URL after it as the source. You decide my cited source is unreliable. Is that original research? Nope. I'd say that it actually is. In that example, the onus would be on the editor who wishes to add the content to demonstrate that the content, having been challenged, deserves to be restored. And the way to do that would be to find a source that is actually reliable. In your example, it would be easy to guess where to look for a reliable source, but that's not the case all the time. What if, instead of the population of Paris, TX, the statement was about the total number of people still trying to make cold fusion work? It's not obvious that any document would give a trustworthy figure for that, as people outside the niche don't care, and people inside it have a motivation to inflate their numbers. Once a claim has been challenged as OR, it needs reliable sources in order to survive. XOR'easter (talk) 17:55, 1 April 2022 (UTC)
    @XOR'easter, please look at the definition in this policy, along with the footnote:
    The phrase "original research" (OR) is used on Wikipedia to refer to material—such as facts, allegations, and ideas—for which no reliable, published sources exist.
    1. By "exists", the community means that the reliable source must have been published and still exist—somewhere in the world, in any language, whether or not it is reachable online—even if no source is currently named in the article. Articles that currently name zero references of any type may be fully compliant with this policy—so long as there is a reasonable expectation that every bit of material is supported by a published, reliable source.
    Now, using this official definition, do you think that "The population of Paris, TX is about 25K people" is more likely to be "material—such as facts, allegations, and ideas—for which no reliable, published sources exist"? Or would you say that, in this specific instance, this is at least probably material for which we have a reasonable expectation that at least one reliable, published source (i.e., the US Census) has been published and still exists? WhatamIdoing (talk) 18:13, 1 April 2022 (UTC)
    In the specific instance of Paris, TX, we might have a reasonable expectation that a published, reliable source exists. In many, many other cases, we don't. We might not even have that reasonable expectation for the population of all inhabited locales. For example, I wouldn't trust an uncited population figure for any ancient city; such factoids can get copied from trivia book to website to YouTube clickbait, becoming completely divorced from actual historical scholarship.
    The way I see it, the existence of a well-reasoned challenge automatically means that a reasonable expectation is dispelled. XOR'easter (talk) 18:30, 1 April 2022 (UTC)
    Other cases don't matter. In this case, we have a reasonable expectation that a published, reliable source exists. Therefore, can we agree that this case does not fit the official definition of original research, even if this statement doesn't (currently) contain an inline citation? WhatamIdoing (talk) 18:50, 1 April 2022 (UTC)
    In this case, we have a reasonable expectation that a published, reliable source exists, but without that source in hand, we have no way to tell whether the text in question reports it accurately. Populations change from one census to the next, perhaps by enough that "about 25K" is not a fair approximation. The population of Flint, MI dropped from 102K to 81K in between 2010 and 2020. Maybe Paris was more stable than that, maybe not; I hesitate to say that the only reasonable expectation is that it was. What if the editor who added the text was going off old memories or working from an old newspaper story? If you asked me to estimate the population of the town I grew up in, I'd be off by ... checking ... about 50%. So, yes, in this specific case, we can have a reasonable expectation that a reliable source exists, but we still have room to doubt that such a source supports the claim. XOR'easter (talk) 19:11, 1 April 2022 (UTC)
    Have we now agreed that the mere fact of being uncited is not the same as OR?
    You might have guessed from my harping on this that this question is really important to me. This kind of problem – the problem of mashing up all the policies into one interchangeable blob, on the premise that what matters is only that "I say you're wrong" and not "exactly what's wrong" – does not optimally help willing editors solve problems (because you might try to solve the problem I incorrectly claim the article has, rather than the problem that the article actually has) and it especially does not help editors who try to improve and clarify our policies, guidelines, and procedures (because confusion breeds confusion, and Wikipedia:Policy writing is hard enough anyway even if you aren't starting from a point of confusion).
    So it's really important, from this sort of theoretical/philosophical/policy writing viewpoint that editors on this page be able to say:
    • about 25K, uncited: not OR; may or may not be a WP:V problem
    • about 25K, cited to poor source: still not OR, but a WP:V problem
    • about 25K, cited to a good source, but that's not what the source says: not (primarily) OR, but (definitely) a WP:V problem (of {{Failed verification}} type).
    • about 25K, cited to a good source, and that's what the source says: not OR, not WP:V – but still maybe NPOV (The population was about 25K in 1980; is the 1980 census DUE now that it's 2022?) or other problems (e.g., the MOS probably bans the "25K" style).
    Of course, there are other possible problems; if you see "about 25K" and have good reason to suspect that the population for that town is nothing remotely like that, then the presence or absence of citations is really not the most salient problem.
    My point is that these are all different, separate problems. I had hoped that this table would help experienced editors figure out that, e.g., merely being uncited is a WP:V and not OR. Perhaps if we provided some sort of table or decision tree, it would help editors use these terms in a fairly consistent fashion. I think that using terms consistently and precisely would improve our communication with other editors. WhatamIdoing (talk) 00:01, 2 April 2022 (UTC)
    See my question below. Crossroads -talk- 04:06, 3 April 2022 (UTC)
    As I said specifically, sourcing is not required for statements that are not challenged and unlikely to be challenged; but once you are challenged that disappears. Policy is clear that it is not sufficient that the sources exist in some vague nebulous sense - you must be able to cite reliable, published sources that are directly related to the topic of the article, and directly support the material being presented. Again, this is the very top of the page and has been for years; and it is quite unequivocal. Whether you agree or not, removing the reliability requirement (as some people are proposing here, and as some of the changes people have recently tried to make to the article would try and do) is a drastic change from existing policy and practice. Under current policy, and current practice, going back years, something is OR if there is no reliable sourcing to back it up; and if you assert that sourcing exists, you must be able to produce that sourcing on request (per the requirement that you {tq|be able to cite reliable, published sources}} - "argue that reliable, published sources exist somewhere" obviously does not reflect that!) These are the core longstanding policies. If you believe (somehow) that that is not the case and that the existing text of the article somehow supports your position, then I suppose we can drop the issue, but I am obviously a hard no on any change that would even obliquely suggest that an unreliable source could render something not-OR; and the very fact that people are trying to make such changes seems to indicate a tacit acceptance that the current policy does not support your position. If you want to change longstanding policy, start an RFC. Arguing "oh it was really this way all along" when the text clearly says otherwise is a waste of everyone's time. --Aquillion (talk) 09:16, 1 April 2022 (UTC)
    Your whole argument seems to depend on whether content is challenged or not, but every bit of existing text is unchallenged by default so it is implied this is what the policy is talking about, not some "nebulous" challenged text that a user might or might not encounter. Just take a closer look at what you keep talking about: must be able to cite reliable, published sources that are directly related to the topic of the article, and directly support the material being presented. Do you not see the difference between "must cite", and "must be able to cite"? Huggums537 (talk) 10:27, 1 April 2022 (UTC)
    I think this is the most important question: Do you not see the difference between "must cite", and "must be able to cite"? WhatamIdoing (talk) 17:23, 1 April 2022 (UTC)
    From one perspective, it's hard to see how there is a difference: in practice, how does anyone know if another editor is able to cite a reliable, published source if they don't go ahead and cite it? What other demonstration of the ability to cite a source is there than pointing to it?
    Now, there can be a distinction between what's appropriate when writing text in the first place and what's appropriate when that text is challenged. I've been doing a fair bit of work on FA and GA efforts for high-profile science topics, and when it comes to a scientific subject, what's obvious to an insider isn't necessarily so to every middle-school student who might be reading the page. It may happen that the person who first wrote a paragraph back in 2007 was just expressing what everyone in their field knows and what every textbook on the topic covers in its first chapter. Seeing that kind of material tagged with a {{citation needed}} is, I won't lie, a bit irksome. But the appropriate response is, pretty often, just to find a citation (say, in one or more of those standard textbooks).
    In other words, not including citations when you genuinely don't think the material is likely to be challenged? Eh, fine. Not digging up those citations when the material is challenged after all? That's bad. XOR'easter (talk) 18:22, 1 April 2022 (UTC)
    Sure, let the Wookiee win. It's almost always faster to find that textbook than to educate the Wookiee enough that he'll realize this really was basic information that anyone in the field would recognize. But the question remains: are some editors treating "must do it" and "must be able to do it" as having exactly the same meaning? WhatamIdoing (talk) 18:27, 1 April 2022 (UTC)
    It's not just a matter of satisfying the wiki-Wookiee. Citations can be legitimately helpful even for basic information that anyone in the field would recognize. What if I know the material in my field, but I want a particularly good explanation that I can recommend to a student? What if I understand the concepts, because everyone in my field does, but I want to confirm that a particular paper was published in 1992 instead of 1994? We're supposed to write technical articles one level down from the level at which the topic is normally introduced. That means, among other things, adopting different citation habits than one uses when communicating with fellow specialists.
    As English phrases, "must do it" and "must be able to do it" have different meanings; when it comes to supporting claims in Wikipedia articles by providing references, the easiest way to achieve the latter is to do the former, particularly when the expectation that reliable sources exist turns out not so indisputably reasonable after all. XOR'easter (talk) 18:44, 1 April 2022 (UTC)
    Not really. The easiest way to be able to cite information is to not make stuff up. Writing down basic information that everyone in your field already knows is not easier than writing down basic information that everyone in your field already knows plus also pulling out a couple of textbooks and figuring out which bits can be supported by which sources.
    For example, I happen to know that all-purpose flour usually has a protein content in the range of 10–12%. I also happen to know that unbleached all-purpose flour from King Arthur Flour happens to have a protein content of 11.7%. Think about it: Would it be easier for me to just write this down, or is it easier for me to write this down plus find a source for it?
    I grant that it's easier for page patrollers if I pre-supply a source (which they probably won't bother to check), but your assertion is that the easiest way for me to be able to cite a source if it's ever challenged is to already go to the extra trouble of finding and citing the source. I don't think that's true. We could agree to disagree about whether it's the best way, but I think your assertion that it's the easiest way is wrong. WhatamIdoing (talk) 00:13, 2 April 2022 (UTC)
    Too bad for them. Writing unsourced text is laziness and should not be endorsed in 2022. I understand that in its early years Wikipedia was much more tolerant of this, simply needing content and figuring it will be improved soon enough. That's not good enough now. Unsourced text is almost always reverted and would never be accepted at AfC. People should not be adding any text because "everyone familiar with the topic knows it" according to them, and such a justification is a massive red flag. The easiest way to demonstrate not just that the information is sourceable but that they are indeed able to source it is simply to source it. Crossroads -talk- 02:48, 3 April 2022 (UTC)
    I think we might need to stop and talk about what the word "easy" means. WhatamIdoing (talk) 17:22, 3 April 2022 (UTC)
    Sure, let the Wookiee win. This really made me laugh out loud! I guess you saw my edit summary about not wanting my source published for OR... Hee, hee ;) Huggums537 (talk) 21:20, 1 April 2022 (UTC)
    I hadn't then, actually, but now I have! WP:Let the Wookiee win is an essay that @Blueboar once threatened to write, about the stupidity of arguing with people who requested a citation. It's faster to give them what they allegedly want than to convince them to quit asking for it.
    Although I personally draw the line at the fact-tag spammed onto the claim that the human hand normally has five digits. Occasionally, probably rarely, a fact-tag is not meant to improve an article. WhatamIdoing (talk) 00:18, 2 April 2022 (UTC)
    An interesting, and happy coincidence! Huggums537 (talk) 01:30, 2 April 2022 (UTC)

I'm against that addition /changes, and it is problematic for many reasons. The first few relate to it being a large addition of essay-like prescriptive advisory type material to a core policy page. What is says or implies is inevitably several major policy changes, which would need a consensus on a very substantial and widely advertised RFC to change. Second it is far too prescriptive, with all of the problems that go along with that. Also, what it implies is in direct conflict with WP:Verifiability. Sincerely, North8000 (talk) 17:14, 1 April 2022 (UTC)

Since "you can" is not merely it's literal meaning (that you are not prohibited from doing so) it's obvious implication is you "should". North8000 (talk) 17:21, 1 April 2022 (UTC)
I wonder what makes you say that it's prescriptive (demanding that a specific thing be done). I thought it was more informative (providing information about optional approaches). WhatamIdoing (talk) 17:22, 1 April 2022 (UTC)
I do recognize that what you describe is the intent, and in the context of an essay it would be just as you describe. While my most recent post did talk about it implying "you should" out of it, when I said "prescriptive" I was more referring to an attempt to be very detailed ("If A, then B" or in a flow chart type way) which rarely works in Wikipedia. Sincerely, North8000 (talk) 17:32, 1 April 2022 (UTC)
I see. So not really prescriptive, but something that might (e.g., by providing clarity) have an effect on editor's behavior (e.g., helping them use the most relevant template). WhatamIdoing (talk) 18:15, 1 April 2022 (UTC)
I don't understand but I think we're good. In short, I think it's nice and useful work, but not in the context of putting it into this policy. Sincerely, North8000 (talk) 23:46, 1 April 2022 (UTC)

Question

Do the editors here that are making such a fuss about this understand that content does not NEED to be original research in order for it to be challenged and removed if it is unverified and unsourced, per WP:V? Crossroads and a few others seem to think that if they can't label content as OR, they can't challenge and remove it pending verification, and that's absurd! It is like arguing that the definition of "running a red light" needs to also include in its scope "driving over the speed limit", because the latter is dangerous, and cops must be able to ticket it, therefore it has to be one of the definitions of "running a red light", when, in reality, it doesn't, because speeding is also ticketable. Likewise, if unsourced material is challenged, then it fails WP:V - also a core policy - and it can be removed. But being unsourced does not automatically make a thing original research - that's ludicrous. Crossroads asked "why then does OR need to exist?" WhatamIdoing explained the historical background fairly well. It isn't common practice anywhere to remove rules that exist just because someone thinks they aren't likely to be relevant anymore (e.g., there's still a law on California's books that says Indians are allowed to gather acorns on public lands). I think the most applicable part of WP:NOR in the present day is probably SYNTH, though; editors still attempt (and do) add their own novel conclusions to WP article space that is not stated by any of their individual sources all the time, and WP:V doesnt necessarily cover those cases. 2600:1702:4960:1DE0:4D69:F0F8:A0CA:5BC (talk) 00:01, 2 April 2022 (UTC)

IMO this policy and wp:ver are 90% duplications of each other, with the other 10% being the expansion on synthesis. To the point where there have been some large scale attempts to merge them. So, in your analogy , if the red light law is in both policies, they are related. Sincerely, North8000 (talk) 00:49, 2 April 2022 (UTC)
We should try the merge proposal again. Splitting V and NOR is confusing. Levivich 16:24, 3 April 2022 (UTC)
Based on what happened last time, I think that the first step would be to get PSTS split from NOR (into its own page, still a policy, still mentioned here, etc., but not part of NOR itself). WhatamIdoing (talk) 17:24, 3 April 2022 (UTC)

Two scenarios

Scenario 1: An editor adds a paragraph of unsourced material, claiming X because Y and Z. I revert this claim and say it fails WP:BURDEN, which I cite. (Many might instead point to RS or the like with a new editor, but for the purposes of this comparison, suppose that more firmness is needed for whatever reason.) Unbeknownst to me, RS supporting "X because Y and Z" do exist. Nevertheless, as far as I can tell, nobody in this discussion would consider me incorrect in applying the policy that way.

Scenario 2: An editor adds a paragraph of material, claiming X because Y and Z. This time, Y and Z are sourced, but neither of them support X. I revert this claim and say it is WP:SYNTH, part of OR. I do so because SYNTH does the best job of explaining why their sources are not sufficient for the claim they are making. Unbeknownst to me, RS specifically supporting "X because Y and Z" do exist.

Why should these situations be any different? Why is there all this hand-wringing about how it's a big difference if sources "exist" 'in any language, anywhere in the world' for scenario B? Why exactly does any of that matter in practice?

Another concern with the table is its repeatedly saying to CHALLENGE it if you believe that no reliable source exists to support that claim. But often, when such text is reverted rather than tagged, it's not that we actively believe that no RS exist, but rather that without sources, in many cases our presumption is that potentially unsourceable text is best left out of Wikipedia. Crossroads -talk- 05:03, 3 April 2022 (UTC)

I agree with your logic, and also that it is an important point. As a sidebar, it's interesting that we're discussing/ debating about a very high threshold for reversions of added material (that the reverter has say or believe that no sources exist), when in reality the current standard is down in the toilet at zero....the reverter does not have to state even a brief perfunctory statement of concern about the veracity or sourcability of the material, thus making it prone to wikilawyering POV'ing use. North8000 (talk) 13:05, 3 April 2022 (UTC)
I disagree with this so-called "logic" completely. The current guidance does not exist solely for the benefit of editors to have an excuse to remove content. I see this as a very partisan way of viewing the guidance, as it fails to take any consideration for the fact that the current guidance is intentionally written to account for special circumstances, such as if an editor adding content claiming X because Y and Z is aware of sources supporting X, but they are unable to afford providing them because of being behind a paywall. Or, maybe the editor adding the content is a specialist in their field who is privy to knowledge about a source that could be used, but because of some censorship issue in China, it will have to go through red tape before it can be available, and then it will also be in a different language. Any child can read the guidance to see the intention of the guidance is to cover all contingencies, not serving a single purpose agenda. Huggums537 (talk) 15:57, 3 April 2022 (UTC)
If they aren't willing to cite it because of a paywall, how do they even know it supports their claim? Either they somehow read it and can cite it, or they didn't get the idea from an RS and shouldn't be adding it. And if the only source for a claim is being restricted by the Communist Party of China, then the claim simply does not belong in Wikipedia. We cannot take claimed experts at their word. Anyone can edit and anyone can claim to be an expert. As untrustworthy as the CCP is, "the government is blocking the evidence" is also untrustworthy. That's just Verifiability, let alone NOR. Crossroads -talk- 01:28, 4 April 2022 (UTC)
It seems like you are just arguing over the fact that I didn't come up with perfect examples to avoid the point of my argument, which you have not addressed. However, to address your concern, did you ever think that perhaps someone read it from a source they are unable to cite for whatever reason, but the source they read it from points to a reliable paywall they don't have access to? You can't really cite something you don't have access to even though you have a reasonable assurance of reliability from somebody else citing it, which is why Wikipedia can't be used as a source. Also, speaking about Verifiability in regard to; "the government is blocking the evidence" WP:V says, Do not reject reliable sources just because they are difficult or costly to access., and If you have trouble accessing a source, others may be able to do so on your behalf. So, while the person being censored might not have access, somebody else might. But, all that is unimportant quibbling. The main point is how the focus of the current guidance is covering multiple contingencies not being an ends justifying some specific means. Huggums537 (talk) 06:28, 4 April 2022 (UTC)
"Oh, let me fix that, because I just read about that the other day, but I need to can't remember the book's title right now" is something that happens. WhatamIdoing (talk) 15:46, 4 April 2022 (UTC)
That's another example, and I will tell another one in more detail below... Huggums537 (talk) 17:07, 4 April 2022 (UTC)
Heck, I bet even Crossroads could come up with better examples than I did since he is a much more experienced editor than me, but I doubt he would name them for fear of losing support in the debate. Huggums537 (talk) 17:25, 4 April 2022 (UTC)
Crossroads, I think the difference for the specific and unusual scenario you present is fairly minor. You genuinely believe the editor has combined Y and Z to produce X, and if you were correct, then it really would be a SYNTH (and therefore OR) problem.
The much more relevant problem is:
Scenario 3: An editor adds a statement claiming a simple fact, like "The population of Paris is about 25K" without a citation. I revert this claim and tell the editor that Wikipedia accepts Wikipedia:No original research. Unbeknownst to me, all of the reliable sources published during the last 40 years support this statement (the city's population has been fairly stable).
Now: Was my action good? Reverting the addition is acceptable, although WP:PRESERVE suggests that we shouldn't revert things when, as in this case, a few seconds with a web search would produce the missing source. But we don't enforce the rule that nominally requires would-be reverters to be collaborative, and we don't have a rule that requires editors to know everything or to make only perfect edits. So that part is okay; I removed accurate content, but it's not the end of the world.
Was my explanation good? Telling the other editor not to engage in OR is basically wrong. If it's a newbie, the newbie will think "Wikipedia" (as opposed to some random person on the internet who happens to have figured out where the undo button is) has accused the newbie of lining up everyone in town and counting them off to come up with the number. If it's not a newbie, the editor will just think I'm an idiot, but the harm probably won't go beyond lowering my own reputation (and maybe the occasional drama board discussion).
Does this matter outside of the specific people involved in the transaction? I think so, for two reasons:
  • Less experienced editors might believe my written excuse for reverting the addition. This results in drift between what the editors believe the pages say and what the pages actually say. This is the telephone game problem, and it has resulted in problems like editors believing that BRD is a mandatory policy, or that the word secondary is just wikijargon for a good source.
  • When we are writing the policies (the purpose of this particular talk page), we need to know exactly what we're talking about. It might feel pedantic to insist upon the distinction in your scenarios 1 and 2 (both of which appear to fail both WP:V and NOR), but in discussions about the policies, it's helpful if everyone involved actually knows the exact meanings. You don't want your doctor using words like flu and Covid interchangeably; we don't want editors using words like unverifiable and original research interchangeably.
For newbies, this last point is really not important. But for the people who gather on this page, to see how the policies can be improved, it is very helpful if they can use exactly the word(s) that describe a situation precisely. I hope that this table will help the experienced editors like us more easily understand the relationship between the sourcing policies (assuming, of course, that Levivich doesn't manage to get NOR and WP:V merged). WhatamIdoing (talk) 18:54, 3 April 2022 (UTC)
Yeah, and everything she said too... Huggums537 (talk) 00:06, 4 April 2022 (UTC)
You claim that my scenarios, stated in very general terms, are specific and unusual. They are not. I have dealt with exactly those situations numerous times. Your scenario, which you claim is much more relevant, is actually extremely unrepresentative. It skews the discussion because we all know that city populations are counted and recorded in censuses, and hence that sources exist for Paris' population. For the overwhelming majority of claims that someone might write, a reviewing editor will not know whether or not sources exist. If they were to write pretty much any other claim about Paris, I would have no idea if a source exists.
Citing NOR when removing a statement which has no sources is uncommon because it makes much more sense to cite V or RS in such cases. Either way, it is still completely irrelevant whether sources "exist" somewhere out there. The currently-in-text but IMO outdated distinction where one of the two policies cares about whether sources "exist" before it can be truly considered a violation, regardless of the edit itself, should not be more emphasized. It will lead to much worse confusion, seeming to raise the bar on when OR/SYNTH can be pointed to and when text can be removed.
If rejigging/merging of NOR and V were done, here's what I would suggest: V is about when no sources are cited, or the sources cited are unreliable. NOR (whether it remains a separate page or not) is about when reliable sources are cited, but they are being synthesized, or extrapolated or expounded upon, to reach or imply conclusions that are not in those sources. The stuff about how it's not OR if sources exist anywhere on Earth in any of numerous languages that our editors don't even understand would be fully deprecated and deleted as an outdated legacy irrelevant to determining misuse of sources. I'd argue that we are already there in practice - editors usually cite OR if text is citing reliable sources but is misusing them. Crossroads -talk- 01:18, 4 April 2022 (UTC)
Your scenario, which you claim is much more relevant, is actually extremely unrepresentative. Of what? Not policy, that's for sure. Also, patently untrue statement because simple statement of fact scenarios are hundreds, perhaps thousands of times more common than your specific SYNTH statement scenarios making them exponentially more relevant. For the overwhelming majority of claims that someone might write, a reviewing editor will not know whether or not sources exist. Are you serious?!? I think the exact opposite is true. I think the overwhelming majority of editors here have some kind of clue if a source exists for a particular piece of content they are reviewing. It will lead to much worse confusion, seeming to raise the bar on when OR/SYNTH can be pointed to and when text can be removed. I see no explanation of alleged confusion, but I do see this pattern of being obsessed over the removal of text. The stuff about how it's not OR if sources exist anywhere on Earth in any of numerous languages that our editors don't even understand would be fully deprecated and deleted as an outdated legacy irrelevant to determining misuse of sources. In the modern age of Google Translate, your idea that editors can't understand sources in other languages is actually the "outdated legacy". Also, The stuff about how it's not OR if sources exist anywhere on Earth in any of numerous languages never was about misuse of sources in the first place, so it is kind of a moot point in suggesting they are irrelevant to determining misuse of sources. Huggums537 (talk) 07:29, 4 April 2022 (UTC)
About this:
  • Citing NOR when removing a statement which has no sources is uncommon because it makes much more sense to cite V or RS in such cases.: My goal in creating this table is to make it happen even less often, because, as you say, it makes much more sense to cite WP:V in such cases.
  • it is still completely irrelevant whether sources "exist" somewhere out there: Not according to the written policy, as you say.
Emphasizing this distinction does not raise the bar on "raise the bar on...when text can be removed". Emphasizing this distinction primarily has an effect on the edit summary you type when you remove it, not on whether you remove it.
I wonder whether your reaction to the long-standing "sources exist" is a worry that editors will expect you to know about the millions of sources. There has never been any expectation along those lines. The goal is:
  • Bob: This smells like original research.
  • Alice: Here's a source to prove that this idea was published elsewhere/not made up by a Wikipedia editor.
  • Bob: Oh, thanks. I guess it's not original research after all.
The goal is not:
  • Bob: This smells like original research.
  • Alice: Here's a source to prove that this idea was published elsewhere/not made up by a Wikipedia editor.
  • Bob: I still say that it's original research, because you didn't cite the source in the first edit and say Mother May I? when you were adding it.
The goal is also not:
  • Bob: This smells like original research.
  • Carol: It's your job to prove that no source in the world has ever published this. I hope you read every language in the world and can prove a negative.
Your suggestion for rejigging the policies has the following problems that you'll want to address:
  • It doesn't ban actual original research ("I proved cold fusion in my kitchen, and I want to share my results on Wikipedia instead of on Facebook or in a scientific journal"). This should be trivial to fix; just add to your proposal a brief section on ==No original research== that says "No, really, we don't publish your original scientific research results, or anything similar. If it wasn't already published somewhere else first, then go away".
  • It narrows the concept of verifiability to use only cited sources. This will be a harder lift. Traditionally – though I suspect that newer editors like yourself have a different idea about this – a fact was verifiable if you were "able" to verify it, using all the sources available to you, not merely if it happened to be in a source cited at the end of that sentence. This would be redefining editors' ability to verify information as being "unable" unless the source is presented to them. The narrower model leads to two problems (all uncited material, including Sky-is-blue and other non-MINREF material being declared unverifiable + need for multiple citations in each sentence of the WP:REPCITE type) but might solve another debate (whether material cited elsewhere in the article needs a citation in every location vs assuming we're smart enough to remember that information cited in paragraph #3 is still verifiable by the time we get to paragraph #6 – this approach would likely require repeating the citations at every mention).
I've no objection in principle to rejigging the policies. I do strongly believe that the first step needs to be splitting PSTS out into its own policy. WhatamIdoing (talk) 16:59, 4 April 2022 (UTC)
I would like to add a good example to reinforce what @WhatamIdoing is saying. suppose I add a line of text that says, "There is the shape of an arrow hidden within FedEX logo." without adding a source. Is it original research? No. Why? Because sources exist for it. As a matter of fact, you don't even need a source for it at all because common knowledge of the logo has made it so stinking easily verifiable that even the primary source is worldwide available, and you can verify it next time you see the logo out in the world, or on a google search. Not even a secondary source is needed. Textbook BLUESKY example. Would I source it though? Probably, because some jackass might demand me to if I did not. Furthermore, if I added this line of text to my draft article about the FedEx logo, nobody would ever dream of going to my unfinished draft, and removing the text based on OR. The people suggesting that unsourced text is equal to OR are also essentially saying that have two sets of rules for OR, where we quite happily allow what we would normally call OR in our articles into our unfinished drafts as long as it is in private, but what we might call "not yet sourced draft text" for unfinished articles is not allowed for public view. This is hypocritical. My view on OR is consistent, and uniform with policy, saying OR is not allowed anywhere on Wikipedia, it's just that we have very different views about what that might look like in real terms. Huggums537 (talk) 17:53, 4 April 2022 (UTC)
I think that we're sort of dealing with many different topics at once here
  • In these central areas that we are discussing (I.e. not counting the PSTS and synth sections), wp:nor and wp:ver are basically duplication of each other and so there is no fundamental difference between OR and violating wp:ver.
  • The rules only apply in certain places, so there is a big difference between a draft and an article in article space.
  • We do need to allow the rules to get invoked for challenged BLUSKY text otherwise we weaken wp:ver too much. But I advocate requiring a perfunctory statement of concern about veracity or sourcability with the challenge. Even wikilawyer POV warriors trying to knock out material would be deterred if forced to look stupid by having to say "I have concerns that it might not be sourcable that the sky is blue" .
Sincerely, North8000 (talk) 18:28, 4 April 2022 (UTC)
The rules only apply in certain places, so there is a big difference between a draft and an article in article space. Realistically speaking, there really isn't any difference between a draft, and an unfinished published article other than the fact that articles get indexed by Google, and are searchable/viewable to the general public on the internet, and draft articles do not. Other than that, draft articles have the exact same rules for review that any other article would have no matter which "certain places" may be reviewing them. The truth is any of these "certain places" handle the rules very much differently when handling, and reviewing drafts. In essence, pretending like there is a big difference between a draft, and an unfinished article, when there really isn't that much. That is part of the problem. Huggums537 (talk) 03:32, 5 April 2022 (UTC)
The searchable/viewable to the general public on the internet is a massive and fundamental difference. Draftspace's purpose is to keep crap out of article space and exercise some quality control. We're not able to catch all the crap in article space either, but we do need to do our best to fight it. Crossroads -talk- 03:47, 5 April 2022 (UTC)
But my point is that the rules are being treated differently, not the articles. It's as if the community is more lenient with the rules on a draft because this is the productive work of some Wikipedian, but after it has been approved and/or released "out into the wild" of public article space, it is now "fair game" to be preyed upon by any lurking predators. So, if there were any big difference between drafts and articles that are creating a problem, that would be it. Huggums537 (talk) 03:59, 5 April 2022 (UTC)
Thinking of other reviewing editors as "predators" is a problem. The more eyes on these articles, the better. The community is more lenient with drafts, but only because they are not actually articles. Many articles, also, have never been drafts and never got review at AfC. They either existed before draftspace was created, or were created or moved directly by an autoconfirmed editor. Crossroads -talk- 04:25, 5 April 2022 (UTC)
Thinking of other reviewing editors as "predators" is a problem. How else should you think of people if, The community is more lenient with drafts, but only because they are not actually articles. and, when it comes to "real" articles they are not lenient with the rules, but use more hunt and destroy predatory methods? Many articles, also, have never been drafts and never got review at AfC. They either existed before draftspace was created, or were created or moved directly by an autoconfirmed editor. This exactly proves my point. There is precious little difference between a draft, and an article, yet the rules are not being treated the same.Huggums537 (talk) 05:11, 5 April 2022 (UTC)
This is not true: OR is not allowed anywhere on Wikipedia. The very first sentence of this policy limits NOR's concerns to "articles". WhatamIdoing (talk) 20:48, 5 April 2022 (UTC)
WhatamIdoing, regarding your last two bullet points: (1) I doubt anyone has honestly tried to use Wikipedia as their publisher for novel experimental findings since, like, 2005, but such a case could be reverted with WP:V, as no reliable sources for such a thing exist. Synthesizing sources into a new thesis or literature review is also a type of research, and original research of that sort is also forbidden yet is by far the most common form of it today. WP:Student editors, for example, do it all the time. (2) While I do think we should upgrade WP:V to require people to actually cite sources, doing so isn't essential to my proposal. The main point is that OR would be about misuse of reliable sources while everything else is covered by V and RS. Crossroads -talk- 03:47, 5 April 2022 (UTC)
I think Crossroads is right about the original problem of novel experimental findings being a fairly antiquated problem, and modern day SYNTH being more of a concern. However, I remain unconvinced about his proposal mainly because this discussion has dragged out so long that I actually forget what it was... Huggums537 (talk) 04:13, 5 April 2022 (UTC)
(1) Yes, novel experimental findings would be a violation of both NOR and WP:V. This fact is one of the reasons that the merge was proposed.
(2) The point is that OR isn't the misuse of reliable sources. Original research = material—such as facts, allegations, and ideas—for which no reliable, published sources exist, anywhere in the world, in any language, whether known or unknown to the editors, etc. Misuse of reliable sources, with the exception of misuse in the form of SYNTH, is a WP:V problem, and always has been.
This problem is the purpose of the table. Look at it:
  • No source: a WP:V problem.
  • Unreliable source:  a WP:V problem. 
  • Misused source:  a WP:V problem. 
  • Misused sources in combination:  a SYNTH problem. 
  • Ain't nobody never said that nowhere: actually OR
The "pure" OR problems (i.e., those that aren't SYNTH) may be less common than they once were. In that case, editors should only rarely have any reason to complain about OR in articles.[1] What I'm feeling in this discussion is a story that says "Well, since people almost never post real OR any longer, then I should get to use the term original research to mean whatever I want, such as when I'm talking about problems technically covered by a completely separate policy". That makes as much sense as saying that you'll be driving a chariot to the store when you go shopping – maybe a little funny, but not helpful if someone looks up Chariot when they ought to be looking up Ford F-Series.
  1. ^ Although, unfortunately, that's not true. With the 2010 change to include information published only in unreliable sources, all of those soon-blocked editors writing about COVID misinformation because they read The Truth™ on some internet forum run by conspiracy theorists are engaging in pure OR. Under the original definition, those people would have been violating WP:V but not WP:OR.
WhatamIdoing (talk) 21:02, 5 April 2022 (UTC)
Please stop repeating the erroneous claim that OR didn't require reliable sources until 2010. This was thoroughly debunked by XOR'easter above.
I am well aware of how OR is defined presently, but I described this as a proposal because my point is that that definition is outdated and redundant and should be changed.
I believe that what is a confusing use of "OR" is not what you think. If I remove some text saying that it is OR, I believe that most editors will understand that as "oh, the sources are being combined or extrapolated to support ideas not found in them", not "oh, he's saying none of the world's sources in any language support that claim". I've been editing since 2018, and regularly since 2019, and that's how I always understood it until recently when you and like one other editor started to say it only applies if no sources exist anywhere. If typical editorial practice has diverged from the strict text of the rule, then in this case I would suggest it is more sensible to update the latter, not the former. After all, stuff not published in any source is already 100% blocked by WP:V, so by your interpretation "pure OR" is a completely redundant and useless concept. Crossroads -talk- 05:11, 6 April 2022 (UTC)
It was not debunked by XOR'easter. Until 2010, the definition of OR was about content that had not be published anywhere, including in unreliable sources. The method for proving verifiability (=what XOR'easter quoted) is not the same as the definition.
Do you think that OR==SYNTH now? The case you mention ("sources are being combined or extrapolated to support ideas not found in them") is OR,[1] but SYNTH is only one subtype of editors adding content that doesn't appear in sources.
[1] Always assuming the idea isn't present in some other, uncited reliable source. We sometimes see scientists, in particular, citing the seminal papers that led to a discovery but not the boring sources, like textbooks, that spell out the end result. If I were to cite Watson and Crick's famous paper on DNA, and I write that the discovery of DNAs helical structure led to a fundamental shift in our understand of genetics, then it's not OR (because the idea is in every biology textbook published since the 1970s), but it did fail verification (because it's not in the specific source I cited). WhatamIdoing (talk) 17:13, 6 April 2022 (UTC)
NOR policy, 2005: The phrase "original research" in this context refers to untested theories; data, statements, concepts and ideas that have not been published in a reputable publication (emphasis added).
SYNTH is a type of OR, although the text of SYNTH seems to concern itself with the sources actually being cited; e.g. do not combine different parts of one source to reach or imply a conclusion not explicitly stated by the source. If one reliable source says A and another reliable source says B, do not join A and B together to imply a conclusion C not mentioned by either of the sources. (Emphasis added.) This makes far more sense than any alternative.
If a reliable source exists in a forest, and no editor knows about it or reads it, does it make something not SYNTH? Crossroads -talk- 06:19, 8 April 2022 (UTC) revised Crossroads -talk- 07:15, 8 April 2022 (UTC)
I said I wasn't going to participate anymore, but this is getting ridiculous; If a reliable source exists in a forest, and no one knows about it or reads it, does it make something not SYNTH? Well, if a source exists in the forest, that means someone had to write it and put it there, so somebody knows about it, and has read it by virtue of the fact it was written by someone in any language, and it exists anywhere in the world. It doesn't make something not SYNTH, but it is enough to make something not OR. Please stop confusing the two. Huggums537 (talk) 06:46, 8 April 2022 (UTC)
I revised it for clarity, but it's a play off a saying. Crossroads -talk- 07:15, 8 April 2022 (UTC)
NOR policy, 2005, exactly the same diff, previous sentence: "Original research" refers to original research by editors of Wikipedia; it does not refer to original research that is published or available elsewhere." No mention of reliability or reputability. Direct and explicit mention of "by editors of Wikipedia", i.e., not original research done by non-editors that the editors of Wikipedia happened to hear about.
Your choice of aphorism is reasonably apt. When the tree falls in a forest, but no one is around, it makes a sound (sound waves are produced) but it does not make a sound (there is no perception of those sound waves). In this case, if a source exists for the content, but the editor does not cite it (or even know that it exists), the the content looks like a SYNTH/OR violation, but actually isn't. WhatamIdoing (talk) 19:52, 8 April 2022 (UTC)
You can't quote that sentence shorn of the clarifying context in the sentence immediately after which I quoted, which specifies a reputable publication. As if anyone in 2005 would buy the argument, "it's not OR because the same ideas exist at crankphysicsblog.com".
And the point of my aphorism is that in practice it does not matter whether sources exist unbeknownst to editors. I have yet to hear a good reason why it actually matters whether unknown sources exist, except for the circular argument that 'it's what the policy says' - in some places yes, but my whole point is that this is irrelevant and pointlessly confusing. In other places it says things like, conclusion C not mentioned by either of the sources. This would be improper editorial synthesis of published material to imply a new conclusion, which is original research. This is much more sensible. Crossroads -talk- 04:33, 10 April 2022 (UTC)
Actually, editors did say that things were not OR but still failed WP:V and NPOV. See, e.g.:
I agree that SYNTH has drifted over the years to focus more on the currently cited sources than it probably should. The fact remains that if you can fix an alleged OR (including SYNTH) problem by adding a citation to a reliable source, then it wasn't an OR (including SYNTH) problem in the first place. It was a missing-citation problem, not a made-up-by-a-Wikipedia-editor problem.
As for why it matters: This policy was created to explicitly say that editors shouldn't use Wikipedia as their first place to publish a new idea. Yes, I know: almost nobody tries to do that now, and the ones who try are a lot more subtle and more SYNTHy about it than they used to be.
But it's not irrelevant, and it's not pointlessly confusing. What's pointlessly confusing is claiming a NOR violation when the problem is fully covered by WP:V. If your concern is a missing citation, then just say that it needs a citation. Don't bother accusing the editor of adding information that can't be sourced properly when your practical concern is only that the editor didn't source it (yet). WhatamIdoing (talk) 20:11, 10 April 2022 (UTC)
well, *shrug* the 2005 policy says what it says. One of your cited arguments is from 2013, well after the supposed change in 2010. It's evident that people will make mistaken arguments no matter the time period.
We have vastly more need for a policy that says "even if you're citing reliable sources, don't use them to reach conclusions not found in those sources" than something about "don't use Wikipedia to publish a new idea". I've never seen any confusion from using OR in this way except for one or two long-time editors policing the meaning of OR. And to be clear, I'm saying SYNTH should stay focused or become more focused on the currently cited sources. Crossroads -talk- 01:19, 12 April 2022 (UTC)
We have a policy to deal with "ghost references" (sources cited in support of a claim, only nothing like that claim can be found in the cited sources). Those are WP:V problems; they have {{failed verification}}. A simple and common example of that is "This was first described by Alice Expert in 1895", citing Alice's 1895 scientific paper on the subject. The citation demonstrates that Alice described it, but it does not prove that she was the first.
If it is additionally true that nobody in the world has ever written about who published the first paper (or if, e.g., the reliable sources claim that Bob did), then it is a WP:OR problem. But when the problem can be solved by adding "Big Blue Science Textbook, ISBN 978-123456780. p. 234" in the refs, then it's not "material—such as facts, allegations, and ideas—for which no reliable, published sources exist." It is in that instance only material—such as facts, allegations, and ideas—for which the appropriate and relevant reliable, published sources have not yet been cited in the article" Because, as you say, the policy says what the policy says, no matter if some editors have gotten into the habit of incorrectly crying "original research" when the problem is "someone cited a source that doesn't 'directly support the material'". WhatamIdoing (talk) 13:58, 12 April 2022 (UTC)
  • I'm going to respectfully withdraw from the conversation. I recognize when I'm getting overcharged about something that really isn't that important. I'm going to go to my mellow place now. Thank you all for having me.Huggums537 (talk) 19:30, 5 April 2022 (UTC)
Maybe overcharged regarding your own level of concern, but here all is fine. North8000 (talk) 19:55, 5 April 2022 (UTC)
  • In the above table I think there might be scope to elaborate on what we mean by "content"? Partly informed by my various trials and tribulations communicating with QuackGuru, and partly by the Rlevse incident which happened in Wikipedia's equivalent of the Late Jurassic, I've been groping towards an idea about verifiability as it applies at different levels of abstraction. There's an unfinished essay on this at User:S_Marshall/Essay but the Cliff Notes are, verifiability applies at the level of thoughts, concepts and ideas, not word choices. The fact that we don't use a source's exact phrasing does not mean we're engaging in OR. Is there a pithy, succinct way to say this?—S Marshall T/C 23:22, 5 April 2022 (UTC)
You're tackling a big topic there. I guess the indirect way is the current one....that we're supposed to summarize, not copy. And use quote marks and say who said it when we do copy. North8000 (talk) 23:31, 5 April 2022 (UTC)
Why's it a big topic? To me it looks like a need for an extra sentence or two in NOR.—S Marshall T/C 07:52, 6 April 2022 (UTC)
I think you'll find that it's more complicated than that. Word choice matters in some situations, and those situations aren't always clear. Consider, e.g., "people infected by a rhinovirus". Is that the same as "patients with the common cold"? Maybe. It depends on what the source is talking about (e.g., the word patients might exclude people who didn't seek professional medical care, which is most people infected by a rhinovirus; people who are infected but asymptomatic don't have the common cold; rhinoviruses are only one virus that causes the common cold) and what you're writing (e.g., the level of generality).
As for the "extra sentence or two", it's already there in the lead: "Rewriting source material in your own words while retaining the substance is not considered original research." WhatamIdoing (talk) 17:28, 6 April 2022 (UTC)
It is already there in the lead, and it's old, old wording. It said exactly those words when I found someone with 300k edits tagging individual words with {{fv}} here, and it said exactly those words when arbitrator, checkuser, bureaucrat, prolific writer of featured content and DYKs, and all round Pretty Important Wikipedian Rlevse wrote this parting statement. If those people didn't get it, then maybe the point could use some more emphasis?—S Marshall T/C 22:21, 6 April 2022 (UTC)
It's also been in Wikipedia:These are not original research#Paraphrasing for years. OTOH, WP:Nobody reads the directions, and a sizeable number of editors won't admit to reading the directions if the directions don't support their view. See, e.g., the discussion at WP:WTW in which editors say that WP:INTEXT attribution is not appropriate for some WP:LABELs, but they still want LABEL to demand that in-text attribution be used every single time. The fear appears to be that if we admit that it isn't required always, then other editors will refuse to use it when (I think) they really should. WhatamIdoing (talk) 23:44, 6 April 2022 (UTC)
Could something to that effect somehow go into your proposed table? If nobody reads it then at least I'd have a bigger stick with which to beat the next idiot in the talk page discussion.—S Marshall T/C 07:57, 7 April 2022 (UTC)
This table deals with things that are violations of something. Maybe we need a new subsection under Wikipedia:No original research#What is not original research? Or perhaps two: One for ===Writing in your own words=== and another for ===Source-based research===, which I believe used to be positively encouraged in this policy (source-based research is the kind that you do to determine whether a source is reputable, and thus that you should prioritize information from Journal of Serious Academics over crackpotsRus.com). WhatamIdoing (talk) 22:59, 7 April 2022 (UTC)

I'd say that evaluating sources is the key task of the encyclopaedist, but I would hesitate to call source-evaluation "research" (particularly on Wikipedia where that language has connotations).—S Marshall T/C 00:27, 8 April 2022 (UTC)

I think the idea was that you shouldn't do "original" research, but you should do "unoriginal" research (i.e., writing what the sources say). WhatamIdoing (talk) 03:22, 8 April 2022 (UTC)
Is reading the sources "research" though? Maybe to an economist but in proper, rigorous fields, "research" involves experiments or observations followed by statistics.—S Marshall T/C 09:03, 8 April 2022 (UTC)
There are at least two varieties of research. Source-based research is learning about a topic by reading sources. Original research is learning about a topic by observing the world, doing experiments, or thinking new thoughts.
Evaluating sources is an activity that falls within research. Either source-based research or original research can be applied to source evaluation. For example, if a source says the power supplied by a USB cable is 12 volts, and I pull out my voltmeter and find it's really 5 volts, I have done original research to discredit the source. I could use this result in a Wikipedia talk page to suggest the source not be cited in the Wikipedia article, but couldn't use the result in the Wikipedia article Jc3s5h (talk) 12:21, 8 April 2022 (UTC)
I have always thought that we made a mistake when we entitled this policy “No original research”… it really should have been called: “No original analysis or conclusions”. It doesn’t really matter whether your research is sourced based or observational… what matters is what you DO with that research. If it leads to an original analysis or conclusion, we don’t want it. This is why SYNTH is part of the policy… you are taking what two sources say and combining them to reach an original conclusion. Blueboar (talk) 12:42, 8 April 2022 (UTC)
Well, we could always look it up on Wikipedia. Research, basic research, applied research, all pretty clear about what research is. It's the analysis and interpretation of data. It follows that reading other people's research is not research. I think the correct word for the reading and critical evaluation of other people's research is education, and that good Wikipedians are autodidacts: we self-educate before and during the article-writing process.—S Marshall T/C 13:14, 8 April 2022 (UTC)
@S Marshall I think an even better word than education is "evaluation". It would make a great substitute to avoid the confusion. What do you think? Huggums537 (talk) 10:17, 27 April 2022 (UTC)
And yet I'm pretty sure that teenagers all over the English-speaking world are regularly told to write "research papers", and that the instructions amount to writing a five-paragraph essay about any topic, supporting each statement of fact with a citation to a book or other reputable source. In that case, reading other people's research is research. WhatamIdoing (talk) 20:20, 8 April 2022 (UTC)
Interesting. I wasn't able to reconcile this with my lived experience. I was educated in the English-speaking world, and my education didn't include that exercise at any point. I wasn't asked to write research papers until I got to university. I've just asked my son, who was a teenager relatively recently, and he doesn't recall it either. Neither of us have ever written a five-paragraph essay.—S Marshall T/C 23:12, 8 April 2022 (UTC)
Did you write anything longer than a paragraph or two in school? Did your teacher ever tell the class to go to the library and look up information? WhatamIdoing (talk) 03:57, 9 April 2022 (UTC)
Yes to both, but going to the library and looking up information was never called "research". I'm interested that to you it was; that helps me understand how Americans can call reading or watching TV or YouTube "doing their own research". I live and learn!—S Marshall T/C 08:13, 9 April 2022 (UTC)
But how do they (the authorities) know what went on in Florence, or what Florence was like in 1672, without research that involves reading, or how do they know what Faraday was thinking without research that involves reading, or how do they postulate what events effected Shakespeare in writing Lear, without research that involves reading.
Not all information is containable in a spreadsheet, nor is all "data", so to speak, numbers.
Is not a very large part of the idea of producing "original" knowledge, having a metaphorical conversation with, or building upon, what came before, and what came before is often written, to research it (literally re (again) searching it), you must read it, otherwise you can neither converse, nor build.--Alanscottwalker (talk) 11:47, 9 April 2022 (UTC)
(Tangent: I heard about a new book, Index, A History of the. Apparently, when Subject indexing was invented, scholars thought it was a tool for lazy men. I imagine that there will be a disproportionately high number of Wikipedians reading this book.) WhatamIdoing (talk) 20:17, 10 April 2022 (UTC)
So what was I doing at the various universities I attended when I used their research libraries to write papers? Doug Weller talk 10:54, 12 April 2022 (UTC)
(edit conflict) Well, that is what it is uniquely about. The other thing that it uniquely covers is Primary/Secondary/Tertiary which seems to have randomly mis-landed here. The rest is a duplication and restatement of wp:VER. Which does create messes. If we ever want to again try to tidy it up, rather than some massive merge effort which might ignore what needs to be here, and which would again die under it's own weight, we could instead migrate PSTS and the duplicated topics to wp:VER, leaving this policy focused on synth. Possibly then rename to your idea. North8000 (talk) 13:20, 8 April 2022 (UTC)
Oh gods… Please don’t get me started on PSTS… it started as a simple statement: “WP should not be a primary source for information” (which was fully in line with the idea behind this policy)… but unfortunately grew into a much more complicated section that, while not wrong, lost that original emphasis. Oh well… battles fought and lost long ago.
In any case, I think I have given all the “two cents” I can to this discussion. I will leave with a plea: when it comes to policy, saying less (but clearly) is often better than saying more (but confusingly). Have fun. Blueboar (talk) 13:49, 8 April 2022 (UTC)
I agree with saying less. OR is generally well-understood and we don't need to overthink this. How do we translate the words and ideas in a reliable source into our own words without adding original or unintended ideas? The word you're looking for is Wikipedia:Summary. And there's a Wikipedia guideline for that too. Shooterwalker (talk) 18:19, 8 April 2022 (UTC)

Should we work to reduce overlap / duplication between WP:No original research and WP:Verifiability ?

One thing that came up in various places above is that there is substantial overlap / duplication between WP:No original research and WP:Verifiability. One might say that WP:No original research uniquely covers Synthesis, which is it's rightful main topic, and uniquely covers Primary/Secondary/Tertiary which probably doesn't belong here, and that everything else is a duplication of WP:Verifiability , which causes problems and complexity. Some big attempts were made in the past to merge them. If we wanted to tidy this up, a more incremental way would be to gradually migrate the duplicated and misplaced stuff over to WP:Verifiability , leaving this policy to concentrate on Synthesis type areas. If there were interest in discussing this further, one proposal would be a "soft" decision to slowly give that migration a try. North8000 (talk) 14:06, 8 April 2022 (UTC)

  • Wikipedia rightly has separate articles on arachnid, spider, and tarantula. Arachnid says that arachnids have eight legs. Spider says that spiders are a kind of arachnid that has eight legs. Tarantula says tarantulas' eight legs are attached to the prosoma. We do this because we need to say the same thing in different places, each time it comes up, because the information is fundamental to our reader's understanding of the topic. We do that a lot in NOR and V because NOR is basically a case of V.—S Marshall T/C 16:46, 8 April 2022 (UTC)
Agree that NOR is a case of wp:V. But the problem is that it is a top level policy which includes a parallel duplication of top level policy wording in wp:ver. And the such duplication creates many issues, including making it overly complex to tweak or make changes to either one. Sincerely, North8000 (talk) 21:19, 8 April 2022 (UTC)
  • I agree that PSTS doesn't really belong here, and I support splitting PSTS to its own policy. Once that's done, it might be easier to figure out the rest. There is quite a lot of duplication, and I think some of it is intentional. (Also, does the ==See also== section really need to be so long?) WhatamIdoing (talk) 20:36, 8 April 2022 (UTC)
Agree that a baby step would be good. One might view PSTS as two things. The official wiki-definitions of those three source types, and a little bit on what to do / not do with certain types. (basically, favoring the use of secondary sources and restrictng the uses of primary and tertiary sources). One concern might be that taking such a small thing and making it a separate policy might invite making it too huge / an open door to wp:creep whereas being a mere section in wp:ver would tend to keep it at it's current size. Also, it seems to be closely related to wp:ver. Sincerely, North8000 (talk) 21:10, 8 April 2022 (UTC)
I think it's more closely related to NPOV, but that's probably because of the subject matter I think of first. PSTS is almost irrelevant if you're writing about which celebrity/politician said something recently. In that case, provenance (e.g., is this an independent source? the authoritative original? an outfit known for twisting people's words?) matters, but a transformative analysis is not usually the point. PSTS is fairly important if you're writing about medical subjects. In that case, "lookit this cool new primary study" (or patent application, or whatever) is usually a terrible source, but a secondary source saying "we looked at all the research on this, and among all the dross, we concluded that these two studies are basically decent and represent mainstream medical views" should be given much more weight than almost any primary study. WhatamIdoing (talk) 04:05, 9 April 2022 (UTC)
That does not seem quite right. People saying something often needs a context analysis for its meaning: "I like girls.", could be something, from the trite and anodyne, to the personal revelation, to the hint of illegal conduct, so how it is contextually analyzed and by what authority matters.
On the larger issues, in summary, V is concerned with the piece of information, NOR is concerned with the analyses of pieces of information, and NPOV is concerned with the balancing and coherence of all the information. -- Alanscottwalker (talk) 11:23, 9 April 2022 (UTC)
You might, in some instances, want to seek a secondary source for who-said-what content. But nearly all such content about BLPs is cited to WP:PRIMARYNEWS sources. Look at an article like Donald Trump. There has been no shortage of effort put into the article. It cites about 800 sources. Six of them are books. Far more of them are news articles about what happened today. Those are primary sources. WhatamIdoing (talk) 21:02, 9 April 2022 (UTC)
  • If you don’t like the NOR VER overlap, resume the work mostly done at WP:A. Just, don’t try to make big changes without bringing the community with you.
    Disagree the WP:PSTS doesn’t belong or that NOR is subservient to WP:VER. They address different concerns, both absolutely right. WP:NOR was an early afterthought when it was realised the extreme permissiveness of WP:VER with regards to the endlessly possible idiosyncratic verifiable stuff. —SmokeyJoe (talk) 04:23, 9 April 2022 (UTC)
WP:A is probably the best solution but it has been / would be near-impossible to do in Wikipedia. North8000 (talk) 12:17, 9 April 2022 (UTC)
  • I don’t have a problem with policies that overlap… addressing similar ideas from multiple angles can be helpful. The key is to make sure that the overlapping policies don’t contradict each other. That can happen (quite unintentionally) as instruction creep sets in. I don’t think the solution to contradiction is to merge everything into one giant catch all policy (as WP:A attempted) but to occasionally do a review of how all our policies are interacting with each other, and resist the instruction creep urge. Blueboar (talk) 13:00, 9 April 2022 (UTC)
    IMO the thing that hasn't worked with the overlapping sections is that we start with "Hey, remember that this other rule exists and might apply, too" and end up with accusations that the same mistake violates every single policy. For example, Wikipedia:Verifiability#Verifiability and other principles mentions that notability rules exist, and that is sometimes quoted as evidence that WP:V (not just the notability guidelines) is setting rules about notability. See also Wikipedia:Consensus#No consensus, which has always been intended to document what other policies/guidelines/procedures say, and which now gets cited as "the Consensus policy says...". It'd be far more appropriate for these editors to say "I found this link at the Consensus policy, and the relevant guideline says..." WhatamIdoing (talk) 21:15, 9 April 2022 (UTC)
  • It might be interesting, as a thought experiment, to imagine starting Wikipedia from scratch. If we had the benefit of hindsight about human behavior and the practical challenges of building an encyclopedia, but none of the jargon and none of the text of any policies or guidelines, what foundational document would we write? XOR'easter (talk) 21:11, 9 April 2022 (UTC)
    Probably a central WP:Sourcing policy. WhatamIdoing (talk) 21:16, 9 April 2022 (UTC)
    Yes, that sounds plausible. XOR'easter (talk) 21:20, 9 April 2022 (UTC)
  • PSTS seems to be here because the problem with overuse of primary sources is that they can lead to Wikipedia seeming to draw new conclusions based on which of the sea of primary sources are picked. As WP:MEDRS says, "cite reviews, don't write them." Too much use of primary sources leads to writing a review or secondary source, thus emphasizing them in a likely original way. Crossroads -talk- 04:22, 10 April 2022 (UTC)
Time for a history lesson - PSTS is here because we originally said (to paraphrase): “Wikipedia should not be a primary source for information”. Then someone clarified by adding “Wikipedia is a tertiary source”. Then someone said: “hey, we should explain what these terms mean”… so far so good.
Unfortunately, what happened next was a major re-write that removed this original statement, but kept the explanation. Divorced from its original context, the explanation took on a life of its own. The focus of the section shifted… what was originally a more philosophical statement about Wikipedia (and the type of information we should include) became a procedural statement about what kinds of sources we should use.
Now… This all happened years ago, and I am not saying we should try to roll it back… just explaining how we got to where we are. Blueboar (talk) 13:14, 10 April 2022 (UTC)
And it all happened while some of the key editors believed that if you were quoted in a newspaper article about your personal experience, that was a secondary source, because the reporter got the information from you, which means two people were involved (you and the reporter), so the information is now "second hand".
The first mention of primary source was added to NOR in February 2004. Colin started writing MEDRS two and a half years after that, by splitting off content from MEDMOS that had previously been split from RS, which didn't exist in 2004. The first version of RS, in February 2005, contained PSTS-related definitions. By that date, NOR already had hundreds of words about PSTS. WhatamIdoing (talk) 20:38, 10 April 2022 (UTC)

My idea was in essence to just move material and eliminate duplication and make no substantive changes. Any substantive changes should be proposed and handled separately. Sincerely, North8000 (talk) 13:26, 10 April 2022 (UTC)

That seems reasonable. I'm ok with some overlap. Duplication can often be reinforcement. The problem we have to avoid is conflict and contradiction. Huggums537 (talk) 10:40, 27 April 2022 (UTC)

Paraphrasing

@S Marshall, I think we should split off your idea above about writing in your own words.

Wikipedia:These are not original research currently says


==Paraphrasing==

  • Accurate paraphrasing of reliable sources is not considered original research. In fact, in most cases you are actually required by policy to write in your own words rather than plagiarizing the source's wording. This includes:
    • using synonyms rather than quotations;
    • using plain English rather than jargon from a technical source; and
    • summarizing whole pages, chapters, or books in one or two sentences.

Do you think it would help to move this directly into NOR, under Wikipedia:No original research#What is not original research? It's often easier to move long-standing text than to create something new, although I have no objections to writing something new, if you think that would be better.

WhatamIdoing (talk) 03:36, 20 April 2022 (UTC)

All this does is add length without saying anything new. In fact it repeats what is already there but worse, as it is missing the important emphasis on the meaning being the same. Here is what the page already says:
Despite the need for reliable sources, you must not plagiarize them or violate their copyrights. Rewriting source material in your own words while retaining the substance is not considered original research....The best practice is to research the most reliable sources on the topic and summarize what they say in your own words, with each statement in the article attributable to a source that makes that statement explicitly. Source material should be carefully summarized or rephrased without changing its meaning or implication. Take care not to go beyond what the sources express or to use them in ways inconsistent with the intention of the source.
What this will enable is people claiming they are merely "paraphrasing" when they are actually diverging from the meaning of the source material. Crossroads -talk- 03:51, 20 April 2022 (UTC)
I just now noticed this, but by definition paraphrasing requires the original meaning to remain intact, so if anyone diverges from the original meaning, then then they are no longer paraphrasing, but just putting in content not supported by the source, and nothing about this enables anybody for doing that because we have plenty to prevent it. Huggums537 (talk) 08:05, 30 April 2022 (UTC)
It's important that both rules say the same thing. I too have sometimes thought that paraphrasing should be its own subheading in NOR. I wonder whether we should invite Maggie Dennis to opine on the phrasing.—S Marshall T/C 08:39, 20 April 2022 (UTC)
I'm not opposed in principle to adding another subsection to Wikipedia:No original research#What is not original research. However, I think that in order to be worthwhile, an addition should elaborate more on why paraphrasing is not OR, and why it is important. I don't really like the phrasing of the bullet points above, as it seems to imply that policy requires a certain specific set of writing techniques. Policy does not mandate that we "in most cases" summarize a whole book in two sentences! The awkward construction makes it sound like the bullet points might be defining what it means to "write in your own words", rather than giving examples of how to go about writing in your own words. If we're adding anything, I'd rather it be a clean new paragraph, rather than something brought over from an essay that hasn't even reached the guideline status of consensus, and the general vetting of the prose that goes along with that level of scrutiny. XOR'easter (talk) 00:59, 21 April 2022 (UTC)
I would reduce the proposed section to “Paraphrasing is not original research”.
If more is needed, start with wikilinking Paraphrase. SmokeyJoe (talk) 01:36, 21 April 2022 (UTC)

Structally, paraphrasing or summarization IS OR in many many ways. The defacto rule is that if it's reasonably safe summarization it's OK unless somebody wants to knock the material out, in which case it's not OK. I applaud your efforts to provide some extra guidance in this area, but it won't be easy. North8000 (talk) 02:01, 21 April 2022 (UTC)

It’s not OR in the ways that count, which means, the creation of new information. SmokeyJoe (talk) 03:05, 21 April 2022 (UTC)

Note of caution, for editors "paraphrasing" is a bit tricky, because it cannot be close paraphrasing. I agree though that whether "close" or "loose", it generally is not OR. --Alanscottwalker (talk) 13:57, 21 April 2022 (UTC)

I think you were referring to Wikipedia:Close paraphrasing, but I also agree that valid paraphrasing is for sure not OR. Huggums537 (talk) 11:10, 27 April 2022 (UTC)

For example:

  • If three sources report a ship's length as 567', 568' and 569' and you put in the article "the ship is approximately 568' long" that is technically synthesis, but it will stick. If you extract and summarize the key facts out of a lengthy paragraph about a hurricane and paraphrase then via omissions you are shifting emphasis and meaning on what the longer paragraph said, but it will say in Wikipedia.
  • If you did the same thing with John Smith (a political figure)'s approx $568 million in donations to charity, and summarize from a paragraph on their other charitable works, a Wikilawyer editor who has opposite politics to John Smith and doesn't want anything good sounding about John Smith in the article will successfully knock the material out on policy grounds.

North8000 (talk) 12:53, 27 April 2022 (UTC)

I'm not sure that your examples are correct. The information (e.g., "approximately 568' long") actually is in all of those sources, even though the sources don't use the exact words. That isn't synthesis, "technically" or otherwise. All three of those sources claimed that the ship was approximately that long; none of them said that it was a materially different size. You are not therefore combining sources to produce something that isn't in any of them. You are merely summarizing, in a slightly vaguer way, what is given more specifically in all of them. WhatamIdoing (talk) 13:35, 27 April 2022 (UTC)
I can see how one might compare paraphrasing as possibly being structurally similar to OR in the off chance event that someone challenges the paraphrasing, or on the other off chance event that someone uses the paraphrasing in abnormal ways such as the examples you gave, but I think normal valid use of paraphrasing would have no meaningful relation to OR in the ways that count such as what Smokey Joe was saying. Huggums537 (talk) 13:43, 27 April 2022 (UTC)
@WhatamIdoing: Yeah, you're right about my example. I hadn't thought about that angle. North8000 (talk) 14:56, 27 April 2022 (UTC)
I would say only the one example about ships is really "incorrect" about being synth, while the other examples are not so much incorrect as they are "irrelevant", because they are really examples of abnormal or malformed paraphrasing. Neither would be allowed, or would stick, so comparing them to OR is kind of a moot point that is really an exercise in futility since neither OR nor abnormally malformed paraphrasing would be living in articles. It doesn't really matter if abnormal paraphrasing has anything to do with OR since they both already have the same boot attached to them anyway. The point of the discussion is determining if normal valid paraphrasing is OR, and I think everyone agrees it isn't. Huggums537 (talk) 02:07, 28 April 2022 (UTC)
In other words, the answer to the question about whether illegitimate paraphrasing is OR would be sometimes it is and sometimes it isn't, but that is not the question we are asking here. The question we are asking here is whether legitimate paraphrasing is OR. We don't need any citations to any reliable sources to know that is question we are asking because it is obvious by context that the question being asked is about legitimate paraphrasing, especially when the passage describes it as "accurate" paraphrasing. Huggums537 (talk) 05:50, 28 April 2022 (UTC)
Actually, now that I've been giving it some more thought, I think it is probably dangerous to be saying that even illegitimate paraphrasing is OR because that would be like saying quotations are OR simply because somebody can use quotations to insert OR into articles. Huggums537 (talk) 09:52, 28 April 2022 (UTC)
Paraphrasing is not OR. Illegitimate paraphrasing is not paraphrasing. SmokeyJoe (talk) 10:32, 28 April 2022 (UTC)
That's a good point. There really is no such thing as illegitimate paraphrasing. Paraphrasing is simply a tool just like quotations. People can use them properly or improperly just like a garden tool. We would never say garden tools are illegitimate murder weapons just because people sometimes use them to murder. Not only would it be morally wrong giving the OR policy that much power and authority for banning a useful tool like paraphrasing, but it is also a legal imperative that the tool be free from any encumbrances because of its vital role in copyright violation prevention. Alanscottwalker warned about close paraphrasing earlier, but many times the only thing standing between you and a plagiarism (copy vio) violation is good (or improved) paraphrasing. Huggums537 (talk) 19:40, 28 April 2022 (UTC)

As encyclopaedists our job is abridgement (or what my English teacher in the 1980s used to call précis). We work out what the sources say and then we summarize their conclusions more succinctly. It's not possible to do this properly without changing the language the sources use. I'm not clear what you lot mean by "illegitimate" paraphrasing. Is it where we write a summary that doesn't accurately reflect the sources?—S Marshall T/C 23:45, 29 April 2022 (UTC)

That was my fault. I should have said illegitimate OR, not illegitimate paraphrasing. As Smokey Joe, and yourself just pointed out, there is no such thing as "illegitimate paraphrasing". We must use the tool of paraphrasing, abridgement, précis, or whatever you want to call it, and it is always a legitimate tool to use. If some content happens to be illegitimate OR, that doesn't mean the tool used to put it there is OR. Just like citing sources is not inherently unreliable just because some people cite unreliable sources, or just like using quotations is not OR simply because some people use quotations to insert OR content. Huggums537 (talk) 06:09, 30 April 2022 (UTC)
  • All this talk about paraphrasing has got me thinking about WP:Close paraphrasing though. This title seems to give all paraphrasing a bad rep because by using this phrasing for the title, the guidance could theoretically be cited against virtually any paraphrasing since it is inherent in any paraphrasing to stay "close" to the original meaning as possible. Also, the guidance appears to be more about the difference between correct, and incorrect use of paraphrasing than it does "close" paraphrasing, so I think the misleading title, and even the misleading portions of the guidance that say, "close" should be changed to "correct", or "incorrect" respectively. "Close paraphrasing", as it is written, is nothing more than a subtle suggestion that could be weaponized to say any paraphrasing is bad. It's just wrapped in shiny copyright paper with a Wikipedia bow on top to make it appear like something much more than just correct and incorrect paraphrasing. Huggums537 (talk) 06:46, 30 April 2022 (UTC)
    I agree that Wikipedia:Close paraphrasing is a bit of a worry. It appears to be a Wikipedian neologism of 2009, and not well defined.
    Close paraphrasing is a redirect to Paraphrasing of copyrighted material, where "close paraphrasing" is mentioned but not defined.
    There is no wikt:close paraphrasing, nor an entry in other dictionaries I've looked at.
    I think the invention of terms like "close paraphrasing" and "illegitimate paraphrasing" is not valid unless the author defines what they mean by the new term.
    Wikipedia:Close paraphrasing has laudable intentions, but a poor foundation.
    I submit that all paraphrasing is "close". Paraphrasing of the source material results in new material that is close, in content meaning intention tone etc, to the source material. Show my some paraphrasing that is not close and I will explain why it is not "paraphrasing", but something different.
    I think it is unexpected that "paraphrasing" is getting attention here at WT:NOR. Paraphrasing is much closer to "copying" than to "original research", or synthesis. SmokeyJoe (talk) 07:28, 30 April 2022 (UTC)
    Totally agree any paraphrasing is "close", and glad you brought up the redirect because "Paraphrasing of copyrighted material" might be a good title for the WP guidance too, but I'm not sure how easily it would replace "close paraphrasing" being mentioned in the body several times. Huggums537 (talk) 07:58, 30 April 2022 (UTC)
    No, gentlemen, not all paraphrasing is close and not all of it is a copyright concern. There are online paraphrasing tools which we need to be aware of, mostly designed to circumvent plagiarism-detection tools, but paraphrasing can also mean --- and means here --- giving the sense of what the sources say without using the source's words. This is not a copyright issue as long as you're summarising at least two sources in one article. Paraphrasing a single source is more problematic.—S Marshall T/C 09:24, 30 April 2022 (UTC)
    Can you give an example of paraphrasing that is not close? SmokeyJoe (talk) 09:42, 30 April 2022 (UTC)
    What I would really like to see is that example from a single source that isn't "close"... Huggums537 (talk) 11:47, 30 April 2022 (UTC)
    An example? There is more than one way to do it, but: take a published article; change every word you can find a synonym for; even drop some of the original or intersperse it; publish it as your work, with or without citation. -- Alanscottwalker (talk) 10:49, 30 April 2022 (UTC)
    You want a way to close paraphrase? Several ways are possible, eg: take something someone else has written; alter every phrase you find with same meaning; even sprinkle it in among other work or excise some of the base work; claim it as your work, with or without providing the source. Alanscottwalker (talk) 11:10, 30 April 2022 (UTC) (to close paraphrase myself :))
    I think it would be debatable if that falls under the category of not being close, but even if it did that is exactly the reason why I think the title should change - because of the possibility of this debate. Maybe it isn't close according to the guidance, but maybe it is in the sense Smokey Joe was talking about. Since there is this confusion, and debate, a rename is in order. Huggums537 (talk) 10:58, 30 April 2022 (UTC)
    I think it would be far more beneficial, far easier, and promote less debate, and confusion if we simply described paraphrasing as correct/incorrect, and the examples you just gave are both incorrect (or what the guidance would call "close"), but the question asked to provide an example that is not close. Huggums537 (talk) 11:31, 30 April 2022 (UTC)

Distant paraphrasing example

Using source text I can freely copy-paste
Source text Close paraphrase Distant paraphrase
In the beginning when God created the heavens and the earth, the earth was a formless void and darkness covered the face of the deep, while a wind from God swept over the face of the waters. Then God said, "Let there be light"; and there was light. And God saw that the light was good; and God separated the light from the darkness. God called the light Day, and the darkness he called Night. And there was evening and there was morning, the first day. The earth was a formless emptiness in the beginning when God created the heavens and the earth, and darkness covered the face of the deep, while a wind from God swept across the face of the oceans. "Let there be light," God said, and there was light. God saw that the light was good, and he divided the light from the darkness. The light was given the name Day, and the darkness was given the name Night. There was an evening and a dawn on the first day. The Genesis creation narrative comprises two different stories; the first two chapters roughly correspond to these. In the first, Elohim, the generic Hebrew word for God, creates the heavens and the earth including humankind, in six days, and rests on the seventh.

Does this help you?—S Marshall T/C 12:28, 30 April 2022 (UTC)

And perhaps something like, Too-Close Paraphrase, will help. Alanscottwalker (talk) 12:47, 30 April 2022 (UTC)
Yes, that looks like a very good source for close paraphrasing. SmokeyJoe (talk) 12:58, 30 April 2022 (UTC)
Yes, that helps me understand what you mean. SmokeyJoe (talk) 12:57, 30 April 2022 (UTC)
Not me. Not at all. I hope you did not copy this from any of our guidance. Nothing in the so called distant paraphrasing is supported by what the source says because things being stated in the distant paraphrasing are not being stated in the source text. It comes off to me as making stuff up, and not paraphrasing at all. I see nothing at all in the source text suggesting the given name of creation is "Genesis", or there are two stories, two chapters, Elohim is a generic word for God, or that God takes six days, and a seventh day break. The only thing that even remotely resembles anything from the source is that God creates the heavens and earth. I would have it removed as unsupported text. Huggums537 (talk) 13:30, 30 April 2022 (UTC)
Well I am pretty sure the original source was truncated because it's asking too much to present it all here, and all one would need is an assumed ellipsis at the end for two chapters and for the published annotations that are common in bibles. Alanscottwalker (talk) 13:58, 30 April 2022 (UTC)
None of that was indicated here. If it had been, maybe I would agree with everyone. All we were presented with was some source text, and some distant paraphrasing that failed to retain any of the original meaning other than a small portion of the first sentence in the original text. Huggums537 (talk) 14:19, 30 April 2022 (UTC)
How much are you demanding of others, here? 'Close paraphrasing' is a thing, not invented by Wikipedians, and it is a way of writing that has ethical and sometimes legal implications. So, look elsewhere for understanding if you so object to Smarshall's however imperfect attempt to be helpful. Alanscottwalker (talk) 14:29, 30 April 2022 (UTC)
I mean I'm not stupid, or being obstinate. Of course I know from my Sunday school lessons the facts in the distant paraphrasing are easily attributable to most common bibles. I'm just saying this would not be a great example if you imagine somebody never heard of a bible, or don't know who God is. And, I think the only reason it barely passes as a good example is purely for that notoriety. That is exactly why I made my critical analysis of it the way I did. I'm not demanding anything of anyone, only trying to prove the point that distant paraphrasing is easier said than done, and the finished product might still not be acceptable. It makes a big difference if a single passage, chapter, two chapters, annotations, footnotes, or whole bible is being sourced if you are paraphrasing, and we were given a passage. That matters, and you are pretending as if it doesn't. Huggums537 (talk) 14:49, 30 April 2022 (UTC)
BTW, I'm not knocking S Marshall's help either. I really do get the idea he is trying to get across based on this example. I just think there could be a better example of it. Huggums537 (talk) 14:56, 30 April 2022 (UTC)
Thank you for agreeing that you understand what I'm trying to convey. I apologise for omitting the ellipsis.—S Marshall T/C 16:27, 30 April 2022 (UTC)
Sure. No problem. Its kind of hard to explain what I'm trying to say about my point because if this example were in an article, it probably wouldn't even need an ellipses (although it should) since it would be obvious from the context of the article that the (whole) bible is the primary source. However, in this case it's not obvious the (whole) bible is the primary source at all until the distant paraphrasing makes references to other parts of it not mentioned in the source text. It's pretty clear we are allowed to summarize whole chapters or books into one or two sentences, but with this example, what it makes it look like you have done is something more like taken a paragraph, and used it to summarize two chapters, and part of a concordance. This might maybe fly in an article where the context/primary source is obvious, but it is kinda bad as an example of the sort of summary we should be doing for paraphrasing even though it does get the idea across that it is possible to summarize without it being too close. Huggums537 (talk) 01:08, 1 May 2022 (UTC)
I dunno. Maybe I'm just an uneducated doofus, but it makes common sense to me this is a wrongful use of paraphrasing. I guess the argument could be made that most people recognize the passage as being from the bible so the context isn't needed since the source is obvious, but then again I never knew what a Quran was, or who Allah is until I was in my late teens, and I might not recognize a passage from it to save my life, sooo... Huggums537 (talk) 04:05, 1 May 2022 (UTC)
It helps to understand what S Marshall means. However, he is conflating perspective of writing, close vs distant, with a measure of alignment to textbook paraphrasing. Instead of “close” and “distant”, the words “tight” and “loose” could be used.
Paraphrase means to say the same thing differently. The degree of difference could be called “close”, where “close” means the difference is small, and paraphrasing becomes copying. In the example, the distant paraphrasing is less close because the writer has changed perspective from close perspective to distant perspective. But I understand what S Marshall means.
The problem here is that “close” is a simple common word that defies hard definition.
Paraphrase means to same the same thing differently. In public, political, discourse, paraphrase often accompanies a change in purpose, keeping the facts the same but changing the POV. In Wikipedia, “POV” is often a badword, but it shouldn’t be, not necessarily. POV is inherent if there is any motivation to communicate. I think in Wikipedia, paraphrasing to alter the POV to a more distant perspective is a good thing. This includes discarding unimportant detail, and adding contextualisation, and consistency in style to the adjoining prose.
At no point, do I see paraphrasing as touching WP:OR; there is quite a gulf between. SmokeyJoe (talk) 04:50, 1 May 2022 (UTC)
  • That's another great idea, "tight" vs "loose". Anything but "close". It just confuses everything. Huggums537 (talk) 05:24, 1 May 2022 (UTC)
    Encyclopaedias should be using what SJ calls "distant perspective" most of the time. An encyclopaedia gives the view from 30,000 feet high; that's how we produce the abridgement I referred to earlier. There are hard cases where we have to go closer to the sources but "distant perspective" is the default.—S Marshall T/C 08:44, 1 May 2022 (UTC)
    I'm guess I'm sort of okay with "distant paraphrasing" terminology for correct paraphrasing, but I really hate using the "close" terminology for when we are really talking about incorrect, or copyrighted paraphrasing. I took a closer look at the link for "Too close paraphrasing" from Yale provided by Asw, and it doesn't seem to be supporting the "close" naming convention as much as I thought it did because it appears to have been published by an undergrad student in 2020 so they may very well have got their bad ideas for naming conventions from Wikipedia in the first place. Huggums537 (talk) 09:50, 1 May 2022 (UTC)
    Close paraphrasing is what universities and colleges call it.—S Marshall T/C 10:07, 1 May 2022 (UTC)
    That's fine for them, but they don't have to contend with confusing editing disputes or constantly changing and evolving content like we do. Huggums537 (talk) 11:55, 1 May 2022 (UTC)
    Also, I think doing something just for the sake of doing it simply because everyone else is doing it at a university level so therefore it "must be good" is a particularly bad reason for doing something, especially if you can see with your own common sense that doing that thing will be problematic for you in your own forum. We engage in a completely different kind of intercourse here on Wikipedia than they do at University, but most of our best editors here are too indoctrinated with academic education to see that. I would gladly reject what all the universities are doing for the benefit of Wikipedia, but I realize some other editors are not willing to do the same because of their indoctrination. I think it's a form of academic bias myself, but that's just my own opinion. Huggums537 (talk) 12:49, 1 May 2022 (UTC)
    It helps if we use the words at least some of our editors know.—S Marshall T/C 16:49, 1 May 2022 (UTC)
    I don't buy that much either. Doctors and lawyers use plenty of words we don't use, yet they are somehow able to adjust them enough to communicate with us, and each other just fine. It would be pretty darned funny if my doctor or lawyer told me that it would help them a lot if I start using the words they know. I mean it probably would help, but if I knew those words, I really wouldn't need my doctor or lawyer anymore... Huggums537 (talk) 17:14, 1 May 2022 (UTC)

Another paraphrasing example

Using source text I can freely copy-paste
Source text Close paraphrase Not so distant, but otherwise correct paraphrase
In the beginning when God created the heavens and the earth, the earth was a formless void and darkness covered the face of the deep, while a wind from God swept over the face of the waters. Then God said, "Let there be light"; and there was light. And God saw that the light was good; and God separated the light from the darkness. God called the light Day, and the darkness he called Night. And there was evening and there was morning, the first day. The earth was a formless emptiness in the beginning when God created the heavens and the earth, and darkness covered the face of the deep, while a wind from God swept across the face of the oceans. "Let there be light," God said, and there was light. God saw that the light was good, and he divided the light from the darkness. The light was given the name Day, and the darkness was given the name Night. There was an evening and a dawn on the first day. The creation story begins on the first day with God creating the heavens, earth, and light (called Day), which was divided from darkness (called Night).

I think this would be a little more like what the paraphrasing example should look like, but even then you could still say it is somewhat "close" to the original. Huggums537 (talk) 01:37, 1 May 2022 (UTC)

An apparent contradiction

In discussing the need for citable sources, the current article includes the following: "... but a source must exist even for material that is never challenged. For example, the statement "the capital of France is Paris" needs no source, nor is it original research, because it's not something you thought up and is easily verifiable; therefore, no one is likely to object to it and we know that sources exist for it even if they are not cited." What this is intending to communicate is that some sources are so widely-known that they do not need to be made explicit, such as the fact that Paris is the capital of France.

So, it might be clearer to have the text read as follows: "... but a source must exist even for material that is never challenged. In cases where a source can be assumed to be widely known, such as Paris being the capital of France, this source does not have to be explicitly cited. As it was not something you thought up, it is not original research, it is easily verifiable and we know that sources exist, and no one is likely to object."

I'm new to suggesting changes; is this something that I can be do if the consensus is positive? Soulfulpsy (talk) 02:40, 26 April 2022 (UTC)

One could simply replace "needs no source" with "does not require an inline citation". Your proposal is rather longer and redefines original research (probably inadvertantly). The former version requires a fact be verifiable (which elsewhere is made clear to require reliable sources) whereas the latter simply requires that it not be invented by the editor themselves. As the policy makes clear, however, reliable sources are required for something not to be OR even if someone is repeating something they heard somewhere else.
Even better, we could replace "needs no source..." with "even if lacking a citation, is not original research" as I don't think we should ever be saying not to cite a fact. The fact it's easily verifiable means it's easy to cite, and whether something is really so obvious gets real fuzzy, real fast. For example, the capital of South Africa isn't so simple. I follow the WP:NOTBLUESKY school of thought. Crossroads -talk- 03:57, 26 April 2022 (UTC)
It's fine if you support NOTBLUESKY, but policy supports WP:WTC, by pointing to it in WP:V, and therefore by extension also supports WP:BLUE. One very important reason this is supported by policy, and NOTBLUESKY isn't, is because NOTBLUESKY completely ignores the fact that primary sources can verify themselves. In other words, the subject of the article itself is often obviously the source so a citation is not needed. See the plot section of Wikipedia:When to cite#When a source or citation may not be needed. The current version of OR doesn't just say "needs no source", it also says, "...even if they are not cited", and "...even if not attributed". The exact opposite of "we should not ever tell editors to stop citing obvious facts", is the absolutely absurd notion that every single little thing needs a citation even if it is already supported by the primary source (AKA the subject of the article). The proponents of NOTBLUESKY always say that if something is easily verifiable, then it should be easy to cite, but as @WhatamIdoing has pointed out before, what could be easier than not citing it, and just letting the reader check the primary source themselves? Huggums537 (talk) 08:28, 26 April 2022 (UTC)
I'd suggest changing the "attributed" language. Wikipedia:Attribution involving a little blue clicky number is meant, but it tends to be mistaken for WP:INTEXT attribution.
I'll have a go at fixing this. @Soulfulpsy is correct that this is unnecessarily confusing. WhatamIdoing (talk) 15:46, 26 April 2022 (UTC)
See this change. I hope that solves the "needs no source" concern.
I would particularly appreciate hearing your views about the new footnote, which says of the "The capital of France is Paris" example that This statement is verifiable because you are "able" to "verify" it, e.g., by typing "capital of France" in your favorite search engine and seeing what the reliable sources tell you. Although inline citations are encouraged, verifiability does not depend exclusively upon the sources already cited in the article. This is obviously true (else the example is utterly nonsensical), but I suspect that it will be objected to by the occasional POV pusher or ruleslawyer, since claiming that all unsourced content is a policy violation is a simple way to get the wrong POV out of an article.
The expected drama will run:
  • The policy says your content is bad
  • Oh, oops, the policy says that it's possible for your content to be verifiable even if it isn't already cited
  • Well, let me discredit the policy by saying that there wasn't an overwhelmingly positive, community-initiated, CENT-listed RFC on whether or not to state the obvious, so let's claim that the policy hasn't always said that basic facts don't violate policy merely because they aren't already cited.
Your comments on this footnote could prevent that last bit of drama. WhatamIdoing (talk) 16:08, 26 April 2022 (UTC)
Your changes eliminate the seeming contradiction. Your footnote, for me, works as well. I lack the history to anticipate pushback. The only other time I attempted to offer a correction was to a discussion of A.N. Whitehead, with which I have some expertise based upon my Ph.D., but it was quickly eliminated for reasons of improper approach which was expressed with more than a hint of arrogance. So, proper approach ranked higher than accuracy; and dismissal was preferable to assistance and guidance. I was not impressed and I realized I'm not presently suited for extensive, nuanced, and detailed exchanges of this sort. Soulfulpsy (talk) 04:19, 27 April 2022 (UTC)
Welcome to Wikipedia my friend... Huggums537 (talk) 09:53, 27 April 2022 (UTC)
Also, I agree with these changes. Huggums537 (talk) 10:10, 27 April 2022 (UTC)
You seem to assume bad faith when people object by saying it's POV pushing. Like it or not, the WP:BURDEN of supporting a claim lies with inclusionists. Why do you emphasize so much that it doesn't matter if a source is actually cited? Why do you seemingly wish to make it easier for people to add material without citing sources?
We should not be endorsing an approach that implies it doesn't need to be cited if one can just Google it. Google results are not always correct or giving good sources. Why bother with citations at all then? Crossroads -talk- 22:38, 27 April 2022 (UTC)
No bad faith intended friend, and I apologize for coming off that way. Like Waid said, I understand you are simply just trying to improve Wikipedia according to your own different ideas about what it should look like. That's ok. It's not bad at all since I very often have my own different ideas myself. I would be very hypocritical if I tried to make you feel bad for it. Please forgive me if I made you feel that way. Huggums537 (talk) 01:45, 28 April 2022 (UTC)
POV pushers are not usually bad-faith editors. They are almost editors who are trying very hard, to the best of their biased ability, to improve Wikipedia – according to their idea of what "improved" looks like.
I believe that if you read the text of the footnote you removed, you will find the words inline citations are encouraged in it. As for whether it "implies" that some content is verifiable even if it is uncited, this policy has said that directly for years. This is the policy, even if you don't think it's the best policy. Wikipedia has long had a small group of editors advocating for every single sentence being followed by an inline citation. They have not yet managed convinced the rest of the editors that this would be preferable.
Your other objection is to me expanding the description from:
but a source must exist even for material that is never challenged.
to:
but a source must exist even for material that is never challenged or that does not require an inline citation.
A source must exist (somewhere in the world, in any language) for all material. It is not clear to me why someone who supposedly wishes to have everything supported by inline citations would wish for this sentence to apply to a seemingly narrower set of content.
Your stated reason for removing this is that WP:MINREF, which is part of the Wikipedia:Inline citation page, is not a policy or a guideline. There is no rule against linking to WP:INFOPAGES, nor even to essays. This policy links to many non-policy and non-guideline pages. In fact, the first sentence of the second paragraph links to the top of WP:Inline citation.
As for this reversion, do you realize that this entire footnote was taken, word for word, in its entirety, from WP:V? And that by removing it, you are causing these two policies not to match as closely as they could and should?
The idea is to tell people that when we require sources that "directly support", we refer to the content of the sources, not to the location of the footnote. If you write Alice Expert wrote a book[1]and the cited source does not say that Alice wrote a book, then the cited source does not "directly support" that sentence, even though the source is placed at the end of the sentence. OTOH, if you write that same sentence based on a reliable source that is all about Alice writing books, even if the citation is formatted as Wikipedia:General references at the end of the page, instead of the more desirable inline citation (perhaps it is added by a new editor who doesn't know how to make the little blue clicky numbers), then that source actually does "directly support" the content in question. (Ideally, someone will re-format the citation into the preferable form, which the lead of WP:CITE encourages experienced editors to do.) The reason we have this explanation about what it means to "directly support" content is because we have altogether too many stupid disputes that amount to one editor deciding that WP:OVERKILL is barely enough citations, and claiming that the "directly supports" language is about the number of linear inches between the material and the little blue clicky number. It isn't. "Directly supports" is about whether the source actually says the thing someone claims it says. WhatamIdoing (talk) 00:06, 28 April 2022 (UTC)
Although a citation for every sentence is overkill - at the end of a series of clearly related sentences should be fine - I certainly empathize with that view and oppose efforts to move the policy away from that and towards 'write whatever as long as sources somewhere say it'. When was the last time that was discussed anyway? Is it really a minority? All my experience nowadays finds people being quite strict with citations, and for good reason. There's enough misinformation as there is.
Regarding this, the first part, I don't agree with adding further emphasis to the idea that some claims "do not need" inline citation. With the second, my main objection is to endorsing a sort of "just Google it" mentality where lazy Google searches have any bearing on content or on what needs citation. The point about verifiability not needing sources necessarily being in the article already is already in the paragraph. Saying it over and over is unwarranted emphasis.
Regarding this, it doesn't necessarily make sense in this context to quote WP:V verbatim. Yet again saying that a citation doesn't have to be in the article, aside from being undue, makes no sense when it just got done saying that one has to cite sources to demonstrate they are not adding OR. And of course, we want them to add those sources to the respective sentences it supports rather than lumped together at the end or something. Crossroads -talk- 02:02, 28 April 2022 (UTC)
  • The most recent significant discussion about whether everything needs an inline citation was earlier this month; you can read it here.
  • There are two changes in that diff. The first adds a link to MINREF (=a page that tells people that policies require them to cite more than they see in many articles). The second explains why/how this policy says that "The capital of France is Paris", when uncited, is still "easily verifiable". I don't think that either of these changes have the effect of emphasizing the idea that some claims do not need inline citation. I think the first tells people to cite more, and the second reduces confusion.
  • It may not be necessary to quote WP:V verbatim, but it is normally a bad idea for two closely related policies to have slightly different wording on the same subject. Note, too, that the text you removed (" The location of any citation—including whether one is present in the article at all—is unrelated to whether a source directly supports the material.") does not say that citations don't have to be in the article. What it says is that the location is irrelevant to the question of whether the source supports the material. That is, the US census (=a reliable source) "directly supports" the claim that the population of Paris, TX is about 25,000 people, regardless of whether the source is placed as an inline citation, typed among the WP:General references at the end of the article, or incorrectly omitted entirely. By contrast, Darwin's On the Origin of Species does not "directly support" any claims about the population of Paris, TX, even if you put it in the most beautifully formatted citation template as an inline citation at the end of that very sentence. That's what "directly supports" means. It's about the Wikipedia article content matching the reliable source's content. It is not about where the source is typed on the page. The source can "directly support" the material even if you screw up everything about formatting the citation.
WhatamIdoing (talk) 02:23, 28 April 2022 (UTC)
That discussion is an MfD for the BLUESKY essay; of course people will oppose deletion of an essay that has any level of community support. When was the last time the community actually endorsed the idea that citations are unnecessary?
Copying a bunch of text verbatim from a different policy without regard for context doesn't necessarily make sense. It was you who added a bunch of extra text that has little to do with the specific issue from the start of this section. In total, this added two extra references to the idea that citations are not necessary, making the lead say it over and over again. This is undue emphasis by any measure. Crossroads -talk- 12:08, 29 April 2022 (UTC)
If editors support an essay that says some (small) number of citations are unnecessary, then that is also support for the idea that some citations are unnecessary, no? WhatamIdoing (talk) 16:16, 29 April 2022 (UTC)
One cannot infer support for an essay's content from opposition to an essay's deletion. One can disagree with an essay while understanding that there are only rarely grounds to delete them. Extrapolating from an MfD to the community in general is also sketchy. Crossroads -talk- 04:49, 1 May 2022 (UTC)

More troubles from this being mostly duplication of wp:Ver. Structurally is says a source is required if it is challenged or likely to be challenged. The latter is good guidance but not directly enforceable but becomes moot when a challenge occurs in which case it becomes simply "challenged". So structurally the core policy says it must be sourced if challenged.

"Not needed" on stuff you consider to be "sky is blue" is an ambiguous statement. It can mean either:

  1. Only: Feel no pressure / guidance that you have to cite it when putting material in. So it will still need to be cited if challenged.
  2. A policy statement that it is never required for those cases. So if you feel or assert that it is sky-is-blue, this policy such overrides the requirement to cite it if challenged.

Number 2 is ludicrous, so "not needed" means simply #1. So there is no policy / structural need to define sky is blue. If somebody challenges a sky is blue statement, you can call them a jerk (or a wikilawyer who wants "sky is blue" removed) but it needs a cite. BTW I advocate requiring a perfunctory statement of concern with a challenge so then the wikilawyer would need to say "I have a concern that "sky is blue" is not source-able" and will probably skip the challenge rather than wanting to look that stupid. North8000 (talk) 20:59, 28 April 2022 (UTC)

I think #2 isn't really all that ludicrous because you don't have to just "feel or assert" it in cases where it is obvious from the context or the primary source (subject of the article) supports it. Why is our only recourse from wikilawyers or POV pushers calling them a jerk? Huggums537 (talk) 00:51, 29 April 2022 (UTC)
Btw, I agree with North's idea for a perfunctory statement regarding challengers. Huggums537 (talk) 00:57, 29 April 2022 (UTC)
I have a pragmatic approach to BLUESKY challenges: It takes far less time (and causes a LOT less aggravation) to simply slap in a citation that isn’t needed than it does to explain why a citation isn’t needed. So just “let the wookie win”. Slap in a citation and be done with it. It isn’t worth arguing about. Blueboar (talk) 12:10, 29 April 2022 (UTC)
I agree with sticking with the rule that if challenged, a cite must be provided, period. But I also think that a perfunctory statement of concern/question about the veracity or sourcability of the statement should be required with the challenge. The reason is to avoid some more common Wikilawyering situations, usually by done using wp:ver/wp:nor in tandem with some other rule. For example, in a hit-piece article which says "John Smith has been accused of kicking dogs". And editor "A" puts in "John Smith said that he does not kick dogs" and source to statement by John Smith on his web site. And a Wikilawyer POV warrior editor "B" knocks it out saying it's not (suitably) sourced because it's not a RS. (primary, or some other reason) If the wikilawyer had to say "I question the veracity of the statement that John Smith said that or the source" he is less likely to make that challenge. North8000 (talk) 12:47, 29 April 2022 (UTC)

Recent back and forth

Regarding the recent back and forth of inclusion of https://en.wikipedia.org/w/index.php?title=Wikipedia%3ANo_original_research&type=revision&diff=1085258114&oldid=1085105135 I do support Crossroad's at least temporary removal. From a process standpoint this is newly added material and so removal is reversion to the last stable version. I think that the statement does allude to / try to explain an important point which should be fully covered somewhere. But I think that we should be reducing overlap/duplication between wp:NOR and wp:ver, not increasing it. Also, in addition to being a bit brief and abstract, putting it in that particular location adds to the confusion. Sincerely, North8000 (talk) 13:15, 29 April 2022 (UTC)

@Crossroads, if you only object to the "whether" phrase, why don't you stick back in the main part of the sentence, which is about the location? The location of any citation is unrelated to whether a source directly supports the material. (omitting "—including whether one is present in the article at all—") would be closer to WP:V than removing the whole sentence.
On the broader point, the policy says (and has said bascially this for years):
For example, the statement "the capital of France is Paris" does not require a source to be cited, nor is it original research, because it's not something you thought up and it is easily verifiable; therefore, no one is likely to object to it and we know that sources exist for it even if they are not cited. The statement is verifiable, even if not verified.
I have underlined the words that I believe will confuse some newer editors, namely the statement that completely uncited material can be "easily verifiable". It is not unusual for me to encounter a newer editor who believes that content is only verifiable if it can be verified in the exact source that is cited. Does nobody else encounter these editors? (I freely grant that my editing patterns are not average.) Does everyone else think that this sentence is perfectly clear, and that almost everyone will understand the circumstances under which a completely uncited statement can be considered "easily verifiable"? WhatamIdoing (talk) 16:22, 29 April 2022 (UTC)
I think “easily verifiable” will be understood… but if we wanted to clarify further, I suppose we could say: “… and so many sources exist that support the statement that it is easily verifiable…” Blueboar (talk) 16:49, 29 April 2022 (UTC)
Is there any other way to word it? "Many" sources seems to give the impression that a primary source alone could not be sufficient for a simple verification, or that secondary sourcing must be required for simple verification, and I think this is wrong. Huggums537 (talk) 19:09, 29 April 2022 (UTC)
Also, every time I have to say the words, "many sources", or "significant coverage" in a discussion, I always sense the flavor of notability as a bad aftertaste in my mouth, but I'm 100% positive that isn't even close to anything you had in mind when you said, "many sources". Huggums537 (talk) 19:20, 29 April 2022 (UTC)
Perhaps "...because any editor who searched for sources could easily find reliable sources that support this statement that it..."? That removes "many sources", but at the cost of extra words. WhatamIdoing (talk) 22:19, 30 April 2022 (UTC)
Except, if there are only a few sources that support a statement, then it is hardly a BLUESKY statement - no matter how easy it is to find those few sources. What makes the difference in BLUESKY situations is the shear volume of potential sources that can be used to verify the information. Blueboar (talk) 23:24, 30 April 2022 (UTC)
I'm not just talking about BLUESKY statements, and neither should the policy only be talking about just BLUESKY either, I'm talking about plot summary type statements that only need the subject of the article (or primary source) to verify them, and this policy should be talking about that too. Huggums537 (talk) 02:03, 1 May 2022 (UTC)
So there are two cases: one is that reliable sources are readily available and abundant in number (e.g., "The capital of France is Paris") and the other is the source is obvious (e.g., the plot of a named work of fiction).
Technically, a statement like "Romeo and Juliet opens with a fight scene" already contains an inline citation to the source. If you don't refer directly to the work, then there isn't technically a citation, but it is still obvious. WhatamIdoing (talk) 16:12, 1 May 2022 (UTC)
Yes, this is exactly my point. If you don't refer directly to the work, then it is still obvious that the only citation that would technically be needed is the direct reference to the work, not "shear volumes" of sources to verify it. So, why are we pushing this idea about "shear volumes" when we have just proven the case that a single citation is sufficient for some obvious facts? I'm certain it is not just the case for plots, but other obvious facts that don't need citations as well... Huggums537 (talk) 16:49, 1 May 2022 (UTC)
Merely adding "the location of any citation" is not helpful either. When people are being told to cite sources to demonstrate they are not adding OR, it does no good to imply that the location isn't that important. It is important. People should be citing their source soon after the claim they are supporting, rather than, say, clumping three paragraphs worth of sources at the end or the beginning.
There is no need for further clarification of easily verifiable. That paragraph already says, must be verifiable in a reliable, published source, even if not already verified via an inline citation. And in the footnote: By "exists", the community means that the reliable source must have been published and still exist—somewhere in the world, in any language, whether or not it is reachable online—even if no source is currently named in the article. Articles that currently name zero references of any type may be fully compliant with this policy—so long as there is a reasonable expectation that every bit of material is supported by a published, reliable source. Hammering yet more on this point of 'it can be verifiable even if the sources aren't cited' is undue and confusing. It's already perfectly clear that, under the current rules, something can be verifiable even if not currently cited.
Citations not supporting the text they purport to is a problem, actually. Editors caught misrepresenting sources are usually warned, and blocked if they persist. Catching an instance of this after the fact should be commended; worrying about the newbie accidentally misusing the jargony sense of "verifiable" is a case of misplaced priorities. More to the point, I've never seen a newbie use "verifiable"; they usually say that it was not in the source or something like that. And people, such as WP:Student editors, citing sources to support some peripheral point rather than their main claim is a vastly more common issue. There have been many times I check some claim against the underlying source and find it doesn't support it.
That experience is partly why I tire of these efforts to cover in excruciating detail why and when we don't need to cite sources. It has the issues backwards. Wikipedia has far more issues with editor failure to cite relevant sources than anything involving too much citation. Crossroads -talk- 05:23, 1 May 2022 (UTC)
"When people are being told to cite sources", then they need to cite the sources. This bit is about whether people need to cite sources even when nobody is telling them to cite sources for that particular bit of obvious information.
I don't think you're understanding the "location" thing, so we probably need to clarify that. After all, if you don't get it, then lots of less experienced editors won't, either.
So, from the top:
  • Is the US census a "reliable source" that "directly supports" the claim that the population of Paris, TX is about 25,000 people? Yes.
  • Is the US census a "reliable source" that "directly supports" the claim that the population of Paris, TX is about 25,000 people even before it occurs to anyone to put that citation in the article? Yes. (Should they cite that source? Yes. Statistics, even simple statistics like these, are WP:LIKELY to be challenged.)
  • Is the US census a "reliable source" that "directly supports" the claim that the population of Paris, TX is about 25,000 people even if someone (perhaps a newbie) types that citation in the wrong part of the article? Yes. (Does that formatting need fixed? Also yes.)
The US census is a "reliable source" that "directly supports" the claim that the population of Paris, TX is about 25,000 people because the US census actually says that the population of Paris, TX is about 25,000 people. Source says it = source directly supports it.
Now:
  • Is the latest MEDRS-compliant review article in a top-notch medical journal about breast cancer rates in China a "reliable source" that "directly supports" the claim that the population of Paris, TX is about 25,000 people? No.
  • Is the latest MEDRS-compliant review article in a top-notch medical journal about breast cancer rates in China a "reliable source" that "directly supports" the claim that the population of Paris, TX is about 25,000 people even if someone (perhaps a newbie) puts a beautifully formatted citation immediately at the end of the sentence about the population of that city? No.
  • Is there anything an editor can do about the location of this citation that will turn this MEDRS-compliant source into a reliable source that directly supports this statement? No.
The latest MEDRS-compliant review article in a top-notch medical journal about breast cancer rates in China is not a "reliable source" that "directly supports" the claim that the population of Paris, TX is about 25,000 people because that source doesn't say that the population of Paris, TX is about 25,000 people. Source doesn't say it = source doesn't directly support it.
The relevance of this to NOR in general should be obvious; NOR is reminding editors that sources need to actually say the things that they're citing them for. WhatamIdoing (talk) 16:25, 1 May 2022 (UTC)
Collapse to keep flow of discussion
What the... ? The population of Paris is just over 2 million no matter what the US census says. What's a TX?S Marshall T/C 18:12, 1 May 2022 (UTC)
Oh, never mind, it means you're talking about some also-ran Paris out there in the colonies.—S Marshall T/C 18:15, 1 May 2022 (UTC)
Texas was never part of the original 13 colonies. ;-) WhatamIdoing (talk) 03:03, 2 May 2022 (UTC)
Lol, that's so true, but I never bother the brits when they still refer to any of us as the colonies because I don't expect them to know great details about U.S. history. They pushed US history onto us pretty hard just as I'm sure they pushed UK history onto them pretty hard, but we do have a great number of copycat cities across the pond. Texas also has a Rome, and New London among others, and many states have more of the same. Huggums537 (talk) 03:28, 2 May 2022 (UTC)
There are a lot of Paris, Amsterdam, Peru, and Lisbon in the states! Huggums537 (talk) 03:37, 2 May 2022 (UTC)
Anyway, this is off topic. Sorry. Huggums537 (talk) 03:44, 2 May 2022 (UTC)

It appears that the author of this article thinks that there is no distinction between not-for-profit and for-profit higher education in the United States, even going to far as to claim Title IV funding is available to for-profit schools, when it is in fact expressly denied, and that for-profit schools receive federal research funding and have endowments, all without proof, and simply citing lists of non-profit endowments, for example. I have removed a bunch of material as a result, and I would request that this article be reviewed for return of said material and original research, as it appears to be a focus topic for the author. 146.115.153.133 (talk) 01:05, 6 June 2022 (UTC)

Lengthy summaries of court decisions

This happened frequently with older articles on case but has come up again recently such as a case decided today Carson v. Makin. Editors are building out long sections which purportedly summarize the decision but to an excess degree that I worry there is effectively original research going by way if deciding what are the most important points of the decision to summarize (this can happen with any lengthy passage that rests on one single source). To wit, there are news articles that highlight key statements and points from the opinions today, but would not at all support a lengthy section. May law articles in the future would give the analysis to this level to support.

I feel it is necessary to address this to a bit more degree here, and if generalizing it further is possible (re using single primary sources to support a lengthy summary) perhaps here, thus is related to having some skill of the art as an editor to "use" the source thus way, which may involve facets that an editor not skilled in the art would not see. And that such summaries of primary sources should avoid use of "skill of art" knowledge. (I can see this also applying to some our math and science article that are deep textbook dives into derivations, which really shouldn't be at WP.) --Masem (t) 18:32, 21 June 2022 (UTC)

IMO direct summary by an editor of the actual decision should never be done. Such will usually have errors and things that the court really didn't say and represent heavy OR / interpretation of primary sources. But for US Supreme Court decisions there is generally already a good summary available. The released document contains the actual decision but is also preceded by a Syllabus (summary) which is not the official decision but instead a summary of it written by the Reporter of Decisions. So an immensely expert, NPOV credible third party summary is already available. North8000 (talk) 19:05, 21 June 2022 (UTC)
I would still be in question of being overly detailed just from the synopsis, since even there there is some skill of the art to work through the most salient points. At least with SCOTUS, nearly every major decision has mainstream coverage (NYTimes, Scotusblog at worse) that I find is nearly the best way to pull the most salient points and interesting quotes. Masem (t) 19:10, 21 June 2022 (UTC)
Descriptions should come only from secondary or tertiary sources (possibly together with cautious use of direct quotations to illustrate something one of these sources says). Court documents of any kind (pleadings, opinions, etc.) are full of technicalities and terms of art which we should not be interpreting for ourselves. EEng 20:18, 21 June 2022 (UTC)

Assistance requested with SYNTH issue in country demographic articles

Your assistance is requested in this discussion at Wikipedia talk:WikiProject Countries, regarding a SYNTH issue in several dozen country-related demographic articles. Thanks, Mathglot (talk) 20:56, 13 July 2022 (UTC)

Can your text messages be used to justify a persons accused state of mind when being accused of sending said messages where of a violent nature?

Is it possible as a form in a defense toward criminal proscution? 2601:642:4C0A:579D:6D8C:FB69:4D5E:841C (talk) 17:01, 16 July 2022 (UTC)

If you want legal advice, contact a lawyer. Headbomb {t · c · p · b} 17:50, 16 July 2022 (UTC)

P/S/T sources; P/S/T sourcing; or P/S/T source materials, etc.

Pinging editors previously engaged in discussion on (primary) sources in archive: Crossroads Huggums537 Jc3s5h North8000 Paul Siebert Rjensen SMcCandlish WhatamIdoing in case of their potential interest here.

It's confusing. Editors can view "sources" as being synonymous with, for instance, publications. After a period away I know I steered/was steered/became compliant toward this kind of view.

The NOR section currently titled Primary, secondary and tertiary sources begins with the sentence Wikipedia articles should be based on reliable, published secondary sources and, to a lesser extent, on tertiary sources and primary sources.

I think that a proportion of Wikipedia editors may stop reading at this point while a proportion more may be influenced by views of the kind of editors that do. Editors I experience can seem to view, for instance, a "secondary source" as a kind of monolith solely composed of material defined by some sort of prescribed sourcing as "secondary". There's also the potential that other editor, while understanding the issues involved, may utilise the lack of clarity for their own ends. There's a clarity problem.

I'm not proposing any definite wording suggestion but think that problem addressing wording might read something like: "Wikipedia articles should be based on reliable, published secondary source material and, to a lesser extent, on tertiary source materials and primary source materials. A problem addressing section title might refer to something like "Primary, secondary and tertiary sourcing".

I'd suggest that various content within the section could be reworked and, for what it's worth, here's my provisionally bold attempt.

Within the current NOR text that begins "A secondary source provides..." a fair use of an example presents that: "A book by a military historian about the Second World War might be a secondary source about the war, but where it includes details of the author's own war experiences, it would be a primary source for those experiences." (I'll summarise this as the book including secondary source material with personal primary source material included). In more common practice "A book ... about ..." a topic might include photographs and various accounts related to that topic, (as potentially primary source material within secondary source material). News articles, if they provide any of the "author's own thinking" often take this to the extreme presenting predominantly primary source material with anything down to a potential mere border or tagging on of sometimes questionable secondary source opinion. (As an aside, this all also got to thinking on a potential juxtaposition between the secondary source requirement for the "author's own thinking" and the ruling against opinion). But back at the example quoted, of the "book by a military historian", if this book made reference to, say, another person's war experiences, that account may itself have included that "author's own thinking" on, for instance, personal perceived follies/successes in pervious wars, (secondary source material within primary source material). Closer to home, Wikipedia calls itself "Wikipedia" (as primary within tertiary with a whole encyclopaedia's worth of PST references placed into a phenomenally voluminous content along the way).

It's the material (or something like that) that counts.

GregKaye 09:32, 27 June 2022 (UTC)

This is complicated, including that I think you are dealing with at least two different areas (primary vs. secondary and trying to separate out opinion in that context) but I think that you are on to an important and useful point and direction there. One challenge is that secondary inherently includes the author's opinion, going the full range from choice of words in an unbiased / objective summary e.g "most" in "most of the people on the plane were Americans" to those which incorporate highly biased / distorted / spun summaries or choices of words and decisions on what to include/exclude. North8000 (talk) 12:03, 27 June 2022 (UTC)
Secondary does not inherently include the author's opinion. A Meta-analysis is a secondary source. It is not an opinion.
I am not sure why you believe that a statement like "most of the people on the plane were Americans". This is either true or false; it is a statement of fact. It is not a statement that depends upon your viewpoint. People cannot reasonably claim "According to Alice's viewpoint, most of them were Americans, but Bob holds the equally valid viewpoint that only a few people were Americans and most of them were French." Instead, the reasonable claim would sound like "Alice says they're mostly Americans, Bob says they're mostly not Americans, and either Alice or Bob is factually wrong."
I contrast this with actual opinions: "Alice says the coffee tastes good, but Bob holds the equally valid viewpoint that the coffee does not taste good". Whether or not the coffee tastes good is an opinion because it is actually subjective, meaning that the answer depends on who (=the subject) is speaking. WhatamIdoing (talk) 16:04, 27 June 2022 (UTC)
@WhatamIdoing: This is going to be so abstract that it will sound backwards, but not only do I agree with the point that you are making, but my post was actually to promote the line of thought that you just expressed. Basically to make the distinction between objective factual summary that isn't a reach and biased opinion. But one can derail that quest by claiming that even the most straightforward summaries involve author subjective choices. E.G. in ""most of the people on the plane were Americans" can you exclude using "most" for a 49% plurality? Or might 51% be not enough to use the term? And is "American" just US citizens vs residents. To avoid that derailing, you need to be able to say "yeah, but it has no signs of bias or highly subjective stuff, so we call that objective, and so objective accuracy does exist. And lastly I noted some of the less recognized forms of bias as flags that it isn't such. North8000 (talk) 17:50, 27 June 2022 (UTC)
Part of this sounds like a situation in which multiple valid definitions of this word exist, and the statement is only true for some of those definitions, so a POV pusher tries to pretend that the statement is wrong for all definitions. Consider "Females can get pregnant", and someone rejecting the statement because that's not how it works in seahorses, or because some females are sterile, or currently too old or too young to produce offspring. The statement isn't wrong, but you do need to know what the statement means, which is somewhat different from "every single female organism can get pregnant right this minute".
Similarly: It is true that "most of the people on the plane were Americans" if even a slightly majority of the people on the plane were US citizens and/or US residents and/or US nationals. If there were 51% US residents, including 10% that were not US citizens, then it is still true that "most" were Americans. Depending upon DUE and BALASP considerations, one might wish to be more precise ("Most were US residents" instead of "Most were Americans"), but the more generic wording isn't wrong just because you could misunderstand it.
I don't think we should accept claims that this is "subjective" or "an opinion" at all. If someone says "Their count of Americans on board is biased subjective opinion because the One True™ Definition of this word with multiple definitions is mine", then maybe we should consider whether that editor is has CIR or NOTHERE problems. It's true that definitions can be hugely important (consider counting "patients with the common cold" vs "people with the common cold": since most people don't seek professional medical care for the common cold, these two counts produce wildly different numbers), but Wikipedia editors shouldn't be deciding that the sources are wrong because we wish that they counted the other group. WhatamIdoing (talk) 23:55, 28 June 2022 (UTC)
Totally agree with this. However, I am fascinated by the ideas presented by the OP that there is a seemingly apparent juxtaposition between NOTOPINION, and an "author's own thinking" as well as the idea that Wikipedia could be seen as primary within a tertiary source. This user really thinks outside the box, and I like that. I think more users should do this more often. Huggums537 (talk) 14:17, 17 July 2022 (UTC)
When it comes deciding if a passage in a secondary source is primary because it reflects the author's own thinking, I think we need to distinguish cases where the author's thinking is based on the secondary sources she read, or is based on her own experience. If the author wrote a book about a war after spending a couple of years going through a defense department archive of documents related to the war, and quoted a passage from a report by a 1st lieutenant about conditions at a fire base, and described the report as typical of hundreds of other reports, the quote is secondary because it is based on multiple sources that the author read. If, later in the book, the author described an incident in a communications center in Hawaii that she took part in, that would be primary. Jc3s5h (talk) 12:55, 27 June 2022 (UTC)
I agree with Jc3s5h.
I think the problem @GregKaye is asking about is whether we should be implying "secondary source document" rather than saying "secondary source material". I agree with his identification of the problem and accept his solution. Given our still-shaky grasp of the difference between secondary and independent and the potential for abuse by POV pushers, I'm a little leery of encouraging editors to decide that this sentence or that sentence is primary/secondary, but it's worth a try.
On the question of "author's own thinking", I can see that the language is misleading. We want the kind of thinking that results in a transformation of primary sources – comparing, contrasting, analyzing, etc. We don't want the other kind of "thinking". "I think I'll go fix breakfast in a minute" might be this "author's own thinking", but it's not a secondary source. WhatamIdoing (talk) 16:10, 27 June 2022 (UTC)
We need to be indicating "secondary source material". There is no such thing as a "secondary source document/publication" except when one just incidentally happens to contain nothing but secondary material; there isn't a publication type that is categorically secondary from start to finish. "Editors can view "sources" as being synonymous with, for instance, publications." That's an error. For example, any given newspaper contains various forms of primary-source material, like advice columns, editorials, op-eds, advertising, etc. Only certain kinds of writing within the publication are secondary (and even then might actually contain bits of primary; to use the above example of someone writing an investigative journalism piece after doing a bunch DoD document research, if they close out the piece with a policy-change recommedation, that last bit (an opinion) is primary.  — SMcCandlish ¢ 😼  21:00, 27 June 2022 (UTC)
One of the hardest things on wiki has been to convince editors that most of the news in the newspaper is WP:PRIMARYNEWS content. Almost nothing in a traditional newspaper is secondary source material; in the modern age, some analytical work has become more common. You are actually more likely to find some secondary material in the opinion section than in the news. "Paul Politician gave a speech at the city council meeting last night" is pure primary. "Carol Challenger and Paul Politician are evenly matched candidates. They agree on property taxes, the importance of local agriculture, zoning, and land-use issues, but Challenger is stronger on climate change and Politician is stronger on education" is secondary material (because it's a compare-and-contrast analysis).
But since GNG requires "secondary" sources, and news articles is all editors have for some subjects, we insist that news articles must be secondary sources, because I must have my article, and all else – including the definitions of independent, secondary, and significant coverage – will bow to the goal of keeping my article. WhatamIdoing (talk) 00:10, 29 June 2022 (UTC)

I wanted to leave the initial thread question open but my inclination would be to steer away from reference to P/S/T sources and toward reference to both P/S/T sourcing and P/S/T source materials, etc. A difficult question (though maybe for another time) is where the border lies between an opinion piece of a rookie journalists and the thoughtful analysis of an accomplished scholar. An expedient solution could be to say news opinion pieces are out and other content with "author's own thinking" can be evaluated. Otherwise some articles may be swamped with potentially fad or similar opinion. I'm sure it may also be in a newspaper's interest to get articles cited in Wikipedia and I'm doubtful that potentially developing dynamics would help us build encyclopaedic content.

Revisiting the in use example: "A book by a military historian about the Second World War might be a secondary source about the war, but where it includes details of the author's own war experiences, it would be a primary source for those experiences." I see at least four-six potential categories of information here:

  1. things the author recollects or had personally recorded about ---self
  2. things the author recollects or had personally recorded about other people or things
  3. (things the author recollects or had personally recorded about a group of people perhaps inc. things that included the author)
  4. things the author has later researched so as to learn more about their own ---time situation (and the rest mirrors the above)
  5. things the author has later researched so as to learn more about other people or things from the time
  6. (things the author has later researched so as to learn more about a group of people perhaps inc. things that included the author)

The "author's own thinking" could be applied to any of this, or it could all equally apply to events that occurred for another author just hours or days previously. And we describe it as the "author's own thinking", but is it though? How do we know it's not, for instance, that it isn't The "author's mate-from-down-the-pub's own thinking" or that it isn't plagarised in some other way. I suspect this stuff may get complicated when addressed in future time. But it may certainly be worth noting in relation to ways we present classification terminologies. Later if the issues I mentioned can be tackled, I confess to suspecting that we might need something like that expedient solution I mentioned earlier. GregKaye 22:43, 27 June 2022 (UTC)

I think that the end result should be to do our best at assessing the expertise and objectivity of the source with respect to the text/item which cited it. Obviously not a no-brainer to determine, but even setting that as the standard/goal would solve a zillion problems in Wikipedia. North8000 (talk) 00:27, 28 June 2022 (UTC)
I think the stuff about opinions could fill a page by itself. It's far more complicated than we admit here. WhatamIdoing (talk) 00:12, 29 June 2022 (UTC)

side bar re PST

  • This might be a good time to remind editors about how the PST section came to be. The policy originally warned editors not to add OR to articles, because doing so turned WIKIPEDIA into a primary source.
It was then thought that (to better explain what we were talking about) we needed to define what a “primary source” was. That, in turn, led to us defining what secondary and tertiary sources were. However, along the way we somehow lost that original crucial point: that WIKIPEDIA should not be primary.
PST has grown into a distracting side discussion about the nature of our sources, when it should be about the nature of WIKIPEDIA.
I think any discussion of PST needs to return this original crucial point… because NOR isn’t really about the sources we use, but what WIKIPEDIA is. Rant over. Blueboar (talk) 21:54, 27 June 2022 (UTC)
Rant/grunt or whatever away. I read this when reviewing archive refs to primary and had in mind to quote you anyway. Awesome stuff. GregKaye 22:46, 27 June 2022 (UTC)
I have been thinking about this problem. Could we WP:SPLIT the PSTS section out of here, to its own page? Then this page could have a {{Main}} link to PSTS, and say the parts that are actually relevant to NOR, namely "Wikipedia is not a primary source, so don't stuff your own original research into articles". WhatamIdoing (talk) 00:13, 29 June 2022 (UTC)

Defining primary ...

Perhaps something like this would work:
Primary sources are composed of direct or firsthand evidence in regard to a topic. Often primary source materials are created at the time when the events or conditions occurred, but they can also include content in autobiographies, memoirs, and oral histories recorded later. Reproductions of primary source materials retain their primary status and different citations used in Wikipedia will link to references presenting varying extents of primary source content.
I think that this fits well with the WP:Identifying and using primary sources, Yale... citation GregKaye 10:05, 28 June 2022 (UTC)

Defining secondary ...

Perhaps:
Secondary sources describe, interpret, analyze, evaluate, comment or discuss primary source materials that they will often also quote. They are works which are one or more steps removed from the event or information they refer to, being written after the fact with the benefit of hindsight but do not need to be independent.
This draws from the WP:Identifying and using primary sources, James Cook University... citation GregKaye 10:35, 28 June 2022 (UTC)

Introduction

Working on from OP 3rd para, perhaps:
Wikipedia articles should ideally be based on reliable, published secondary sources and, to a lesser extent, on tertiary sources and primary sources. The establishment of the topic's notability will also typically be achieved through reference in secondary or tertiary sources which are also helpful for the avoidance of novel interpretations of primary source content. All analyses and interpretive or synthetic claims about primary sources must be referenced to a secondary or tertiary source and must not be an original analysis of the primary-source material by Wikipedia editors.

Appropriate sourcing can be a complicated issue, and these are general rules. Deciding whether the use of primary, secondary, or tertiary source material is appropriate in any given instance is a matter of good editorial judgment and common sense, and should be discussed on article talk pages. A cited source may be considered primary for one statement but secondary for a different one. Even a given source can contain both primary and secondary source material for one particular statement. For the purposes of this policy, primary, secondary and tertiary sources are defined as follows:

I added the word "ideally" towards the beginning because, if secondary sources are written with hindsight, there may be none about when some articles are first written. In the second paragraph I think that text might be cut (or edited back) from "A cited source..." as the issues mentioned may be otherwise dealt with something like in the proposed texts. GregKaye 11:21, 28 June 2022 (UTC)

GregKaye 11:21, 28 June 2022 (UTC)

  • Question… I am unclear as to the relevance of the proposed changes to the underlying concept of this policy page (ie don’t add your own original research). Would it be something that would be better placed on some other policy/guideline page?

Two main things that I think would be useful here:

  1. I'd like to clarify from the beginning that a section of a content can be a P/S/T source and that the entire content does not necessarily get labelled as primary, secondary or tertiary.
  2. I'd like to clarify from an early stage that a secondary source is not just a content that quotes a primary source. I think this is a common misunderstanding.

It's also all something that I'm trying to get my head around. To now I've been under the misunderstanding that Wikipedia sourcing was against citation of contemporary opinion which WP:RSOPINION indicates is not the case and retracted one short comment I'd previously made. GregKaye 16:49, 28 June 2022 (UTC)

All valid questions and concerns… BUT… how do they relate to “don’t add YOUR OWN original research”? I think you may have gotten distracted by focusing on the nature of the source, when this policy is (or should be) focused on the nature of what WE do with the source (ie what we write based on our sources). NOR is a content policy, not a sourcing policy. Blueboar (talk) 17:05, 28 June 2022 (UTC)

Fictional characters and original research

Currently, most articles on fictional characters include significant amounts of original research, either unsourced or sourced to the works themselves. This includes both lists such as List of Heroes characters, as well as individual articles on the characters, like Blood Brothers (comics).

Is this permitted under policy, and if not how can we address this given the extent of the problem? BilledMammal (talk) 00:39, 18 July 2022 (UTC)

The answer is WP:WAF. Introduce WP:WAF to the enthusiastic authors. Convert the writing style to real world perspective, which probably means cutting a lot of material that lacks real world perspective sourcing. Don’t go too quick to deletion policy, as characters can almost always be merged, smerged or redirected. It’s rare that WP:NOR is a good reason for deletion for anything that is verifiable. —SmokeyJoe (talk) 12:51, 18 July 2022 (UTC)
We definitely need to start setting some guidance on length of character summaries when only the primary source (the show, movie, etc.) is referenced implicitly, or even with direct primary sources (like to specific episodes). These should not be of similar or larger length than the summaries of the works themselves. And I would definitely agree that for a character in a multi-work series, references to the main works where things happen needs to happen as to meet WP:V. --Masem (t) 01:10, 19 July 2022 (UTC)
@BilledMammal, I wonder if you could give an example (e.g., a single sentence) of original research from one of those articles.
Keep in mind that the definition of OR is that the material has never been published in any reliable source (including the works themselves), so "Bob Bishop was introduced in season two" is not actually OR if there's a pair of DVDs somewhere that say "season one" and "season two" on it, and the character appears for the first time on the "season two" disk. WhatamIdoing (talk) 22:07, 27 July 2022 (UTC)
It's hard to say, as I don't know either of those fictional works, but lines like despite his evil ways, Linderman still possessed a shred of decency inside him and healed the damage done to Angela's mind so she'd know the truth about what Arthur was doing, with no other choice Eden kills herself before Sylar could steal her powers, and Sylar, driven by his aggression and hunger for power killed him and stole his ability, making him the first of several victims (emphasis mine) appear likely to be lines that are not supported by the primary source and instead require interpretation of it - in other words, original research. BilledMammal (talk) 05:05, 30 July 2022 (UTC)
Are you really objecting to the "unencyclopedic" writing style, or to the factual claim that the character is motivated by "hunger for power"? WhatamIdoing (talk) 22:14, 30 July 2022 (UTC)

Simple synthesis is not original research

I have noticed recently some editors claiming that simply compiling information from multiple sources and grouping it together by theme or obvious relationship could be SYNTH or OR. WP:NOTOBVIOUSSYNTH, Wikipedia:What SYNTH is not#SYNTH is not just any synthesis, WP:MNA, Wikipedia:What SYNTH is not#SYNTH is not a rigid rule, Wikipedia:What SYNTH is not#SYNTH is not obvious II, Wikipedia:These are not original research#Compiling facts and information, all say that this is not SYNTH. Is there a way we could clarify the policy to make it clearer that "simple synthesis" is not necessarily SYNTH/OR, and that there are acceptable bounds to table stakes assumptions. For example, one source says that Elon Musk had a secret relationship with Sergey Brin's wife, and another source says that Sergey Brin sought a divorce from his wife. We could reasonably list those 2 facts together. Even though perhaps this implies that the divorce was due to the secret relationship, we aren't stating that conclusion, we are simply grouping related information. Andrevan@ 22:32, 26 July 2022 (UTC)

When it comes to BLP and accusations, blatantly connecting those two facts without any other source making the connection is SYNTH and inappropriate on a BLP page. On the other hand if those just happened to be two "facts" were documented under a personal life section without any direct implication one begat the other, that would be okay, but we should be playing very carefully with inclusion of accusations in the first place. We are not celebrity gossip, which this stuff borders on, and it would simply be better to have more factual clarity before including
--Masem (t) 22:49, 26 July 2022 (UTC)
I haven't specifically edited the Elon/Sergey example, but let's assume for the sake of argument that it did have enough RS that it was graduating out of the gossip column. I disagree with the interpretation that BLP would prevent putting that information if attested in RS. Andrevan@ 22:51, 26 July 2022 (UTC)
We should not be including claims and accusations even against PUBLICFIGURES until we have two or more reputable sources independently confirming or supporting the claims. But even if both example statements have that support, we would have to be super careful to make sure our wording does not directly imply one resulted from the other, unless that inference is also reported by at at least one reputable source. We have too many problems with what may seem like low level synth that makes it really easy to create slander in wikivoice from unconnected facts. Masem (t) 22:56, 26 July 2022 (UTC)
So that's what I'm asking in this thread, given we have enough sources, is it SYNTH to report related facts together? We are not making an inference, and we are not explicitly concluding anything, but since both belong in the "personal life" section one after the other, there is no insinuation, it is simply grouping of related facts and information per all the policy links I cited. Given that we clearly disagree on this, I wonder if others agree with your interpretation, mine, or something else. Andrevan@ 22:59, 26 July 2022 (UTC)
Yes, if you are implying something a singular source doesnt support, and you can imply by placement, that is still OR. You are the one deciding those facts are related. A source has to do so. We dont just wink wink our way out of SYNTH. nableezy - 23:06, 26 July 2022 (UTC)
So you believe it would be original research to write in the article two sentences, one after another, about Sergey Brin's personal life, because some people might interpret that it is implying a causal effect when none was explicitly noted. Even though these statements are obviously correlated, though we are not claiming or seeking a causal connection. How do you square that with the policy statement that collecting related information under a common heading, Let the readers draw their own conclusions after seeing related facts in juxtaposition. is not original research? Andrevan@ 23:12, 26 July 2022 (UTC)
Well in this specific case theres a Wall Street Journal report linking the two. But I would still say that Musk supposedly having an affair with Brin's wife doesnt belong in Brin's BLP regardless. We dont need to document salacious details about a couple's marriage unless it becomes unavoidable by dint of its widespread coverage. And that has not happened yet. nableezy - 23:28, 26 July 2022 (UTC)
That's fair but I was trying to make a hypothetical example. Let's just say we didn't have the WSJ article linking the two explicitly, but it still had achieved widespread coverage. I'm asking about SYNTH not WEIGHT. You've made your view of it clear and that aligns exactly with Masem. I'm not urgently trying to add this incident to the article so let's maybe let some others opine. I'm curious if I am alone in my interpretation or if there is a difference of opinion within the community on what is an acceptable baseline assumption or if there even is such a thing. Andrevan@ 23:33, 26 July 2022 (UTC)
Id prefer a less salacious and BLP involved example tbh, but to take an example in a world I often edit in. I can easily source that the acquisition of territory by force is a war crime. I can likewise easily source that Israel has claimed territories it has acquired by force (the Golan and East Jerusalem). I cant, in the article Israel, have these lines next to each other without one source connecting the two: The acquisition of territory by force is a war crime. Israel has acquired East Jerusalem and the Golan Heights through force. nableezy - 00:04, 27 July 2022 (UTC)
That isn't a great example because the first statement reads like a non-sequitur, since it doesn't relate to Israel and is just a general statement. In my example, both statements were related to the subject of the article assuming Sergey Brin's personal life was the section. However, I think your example would be valid if we had the statements (taken from List_of_United_Nations_resolutions_concerning_Israel): As of 2013, the State of Israel had been condemned in 45 resolutions by the United Nations Human Rights Council (UNHRC). The United Nations General Assembly (UNGA) has adopted a number of resolutions stating that Israel's strategic relationship with the United States, a superpower and permanent member of the Security Council with veto power, encourages the former to pursue aggressive and expansionist policies and practices in the Israeli–Palestinian conflict. The United States responded to the frequent criticism from United Nations organs by adopting the Negroponte doctrine of opposing any UNSC resolutions criticizing Israel that did not also denounce Palestinian militant activity. Those two statements are separate, but related. Andrevan@ 00:10, 27 July 2022 (UTC)
That given this type of information is on the order of gossip mongering, simply wait until sources have corroborated details including or correlation. We are far too eager to push these gossipy items when it is policy from BLP to wait and see. Masem (t) 23:16, 26 July 2022 (UTC)
I'm specifically picking on the SYNTH issue about stating related facts together that some editors think implies negative or slanderous information. Let's assume that there are sufficient sources covering it. Can we cover them together by grouping only? If not, how do you square that with the policy? Andrevan@ 23:18, 26 July 2022 (UTC)
It doesnt have to be negative or slanderous to be an OR issue, thats a BLP issue. But let me ask you this. Do you actually think that X had an affair with Y's wife in January. Y filed for divorce in February does not imply that the divorce is due to the affair? Do you think that if no source directly makes that connection that we should be making it explicitly? Implicitly? If not explicitly, why is it acceptable to make it implicitly? nableezy - 00:16, 27 July 2022 (UTC)
I agree with you that a reasonable person might see two statements like that and assume that one caused the other, but I do not believe that it is OR or SYNTH to write those two statements together in the same section. Because it is the person making the conclusion and we aren't doing any crazy POV twisting, we are simply grouping related statements. We are simply reporting the facts per policy, juxtaposition without a conclusion is not OR/SYNTH. Sure, probably, they are related somehow, but not necessarily causal. Andrevan@ 00:22, 27 July 2022 (UTC)
Ok, but you didnt answer any of my questions besides possibly are we implying anything (I think you said yes?). nableezy - 00:26, 27 July 2022 (UTC)
I do not agree we are making the connection implicitly or explicitly, but a reasonable reader might assume that there is a causal link, and might reasonably infer they are correlated or connected. But we are not doing anything other than grouping related facts that relate to the same subject in the same way. So I don't think we are implying that there is a causal link, but it is reasonable to assume that they are both related to Sergey Brin's personal life and can therefore be grouped. Andrevan@ 00:40, 27 July 2022 (UTC)
I dont get how you can say a reasonable person might see two statements like that and assume that one caused the other and then follow that up I do not agree we are making the connection implicitly. One follows the other. nableezy - 01:09, 27 July 2022 (UTC)
Correlation is not causation - the two facts happen contemporaneously, were reported together, we don't know if A caused B, B caused A, or some third event C caused A and B. So we aren't implying causation. We do believe they are correlated - they relate to the same thing and to each other. Andrevan@ 01:16, 27 July 2022 (UTC)
That isnt the point though. If you feel that a reasonable person might see two statements like that and assume that one caused the other, then you are saying that it is implied. Implication meaning the conclusion that can be drawn from something although it is not explicitly stated. nableezy - 01:24, 27 July 2022 (UTC)
No, that's inference. An inference is a reasonably drawn conclusion that a reader may make. An implication implies intentionality on our part. Andrevan@ 01:32, 27 July 2022 (UTC)
define: implication - the conclusion that can be drawn from something although it is not explicitly stated. nableezy - 01:34, 27 July 2022 (UTC)
Reasonable people can assume that Sergey divorced his wife because of Elon, it's also possible that they got divorced first and then the affair, it's also possible that he cheated on her first and then they got divorced and then Elon got involved. We aren't implying why the facts are related, but a reasonable person might infer that one caused the other. They could also be wrong about that. Stranger things have happened. It's not strictly implied that one caused the other - that's what the reader is inferring. Andrevan@ 01:35, 27 July 2022 (UTC)
I really think we should stop using a real world example involving three living people. But from my reading of this, you seem to want to be able to hint at something that you cannot directly say, and in my view that is still inappropriate. nableezy - 02:10, 27 July 2022 (UTC)
We can come up with a different example. My point is that it's not "hinting" to juxtapose related facts. It's not contravened by policy that a reasonable person might be able to make a conclusion, so long as we don't make it. Within reason of course, I can think of ways that it wouldn't be valid. Andrevan@ 02:14, 27 July 2022 (UTC)
Disagree totally, especially with your example. Avoiding SYNTH is easy. If you can find reliable sources which connect two facts, its not SYNTH. If you can't, then it is SYNTH and should be avoided. If you want to include such "simple synthesis" without corroborating reliable sources, then you are in the wrong. -- Netoholic @ 23:33, 26 July 2022 (UTC)
How do you square that with the policy statement that collecting related information under a common heading, Let the readers draw their own conclusions after seeing related facts in juxtaposition. is not original research? Maybe the policy is unclear on the OR of implication because I can't find this. If that is the consensus of editors then so be it, but can you elaborate on the policy justification? Andrevan@ 23:36, 26 July 2022 (UTC)
BLP overrides NPOV as it has legal implications. If you cannot see how even simply putting these two statements next to each other causes a BLP problem, you really need to step back to understand the importance of BLP management on WP. Masem (t) 23:42, 26 July 2022 (UTC)
Please explain how, if we could separately source the statements, Sergey Brin's wife had an inappropriate relationship with Elon Musk, and Sergey Brin is seeking divorce from his wife, are a violation of BLP if taken together. Andrevan@ 23:45, 26 July 2022 (UTC)
You are creating the implication that the divorce was a result of the inappropriate relationship. While that may be Occum's Razor for why it is happening, we cannot make even such apparent leaps of logic when dealing with BLP. Maybe the affair was the last straw but there were other more pressing reasons why Brin sought a divorce. This is why it is far better to wait to have corroboration of accusations and around personal life claims like divorces. Now, obviously, it does look like the WSJ has made the connection here based on Elon Musk's page, so this is a null and void example since the situation appears resolved, but without the WSJ article, we would likely be best to avoid including yet-verified claims until we have that evidence even if they seem widely reported. Masem (t) 00:22, 27 July 2022 (UTC)
I agree we cannot make that leap, but that's why we aren't. You made the leap. The reader might reasonably assume that, but we aren't. Andrevan@ 00:23, 27 July 2022 (UTC)
I think thats too cute by a smidge. Almost Trumpian, a lot of people are saying level. Im not saying, but a lot of people are talking about it. nableezy - 00:27, 27 July 2022 (UTC)
Trumpianism would be making an unsourced statement, and then claiming it was a rumor that he heard, when he really made it up himself. I'm talking about adding 2 completely true, sourced statements to an article, and whether it would be OR/SYNTH to list them in the same section near each other because they both relate to a related topic, when that might create some implication in readers' minds that the events are connected even though we did not say they were. Andrevan@ 00:33, 27 July 2022 (UTC)
I said almost Trumpian, the part thats similar is the hinting at, suggesting at, but being able to say hey I didnt say that. Like I said, too cute by a smidge. nableezy - 01:37, 27 July 2022 (UTC)
"Suggesting" and "hinting" also go with the stronger usage of "implying" but that all connotes some expressive intent on our part, when we are simply juxtaposing related info and letting the reader decide what they think or how they want to conclude/interpret, per the policy links I quoted. I do think "almost Trumpian" is kind of a personal attack. It's implying we are trying to smear someone. Who exactly are we smearing by reporting that Sergey got divorced? His wife? Elon? Andrevan@ 01:50, 27 July 2022 (UTC)
Youre reading something into that phrase that isnt there. I said up above It doesnt have to be negative or slanderous. Its the hinting at something but still trying to claim I didnt say it that Im describing like that, but sure if you take that as an attack, then lets just call it inappropriately trying to run around the OR policy by making implications while maintaining a veneer of plausible deniability. nableezy - 02:14, 27 July 2022 (UTC)
We are supposed to play very much safe and middle ground with BLP. I agree that there may be a handful of readers that will not make that implication, but there are definitely some that will, and that's why we should avoid it.
Also, you are wrong that these are "true" statements. They are verified to the RSes making the claim, but (at least at this time) the only thing that is true is the divorce happened. We still do not have "truth" related to the affair. So it is not the case of simply just putting two true statements next to each other. If the affirm was 100% confirmed, and the divorce 100% happened, it would be hard to argue that those two statements can't be placed next to each other even absent any suggested connection. It would be like a case of "Country X's economy sank 2000% in 2020. In late 2020, President of X was voted out of office." - it would be implied that the economy was the cause for X losing, but as these are both 100% statements with truth behind them, we can connect them without the concerns of the above. Its when we're dealing with accusations and other yet-proven statements that it can be a problem. Masem (t) 00:46, 27 July 2022 (UTC)
Mmm, ok. So you believe it would be OK if the two statements were completely verifiable and fully sourced for truth in wiki voice, and the case where it is not OK is when one is simply an allegation that hasn't been proven. Your BLP argument was that putting a true statement next to an allegation may lend credence to the allegation. Is that what you mean? So if they are both unambiguous facts that are related, we can list them together, but not if some of the statements are controversial or are attributed to opponents/critics. Is that fair? Andrevan@ 01:01, 27 July 2022 (UTC)
Masem, I would call your example of “the economy falling and President X losing the election” a classic SYNTH. Sure, a bad economy can often lead to an election loss… but not always. There could be a dozen other factors that are not mentioned. We can not imply that X lead to Y unless we have a source that directly connects X and Y. One way to tell if two factual statements, placed near each other, are forming an original conclusion - just switch the order in which they are presented. If doing this changes the conclusions the reader will form from the facts being in proximity, you have engaged in SYNTH. Blueboar (talk) 01:24, 27 July 2022 (UTC)
Depending on the context, I think those two statements are interchangeable. For example in a background article about the election, we might want to write some background details about the economy or the pandemic to set the scene, and then say what happened during the election, especially if those election issues were relevant during the campaign. We can't say the bad economy caused the bad election but we can talk about the economy being something that came up. We could also state the results first, and then talk about the conditions and the background. It's not SYNTH to talk about the state of the economy and the economic issues from the campaign, we know that they are related to the election, even though we don't know a proximate cause of the results. But a reasonable reader might assume that the economy was a main reason, and we don't need to try to avoid that if it's going to happen due to the proximity of factual sourced info. Andrevan@ 01:41, 27 July 2022 (UTC)
"Background" article sections are one of the easiest places to slip into SYNTH. If particular facts are appropriate background, then it should be easy to find attributable reliable sources that offer up those facts as background. The danger is in editors adding in facts that they believe are background, creating an OR/SYNTH situation. -- Netoholic @ 02:25, 27 July 2022 (UTC)
So if it is the position of editors commenting that there's no such thing as "simple synth," how do we interpret the policies I posted at the top of the thread? Can we give an example of what is "allowable not-OR synthesis"? Or such a thing doesn't exist? Andrevan@ 02:45, 27 July 2022 (UTC)
One example, I think, of "allowable" synthesis is in list articles. For example, List of unusual deaths has a defined inclusion guideline, but there is no reliable source that includes all of the list's entries (some external lists might include some of them), so that collection is a Wikipedia collection, more exhaustive than any one external source is. Some of the policies you posted are really just essays, so are one interpretation of the main WP:NOR policy. Some editors wish to see softened guidance, some prefer stricter. I stand on the side of preferring external reliable sources do the synthesis, and us just citing those sources and not going beyond them. -- Netoholic @ 03:08, 27 July 2022 (UTC)
True, that's actually right, my bad. Masem (t) 02:16, 27 July 2022 (UTC)
On the topic of synthesis (which is in the WP:NOR policy), you also have to connect it to WP:NPOV. "Undue weight can be given in several ways, including but not limited to the depth of detail, the quantity of text, prominence of placement, the juxtaposition of statements, and the use of imagery". So it's absolutely possible to imply something that isn't verifiably true or neutral, just by juxtaposing statements.
Let's say we were talking about a political candidate who dropped out. A reliable secondary source is summarized as: "The campaign found that they were slowly running out of money. On August 5th, they announced they were suspending their campaign."[1] A wikipedia editor comes in and inserts a statement. "The campaign found that they were slowly running out of money.[1] Journalists also criticized the candidate's debate performance.[2] On August 5th, they announced they were suspending their campaign."[1] All those statements are verifiable, but we've substantially changed the meaning to something that isn't really stated by the secondary sources. (You could imagine how malicious an editor could be with selecting what statement to insert, even a verifiable one.)
The policies are mean to be taken in their collective spirit. Most WP:SYNTH issues will at least raise a potential issue with WP:VERIFIABILITY, and even WP:NPOV in some cases. The point of WP:NOR is that editors are not supposed to be creatively assembling facts to present novel ideas, comparing or grouping things that no reliable source has found. Wikipedia builds articles based on a WP:SUMMARYSTYLE. It's not an easy line to define, but it's something we have to watch for. Shooterwalker (talk) 15:53, 27 July 2022 (UTC)
But let's look at this statement: "The campaign found that they were slowly running out of money. On August 5th, they announced they were suspending their campaign."[1] I think that statement is not OR/SYNTH because it's a reasonable assumption that the campaign's finances are related to how long the campaign can stay around. However, according to the interpretation of some editors, this would be OR/SYNTH unless we explicitly found a source saying that the campaign shut down due to their financial situation. Andrevan@ 15:56, 27 July 2022 (UTC)
I know it's hard to talk about hypotheticals. But in this hypothetical, the source is making the connection. (Though I think there's always a valid argument to be had about whether we are summarizing the source properly.) It becomes a much bigger issue when you insert the middle statement, and sourcing it using a [1] [2] [1] scheme. Like I said, it's a hard line to define, but the policy exists to promote discussion about cases exactly like this. Editors should always ask if we implying something that isn't verifiable in the sources. Much stricter standard if it involves WP:BLP and/or controversy. Shooterwalker (talk) 16:01, 27 July 2022 (UTC)
I agree the [1][2][1] example in this case feels problematic, but what about a situation where source [1] discusses the campaign's finances only, and source [2] discusses the campaign being suspended only. I contend it is not OR/SYNTH to group the two statements sequentially with 2 sources Andrevan@ 16:49, 27 July 2022 (UTC)
No… that is precisely SYNTH. You are taking two statements and linking them to imply a conclusion that neither source states. A+B= C(implied). Blueboar (talk) 17:05, 27 July 2022 (UTC)
This is what is a major issue with our writings nowadays...editors want to craft articles around narratives they think exist on topics, but really are narratives of their own creation. Thus we get SYNTH like this where two disperately sourced statements are written together in a manner that implied one follows or connects to the other that fits their personal narrative but one not necessarily supported by sources. Same type of argument on the NPOV page about include votes from lawmakers. It is why we really need to get our of writing on immediate current events and instead wait until later covered by RSes. Masem (t) 17:19, 27 July 2022 (UTC)
I really don't agree. It's common for campaigns to run out of money, and it's common for them to be suspended. I do not think we are making an untoward, original research, novel synthetic implication by listing them together, even though a reasonable reader might assume that the campaign suspended due to running out of money. It's just a given in politics that "the campaign ran out of money" and "the campaign was suspended" are related statements that go in sequence. We aren't making a conclusion or a connection beyond the obvious one. Andrevan@ 18:33, 27 July 2022 (UTC)
I want to echo the other editors here. It's hard to discuss hypotheticals, but the [1][2][1] example is very likely to be WP:SYNTH. The [1][1][1] is less likely to be a problem, but it's always possible that someone cherrypicks statements from a source, and juxtaposes statements to imply something that isn't there in the source. That's why the WP:SYNTH policy exists, to remind editors that we're not supposed to draw comparisons / sequences that aren't there. Each case will need to be discussed by a consensus of editors until you have something that is plain from the sources, without any POV pushing or new ideas. Shooterwalker (talk) 18:36, 27 July 2022 (UTC)
I absolutely agree. It's case-by-case and a consensus of editors will decide, POV pushing or novel ideas are never good. I think Shooterwalker you are pointing out that there DO exist situations of "simple synthesis" as per the policy, which is distinct from the position that some editors are taking that "any synth is OR." Andrevan@ 18:39, 27 July 2022 (UTC)
The hypotheticals may be getting in the way of the discussion. A lot of the time, I will personally go through a reliable source and pull out what I think are the most important points, and arrange them into a sequence that is logical for readers. There is no intention of synthesizing a new idea. But someone may come along and say "actually, you make it sound like these two things are connected, and the source doesn't explicitly say that". Assuming they're right, I have two choices. One is that we rephrase until we find something that is closer to the original source. The other is that I find another reliable source to support the point that I think is being made. The point is that it's something we have to be sensitive to, but there will always be some grey area between summarizing and synthesizing. Shooterwalker (talk) 18:43, 27 July 2022 (UTC)
I absolutely agree. "There will always be some grey area between summarizing and synthesizing." I wish the policy would say that if that is the consensus of editors that such gray area does exist and is nonzero, of course we must be careful. Andrevan@ 18:45, 27 July 2022 (UTC)

It ain't in the policy

User:Andrevan has asked several times about how people square the actual OR policy with a statement that ...isn't in the policy. It isn't in any policy or guideline.

The words in question are a partial, out-of-order quotation from the "explanatory essay" Wikipedia:These are not original research. In full, they say:

  • Comparing and contrasting conflicting facts and opinion is not original research, as long as any characterization of the conflict is sourced to reliable sources. If reliable references cannot be found to explain the apparent discrepancy, editors should resist the temptation to add their own explanation. Present the material within the context contained in reliable sources, but avoid presenting the information in a way that "begs the question". An unpublished synthesis or analysis should not be presented for the readers' "benefit". Let the readers draw their own conclusions after seeing related facts in juxtaposition.
  • Identifying synonymous terms, and collecting related information under a common heading is also part of writing an encyclopedia. Reliable sources do not always use consistent terminology, and it is sometimes necessary to determine when two sources are calling the same thing by different names. This does not require a third source to state this explicitly, as long as the conclusion is obvious from the context of the sources. Articles should follow the naming conventions in selecting the heading under which the combined material is presented.

The first snippet that Andrevan quotes was added in this edit, and the second was added in this edit. Neither of these editors are especially active these days, and there have been few detailed discussions about them. However, I feel like Andrevan's is quoting them out of context. Where are the "conflicting facts" that readers should draw their own conclusions about? I don't see any in these discussions. Where are the "synonymous terms" that need to be lumped together in these discussions? I don't see any. This is not especially relevant. WhatamIdoing (talk) 22:32, 27 July 2022 (UTC)

Fair, it's true that what I quoted/linked to there were essays and not policy, and I apologize for not distinguishing clearly between policy and essays. The essay notes I have been citing have stood for many years and I think they still should be thought through, though if consensus has changed, then it has changed. I believe those essays do reflect the views of 2007 accurately. The essay "What SYNTH is not" is also prominently linked from the WP:NOR main page. It's also true that in the discussion above, everyone except me has argued that an implication that could be reasonably inferred, is still WP:SYNTH. So noted. But there are policies that apply: Wikipedia:Neutral point of view#Making necessary assumptions, for example. I do think that regardless of what you think of my argument, there is still some gray area where "simple synthesis" is not original research. However, perhaps I failed to make a compelling argument or illustrate with reasonable examples. Andrevan@ 22:42, 27 July 2022 (UTC)
I agree in principle that some sorts of limited, "simple synthesis" statements will not violate WP:OR. One could, for example, combine a source about the population of Canada with a separate source about the population of the US, and a third source that has the population of Mexico, and produce an estimated number of humans living in North America. This is simple enough that it is not OR. But the example given above is not so simple, and it could easily be abused, especially for writing about politicians ("He has been condemned by <long list of organizations>. He voted for the Freedom and Liberty Act" – hmm, makes it sound like these organizations opposed that bill, doesn't it?).
I think that a more productive line of inquiry for right now might be "what should I do?" rather than "does it technically violate a rule?" To give you a slightly less fraught example, imagine a substub that says "Alice was an HIV activist known for promoting equitable access to healthcare. She died in 1996."
This might improperly imply that Alice died of AIDS. But what to do? Well, one thing is to find information to expand the article in ways that clear up the potential misconception: "Alice (1903–1996) was a nun who founded an HIV hospice in 1984 and became an HIV activist. She died of breast cancer at the age of 93." Another is to separate it, so that each of the two sentences is in a different paragraph or a different section. This little gap can sometimes help, at least a little bit. (I know that Miss Snodgrass told you that a paragraph can't have just one sentence, but that's not the English Wikipedia's rule.) A third option is to leave out the less important information (Alice's death isn't relevant to her notability). There might be more options. WhatamIdoing (talk) 00:56, 28 July 2022 (UTC)
Makes sense, thanks for the thoughts. [01:00, 28 July 2022 (UTC)] So WhatamIdoing would you agree that the following statement is not problematic synth: "Joe Colonialperson was a moderate abolitionist on the issue of slavery according to his writing[good secondary source citation]. Joe Colonialperson owned 5 slaves[cited to a github account from Washington post which has a table of slaveowners]." These events are connected but we aren't making any implication as to how the first impacts the second or vice versa, but they are both related to the issue of slavery. We're "compiling facts" "grouping under a related heading." Or am I synthing OR now? If so, what is the OR conclusion other than a general sense of the "yet"ness of the 2nd statement. Am I making sense or totally off the reservation here? Andrevan@ 02:28, 28 July 2022 (UTC)
You are creating the impression that Colonialperson was hypocritical (wanted to end slavery but owned slaves) even though that's not said. That's a problem. Masem (t) 02:54, 28 July 2022 (UTC)
So keeping in mind WhatamIdoing's earlier comment, "what to do" now? Do we have to delete the 2nd statement about Colonialperson? Andrevan@ 02:55, 28 July 2022 (UTC)
Understanding the nature of slavery and abolition at the time, either you find secondary sources that speak to the compounded statement, or in this case, the fact that he owned slaves seems extremely trivial when considering that being an abolitionist was likely more important. Masem (t) 03:09, 28 July 2022 (UTC)
So by your logic, even if there is a source that is reliable showing that Colonialperson owned 5 slaves, you believe it is inappropriate SYNTH to add this fact to the article at all, because it would suggest the conclusion that Colonialperson was a hypocrite, since he was also stated to be an abolitionist. I really can't agree with that, I think we have a responsibility to report this information if it's properly referenced and accurate information. If that is the consensus of editors obviously I will abide by it, but it seems like a wrong and bad policy to me. We shouldn't whitewash facts and history for fear of implying something negative about a person, especially one that died hundreds of years ago (hypothetically). Andrevan@ 03:14, 28 July 2022 (UTC)
No we don't have that responsibility. We are here to summarize sources, not repeat every detail they give. I would have a real hard time to believe that if there are a fair number of sources reporting on both aspects here, that none of them cover the intersection of those ideas (that is, to explain why he owned slaves if he was an abolitionist). But I can see the situation where there's plenty of coverage of the abolitionist aspect but where the slaving owning was only in 1 or 2 sources. In such a case, we'd just not include that trivial information.
Just because the person is long dead, it still is a SYNTH issue which applies to any topic. We have to be fully aware of this throughout any writing, but moreso when it is a BLP. Masem (t) 03:26, 28 July 2022 (UTC)
Consensus may be that you are correct, and I am mistaken, but permit me to argue for the lesser interpretation of SYNTH. If consensus is that I am wrong, I will abide by that.
Given the scenario which we agree is possible, that a few sources cover that Colonialperson owned slaves, but the vast majority of more "classic" sources like mainstream textbooks, generally just talk about Colonialperson an "abolitionist." That means we should reduce the WEIGHT of the Colonialperson-slavery info. I don't agree with the argument to leave it out altogether or even that it is a form of original research to include it. Original research is about pushing novel ideas and new conclusions. There are and should be reasonable examples of "allowable synthesis," namely applying simple logic, organization, such as basic calculations, grouping, or substituting synonyms etc. I would argue this also extends to juxtaposing related facts. It makes sense that we should take care to avoid suggesting new ideas by implication, but you are actually arguing that any inclusion of this fact constitutes original research due to unavoidable implications of the coexistence of two facts.
I do think we have a responsibility to report verifiable, reliable information, about living or dead persons, and especially if they are public figures. I think we have a responsibility to NPOV not to try to preserve the good image of people just because something unpopular might be true about them. Similarly, we need to, for NPOV, appropriately characterize the opponents' views of a person (writing for the opponent) or qualify/attribute relevant minority views. I don't agree that the original research policy was designed to prevent editors from doing any kind of logical organization or providing additional verifiable information that would be educational and informative, and is encylopedic.
The slavery example is somewhat contrived but also realistic. Summarizing sources does not mean being selective about including relative, pertinent, sourced information, because it only appears in a small fraction of the sources. That is an argument to reduce WEIGHT, not to eliminate the information. And the idea that it is original research because the reader might judge the subject of the article harshly for their actions, just by knowing 2 facts, goes beyond simply the idea that we are consciously creating an implication by placement. Andrevan@ 03:38, 28 July 2022 (UTC)
WP:NOT and WP:V - there is no assurance that every bit of information that has been reliably published needs to be captured in an article. And if the tradeoff of not including a minor point about owning slaves is to avoid implied SYNTH, we're going to prefer the latter. SYNTH starts, immediately, Do not combine material from multiple sources to reach or imply a conclusion not explicitly stated by any source. and that's exactly what the situation is here. It's pretty clear we absolutely take steps to avoid potential implied conclusions. And if that means we need to drop less-covered information, so be it. We are a summary source and absolutely do not need to include every detail that is published.
Yes, organization of information is not necessarily original research, but the wrong approach to organization in a manner that is not similar to how the topic is already covered can create synthesis and original research, as well as create NPOV problems. (eg this is why we try to have editors avoid "controversy" sections on topics and instead work that commentary into the overall article, as such controversy sections often bring in lots of this synthesis-like OR by how statements are grouped.) Masem (t) 04:47, 28 July 2022 (UTC)
I agree that not all information should be included necessarily, but suppose there was a consensus of editors that the WEIGHT/NPOV was due, I am concerned about the SYNTH issue in that case. Andrevan@ 19:51, 30 July 2022 (UTC)
The slave owner example is interesting but we would have to be very careful with such information if we can't find a RS that explains the apparent discrepancy. Consider a few possible explanations; 1. They are synically hypocritical (do as I say, not as I do). 2. They are actually torn by it but still own the slaves because they feel it is needed at this time (I suspect Jefferson was in this group). 3. They inherited a plantation that came with slaves. Once the property transfer was in order they freed the slaves but they were the legal "owner" for the time between inheritance and freeing them. 4. They were a slave owner, decided it was wrong and then became an abolitionist (John Newton). For the given facts these associated contexts are very different and we need to be very careful about presenting the given facts in a way that implies which, if any, of the contexts is true. If the only source for owning the slaves is a primary source then it probably isn't due. I do get that modern sensibilities considers slave ownership to be very notable. However, if WP:RS about the subject don't note the slave ownership then we shouldn't either. Else we are engaging in OR to establish relative WEIGHT. Springee (talk) 02:42, 29 July 2022 (UTC)
I agree that those 4 possibilities exist, but I don't agree that we need another source to tell us which it is. We can leave it unknown and allow the reader to make their own conclusions. We should avoid introducing SYNTH but just adding the information, provided it had appropriate sourcing and sufficient weight, is not default SYNTH IMHO. Andrevan@ 19:50, 30 July 2022 (UTC)
"You are creating the impression that Colonialperson was hypocritical": Not necessarily. Consider the similar (in logical structure, not in moral weight) "Joe Restaurant advocates for legally prohibiting smoking in restaurants, but allows smoking in his restaurant because it is currently legal and customers want it." This was a pretty typical stance in the US restaurant industry a few decades back: they didn't really want smoking in their restaurants, but they also didn't want a third of their customers leaving for places that permitted it, while only a tiny number of new customers would come to the lone smoke-free restaurant. It's not always hypocritical to advocate for a systemic change without making individual changes in advance. WhatamIdoing (talk) 01:43, 29 July 2022 (UTC)
In your example, this is presuming that "because it is currently legal and customers want it" is sourced. Without that, the phrase returns to synthesis. Masem (t) 01:47, 29 July 2022 (UTC)
I agree Andrevan@ 19:49, 30 July 2022 (UTC)
Without a source, none of it could appear in an article. However, I give this an example of the logical mistake. Compare:
  • "Joe Restaurant advocated for legally prohibiting smoking in restaurants, but allowed smoking in his restaurant"
  • "Joe Colonialperson advocated for legally prohibiting slavery, but owned 5 slaves".
Hypocrisy isn't the only reason why someone would advocate banning a system they're exploiting. WhatamIdoing (talk) 22:21, 30 July 2022 (UTC)

On combining sources

I think that when editors are deciding whether to combine sources to get to a conclusion, I would like them to consider four things.

  1. Are the sources really comparable? (Are they equally rigorous? Do they use the same methodology? Are they definitely talking about the same place, or really comparable places? Are they definitely talking about the same time period, or really comparable time periods?)
  2. Does combining the sources get to a conclusion that's helpful or interesting to the reader?
  3. What's our purpose in combining them? Are we trying to lead readers towards a particular conclusion?
  4. Why can't we find a reliable source that's combined them already?

There are certainly cases where all four questions have good answers.—S Marshall T/C 19:22, 30 July 2022 (UTC)

These seem like valid questions, but the case that I think maybe we lack clarity on or we aren't properly communicating on (or maybe I am just wrong on this, that is indeed possible), is when combining sources simply for organizational purposes, making reasonable assumptions about the meaning of their content, and to show a change over time or otherwise construct good writing and communication, based on table stakes assumptions about the context of the information.
For example to question 3, the answer is emphatically no in all the cases I have offered, and I think explicitly using them to get to a conclusion does make it synth, where there seems to be a gray area is in what can be considered inappropriately implied. To question 2, I don't think presenting related facts that show different or conflicting views at different times, imply that the person is a hypocrite and is therefore SYNTH. The person's views may have evolved, and we don't necessarily know why. Or to WhatamIdoing's point, sometimes there is a strategic or pragmatic reason to say A and do B. SYNTH is about original research and advancing new ideas, not the basic facts and assumptions needed to organize an article and write effectively about history or science or whatever.
For example, totally made-up and historically nonexistent: Congressperson Abraham Adams supported the General Sedition Act in 1863.[1] However by 1867 Adams told the New York Gazette that legislation pertaining to sedition was a violation of the right to free speech.[2] Adams voted against the 1877 Alien and Sedition Acts.[3] To your question #4, sometimes we just don't have that source which makes the connection. That doesn't mean they aren't still connected, but I contend that in order to fit these facts together, it's reasonable to know that if source 1 says Adams supported the act, source 2 shows that he opposed the act and is later in time, and source 3 says he voted against the act which is even later still, that this is a normal situation in politics. We aren't unduly implying that Adams' actions were hypocritical or creating original research. It is known in politics that if you take a position on an issue, and vote for or against that issue, those facts are related by virtue of the basic way the system and actors work. Andrevan@ 19:35, 30 July 2022 (UTC)
I'm afraid my ignorance of US politics is profound, so please forgive my failure to follow your point. How does this example combine sources to reach a conclusion?—S Marshall T/C 20:10, 30 July 2022 (UTC)
I agree, it does not "reach a conclusion," but editors above and in other discussions have argued that such description would be SYNTH/OR. Andrevan@ 20:12, 30 July 2022 (UTC)
Your example here is not like the previous examples, because of the source where Adams explained his change of mind, which thus is not an issue with synthesis. It would be synth if that statement was not available, and resulted in a implicit suggestion that Adams was hypocritical or the like, which we cannot do. The previous examples were similar - two facts were presented that presented contrary statements but without any input of why that contradiction existed, so it made the person written about appear hypocritical. Masem (t) 20:41, 30 July 2022 (UTC)
I am glad we agree this series of statements is OK, but Masem, source number 3 doesn't say why Adams voted against the bill, so by your earlier logic, if we create this sequence, we are implying that his 1867 statements, are the reason for his 1877 vote, even if this source didn't tie them together explicitly. I contend this is "fair synth" but you earlier, led me to believe you did not. Perhaps, I am mistaken. Or to use more of a real example: "Marjorie Taylor Greene was critical of NATO.[1] Marjorie Taylor Greene stated that the US should leave NATO[2]. Marjorie Taylor Greene voted that Finland and Sweden should not join NATO.[3]" Andrevan@ 20:44, 30 July 2022 (UTC)
I am saying that the key line in your example is the 1867 statements that explain the change of mind, which gives reason of why it is fine to mention, in sequence his earlier "for" vote and then his later "against" vote. Without his statement, putting the two vote stances next to each other creates synth in the implication of hypocritical nature. In the case of the MTG example, if you did not include the first statement that gave her stance on NATO, then putting the other two sets of sources together implies her being critical, which again we cannot do. But the first statement gives the necessary OR from RSes that then is fine to link them. Masem (t) 21:05, 30 July 2022 (UTC)
Masem, I think you're assuming far too much. Most people would not expect us to protect politicians against all hints of hypocrisy – the profession is rather known for that quality, or at least for prioritizing party loyalty above principles – but we shouldn't be fixated on such concerns. In this case, the politician might not have changed his mind at all. The difference could have been the bills in question (some forms of seditious speech are not legally protected free speech in this country; perhaps one bill infringed on free speech and the other didn't), or the difference could have been party politics (perhaps the real difference in the two bills had nothing to do with sedition or free speech, but an unrelated "Christmas tree" clause that the opposition party hung on it), or the difference could have been circumstantial (politicians are more likely to vote against free speech during a war), or the difference could have been the attitude of the voters, or any number of things.
The problem here is not "He voted for Bill #1, and he voted against Bill #2". If there is a NOR problem here, it is in claiming that the two votes were inconsistent (the "However" language) and claiming his second vote was motivated by free speech concerns. WhatamIdoing (talk) 22:35, 30 July 2022 (UTC)
A lot depends on context too. Let's say this lawmaker was a Democrat, and the way the votes fell were against normal democratic patterns. (eg say #1 was for more military spending and #2 against expanded immigration). Now, those statements may be factual, but just their presences without any further explanation or context (given that we don't normally report how a lawmaker votes on every bill) would appear to be critical of these voting patterns. That's why we prefer editors to wait and work from secondary sources (not primary news stories) that provide analysis and context to better describe, in this case, the political positions of a lawmaker, so that we absolutely avoid the potential OR by trying to analyze ourselves. Masem (t) 23:10, 30 July 2022 (UTC)
If their vote on the bill was suitably referenced in a number of reliable sources with non-trivial mentions, I believe that vote could be appropriate, if it relates to their positions on the topic, i.e. the context. If their vote was somehow unique or different, if anything, that is equal or more reason to mention it. Breaking with one's party is often notable. Assuming already verifiable, reliable, and due weight for notability reasons, this idea that we're doing original research by listing the votes that people have taken, simply because someone might interpret that negatively or critically, is quite objectionable to me. Andrevan@ 23:19, 30 July 2022 (UTC)
I would expect that if many sources documented these votes that there would be context in those sources to explain why they are important, if they are specifically calling that lawmaker by name and not just mentioning the voting call. The secondary-like content of those news stories is necessary to include, not just the primary-factual details that, without the secondary information, are just data points that we as editors cannot connect. We have gotten really really sloppy on this type of writing overall on WP, far too much focus on prose-line style writing rather than looking for summaries and analysis with editors thinking they know what's best to include. That's why it is very important to know that improper synth can come from combining sources, particularly those that are only carrying primary information and not secondary analysis. Masem (t) 23:29, 30 July 2022 (UTC)
If you have a number of reliable sources with non-trivial mentions, you should be able to provide a sourced explanation of why these votes matter and what they represent. WhatamIdoing (talk) 04:47, 31 July 2022 (UTC)

Combining sources is necessary -- we rightly expect and require that articles will have multiple sources. Selecting which information from each source to include is necessary -- we are meant to summarize what the reliable sources say, not obsessively regurgitate every detail. So the wording of SYNTH needs to be quite nuanced. There's a lot, in the previous discussion, about how you can violate SYNTH through the sequence of ideas -- the order in which you say things -- and I think that's true but when we're thinking about how to write policy, we need to let editors write articles. Most articles should present related facts in chronological order. It should not be a SYNTH violation to do so. In fact, I think SYNTH should be written conservatively with a lot of nuance.—S Marshall T/C 08:55, 31 July 2022 (UTC)

I fully agree with S Marshall on this 100%. Huggums537 (talk) 09:34, 31 July 2022 (UTC)
While it is important that articles should provide a good chronologic order to events relative to a topic, our goal should still be focused on a long-term, looking-back view of the topic, and that may mean that a strict chronologic order may not be ideal. For example, to use lawmakers again, a strict chronological order of how they voted is far less helpful and more prone to synthesis than working from the lawmaker's key stances on issues and what they in their government role to support or oppose that. Or another way to look at this is that we should be working to emulate how secondary sources structure content, rather than trying piecemeal too many primary sources which is where SYNTH can easily come into play. Masem (t) 12:58, 31 July 2022 (UTC)
It is the case, that much synth/notsynth is a matter of the way the content is written (often with use of implication), and that it is sometimes a cross-over with some ways of undue emphasis. -- Alanscottwalker (talk) 13:52, 31 July 2022 (UTC)
Agreed with this Andrevan@ 17:29, 31 July 2022 (UTC)

Lexical cohesion in sources

I couldn't find any guidance on lexical cohesion in the policy and started a thread at WP:NORN#Treating lexical cohesion in sources.

Looking at the archives, might've been better suited for this talk page. Would appreciate a wider input at the WP:NORN discussion. PaulT2022 (talk) 23:35, 31 July 2022 (UTC)

If so then NOR won't help much. NOR doesn't say you have to use the words the sources use. It says your articles have to mean what the sources mean.
In this case the sources mean "war crimes" and I think you can and should say that.—S Marshall T/C 07:08, 1 August 2022 (UTC)
I don’t disagree… however, a caution is required: Different people can read the same source, and interpret the words it uses as “meaning” very different things. Blueboar (talk) 13:23, 1 August 2022 (UTC)
Exactly. I don't advocate writing those words in the policy, although there's what might turn into an essay about it in my userspace.—S Marshall T/C 17:36, 1 August 2022 (UTC)
I do think an essay summarizing these discussions might be worthwhile, maybe as part of WP:NOTOR. Andrevan@ 20:31, 1 August 2022 (UTC)
An expansion of Wikipedia:These are not original research#Paraphrasing?
Etymological fallacy and Semantic change are also things that need to be addressed. The advice that seems most salient to me is that when Major Authority™ says that something's name (or spelling) has changed, then you should use the new name even if you are citing a source that uses the old name. For example, the older names for what's now called Intellectual disability should be replaced by the current name (all the usual exceptions apply: in direct quotations, redirects, etc.). Similarly, if a group says that an old term is inappropriate for some reason (a reason that is compatible with encyclopedic purposes), then editors should use the more appropriate/clearer/less offensive term. Here I am thinking of words like Miscarriage, which most major medical organizations and patient groups declared to be preferable to the older term spontaneous abortion decades ago. The older term sounds like women are just popping out for an abortion on a whim. I am not thinking of euphemisms ("lost his battle with depression") or obviously problematic terms ("Don't call me a thief; call me a professional property reapportioner"). WhatamIdoing (talk) 23:26, 25 August 2022 (UTC)
Also: When a variety of sources use a variety of words, then that's an excellent opportunity to employ elegant variation. I understand that in some languages, students are taught to find the best word and use it repeatedly, but that's not considered good writing style in English. If half your sources say "Auto racing" and the other half say "Car racing", then use both. WhatamIdoing (talk) 23:31, 25 August 2022 (UTC)
I agree about the elegant variation piece. However in other cases there is the issue of WP:COMMONNAME. Andre🚐 23:39, 25 August 2022 (UTC)
There have been several discussions about this recently. Last weekend, I was thinking about some potential advice and realized that it would be possible, by dint of someone quietly cherry-picking sources and then insisting on using whatever name is most commonly used in the currently cited sources for an article, to end up with an article that isn't allowed to mention its title after the first sentence. WhatamIdoing (talk) 16:17, 26 August 2022 (UTC)

Community research should be encouraged! Community research would unleash the potential of humanity!

Original research can go in a special box and still be based on every other Wiki policy and principle, for example: consensus and neutral point of view.

I think this policy page flies only because for the early days of Wikipedia Jimbo Wales sent a mailing list post and everybody else now hops on. Altanner1991 (talk) 04:50, 31 August 2022 (UTC); edited 05:09, 31 August 2022 (UTC)

Disagreed. No original research is a very important policy. It's about the idea that Wikipedia isn't a place for publishing original thought. It's a place for compiling secondary source references to support distillation of verifiable information into general purpose reference. Andre🚐 04:54, 31 August 2022 (UTC)
This has 0% chance of happening, but anyway: instead of a separate box, how about a separate website? We can call it Wordpress, Blogspot, YouTube, or Twitter. Crossroads -talk- 04:51, 1 September 2022 (UTC)
Well, I thought it was the greatest suggestion. Wikimedia is far superior. Blogs and social networking? No, their lack of collaborative potential renders them irrelevant. Altanner1991 (talk) 05:44, 1 September 2022 (UTC)
There are all kinds of things that Wikipedia is WP:NOT. Some of them are even good in the right circumstance -- democracy, databases, speculation, websites. But original research is outside the scope of an encyclopedia. Shooterwalker (talk) 05:03, 1 September 2022 (UTC)
I suppose a sister project would have been a more conservative suggestion. Altanner1991 (talk) 05:45, 1 September 2022 (UTC)

Yes, I might agree that the encyclopedia is best (at least, at this point in time) kept "neat", meaning no major deviations from the traditional "encyclopedia concept". But the idea of "community-led research" in a sister project is something I would find exciting. Altanner1991 (talk) 05:51, 1 September 2022 (UTC)

i'm n00b. what of the case of simple calculations that any wikipedia reader can verify for themselves?

asking for this: https://en.wikipedia.org/wiki/Talk:Fischer_random_chess#How_do_I_go_about_adding_statistics?

I propose to add statistics that I'll calculate myself (eg how often white wins vs black wins vs draw) and then people can verify for themselves but it'll take about 15 minutes to verify. is this original research? Thewriter006 (talk) 13:39, 28 September 2022 (UTC)

Yes it is, don't do it. If your results are valid, perhaps you can just search existing publications for the figures you came up with, and then cite those publications. Do not add unsourced material to the article based on something you came up with yourself, even if you are the author of the definitive work on the applications of statistical analysis to chess. Mathglot (talk) 23:14, 28 September 2022 (UTC)
how many minutes is the cut-off here: is 10 seconds acceptable? Eg 'White has about, on average, a 7% increased advantage in these 90 positions (Evaluation is 0.1913) compared to the remaining 870 positions (Evaluation is 0.1790).' There's no source for 7%, but there is for 0.1913 and 0.1790. And then you can calculate for yourself 7% in 10 seconds. So 10 seconds is ok but 15 minutes is not. Hmmmm...what's the cut off? Or is the 7% even O.R. too?
P.S. This is chess960 not chess. ;) Thewriter006 (talk) 08:09, 29 September 2022 (UTC)
The whole % calculation assumes these are linear ratio scales, which is non trivial. So while the calculation may (to some extent) be easy, the interpretation may be nonsensical (which is why it should not be added) E.g. it also makes no sense to claim that going from 32 Fahrenheit to 48 Fahrenheit is a 50% temperature increase - as becomes blatantly evidente when we use the Celsius equivalent (going from 0 tot 8.9 Celsius) which would amount to an infinite % of temperature increase). Arnoutf (talk) 15:08, 29 September 2022 (UTC)
Agree with Arnoutf. Yes, there is an evaluation of 0.1913. Yes, there is another evaluation of 0.1790. Yes, 0.1913 is about 7% greater than 0.1790. But if you're not an expert in writing about chess960, you won't know know what an evaluation is, and what are valid ways to compare one evaluation to another. Which is why a reliable source should be making the comparison, and we should cite the reliable source. Jc3s5h (talk) 15:36, 29 September 2022 (UTC)
  • When we say that simple calculations are not OR, we are talking about very basic arithmetic - adding two numbers together, converting feet into meters… things that the average 10 year old would understand. Statistical calculations are not that basic. When in doubt, cite a source. Blueboar (talk) 16:32, 29 September 2022 (UTC)
    Agree with Blueboar. Couldn't have said it better myself. Altanner1991 (talk) 03:24, 30 September 2022 (UTC)
    I think usually it should be allowable to use the kind of calculation the source intends be used. For example, if citing a table that was written with the expectation that the reader would interpolate between values, and the value being looked up is in between two tabular values, it would be appropriate for the Wikipedia editor to interpolate. Jc3s5h (talk) 17:33, 30 September 2022 (UTC)
  • I collegially differ from my colleagues' remarks above about WP:CALC, and I believe it should go much farther than they suggest. Mathematics is a lot like a foreign language. It is not needful that anyone should be able to understand your calculations. All that matters is that someone should be able to understand it. We have some really top notch mathematicians on Wikipedia who can verify the more rarefied calculations for you. And indeed, in practice, articles within the scope of WikiProject Mathematics do rightly allow some pretty advanced maths, because it's impossible to explain mathematics successfully without examples and we can't rip off examples from textbooks because of copyright.
For example, our article on Tensor product of modules has a footnote that reads:

First, if then the claimed identification is given by with . In general, has the structure of a right R-module by . Thus, for any -bilinear map f, f′ is R-linear

I wouldn't expect a humanities graduate to follow that. But I put it to you that it is a good and valid way of verifying the claim it makes, and my position is that it does and should fall within the scope of WP:CALC.—S Marshall T/C 18:28, 30 September 2022 (UTC)

Avoiding original research in determining superlatives

I have been having a discussion with others on Talk:Longest flights about how to verify "lists of superlatives." The case in question is a list of the longest flight currently operated by each type of aircraft. Since all commercial flight data is available on various commercial websites, in principle this information is just a case of sorting, which I suppose is a routine calculation as allowed by this policy. However, the number of flights is large enough that in practice this is done by a user running a script every week to scrape all of the flight data off of a flight data website and then sorting to find the longest flights. Is there an accepted way to cite these claims so that they are verifiable and do not fall afoul of the no-original-research policy? CapitalSasha ~ talk 13:36, 3 October 2022 (UTC)

Thats synth because it requires making assumptions (like, did the carrier st one point offer a longer flight no longer offered?l. Superlatives in WP's voice should always be taken as OR. Masem (t) 14:04, 3 October 2022 (UTC)
That was my initial thought, but it was pointed out that the same criticisms apply to lists like List of tallest buildings that seem to be well-accepted. (No actual source is provided saying that The Marina Torch in Dubai is the 77th-tallest building in the world.) CapitalSasha ~ talk 16:31, 3 October 2022 (UTC)
Except there, there are clear standards for how building height us measured and separate listing of these buildings relative to each other. Adding a new building to a well defined list like that as long as the standards for measurement have been set is not the same issue. Masem (t) 16:58, 3 October 2022 (UTC)
Standards for flight length have also been determined though. They are measured by great circle distance. FlyingScotsman72 (talk) 01:19, 5 October 2022 (UTC)

Credit tally

High-quality source X says Actor Y made films (plural, number unspecified) for Studio Z. Within the context of a given Wikipedia article, the specific number is pertinent. Is tallying up the relevant credits in IMDb (or an authoritative print filmography) to specify the number a routine calculation or original research? 24.90.253.80 (talk) 00:40, 25 October 2022 (UTC)

Can we sidestep the question and instead provide a list of the films? That is, avoid saying Joe Film made three films for Studio Z and instead write something like Joe Film made several films for Studio Z: Amazing Alice, Bob's Business, and Carl v. Carol. This could be awkward if the list is very long, but there is a lower risk of an OR challenge from it.
If you need a number, then it's usually okay to find a filmography that lists the films and count them up. Searching in different places to find all the films you can carries a bigger risk (both in terms of policy compliance and in terms of getting the wrong answer, which would be a very big problem). WhatamIdoing (talk) 06:16, 31 October 2022 (UTC)

Talk:Trump

There is a discussion at Talk:Donald Trump#Airliner shot down that may benefit from editors familiar with this policy of No original research. Bob K31416 (talk) 10:05, 6 November 2022 (UTC)

Let's finally split PSTS to its own page

This has come up time and again, and I think we should just do it. I realize that this will entail an RFC and probably some hand-wringing over whether the same words on a separate page still say the same thing. I get it; change is hard, and we want to get this right. But on the other side:

  • the only reason this was ever in this page is because we wanted to tell people that Wikipedia is not a primary source, so we'd appreciate if editors didn't just make stuff up themselves and stick it in articles (i.e., "original research", as in "Wikipedia is not a publisher of original thought", with the numbered list beginning with WP:NOT "Primary (original) research"),
  • whether a source is primary, secondary, or tertiary doesn't have much to do with whether a claim is verifiable and therefore not original research (nothing that's actually verifiable is a violation of OR),
  • the concept of PSTS is important to multiple policies and guidelines, not just this one. Actually, not even mostly this one. The words primary, secondary, and tertiary do not appear anywhere in this whole policy except in the one ===subsection===.

Looking it over, there will have to be a few changes, but they all seem surmountable. I might try to mock this up in the sandbox later, but so far, it looks like we'll need a new nutshell for the split-off policy (the existing one doesn't mention PSTS at all), and we'll need to decide whether PSTS should be called a "core" content policy in Template:Content policy list, or if it should be list in "Other", next to BLP and NOT. There's also the more mechanical matter of repointing various shortcuts, but that's easy.

It looks to me ike the lead of the current page won't need a single word changed, and it's possible that nothing else will either, except to copy the existing PSTS subsection to another page. The new page would probably benefit from a couple of introductory sentences.

@GregKaye, this is partly inspired by your comments above, so I'd like to know whether you see any problems with this. What do you (all) think?

WhatamIdoing (talk) 21:35, 14 October 2022 (UTC)

No.
WP:PSTS is the fundamental core of WP:NOR. Satisfying NOR requires a balance of primary and secondary sources, as nicely laid out at PSTS.
Verifiability is another policy. If you think NOR and V should be merged, let’s return to WP:A (a very good idea but failed catastrophically due to poor change management).
“Primary” means “original”. It appears in the title. “Secondary source” is mostly every source that is is reputable and not primary.
PSTS is core policy. It is the meat of NOR. It immediately goes to source typing, which is essential in writing an encyclopedia as opposed to writing a random collection of facts. Wikipedia is the first. Google serves for the second. SmokeyJoe (talk) 23:25, 14 October 2022 (UTC)
The top of NOR says "The phrase "original research" (OR) is used on Wikipedia to refer to material—such as facts, allegations, and ideas—for which no reliable, published sources exist."
Therefore, I conclude:
  • reliable, published source exists – not OR
  • reliable, published source does not exist – OR
Note the complete absence of any words like primary or secondary in that definition. That's because they're not technically relevant to the question of whether a given claim is OR.
I disagree that any policy requires a "balance" of primary and secondary sources. This could only be true if you think that "balance" could involve zero primary sources, which is definitely the desirable "balance" for articles like Cancer. There is no reason for that article to cite any primary sources at all.
But even if you had the wrong balance of source types, the result wouldn't be "material—such as facts, allegations, and ideas—for which no reliable, published sources exist." It would just be another article with verifiable, non-OR contents that needed more work. WhatamIdoing (talk) 23:44, 14 October 2022 (UTC)
The top part, the current lede, is bloated and not very good. The early versions of the page began better. The part you quote is particularly poor. You conclusions suggest that you are going straight to WP:V.
You disagree that WP:PSTS requires a balance of primary and secondary sources? I’m astounded. It squarely does, and it is the most important part of this core policy, to require a balance of primary sources (sources of facts) and secondary sources (evidence of interest, and contextualisation of those facts). This policy establishes the need for the balance so clearly that it need not be repeated elsewhere.
Failure against PSTS usually means the article needs more work. Where the balance utterly fails, all facts no secondary sources, it is the extreme case covered by WP:N, which is an explicit WP:DEL#REASON and is regularly enforced. SmokeyJoe (talk) 23:54, 14 October 2022 (UTC)
Which sections are "most important" isn't really the point. The point is that the fundamental definition of "OR" has nothing to do with historiography. PSTS could continue being the most important policy even if PSTS's words weren't located on the same page as WP:CALC's words.
The early versions of the page don't mention PSTS at all. See, e.g., the first day:
---
Wikipedia is not
the place for original research such as "new" scientific theories.
From a mailing list post by Jimbo Wales:
If your viewpoint is in the majority, then it should be easy to substantiate it with reference to commonly accepted reference texts.
If your viewpoint is held by a significant scientific minority, then it should be easy to name prominent adherents, and the article should certainly address the controversy without taking sides.
If your viewpoint is held by an extremely small minority, then whether it's true or not, whether you can prove it or not, it doesn't belong in Wikipedia, except perhaps in some ancilliary article. Wikipedia is not the place for original research.
---
Two months later, it got its first mention of primary sources, and that was to say that "Wikipedia is not the place for original research such as "new" theories. Wikipedia is not a primary source."
If you start an article with only plain, simple, obvious facts and no contextualization, you are not engaging in original research. You're just not writing a very good article. The point made in the early versions was that Wikipedia is not a publisher of original research. PSTS is not really about that rule at all. WhatamIdoing (talk) 00:44, 15 October 2022 (UTC)
If you start with just plain simple facts, you’re taking a wild chance that what you are writing about is Wikipedia-notable. It is extremely poor advice to tell a newcomer that they can do this, even if they can. SmokeyJoe (talk) 01:59, 15 October 2022 (UTC)
PSTS isn't about notability, and it isn't written for newcomers. WhatamIdoing (talk) 03:03, 15 October 2022 (UTC)
PSTS is the foundation of WP:N, the requirement that each article has two secondary sources. WP:N covers the extreme end of applicability of PSTS, where secondary sources don’t exist, and to attempt to write on the topic can only violate WP:NOR. WP:N doesn’t limit content, it is only for deletion/merge decisions. If you want to limit coverage of a subtopic within an article due to lack of subtopic notability, the policy basis for doing this is WP:NOR, specifically WP:PSTS.
All core policy should be considered to be written for newcomers. There is a history of leading Wikipedians using policy editing to engage in high-language debates with each other, but these pages should instead been regarded as basic policy that should be amongst the first pages that newcomers are pointed to, as is the case. SmokeyJoe (talk) 03:21, 15 October 2022 (UTC)
I'm struggling to understand how WP:N's reference to secondary source could possibly be harmed if the exact same words are on a page with a {{policy}} tag at the top that's called Wikipedia:Primary, secondary, and tertiary sources instead of being located on a page with the same tag at the top that's called Wikipedia:No original research.
When editors want to limit coverage of a subtopic within an article due to lack of subtopic pertinence, the most commonly invoked policy basis for doing this is WP:DUE, but even if you like to invoke PSTS for this, I again fail to see how that goal could possibly be harmed by putting the exact same words on a separate page. WhatamIdoing (talk) 05:27, 15 October 2022 (UTC)
Harm?? WP:N has a foundation in PSTS was the point.
DUE is good for most cases. PSTS may be better sometimes, like when someone wants to add data that no source ever commented on. SmokeyJoe (talk) 08:37, 15 October 2022 (UTC)
Let's stipulate that WP:N has a foundation in PSTS.
What would happen to WP:N if we decided to WP:MOVE this page to a different title? Nothing, right? Not a single word of PSTS would change, and WP:N would not be affected at all.
What would happen to WP:N if we decided to put WP:SYNTH on a separate page? Nothing, right? Not a single word of PSTS would change, and WP:N would not be affected at all.
I suggest to you that cutting and pasting the text of PSTS to a separate page, also marked as policy, still linked straight there by WP:N, without changing a single word of PSTS, would equally have no effect on WP:N.
I am literally asking you to tell me what could possibly change if WP:N links to these exact words: "Wikipedia articles should be based on reliable, published secondary sources and, to a lesser extent, on tertiary sources and primary sources...." either
  • in a subsection on a policy page, versus
  • at the top of a policy page.
What difference does the location of the words make to WP:N? WhatamIdoing (talk) 15:42, 20 October 2022 (UTC)
The word “viewpoint” necessarily implies a secondary source in the historiographical meaning. Jimbo’s post says that others’ secondary source are required. SmokeyJoe (talk) 02:09, 15 October 2022 (UTC)
In practice, we don't use the historiographical meaning, and I don't think his famous comment about "viewpoints" has any connection. All he says about OR in that message is "Wikipedia is not the place for original research", and he says this at the end of this paragraph: "If your viewpoint is held by an extremely small minority, then _whether it's true or not, whether you can prove it or not_, it doesn't belong in Wikipedia, except perhaps in some ancilliary article", in a thread about (literally) whether a Wikipedia editor had proven Albert Einstein wrong about special relativity. Think about that. That is the origin of our rule against original research. It has nothing to do with the value of secondary sources. It has everything to do with crackpots making stuff up and trying to get it published as The Truth™ in Wikipedia. WhatamIdoing (talk) 05:21, 15 October 2022 (UTC)
I don’t know who is your “we”. Wikipedia should use the historiographical definitions because an encyclopedia is an historiographical work, as opposed to a science report, or journalism (the main competition).
Jimbo was responding to the late 1990s thing of many amateur physicists determined to publish their theories, anywhere. I think it diminished due to the arrival of good search engines, when they could search for their discoveries and discovery that they weren’t new at all. SmokeyJoe (talk) 08:42, 15 October 2022 (UTC)
The English Wikipedia uses articles from celebrity magazines and breaking news as the sole basis for articles. Either:
  • We don't use the historiographical meaning of secondary, or
  • We don't technically require true secondary sources.
Take your pick, but don't waste your time try to convince me that articles sourced entirely to WP:PRIMARYNEWS contain any source that a historian would, if looking back from even 20 years in the future, call a true secondary source.
If you feel like Wikipedia therefore isn't really an encyclopedia, then I won't contest your conclusion. Some may decry this and some may acclaim it, but regardless of individual opinions about whether it's desirable, it's a fact that we regularly accept articles that don't have any true secondary sources. WhatamIdoing (talk) 15:50, 20 October 2022 (UTC)
Cancer contains many primary sources. Note that source typing, primary vs secondary, is not inherent but depends on how the source is being used. SmokeyJoe (talk) 23:57, 14 October 2022 (UTC)
I didn't say that cancer doesn't cite primary sources; I said that it shouldn't. WhatamIdoing (talk) 00:35, 15 October 2022 (UTC)
It should. An article should standalone. It needs to define things, and give examples. These go to primary sources. All pure secondary sources, all opinion not facts, like running editorials containing running commentary assuming you already know the topic, do not make acceptable articles. Articles need both facts and contextualisation. SmokeyJoe (talk) 01:58, 15 October 2022 (UTC)
A meta-analysis is a secondary source. It is, at its heart, a mathematical calculation. Do you think that meta-analyses are "opinion not facts"? Or is it your opinion that it's not a secondary source, even though multiple reliable sources say that it is?
It is common in scientific articles to source facts to secondary sources. High-school chemistry textbooks are not primary sources for facts about chemistry. WhatamIdoing (talk) 05:31, 15 October 2022 (UTC)
If it’s a standard analysis, used in its standard way, then it is neither opinion, nor a secondary source, it is just standard data processing. If the analysis is new, or it’s use is not standard, then the applicability and interpretations are opinion. To better do this test, can I have some real examples?
It is common in scientific articles to find all sorts of atrocious referencing and other nonsense. Wikipedia should do better than some common things. Wikipedia should never reference high-school textbooks. Among other things, it is not the purpose of a text book to a reference work, but a teaching tool. SmokeyJoe (talk) 08:49, 15 October 2022 (UTC)
doi:10.1111/j.1471-0528.1990.tb01711.x is one of the most famous meta-analyses. The creative analysis comes in deciding which things to analyze, not in how one does the math. WhatamIdoing (talk) 15:57, 20 October 2022 (UTC)
I can’t agree that at its heart, a meta-analysis is a mathematical calculation.
“Opinion” is a simple typical example of a description of secondary source content. More generally it is anything that is transformative of the primary source information. SmokeyJoe (talk) 13:29, 15 October 2022 (UTC)
"I like dark chocolate" is an opinion, and it is not secondary material. WhatamIdoing (talk) 15:58, 20 October 2022 (UTC)
I think the question of whether WP:PSTS is both correct and good advice to newcomers needs to be resolved first. You appear to have a beef with PSTS. SmokeyJoe (talk) 02:02, 15 October 2022 (UTC)
My only concerns with PSTS are that:
  • It doesn't have much to do with editors making stuff up ("original research") and trying to cram it into Wikipedia ("Wikipedia is not a publisher of original thought"), so it doesn't belong on this page.
  • There are too few editors who understand that Wikipedia:Independent does not mean secondary.
Note that I haven't proposed changing a single word of PSTS. I just want it "physically" located on a separate page – a policy in its own right, not a subsection of a policy that doesn't mention PSTS at all outside of that one subsection. WhatamIdoing (talk) 05:35, 15 October 2022 (UTC)
“Making stuff up” is not the focus of intent of NOR to counter, but the creative combination of facts by editors.
I think your essays on source typing are excellent. I don’t know how splitting out PSTS would help there.
I think PSTS shouldn’t be split out because PSTS is the core of NOR. NOR needs to include source typing and the need to balance primary and secondary sources, as defined historiographically, as per the articles primary source and secondary source. SmokeyJoe (talk) 08:56, 15 October 2022 (UTC)
"Making stuff up" really is the focus and intent of NOR. SYNTH is all about editors making stuff up by saying "when I put this source next to that one, I get this new conclusion". The definition of NOR is "material—such as facts, allegations, and ideas—for which no reliable, published sources exist" – in other words, "stuff made up by editors" (or, in these latter days, stuff copied by editors from obviously unreliable sources). WhatamIdoing (talk) 16:00, 20 October 2022 (UTC)
No, because at best this would be a bunch of laborious re-arranging for not much or any benefit, but also because I can see this is premised on the same erroneous 'it is not OR if any source anywhere on Earth says the same thing' POV discussed to death here and here in these very archives.
PSTS is on this page because synthesizing primary sources into a narrative is a form of original research and hence forbidden. It matters not one iota that the individual statements are supported by the primary sources. Wikipedians are to cite secondary sources, not write them (something very similar is said at WP:MEDRS, but the general principle of relying on secondary sources applies everywhere).
Do not combine material from multiple sources to reach or imply a conclusion not explicitly stated by any source. Similarly, do not combine different parts of one source to reach or imply a conclusion not explicitly stated by the source. If one reliable source says A and another reliable source says B, do not join A and B together to imply a conclusion C not mentioned by either of the sources. (emphasis added) Crossroads -talk- 02:40, 15 October 2022 (UTC)
This statement: "PSTS is on this page because synthesizing primary sources into a narrative" does not happen to be factually true. This would be clear if you had been editing back in the day, or even if you just spent all day reading the archives and stepping through the history of the policy.
PSTS is on this page because editors were fond of saying that "Wikipedia is not a primary source", and then they had to explain what that meant, and since none of the pages except NOT, which was already a mile long, said anything about this, the longer explanation ended up here.
SYNTH is wrong whether you do it with primary sources or secondary sources or tertiary sources or a combination of any of them. There is absolutely nothing about PSTS concepts that is relevant for understanding SYNTH. SYNTH could get along just fine if PSTS had never existed (NB: the same cannot be said for other policies and guidelines), and SYNTH will definitely get along just fine if the exact same words are present, complete with the exact same policy tag at the top of the page, on a separate page. WhatamIdoing (talk) 05:41, 15 October 2022 (UTC)
BTW, it's not "laborious" at all to split that section out. You can see it at Wikipedia:No original research/PSTS with a few notes from me in red. Because PSTS is not integrated into NOR, or even mentioned at all outside the one section, then removing it from NOR would take about ten seconds, and setting up the separate page took only a few minutes. WhatamIdoing (talk) 06:00, 15 October 2022 (UTC)
I don't think the history particularly matters when it comes to reasons for how things should be now. Regardless of how easy it is to move the text, one of the most common forms of OR is misuse of primary sources. That's a rationale for keeping it here. Crossroads -talk- 00:24, 17 October 2022 (UTC)
  • My primary concern with the current PSTS section is that it focuses the reader on the wrong thing… it focuses on evaluating the source, rather than evaluating the text of our articles - ie what we write, based on that source. OR does not stem from the type of source being used, but what we do with that source.
One advantage of splitting the PSTS section off into its own policy/guideline is that we could expand it… explaining HOW to use primary, secondary and tertiary sources appropriately, and HOW to avoid using them inappropriately. Blueboar (talk) 13:11, 15 October 2022 (UTC)
Interesting, Blueboar. Evaluating the source, typing the source as primary or secondary, depends on how it is being used. Can you point to an example of where editors have had trouble with this?
How to use appropriately, that is always going to be an essay. While much of the current word count could be better explained in a dedicated essay, the core point has to remain policy, surely? SmokeyJoe (talk) 13:21, 15 October 2022 (UTC)
Of course it does. That's why the core point should be moved to a separate policy page. I have absolutely no intention of "demoting" PSTS from policy status. In fact, I think it would be more accurate for you to think about this as a suggestion to "promote" it as its own, separate, stand-alone policy. WhatamIdoing (talk) 16:08, 20 October 2022 (UTC)
I endorse that reasoning, and would strongly support putting it into its own page. I'm thinking of this absolutely excellent essay, which would belong in a split-off version of this policy. Wikipedia:Identifying and using primary sources DFlhb (talk) 15:02, 12 November 2022 (UTC)
  • I don't think we should eliminate discussion of PSTS on this page, as their basic definition is essential to understanding when OR comes up, but I do think we would benefit from a guideline page to explain in more depth what these sources are, how to identify them (we have that primary essary, but there should be similar advice for all three), inclusion of what Blueboar says above, that a work can be primary for one topic and secondary for another, there's no catchall here. Perhaps there's also consideration of how that type of page would intersect with the existing WP:RS. I agree we cannot completely separate PSTS discussion from OR, but I think the matters around PSTS need more than essay pages to flesh out. --Masem (t) 13:37, 15 October 2022 (UTC)
    Can you share an example of why you really need to understand PSTS to figure out if someone's putting stuff in an article that isn't in any source?
    Indisputable NOR violations include:
    • I can't see the Earth curving, so it's flat. (No reliable source says this, so it's an OR violation.)
    • This tweet says he got married today, and this other tweet says he's in City this evening, so obviously the wedding happened in City. (Straightforward SYNTH)
    • Paul is an actor.[source saying he's not] (Assuming no other reliable sources say this, it's an OR violation.)
    I don't need to know which of these are primary, secondary, or tertiary sources to figure out that these are NOR violations. Every single one of them would be a NOR violation no matter which type of source was claimed. WhatamIdoing (talk) 16:06, 20 October 2022 (UTC)
    On your idea of keeping some basic information, NOR already has a section Wikipedia:No original research#Related policies. There are similar sections in WP:V and WP:NPOV. We could add a similar summary of WP:PSTS to that section, or use a Wikipedia:Summary style approach to shorten what's in the ==Using sources== section, with a {{Main}} link to the new policy page. WhatamIdoing (talk) 16:11, 20 October 2022 (UTC)
  • This has felt out of place for a while. They are related, but not any more than WP:V and WP:NPOV. WhatamIdoing is right that we can link to related policies and keep a short explanation of whatever is relevant. Shooterwalker (talk) 02:16, 21 October 2022 (UTC)

"Secondary sources" extremely questionable!

The penetrating and annoying calls for "secondary sources" overlook the fact that these - if not illegally copied from some encyclopedia - are usually written by non-specialist journalists, and all too often with very little understanding and misleading interpretation of the facts. Perhaps you should take a look at the book "Factfullness". Hans J.J.G.Holm 2A02:8108:9640:1A68:31D1:F57B:3DCE:D718 (talk) 09:18, 22 November 2022 (UTC)

Welcome to Wikipedia.
Newspaper articles are generally primary sources. Desirable secondary sources include things like a history book published by reputable academic press, or a review article in a good scientific journal. Of course you can't expect to find scholarly sources for popular culture, but in every subject, we're hoping to find a source that provides some sort of analysis. Even a simple compare-and-contrast analysis is helpful, so that an article can say things like "This team won more games than that team" or "This was the first full-length color movie filmed in Ruritania". WhatamIdoing (talk) 04:46, 23 November 2022 (UTC)
Welcome to Wikipedia, Hans.
You need to appreciate the field you are in. Wikipedia is not journalism. Neither is it science. Wikipedia is history. The field is historiography.
In journalism and science, a primary source is quite a different thing to a primary source in historiography. Similarly, a secondary source. Read these articles, and their references. Unlike in journalism, a secondary source is not a second hand source. In historiography, a primary source contains the facts, and a secondary source contains opinion and contextualisation for the facts. A secondary source does not replace primary sources, but proves there are people who care about these facts, and gives contextualisation and other meaningful information about the facts. Through this construct, it follows that Wikipedia is not a collection of random facts.
- SmokeyJoe (talk) 09:35, 23 November 2022 (UTC)
Wikipedia is not a collection of random facts??? I'm rather skeptical of this assertion. It sure feels like "a collection of random facts" quite accurately describes WP. That's consistent with the idea that ideally, there should be citations for each claim. Jimbo may have envisioned that WP shouldn't be a collection of random facts, but IMO, WP rules tend to make the "collection of random facts" an accurate characterization of WP. Fabrickator (talk) 09:17, 3ade of grass December 2022 (UTC)
Random facts would be if I bought a 40 year old telephone book at a garage sale and started adding people's old addresses and old phone numbers into the encyclopedia. Random facts would be the precise height and precise 3D location of every single blade of grass on my front lawn, and a comprehensive list of every single piece of junk mail I have ever received since my birth. But I do not try to add these types of facts because I exercise good editorial judgment. Your comments indicate that you do not really understand the concept of random facts, and by extension, do not understand Notability. That's OK, unless you hope to keep contributing to Wikipedia articles. Cullen328 (talk) 09:49, 3 December 2022 (UTC)
The notability guideline only applies to the issue of whether the topic may have its own article, not to what content is suitable for inclusion in an article. Fabrickator (talk) 21:16, 3 December 2022 (UTC)

directly related to the topic of the article

Can someone point me to the content in the body of this policy that justifies the bolded wording? I see no need for it.

To demonstrate that you are not adding original research, you must be able to cite reliable, published sources that are directly related to the topic of the article and directly support[a] the material being presented.

It makes sense to guard against coatracking "off-topic" content into an article, but what does the wording above have to do with OR, rather than just to off-topic content? The essence of OR is "content not based on RS". It is another matter when unsourced or reliably-sourced content is placed in the wrong article. It just doesn't belong there. -- Valjean (talk) (PING me) 23:57, 5 November 2022 (UTC)

I have seen cases (but can't recall) where editors use a whole host of RSes to come to a conclusion about a topic where none of those RSes actually make that claim directly, typically trying to claim some statement must be included by way of analogy, or often in cases of controversial material that is not seen as controversial by RSes, by pointing out analogies of other cases or other types of faulty logic to make their case. That falls out of the WP:SYNTH aspect, which is its own part of the OR policy. Masem (t) 00:03, 6 November 2022 (UTC)
Yes, that is an abuse of sources that can certainly be OR. SYNTH is one type of such abuse. I see that as related to our reasonable requirement that sources must "directly support the material being presented." I'm referring to something else. -- Valjean (talk) (PING me) 00:37, 6 November 2022 (UTC)
I've never liked this sentence as I don't think its meaning is clear. The meaning of "directly supports" is crystal clear and at the heart of NOR. But how can a source "directly support" a statement yet not be "directly related" to it? The only times I've seen "directly related" employed in a content dispute is by someone who argues that even though a statement is explicitly provided by a source, the source as a whole concerns another topic and so isn't "directly related". I think this is a misuse of the sentence, but what is an example of a proper usage that wouldn't be equally served by just having "directly support"? Zerotalk 01:39, 6 November 2022 (UTC)
A hypothetical example would be that someone would want to argue that specific actions Russia has done are war crimes, by way of citing numerous academic sources that point out that other similar acts in past wars were considered war crimes (directly supporting the information), but not a word from RSes that state that Russia's acts are also considered war crimes. The editor is creating inappropriate OR that while the material directly supports the information, it does not directly reference the topic. Masem (t) 02:06, 6 November 2022 (UTC)
I would say that's a case where the source does not directly support the statement, but only provides a basis for an argument. The statement "Russia committed war crimes" requires the argument part, which would be OR. Zerotalk 03:49, 6 November 2022 (UTC)
I would find that there are editors that would state that saying "here's all these RSes that said if a country did X those are war crimes" to justify "Russia doing X is a war crime", justifying that the RSes talking about war crimes are "directly related", presuming we're talking an article like the Ukraine-Russian war. The lack of any source to connect "Russia doing X" to being a war crime is certainly a basis of argument but I've seen editors try to logic this approach on other topics. I know this is all covered by the principle of SYNTH, but that's what I'm seeing in the lede is trying to capture briefly the section of SYNTH in the lede. Masem (t) 15:25, 6 November 2022 (UTC)
As a second hypothetical but also of what I've reminded of what I've seen before, say we have a high quality RS that is a focus on a person X, likely a critique of their political or ideological position, which is 100% valid to use on the article about X. But within that we get a line like "Like Y, X shares (this view)." where Y is a different person that is only mentioned briefly in that context. In that case, that RS would not be sufficient to use to justify "Y has (this view)" on the article page about Y because the article, while mentioning Y, is not directly about the topic. Masem (t) 15:49, 6 November 2022 (UTC)
@Masem: I agree that we should devalue sources that support some text only by passing mention. Once I had a dispute about the use of a historical claim made in passing in a newspaper cooking column. The question here is whether the words "directly related" in the policy are intended to indicate this issue. If so, it isn't clear enough and needs expanding on rather than relying on editors to grasp the proper intention of those two words. Zerotalk 00:08, 7 November 2022 (UTC)
That's my point: You go to the body in the SNYTH and the "directly related" language is right there. It is not like that is magically appearing out of nowhere here. Masem (t) 01:49, 7 November 2022 (UTC)
User:Masem, that exact language is not found at the SYNTH section. The lead is the only place where that wording appears. Maybe you're thinking of some synonyms that mean the same thing? If so, please quote them here. -- Valjean (talk) (PING me) 15:17, 8 November 2022 (UTC)
Quoting from SYNTH ""A and B, therefore, C" is acceptable only if a reliable source has published the same argument concerning the topic of the article." Masem (t) 15:40, 8 November 2022 (UTC)
Okay, now I see what you mean. The wording in the lead, which is what I'm discussing, covers two things, whereas the SYNTH wording discusses only the first of the two.
LEAD: "sources that are directly related to the (1) topic of the article and directly support the (2) material being presented."
SYNTH: "source has published the same argument concerning the topic of the article."
Their mentions seem to be about very different topics. LEAD is about "related to the topic" and SYNTH is about the "same argument concerning the topic." The first is a meta aspect and the second is a very specific aspect. The first is about "Trump" in an article about Trump with no regard to any specifics, IOW the source must mention Trump. The second is about an intricate argument within the article about Trump, IOW the source must mention Trump and connect him to the argument about him. Slightly related, but not always.
The source should support the argument, and that requires it already is related to the topic of the article, IOW the part about "related to the topic of the article" seems superfluous. -- Valjean (talk) (PING me) 22:20, 8 November 2022 (UTC)

Suppose there is a Wikipedia article about Russia and consider the following case.

Russia targeted civilians in the war.[1] Targeting civilians is a war crime.[2]

where RS [1] is about Russia and RS [2] is not about Russia, yet it directly supports its sentence. This would be OR because it implies that Russia committed war crimes without an RS saying so. Having the phrase "published sources that are directly related to the topic of the article" would prevent this OR. Whereas just having the phrase "directly support[b] the material being presented" would allow the OR because the RS [2] directly supports the material that it is associated with, which is the sentence, "Targeting civilians is a war crime." Bob K31416 (talk) 09:47, 6 November 2022 (UTC)

That example is a textbook case of SYNTH. Ref [1] is a good source for the first sentence and ref [2] is a good source for the second sentence. Neither is problematic in isolation. However, the juxtaposition of the two sentences is clearly intended to tell the reader that Russia committed war crimes, which is not directly supported by either source. Zerotalk 10:08, 6 November 2022 (UTC)
I'm not sure whether you are agreeing or disagreeing with what I wrote. Could you explain more. Bob K31416 (talk) 10:42, 6 November 2022 (UTC)
In your example, the (unstated but clearly intended) conclusion that Russia committed war crimes is NOT directly supported by either of the sources. So this use of sources runs afoul of the "directly supports" rule (not to mention the SYNTH rule). It is not a case where the "directly related" part makes a difference. I'll poset that there is no case where the addition of "directly related" to the policy outlaws anything that is not already outlawed and, moreover, that the concept of "directly related" is too vague to be useful. Zerotalk 10:54, 6 November 2022 (UTC)
I think my original message refutes what you are saying, so I'll leave it at that, except to say that "published sources that are directly related to the topic of the article" has been a part of the policy's lead for at least 14 years and I have found it useful for understanding the policy. Bob K31416 (talk) 11:07, 6 November 2022 (UTC)
Bob K31416, your example is an excellent demonstration of SYNTH. That part of NOR is good and explains how one type of source abuse is covered by NOR. There are other types of source abuse that cause other problems. -- Valjean (talk) (PING me) 16:05, 6 November 2022 (UTC)
Val. Is this question, related to an ongoing discussion at Donald Trump's talkpage? GoodDay (talk) 15:18, 6 November 2022 (UTC)
User:GoodDay, it is triggered by that discussion, but because it is more of a policy question that has implications everywhere, I chose to discuss it here. We can't change policy at Talk:Donald Trump. If this results in a change that will affect that discussion, then we can deal with it there. -- Valjean (talk) (PING me) 16:00, 6 November 2022 (UTC)

Okay, lets approach this from a slightly different angle. Would we lose anything by eliminating that phrase? In what situation is that phrase actually necessary for THIS policy?

To demonstrate that you are not adding original research, you must be able to cite reliable, published sources that are directly related to the topic of the article and directly support[b] the material being presented.

How's that? I don't see that OFF-TOPIC is directly related to this policy. It's just off-topic and should not happen. Not all forms of source abuse are NOR. -- Valjean (talk) (PING me) 16:11, 6 November 2022 (UTC)

Perhaps this is the only policy level P&G that I know of that warns about using off-topic sources to try to justify content in articles. It is nutshell'ing this line in the body " "A and B, therefore, C" is acceptable only if a reliable source has published the same argument concerning the topic of the article." Masem (t) 16:14, 6 November 2022 (UTC)

It's explaining WP:SYNTH. We should not combine sources about A (the topic of the article) and B (sources not talking about the topic of the article) to imply C (some claim that B is somehow relevant to the article topic). Crossroads -talk- 23:06, 6 November 2022 (UTC)

This particular phrase, as written, does not explain SYNTH, though based on the archived discussions, I think it might have been intended to.
What it actually says is that editors shouldn't use sources that aren't about the subject of the article (e.g., do not cite medical journals while writing Box office, even if a journal article mentions box offices; do not use film industry magazines while writing SARS-CoV-2, even if a magazine article mentions that virus). A specific warning to "generally" avoid passing mentions was added around the same time. That line probably belongs to Wikipedia:Neutral point of view#Balancing aspects ("if all you can find is a passing mention, it probably doesn't belong in the article"), not to NOR anyway.
As these points are made elsewhere, and as the application is more general than absolute (e.g., if you are writing a sentence about the effect of pandemic lockdowns on movie theaters, you might cite a variety of sources about lockdown effects, and not exclusively sources that are primarily about Box office or SARS-CoV-2), I don't think that the words "are directly related to the topic of the article and" truly need to be in the first paragraph of the policy. SYNTH will still be 100% banned even if those exact words aren't in the lead. I am slightly inclined to remove those words, for less confusion and more concision. This should be understood as changing the wording but not the meaning of the overall policy. WhatamIdoing (talk) 02:19, 13 November 2022 (UTC)
I agree. I don't think it adds anything and also agree that nothing is lost by deleting those words. -- Valjean (talk) (PING me) 02:45, 13 November 2022 (UTC)

While I acknowledge the technical arguments for removing that text (in addition to keeping the text shorter, removing the text makes the policies more composable), I think there are practical benefits to keeping it. Synthesis is one of the more insidious challenges we have when building a neutral encyclopedia and whatever we can do to briefly explain our long-standing position to editors is beneficial. I agree that we should not bloat our policies yet IMO the clarity provided to our editors for this particular issue outweighs the minor loss of conciseness. Orange Suede Sofa (talk) 04:15, 14 November 2022 (UTC)

?? SYNTH is untouched by this. -- Valjean (talk) (PING me) 04:39, 14 November 2022 (UTC)
@Orange Suede Sofa: From the discussion here, it is clear that even highly experienced editors cannot agree on the purpose of those words. So far from clarifying anything, the evidence is that they are more confusing than helpful. SYNTH is far better described by its own section. Zerotalk 06:02, 14 November 2022 (UTC)
Bingo! The SYNTH explanation is good. We are not supposed to abuse sources by making content not backed by those sources. -- Valjean (talk) (PING me) 06:37, 14 November 2022 (UTC)
  • OK, I have a "directly related" question. Take a case where information changes over time. Take theory X which was originally, widely viewed as not true but was later found to be true. How do we deal with a case where a person/organization is declared by RSs to be wrong for supporting the theory but later RSs don't reverse that claim when the new information comes out? How does "directly related" apply? Consider Mr Smith's BLP says he was wrong when he claimed X. This is cited to sources that directly make the claim. A few years later understanding shifts on the topic. We don't have new sources saying "Mr Smith turned out to be correct". What should be done? One option would be remove the accusation. That might be OK but it kind of buries that the person was publicly declared wrong. Essentially this would be saying the original RSs are no longer due because they are not accurate. Another thing that might happen is an editor says "Mr Smith was found to be correct [source that says theory is true but doesn't mention Smith]. My feeling is this option is synth since the source didn't say "Smith was right". A third option is to simply state "Theory X has since been found to be correct [source stating X is true]". This is true to the sources but opens the question, is this a form of synth since it clearly implies Smith was right even though no sources state that. However, the actual claims are true to the sources and facts about the theory are, in my view, DUE when they relate directly to the nexus of Smith and the theory. Would removing "directly related" change this? Springee (talk) 13:50, 14 November 2022 (UTC)
I would say that in the scenario you lay out, a source saying that Mr. Smith is wrong should be considered obsolete… and thus no longer reliable except as a primary source for saying “X thought that Smith was wrong”. However… I would also argue that mentioning X’s obsolete opinion is UNDUE, unless that opinion is noted by more modern sources. Thus, the correct action is to omit the discussion of Mr. Smith’s rightness/wrongness all together. What people in the past thought of him is now irrelevant. Blueboar (talk) 15:03, 14 November 2022 (UTC)
  • The issue is that SYNTH is widespread, and we should include clarification of it in the lead of this policy to remind editors to avoid it. Removing the phrase would IMO result in more cases of SYNTH popping up across the encyclopedia, which would just waste editor time. I strongly support keeping it if the choice is binary between keep or remove, but I'd also support removing and putting a clearer explanation of SYNTH in the lead. DFlhb (talk) 14:07, 14 November 2022 (UTC)
    It won't have any effect on SYNTH, and it's not representing SYNTH anyway. The words being discussed say "directly related to the topic of the article".
    What that says is: Please go revert your edit today to Alt-lite, because the source you cited is "directly related" to reviewing a couple of books about Anti-fascism, which means that the source is not directly related to the subject of Alt-lite. Similarly, a bunch of sources in your edit to Mike Cernovich are "directly related" to other subjects and only mention him in passing, or not at all (example), so you should go revert that, too.
    If you care about SYNTH, you should be looking at the immediately previous sentence, which says "This includes any analysis or synthesis of published material that serves to reach or imply a conclusion not stated by the sources." WhatamIdoing (talk) 21:15, 14 November 2022 (UTC)
    Being "wikistalked" (I'm kidding obviously!) by an editor I respect is an honor :)
    Very fair point. That exact thought actually occured to me as I edited these articles after posting here. I see the limitations of my reasoning: there are many ways to use sources that aren't directly related, yet are used in a SYNTH-compliant way. I've reviewed the discussion on Donald Trump about the Iran plane thing (that prompted this) and there's likewise no issues with how the sources are used there.
    I support removing the "directly related" passage. DFlhb (talk) 22:07, 14 November 2022 (UTC)
    Thank you for the compliment, and also for reviewing that discussion, which I couldn't make myself read completely. WhatamIdoing (talk) 06:27, 15 November 2022 (UTC)
    More than a month after the last comment, I have removed these words from the policy. It appears that everyone who
    If anyone is concerned, please note that SYNTH is still 100% prohibited, and these words were not in or about WP:SYNTH anyway. Restoring these words should probably be contingent upon someone finding a reliable source that does verify the material in question, isn't directly on the subject of the article, and shouldn't be used because it's not directly about the subject of the article. WhatamIdoing (talk) 23:13, 22 December 2022 (UTC)
    @Crossroads has reverted it. Crossroads, please give an example of a situation in which these words help editors. For example, explain why this policy should require 100% of sources to be "directly related to the topic of the article", even though you reverted back in a source that is "directly about" a style of psychotherapy in an article whose topic is Anal sex. You made this policy say (again) that the only sources that can be accepted in the article Anal sex are the sources that are "directly about" the topic of the article – which is anal sex – and not any sources at all that are directly about psychotherapy, even if the sources that are technically "directly about" some other topic are reliable, appropriate, and directly support relevant content for this article. Is that what you really want to achieve? WhatamIdoing (talk) 01:15, 23 December 2022 (UTC)
    That source is talking "directly about" the topic of the article in that portion. That this phrase requires the entire source be about the topic, rather than forbidding SYNTH tangent coatracks of sources that don't mention the topic of the article, is debatable at best.
    If you want to change the text of the policy after a discussion with significant disagreement, you need to start an RfC. There have been quite a few very long discussions on this page, and just because editors who disagree haven't kept repeating the same points, that isn't license to change the policy. We work by consensus, not exhaustion of opponents. Crossroads -talk- 05:51, 23 December 2022 (UTC)
    But you can't use a source that is only directly about the topic of a paragraph in an article. The policy literally says that the source must be "directly related to the topic of the article". NB that it says all sources must be directly related to "the topic of the article", not "the topic of a paragraph, sub-section, or some other reasonable portion of the article". The difference between sensible practice and the literal wording is why we shouldn't have these words in the article.
    You pretty much have to choose:
    1. The current wording is good, it should be kept, and you will go revert your edit;
    2. The current wording is bad, it should be removed, and your edit should be kept; or
    3. The current wording is bad, but you don't want policies and guidelines to say what they mean.
    Take your pick, and let me know which one, okay? If you pick #3, maybe you'll write a ringing defense of misleading policies for a future RFC. WhatamIdoing (talk) 22:28, 27 December 2022 (UTC)

Maps, OR, and SYNTHESIS

Many maps displayed on Wikipedia don't mention any source in their description: e.g., Eastern Orthodoxy in Europe, Tunisian Arabic, Hispano-Celtic languages, Iberian shrew, 16th-century Basque literature, Portuguese language. I assume they are WP:OR, even though some of them look more or less correct. I wonder:

  1. Should we automatically remove WP:OR maps from articles? (at least if the author doesn't answer or cannot provide reliable sources)
  2. If we cannot find a free map that can perfectly replace the removed that, how should we proceed?
    1. Would it be OR to create a map based on various text sources? or on various other maps? For instance, File:Tunisian dialect 1.png says it was done "by comparing the comtemporary linguistic works about Tunisian". Similarly, a detailed map of Eastern Orthodoxy in Europe would probably require to combine many sources from various countries, sometimes contradicting each other, into one map.
    2. How to avoid Wikipedia:SYNTHESIS?

A455bcd9 (talk) 22:09, 25 November 2022 (UTC)

User-made maps definitely should include one or more sources where the information used to create the map was pulled from, but otherwise making maps from multiple reliable sources should not be seen as OR otherwise. Masem (t) 22:49, 25 November 2022 (UTC)
Isn’t this covered by WP:OI? Blueboar (talk) 22:56, 25 November 2022 (UTC)
WP:OI is below "What is not original research" so it's not clear. It only says: "Original images created by a Wikipedian are not considered original research, so long as they do not illustrate or introduce unpublished ideas or arguments". The maps I listed above do not stricto sensu introduce unpublished ideas or arguments. But they're not verifiable. On the other hand, WP:IMAGEPOL says that "user-made images may be wholly original" and gives two examples that don't provide any external sources: File:Conventional 18-wheeler truck diagram.svg and File:Checker_shadow_illusion.svg. So it seems that currently not all user-made images have to be sourced. So I would suggest adding something like:
  • "User-made images such as graphs, charts, drawings, and maps must conform with Wikipedia's policies of reliability and verifiability: they must include in their description references of the reliable sources used to create them."
What do you think?
@Masem: "making maps from multiple reliable sources should not be seen as OR" => in practice I find it hard to avoid OR and SYNTHESIS. For instance, here are two maps of Arabic dialects in Algeria: Map A and Map B. Assuming (for the sake of the argument) that the two maps come from equally reliable sources and you want to create a map of Arabic dialects in Algeria: which one do you follow? which one do you include in an article? how do you map dialects from one map to the other one? you must likely need to use your own knowledge to guess that "Saharan Ksouri" and "Sahel Algerian" (map A) are part of what map B calls "Algerian Saharan Spoken Arabic" and draw a map that is a SYNTHESIS of the two with borders that are somewhat in the middle between those of Map A and Map B? (it's not only a theoretical problem as it's a discussion we currently have in Talk:Arabic#Proposal_to_Remove_Two_Maps). A455bcd9 (talk) 10:41, 26 November 2022 (UTC)
When you have conflicting information from two or more reliable sources, you have to figure out how to resolve it the same way if that was in prose just as if it was in a map. A map that managed to include both prior maps without changing information from either would be find, but synthesizing wholly new lines from the combination of the maps would be inappropriate. Masem (t) 14:24, 26 November 2022 (UTC)
It's way easier in prose as you can say "X thinks that way and Y thinks the other way". In a map, you cannot include information from let's say 3 different maps. So if "synthesizing wholly new lines from the combination of the maps would be inappropriate" then it means that in practice you need to follow only one map/source.
Should we add this somewhere? A455bcd9 (talk) 14:35, 26 November 2022 (UTC)
While I don't doubt that SYNTH can be violated using a map, a blanket one-source requirement would severely damage the usefulness of maps. As an example, a historical map of a region is enhanced by indicating modern borders (indicated as such). This helps the reader to understand what the map shows. It shouldn't be necessary for the map maker to find a single source map that shows everything. Zerotalk 01:35, 27 November 2022 (UTC)
You're right @Zero0000: I think overlapping/overlaying different maps is fine (e.g., historical events and modern borders). It would not be WP:SYNTH as it does not "imply a new conclusion". But what about synthesizing maps? If we have two sources for a historical map of a battle, one saying that one regiment was in region A and another one saying that the same regiment was in region B: can we put the regiment somewhere in the middle between A and B? Or do we have to stick to one source? (alternatively, we could indicate both positions and mention "according to source x" and "according to source y" in the legend and description, which would then be equivalent to overlay = OK). I suggest adding the following to WP:IMAGEOR:
  • "User-made images such as graphs, charts, drawings, and maps must conform with Wikipedia's core content policies of verifiability and "No original research". They must include in their description references to the reliable sources used to create them. When multiple sources are combined to create an image, this image shouldn't describe something not explicitly stated by any source. Improper editorial synthesis include synthesizing new lines from the combination of two maps. Acceptable editorial synthesis include overlaying modern borders (indicated as such) on a historical map."
A455bcd9 (talk) 06:35, 27 November 2022 (UTC)
I'm dubious. WP:V and WP:NOR are complex policy pages and there is a danger of unforeseen consequences. Also, remember that articles are constructed by taking material from different sources and placing them beside each other. Mere juxtaposition is not synthesis. Synthesis is the drawing of original conclusions from the combination of sources, but trying to apply that to a map sounds like a good source of disputes. It is already open to editors to dispute that a map is accurate or suitable for a page, without the need for new rules. It would be ok to write that map makers should (not must, so as to not immediately disqualify many existing maps) record the reliable sources for their map. Zerotalk 07:45, 27 November 2022 (UTC)
I agree about the risk of unforeseen consequences. So what about: "User-made images such as graphs, charts, drawings, and maps should conform with Wikipedia's core content policies of verifiability and no original research: their file description page should mention their sources and they should avoid improper editorial synthesis."? A455bcd9 (talk) 12:19, 27 November 2022 (UTC)
  • I think this is complex and has a lot of facets. My two primary concerns here are:
(1) Reliable maps are typically copyrighted, and tracing them would create a copyvio. This severely limits the precision of user-made maps and it should stop us from using a single map as a source -- ever. I do feel that there are sound, copyright-related reasons why editor-made maps should always be compiled from several source maps.
(2) The scale of the source maps is a key consideration. For example, an historic building might appropriately be labelled on a 1:1,250 scale map, but if we label the same building on a 1:10,000 scale map, then we're implying that it's very important.—S Marshall T/C 16:53, 27 November 2022 (UTC)
(1) According to WP:IMAGEPOL: "User-made images can also include the recreation of graphs, charts, drawings, and maps directly from available data, as long as the user-created format does not mimic the exact style of the original work. Technical data is uncopyrightable, lacking creativity, but the presentation of data in a graph or chart can be copyrighted, so a user-made version should be sufficiently different in presentation from the original to remain free." So I think the copyright issue is okay. Also, if you assume that there's a copyvio risk, compiling several source maps wouldn't remove that threat, it would just increase it.
(2) This is probably already covered by WP:WEIGHT? "This rule applies not only to article text but to images, wikilinks, external links, categories, templates, and all other material as well." A455bcd9 (talk) 17:53, 27 November 2022 (UTC)
Not all reliable maps are copyrighted, if they are just presenting factual data. They can copyright aspects like color choices or the like, but elements like roads, towns, borders, etc are not copyrightable as raw data. Masem (t) 18:13, 27 November 2022 (UTC)
The Ordnance Survey begs to differ from you! Their maps, at least the ones published before 2015, are Crown Copyright. A map isn't raw data: it's a large quantity of data compiled and presented in a human-readable way.—S Marshall T/C 18:58, 27 November 2022 (UTC)

(ec) I agree with the good points made by Masem, S Marshall and Zero. Also, maps are highly useful in articles and a new extremely strict interpretation would virtually eliminate maps in Wikipedia. I think that the status quo strikes a good balance. Finally, having gotten into it pretty deep on the IP side on a few articles with some of the best wiki IP experts, maps that pass an extremely broad and strict interpretation of synthesis are likely to violate a strict interpretation of IP laws because they are copying, not transforming the scheme/content. IMO the best approach is the status quo.North8000 (talk) 18:05, 27 November 2022 (UTC)

@North8000:
  1. I'm not suggesting "a new extremely strict interpretation". Just to add that maps shouldn't be OR. For instance, we've been discussing in Talk:Arabic#Proposal_to_Remove_Two_Maps since August 2022 to remove a map that was pure OR. The debate would have been settled long ago if we had an explicit policy on this.
  2. Is the "status quo" satisfying? I don't think so: there are poor quality maps that are pure OR, unverifiable, and full of errors in important articles: Eastern Orthodoxy in Europe, Tunisian Arabic, Hispano-Celtic languages, Iberian shrew, 16th-century Basque literature, Portuguese language.
  3. On the other hand, I don't suggest any strict interpretation about WP:SYNTHESIS. For instance, maps in FAC Punic Wars and Abd al-Rahman ibn Muhammad ibn al-Ash'ath are great and all correctly sourced: File:First Punic War 264 BC v3.png, File:Sicilia - prima guerra punica key en.svg, File:First Punic War 237 BC.jpg, File:Map of Rome and Carthage at the start of the Second Punic War 2.svg, File:Second Punic war (cropped).png, File:Caliphate 750.jpg, File:Iraq ca. 875.svg. I would love all maps on the English Wikipedia to follow the same standards.
A455bcd9 (talk) 18:17, 27 November 2022 (UTC)
I think that it's unfortunate that we're blending OR and synthesis in the same discussion. OR is where the "synthesis" term comes from but it is also a duplication/reiteration of wp:ver. If something looks questionable from an accuracy standpoint it should be challengeable / removable from the article. This has more of a relation to wp:ver but by necessity a less strict application of it for images . North8000 (talk) 18:28, 27 November 2022 (UTC)
@North8000: you're right. That's why at the very least I think we should add "User-made images such as graphs, charts, drawings, and maps should conform with Wikipedia's core content policies of verifiability and no original research: their file description page should mention their sources." Then, the issue of synthesis is more complex... A455bcd9 (talk) 18:31, 27 November 2022 (UTC)
I think that there is a flaw in that (core content policies are for text so this would be modifying them, not applying them) and also that it would go too far. But something that implements what I described in my previous post might be good.North8000 (talk) 18:41, 27 November 2022 (UTC)
Core content policies are not for text only @North8000:
  • WP:OR's introduction: "The prohibition against original research means that all material added to articles must be verifiable in a reliable, published source, even if not already verified via an inline citation."
  • WP:V's introduction: "All material in Wikipedia mainspace, including everything in articles, lists, and captions, must be verifiable."
A455bcd9 (talk) 18:49, 27 November 2022 (UTC)
  • I think there needs to be citations provided when they are used on wikipedia otherwise anyone with a good graphics design capability can make any map and make it look professional, while being deceptive to readers. The readers have a right to know where this stuff is being generated from. I have experienced this on graphs on demographics and other kinds of research.Ramos1990 (talk) 18:37, 27 November 2022 (UTC)
  • As has been mentioned before, WP:V and WP:NOR are complex policy pages; any change to them should be done for the right reasons and not just to settle an ongoing dispute (which is the prime purpose of this discussion). M.Bitton (talk) 21:33, 27 November 2022 (UTC)
    Please Wikipedia:Assume good faith: the purpose of this discussion isn't to settle an ongoing dispute but for me to better understand how core content policies apply to maps. By the way, based on the above discussion, updating these policies isn't necessary stricto sensu as we already have:
    • WP:OR: "The prohibition against original research means that all material added to articles must be verifiable in a reliable, published source, even if not already verified via an inline citation."
    • WP:V: "All material in Wikipedia mainspace, including everything in articles, lists, and captions, must be verifiable."
    • WP:IMAGEPOL: "Diagrams and other images [...] In such cases, it is required to include verification of the source(s) of the original data when uploading such images."
    • Help:File description page: "What pre-existing sources (free images, photos, etc.) were used as inputs?"
    • WP:CITE: "For an image or other media file, details of its origin and copyright status should appear on its file page."
    • WP:RS: "Like text, media must be produced by a reliable source and be properly cited."
    But it would be easier to make things more explicit when it comes to user-created illustrations. A455bcd9 (talk) 22:01, 27 November 2022 (UTC)
    Assuming good faith (which I do) does not prevent one from telling it like it is, especially when the consequences are far reaching. This whole thing (presented from a well chosen angle) is about you trying to settle an ongoing dispute in which you are involved. You're not just trying to understand how the policies and guidelines work, you're trying to alter them. M.Bitton (talk) 22:10, 27 November 2022 (UTC)
    No, that's not the case. But if you want to believe it, I'm afraid I can't do anything to prove my good faith. I feel sorry for you. Cheers, A455bcd9 (talk) 22:25, 27 November 2022 (UTC)
    I feel sorry for you. There's no reason to make personal comments. M.Bitton (talk) 00:44, 28 November 2022 (UTC)
    I apologize @M.Bitton. A455bcd9 (talk) 07:11, 28 November 2022 (UTC)
    Thank you. Apology accepted. M.Bitton (talk) 19:59, 1 December 2022 (UTC)
A455bcd9 (talk · contribs · deleted contribs · logs · filter log · block user · block log)
@A455bcd9: As I just said on your user talk, I don't see a consensus for removal here, so it is premature and unsupported for you to be going through the 'pedia and removing all these maps, with only the edit summary of, "OR". What I see here is not only a lack of agreement with you, but you continually repeating yourself. I suggest you WP:DROPTHESTICK and move on to something more productive. - CorbieVreccan 22:26, 27 November 2022 (UTC)
Hi @CorbieVreccan: I indeed removed a few images that didn't have any sources on Commons and marked others with Template:Imagefact (the equivalent of "Citation needed" for images). I'll stop doing it right now as I'm afraid your message may be a threat but I do not understand why my actions were problematic. I initiated this discussion to better understand our policies and how they apply to maps. During this discussion, it appeared that WP:OR and WP:V apply to "all material" and that WP:IMAGEPOL, WP:CITE, and WP:RS already require images to cite references. So I got my answer. (the question about WP:SYNTHESIS remains though) Please let me know if I misunderstood something. Cheers, A455bcd9 (talk) 22:34, 27 November 2022 (UTC)
I would definitely not go removing images on that basis. If there is something appears incorrect about a map I would bring that up in talk at the article and if it both appears that there is an error and it's unsourced, I think that it would be pretty easy to get an agreement to remove the image from the article. IMO that is the defacto status quo for images/diagrams etc. in the fuzzy Wikipedia ecosystem. I also don't agree that your 22:25, 27 November 2022 response refuted what I said. Two of the five items listed do not even realte to what you are asserting. Also, when I said that the core policies are written for text I meant in their structure and how they are applied in the fuzzy Wikipedia ecosystem. For example, a piece of text may "contain" 2-3 statements; a map may "contain" thousands of statements and we can't be saying that it fails because one of those thousands is unsourced.North8000 (talk) 23:24, 27 November 2022 (UTC)
Yes, I added {{Imagefact}} most of the time when I identified a problem. Sometimes I removed the image, I may have been too WP:BOLD... However, I don't think "that is the defacto status quo for images/diagrams etc. in the fuzzy Wikipedia ecosystem": for instance, WP:FAC requires all images to be properly sourced for the article to be promoted. Also, what are the "Two of the five items listed [that] do not even [related] to what [I was] asserting"? A455bcd9 (talk) 23:33, 27 November 2022 (UTC)
I feel like this whole thing is based on a misunderstanding of the policy. Wikipedia:Uncited does not mean unverifiable. All material needs to be verifiable, with the emphasis on the "-able". That means that someone (not necessarily you) is able to verify that it came from some source. Cited material can be unverifiable; uncited material can be verifiable.
As for the rest, I fully agree with the concerns about unintended side effects. This is a difficult area and changes need not only careful thought, but a review of how it would affect dozens or hundreds of different subject areas. I see lots of potential for sweeping declarations to be rejected as stupid (we don't need a source cited for "the space on this map is called the Atlantic Ocean" or for "This graph shows an exponential curve"), and a gift to certain kinds of anti-science POV pushers ("You can't convince me that COVID-19 exists, so you can't use any diagrams of the virus's structure unless you jump through these bureaucratic hoops"). WhatamIdoing (talk) 00:55, 3 December 2022 (UTC)
@WhatamIdoing: yes, the question is "Does WP:V apply to user-made maps in the same way it applies to text?" (i.e. "must be verifiable", and if "challenged or likely to be challenged" => "must include an inline citation to a reliable source"). In any case, application of the policy shouldn't be stricter for images. (And there's also the issue of Wikipedia:SYNTHESIS, but that one is even more complex...) A455bcd9 (talk) 07:41, 3 December 2022 (UTC)
I think that images, including maps, should be examined with respect to the main reason why it's in the article, and not with respect to anything else. For example, imagine that an article says (in plain old text) that these three molecules combine to form a protein spike on the surface of a viral particle. The image is added to illustrate the idea of three molecules combining in a single structure. This image is not only verifiable, but already cited (in the article text). Objections about anything else (e.g., a silly objection over an educationally appropriate use of false color or a more serious objection to an error in a different/irrelevant part of the image) should be rejected. (Of course, if you can swap in an image that is equally good for the local purpose and also doesn't have an error in another part, then that's great.)
Maps have some of the same problems as any other image:
  1. Sometimes, the image already is a reliable source (i.e., not user-made).
  2. Strong sourcing is impossible for most user-created images. There are no reliable sources that say "Yup, the photo she uploaded to Commons really is the neighborhood she said that it is", nor any reliable sources that say "Yup, the map she uploaded to Commons really is the neighborhood she said that it is".
    Worse, even if an external source later endorses that image as being accurate, someone's going to claim that it's still all wrong and bad and unverifiable, because having a reliable source saying that the image is correct after you've uploaded it would be WP:CIRCULAR, and someone else will say that having the reliable source endorse the image before you upload it will be rejected as the source buying a pig in a poke. The game is rigged against image verification.
  3. Sometimes, the user-made image is the kind of "Paris is in France" or "Here is a street sign that says Maple Street" simplicity that we don't actually want sources for.
Maps have at least three specific problems:
  1. We shouldn't have different standards for a "user-drawn map" and a "user-labeled aerial photograph" that would convey the same information.
  2. In areas with geopolitical disputes, it's possible to cite any outcome you want.
  3. Sometimes, citing the image would require an unreasonable number of sources.
To give you an example of this last one, let me tell you that a couple of decades ago, long before Wikipedia existed, I saw a map being produced. The sources were basically an aerial photograph (which he was turning into a line-art drawing) and the signs on the buildings (which he used to label all the different buildings in the photo). If we had a copy of that on Commons, the list of citations would have been a photograph in the local history museum plus one {{cite sign}} for every business in town. That would be fully cited. The information even would have been true (at the time), rather than merely being just verifiable. But providing those sources would have no practical value.
In your dispute, I suggest that you spend a while thinking deeply about why you care about these particular maps. What would you personally do differently if the source(s) were added on the Commons page tomorrow? WhatamIdoing (talk) 21:26, 3 December 2022 (UTC)
Thanks for your answer @WhatamIdoing: I'm not sure what to conclude from it though 😅.
However, to answer your last point: I don't have a dispute. The question emerged from a discussion here about how to improve an existing map (that already has a source). Do we need sources to fix mistakes identified by contributors? What kind of sources? How to avoid WP:SYNTHESIS? etc.
And then, besides this specific example, I wondered what the general policy was as on the one hand, WP:FAC has strong sourcing requirements for user-made maps, while on the other hand you can find unsourced maps in many articles. A455bcd9 (talk) 22:18, 3 December 2022 (UTC)
Do we need sources to fix mistakes identified by contributors? We would first want to know whether there is any mistake. If a patriotic Indian editor makes a map that draws the line between China and and India in a different place than an equally patriotic Chinese editor would draw that line, there is no "mistake"; there is just an understandable and verifiable difference in their views of which country's claim to the territory around the Line of Actual Control should be represented in the map.
If there is a mistake that is truly a mistake, and you can provide verifiable corrections, then editors are generally happy to fix it. WhatamIdoing (talk) 02:50, 4 December 2022 (UTC)

Evidence-based-mapping

I've just discovered @Nederlandse Leeuw's essay Commons:Commons:Evidence-based mapping where they cite two deletion discussion precedents (Wikipedia:Templates for discussion/Log/2021 May 28#Template:Legality of zoophilia by country or territory and Wikipedia:Templates for discussion/Log/2021 August 16#Template:World laws pertaining to animal sentience) and conclude:
  • Maps that are based largely on original research (WP:OR) should be removed from English Wikipedia, and any templates which embed such maps should be deleted.
  • Maps are a visual representation of data, and data must be sourced (WP:UNSOURCED). Therefore, lack of sourcing is a valid rationale for deletion of templates that embed them, and removing such maps from English Wikipedia.
  • Maps that synthesise data from multiple sources in order to reach a conclusion not found in any source, or bring together data from multiple sources that are not compatible (e.g. population data in which children were only included in some sources), commit WP:SYNTH. Therefore, such maps may be removed from English Wikipedia, and any templates which embed such maps may be deleted.
  • Merely bringing together data from multiple compatible sources, without extrapolating one's own conclusions from them, is not prohibited in WP:SYNTH or anywhere else, and so no valid reason for removal of maps or deletion of map-embedding templates from English Wikipedia. Therefore, this is a valid way of making maps on Commons and using them on English Wikipedia.
  • No conclusion was reached about whether sources should be listed in the 'Source' parameter in the map's description page on Commons (as this essay recommends), inside the English Wikipedia article or map-embedding template in the form of references (as some Wikipedians argued), or both.
Is Nederlandse Leeuw's conclusion above representative of the consensus on the English Wikipedia? A455bcd9 (talk) 20:43, 29 November 2022 (UTC)
(found nothing at that link but let's assume that what you describe is in an essay) From a process standpoint, clearly no, an essay derived from two template deletion discussions is many levels away from the process that would be required to be considered to be a Wikipedia consensus. North8000 (talk) 20:54, 29 November 2022 (UTC)
Sorry, I've just fixed the link @North8000.
By the way, MOS:IMAGES also says: "Each image has a corresponding description page, which documents the image's source, author and copyright status; descriptive (who, what, when, where, why) information; and technical (equipment, software, etc.) data useful to readers and later editors. [...] Reliable sources, if any, may be listed on the image's description page. Generally, Wikipedia assumes in good faith that image creators are correctly identifying the contents of photographs they have taken. If such sources are available, it is helpful to provide them. This is particularly important for technical drawings, as someone may want to verify that the image is accurate."
Does this guideline for technical drawings also applies to maps ("as someone may want to verify that the image is accurate")? A455bcd9 (talk) 21:05, 29 November 2022 (UTC)
Your questions keep representing ginning up things that you find into things that they aren't and then asking for a "yes" answer that your creation is the rule. You already received your answer on what common and accepted practice is You should just listen to it. Sincerely, North8000 (talk) 21:21, 29 November 2022 (UTC)
But what is accepted practice @North8000? If I sum up other contributors' answers it seems to be something like "User-made maps should cite sources but we shouldn't modify the policies because it's a complex issue and we want to avoid unintended consequences":
  • Masem: "User-made maps definitely should include one or more sources where the information used to create the map was pulled from"
  • Blueboar: "Isn’t this covered by WP:OI?"
  • Zero0000: "It would be ok to write that map makers should (not must, so as to not immediately disqualify many existing maps) record the reliable sources for their map."
  • S Marshall: "I think this is complex and has a lot of facets"
  • Ramos1990: "I think there needs to be citations provided when they are used on wikipedia"
  • M.Bitton: "WP:V and WP:NOR are complex policy pages"
A455bcd9 (talk) 21:52, 29 November 2022 (UTC)
Just chiming in here as A455bcd9 tagged me and cited part of my Commons essay "Evidence-based mapping". On the whole, North8000 is correct that the essay as such is just that: an essay. However, what I sought to do is take the first step towards a more comprehensive policy on the requirements of basing maps on evidence. For that, I've gathered all sorts of English Wikipedia & Wikimedia Commons policies, guidelines, rules, precedents, conventions, plus my own suggestions, in that text; these have widely varying levels of consensus, from strong (official policies) to weak (my suggestions). And the excerpt quoted by A455bcd9 is derived from AfD precedents, which reflect a kind of jurisprudence consensus, if you will. And the first two points cited therein answer A455bcd9's first question in the positive: Should we automatically remove WP:OR maps from articles? (at least if the author doesn't answer or cannot provide reliable sources). Yes, we should. The two precedents clearly state we should per WP:OR and WP:UNSOURCED.
To answer A455bcd9's second question Would it be OR to create a map based on various text sources? or on various other maps?: It depends on whether all sources for all the data that have been used for the map are all mentioned clearly and accurately, whether these data are compatible, and as long as no conclusions are drawn that are not mentioned in any of the cited sources per WP:SYNTH (except for simple calculations per WP:CALC). This is also stated in the third and fourth point of the quoted essay excerpt. These are not just my opinions, these are relevant map precedents based on some of English Wikipedia's core policies. Cheers, Nederlandse Leeuw (talk) 22:03, 29 November 2022 (UTC)
PS: To add to that: some of the templates that were nominated for deletion embedded maps that I had created based on multiple but compatible, clearly and accurately cited sources. The community approved of my evidence-based mapping practices, and kept those templates which embedded my properly sourced maps, and deleted a bunch of templates embedding maps with no or poor sourcing.
Also note the difference between English Wikipedia and Commons: Although it takes a very high threshold to have a map deleted from Commons (as it values free artistic expression without copyright violation above all else, including accuracy and verifiability), the threshold for removing an unsourced OR map from English Wikipedia is very low (as it values accuracy and verifiability above free artistic expression). In other words, you can make all the maps you want without even trying to support your claims with sources and then upload them to Commons, but don't expect English Wikipedia to accept your rubbish maps inside its articles and templates. Cheers, Nederlandse Leeuw (talk) 22:18, 29 November 2022 (UTC)
PPS: I should add that I am not in favour of automatically removing unsourced maps, but on a case-by-case basis. After all, if a user sees an unsourced map, but thinks it is valuable, they can look for and add the sources the map was probably based on or could be based on, perhaps improving the map in the process. I've been in that position several times. Unsourced maps aren't necessarily worthless (so I agree with Zero0000 that existing maps should not be immediately disqualified), but their presence on English Wikipedia should be tolerated with caution, and sourcing should be provided ASAP, depending on how important/controversial the unsourced information is. Unsourced maps with highly controversial information, e.g. about what is legal or illegal to do in a certain country or region and thus could influence the behaviour of people who read Wikipedia, should be removed immediately according to the zoophilia AfD. Cheers, Nederlandse Leeuw (talk) 22:39, 29 November 2022 (UTC)
@Nederlandse Leeuw: yes, WP:V should apply to images in the same way it applies to text: we don't automatically delete unsourced (or poorly sourced) sentences. We would first tag them with Template:Citation needed, try to find a source, modify the text to match the sources found if necessary, or start a discussion on the talk page. (Unless the unsourced text is obviously inaccurate and in that case it's better to be bold and delete it right away. For images, we can also immediately replace an unsourced one by a sourced alternative.). A455bcd9 (talk) 07:13, 30 November 2022 (UTC)
Yes, generally speaking I agree with you. Of course, there is not always an alternative image/map readily available, and not everyone has the know-how or time to create one themselves. Until 2018 I always used Microsoft Paint (a simple but primitive and outdated programme as far as mapping is concerned) and switched to InkScape (which I still don't fully understand, but at least produces .svg images that are generally easy to edit, scale, translate etc.). Since then I've been able to produce some quality evidence-based law-related maps (if I do say so myself, but last year's AfDs confirmed the community appreciated my approach above the unsourced and controversial law-related mapping path). In that sense, replacing bad unsourced texts may be easier than replacing bad unsourced maps, although granted, writing text on English Wikipedia also requires some skill that many people do not have: sufficient mastery of the English language, a Wikipedia-like encyclopedic style and tone, and complying to all our policies, guidelines, precedents, conventions etc. Therefore, a case-by-case approach is best: when a map (or a text) is unsourced, but plausibly accurate, we can (A) try and fix it ourselves, or (B) add a {{datasource missing}} or {{citation needed}} template respectively in hopes that the creator/author or another user will come along to fix it for us, unless (C) the map or text is so bad that it should be removed immediately. Law-related maps might perhaps be given a similar status to texts in a WP:BLP; people reading Wikipedia should be able trust maps about what they are and aren't legal to do in a given country or region, especially in the domain of criminal law. Cheers, Nederlandse Leeuw (talk) 12:14, 30 November 2022 (UTC)
PS: With regards to maps portraying the distribution of the usage of languages or the adherence to religions per given territory, these tend to be cases where immediate removal is not necessary or warranted, depending on how inaccurate or unsourced / badly sourced the information is. When I encounter a map like File:OrthodoxyInEurope.png (your first example given), my standard approach is (B): adding a {{Datasource missing}} template if no sourcing or evidence is given, and perhaps even a {{Inaccurate-map-disputed}} template if I have got evidence to the contrary. But in order to be (C) removed immediately from English Wikipedia, I must have indications that the map is significantly misleading, perhaps intentionally, e.g. for reasons of ethnic, religious or linguistic nationalism (incompatible with WP:NPOV and WP:SOAPBOX). For examples of this kind, I refer to relatively recent AfDs: Wikipedia:Articles for deletion/Eastern Orthodox Slavs Wikipedia:Articles for deletion/Muslim Slavs Wikipedia:Articles for deletion/North Slavs. File:OrthodoxyInEurope.png might also be removed from English Wikipedia for this reason if there are indications that this map was made not for informative or educational purposes, but for some sort of political message about how all people in a given territory "belong" to a certain religion (rather than a majority or a specific percentage, which is not indicated anywhere in the description), which might or might not be the case here. Cheers, Nederlandse Leeuw (talk) 12:44, 30 November 2022 (UTC)
Yes, I agree File:OrthodoxyInEurope.png seems okay-ish. The worst part is that there's no legend, so it's unclear why some areas aren't colored (e.g., Tatarstan in Russia) even though they have large Orthodox populations (but probably not the majority of the population). On the other hand it's super fine-grained in Bulgaria to exclude towns with Bulgarian Turks. Ideally we would create a new map citing WP:RS.
Anyway, have we reached a consensus on the following questions:
  • Does WP:V apply to user-made maps?
  • Does WP:OR apply to user-made maps?
    • If yes, in practice, how to avoid WP:SYNTHESIS when building maps?
A455bcd9 (talk) 16:48, 2 December 2022 (UTC)
The roads projects used to use a process to create SVG maps similar to Wikipedia:WikiProject U.S. Roads/Maps task force/Tutorial, based on GIS data. At FAC they sometimes did ask us if the sources were declared on the Commons page, and that was generally it. I appreciate that this probably will not work for every application described here. (And I say used to, because since then we have shifted to dynamic maps). --Rschen7754 19:23, 3 December 2022 (UTC)
  • The answers to A455bcd9's questions are:-
1) Yes, WP:V and WP:OR are core content policies that apply to everything displayed on a rendered mainspace Wikipedia page.
2) If the image is hosted on Wikipedia and you think it violates a core content policy, either move it to Commons or begin a FFD.
3) If the image is hosted on Commons and you think it violates a core content policy, unlink it so it doesn't appear on a Wikipedia page. Unlinking is a bold edit to which WP:BRD applies, so if you're reverted, do not counter-revert but proceed to the talk page.
4) You probably weren't going to do this at all, and it's probably quite needless for me to say it, but long experience of Wikipedia is forcing me to type this out: proceed slowly and don't begin a map-related campaign or crusade of any kind.
5) WP:SYNTH is where you combine sources to reach, imply or suggest a conclusion that's not contained in any of those sources. Synth is a problem when, for example, editors combine statistics from two different studies. You should only do that if the studies are comparable (used a similar method, took place at a similar time, covered a geographically similar area, etc.)
That last paragraph needs some elaboration.
5a) When applying WP:SYNTH we need to bear in mind that editors must combine sources. WP:N requires multiple sources covering a topic. Editors are supposed to read all the sources, evaluate which are the best, and then generate an encyclopaedia article that summarizes what the sources say. We must allow editors to do this.
5b) SYNTH is only a problem if it leads the reader towards a novel conclusion that isn't found in the more reliable of the sources.
5c) An encyclopaedia article is an easily-readable summary. This means editors have to make their content accessible to the general public. A map that appears in an encyclopaedia article can and often should be a simplified version of more complex maps from the sources.
5d) Editors aren't allowed to trace copyrighted maps. This is more of a problem in some parts of the world than others, but for example the standard and most reliable maps of the UK, those by the Ordnance Survey, are often Crown Copyright. Because we can't trace maps, some inexactitude has to be tolerated.
Hope this helps.—S Marshall T/C 23:28, 3 December 2022 (UTC)
Thanks @S Marshall. I have the same understanding.
WP:SYNTH is harder to apply to maps. I think it means in practice following one and only one source for a given area. Because if one source says that the % of Eastern Orthodox believers in an area is 10% and another source says 20% for the same area: you cannot say it's 15% on the map and you have to choose one source. The most reliable and/or most recent for instance. Then you can do a patchwork of different reliable sources to cover different areas, as long as these sources are comparable (similar method, definition, place in time, etc.).
Regarding copyright, I asked the question on Commons. Although some complex maps are copyrighted (e.g., Ordnance Survey in the UK) it seems that we can retrace information contained in simple maps on a new free background as "it is a fundamental tenet of copyright law that what is protected is the creative expression of ideas, not the ideas themselves". It's not 100% clear though... A455bcd9 (talk) 06:48, 4 December 2022 (UTC)
I disagree with you about single sources. I still think combining sources is the ideal way to work.—S Marshall T/C 09:45, 4 December 2022 (UTC)
I agree that combining various reliable sources can be great. For instance, on this historical map: I assume the borders come from one source, the dynasties from another, etc. Or on File:Arabic Varieties Map.svg (requested by me): different sources support each area+dialect combination.
However, in many cases, I don't think combining sources for the same area and topic is possible. For example, if I want to update File:OrthodoxyInEurope.png using reliable sources, I would find one recent RS for Russia (e.g., the source used here), one for Ukraine, one for Belarus (e.g., the source cited here), etc. If there are two sources for Russia, I wouldn't display the average or the median of their values but choose the most recent and/or reliable one. If I had one more recent RS for one specific federal subject of Russia, I wouldn't use it either for consistency with other Russian regions. (Unless values for that federal subject are missing from the national source, or grossly inaccurate.) What do you think about this approach? A455bcd9 (talk) 13:05, 4 December 2022 (UTC)
I think that approach corresponds fairly exactly with what I've been suggesting as best practice. OrthodoxyInEurope.png is a mosaic made up of other sources, and where the sources don't agree, you evaluate which one is most reliable and follow that one.
I think a very high proportion of our maps are in articles about geography and history: particularly military history, which has a large and productive taskforce. A post on WT:MILHIST might attract other interested editors.—S Marshall T/C 20:13, 4 December 2022 (UTC)
Thanks, I agree 100% then. It may be unnecessary to post there, as you've already summed up the situation well. Cheers, A455bcd9 (talk) 20:39, 4 December 2022 (UTC)

just dropping in. FYI, This discussion is one of about 3 or 4 that is going on simultaneously that is directly or tangentially touching the issues the essay WP:MAPCITE was intended to address. The good news is so far the conclusions reached here seem to be in agreement with the contents of that essay, so I don't think any action is required. But just to make everybody here aware that essay exists, and revisions are being debated to it in a similar vein to this discussion.Dave (talk) 14:45, 5 December 2022 (UTC)

I'm in full agreement with the answers S Marshall has given to A455bcd9's questions, I'm glad to see they also agree with each other and that Dave agrees with all of us. Well, how constructive haha! Good to see WP:MAPCITE is getting a revision as well. If you happen to have any suggestions for Commons:Commons:Evidence-based mapping (which A455bcd9 and I have recently expanded and refined as result of this discussion), you'd be welcome. Cheers, Nederlandse Leeuw (talk) 23:05, 6 December 2022 (UTC)
We should really add this from @S Marshall to a policy page or essay somewhere: When applying WP:SYNTH we need to bear in mind that editors must combine sources. WP:N requires multiple sources covering a topic. Editors are supposed to read all the sources, evaluate which are the best, and then generate an encyclopaedia article that summarizes what the sources say. We must allow editors to do this. 5b) SYNTH is only a problem if it leads the reader towards a novel conclusion that isn't found in the more reliable of the sources. cc @Rschen7754 Andre🚐 05:02, 6 January 2023 (UTC)


Cite error: There are <ref group=lower-alpha> tags or {{efn}} templates on this page, but the references will not show without a {{reflist|group=lower-alpha}} template or {{notelist}} template (see the help page).