Wikipedia talk:Manual of Style/Mathematics/Archive 5

Page contents not supported in other languages.
From Wikipedia, the free encyclopedia

Can cursive theta be substituted?

Is it acceptable in all mathematical notation to use U+03B8 θ GREEK SMALL LETTER THETA instead of the cursive variant U+03D1 ϑ GREEK THETA SYMBOL? They are visually distinct, but presumably the actual Greek letter is easier to type because it is on the Greek pulldown for desktop editors, it's on some people's physical keyboards, and it's on my phone's keyboard because I enabled the Greek language for when I'm editing technical articles. The fewer variants we have the easier it is for readers and editors to search articles and for search engines to index them. I've never run across the cursive variant outside of Wikipedia, but I'm an engineer and not a reader of mathematical papers. For some math articles just talking about angles, I've converted the cursive form to the better-known form, for consistency and clarity. For a general audience, I'd say the non-cursive theta is clearer since it's familiar from trigonometry and is less likely to be confused for a cursive d or something.

There are only a few math articles that use ϑ in HTML markup, namely:

and a bunch not listed above use only \vartheta in <math>...</math> markup:

Something like Bicycle and motorcycle geometry definitely seems like it should be using a regular theta for clarity, but I'm not sure about the more obscure branches of math represented here.

Thanks! -- Beland (talk) 18:21, 7 February 2022 (UTC)

I think the answer is probably "yes you are welcome to use {{mvar|&theta;}} (θ) or {{mvar|θ}} (θ). Note that the "mvar" version of the character is more ledgible than the (usual) sans-serif font's "θ".
However, the WP MOS forbids changing article styling of any kind (unless its forbidden by the MOS). Changing "ϑ" to "θ" is indeed a style change, and article authors and editors who chose "ϑ" will have a legitimate gripe against you, and watchful editors may decide to revert your edits because it noticeably and directly violates the "preserve existing, consistent style" rules.
On the other hand, if you converted "ϑ" (U+03D1) to &vartheta; (which also renders as "ϑ") and clearly and succinctly explain that it was to enable successful searches for "theta", then I can see no thoughtful justification for an objection: The rendered character, after all, is identical, and the motivation to enhance searchability is legitimate, although not overwhelming.
However, some Wikipedia editors are neither thoughtful nor reasonable, but rather reactionary zealots who jealously guard "their" articles. There will be objections from those quarters. It might help if you specifically post warnings and reasons on the Talk pages a month or two ahead of time (and not via links to a central article – only identical, independent text in each); you might even embedded very short comment text at the first two or three occurrences of the change to &vartheta;, which also call the attention of the editor inspecting the change to the slightly less brief explanation already inserted in each article's Talk page a while back.
Slow, patient, extended diplomacy is the key, and the first rule for good relations is "no surprises".
You should never replace LaTex with HTML, ever. Leave \theta and \vartheta alone. I'm not certain, but I think that searches on the character U+03B8 (θ) will miss anything in HTML symbolic form (&theta;), it might also miss all of the LaTeX \vartheta and \theta symbols. However, will a search on "theta" pick all of them up (except the Unicode charachters)?
My off-the-hip guess there's a 70% chance that a search on "theta" might catch all the HTML & LaTeX forms, since all of them contain the imbedded text "theta". You should actually test that before you make any changes, both for searches via the Wikipedia search bar and via your browser's "search in page" option (ctrl+F), to see if making the changes is worth your effort.
And again, whatever works or doesn't work for searching, you must never replace LaTeX with HTML or Unicode. Ever. Regardless of your respectable intention to make articles searchable.
Regarding readability, I actually think the variant form ϑ or ϑ or ϑ is better than the standard Greek language form (θ), particularly in the default sans-serif font. I suspect that was the motivation for using the script theta / Unicode SYMBOL FORM: It looks passable even without embedding it in either the "math" or "mvar" font template, so a little more convenient for some authors.
Astro-Tom-ical (talk) 16:21, 4 March 2022 (UTC)
"ϑ" vs. "θ" is more than a style change, since it affects searches. And in any case, not changing between styles is good advice (I'm not sure it's a rule anywhere?) only if there are multiple styles allowed; in this case for consistency reasons we should have one preferred style. For now I've changed all the article using the cursive theta in HTML markup (except where discussing the character itself), and we'll see if any page editors have any concerns as a result. -- Beland (talk) 09:29, 12 March 2022 (UTC)
@Beland: I reverted your "theta changes" on Eisenstein series, J-invariant, ATS theorem, Chebyshev function, Lovász number, Prime number theorem, Primorial and Shannon capacity of a graph. Please note that both variants of theta are in use (see for example Theta function#Auxiliary functions) and don't change one style to another. A1E6 (talk) 16:41, 12 March 2022 (UTC)
@Astro-Tom-ical and A1E6: Ah, MOS:VAR is the rule alluded to above; thanks for pointing that out. I did some testing, and it seems whenever we have any type of variation, it causes search problems of one type or another. ϑ and θ and &theta; do not show up when searching for "theta" on find-in-page; readers have to search for one of the variants of the character. At least in Firefox, I can find-in-page for either θ or ϑ and it will find the other. Using find-in-page for LaTeX is somewhat broken in Firefox, though all of the input forms do the same thing. The browser reports the correct number of matches, but can't highlight or auto-scroll to them because the text is invisible. This argues for using only HTML instead of LaTeX, but given the complexity of some formulas, this is not feasible. There are larger HTML vs. LaTeX questions anyway, so I'm not trying to do anything about that. I actually converted all instances of &theta; long ago to θ, except on pages that also use LaTeX math formulas (where math editors want to be able to search the wikitext for "theta" and find all the instances reliably). Given this is treated exactly the same as the characters, no changes are needed for searchability there anyway.
The real problem for θ vs. ϑ is external search engines. For example, with Duck Duck Go, searching for "ϑ shannon capacity" makes Shannon capacity of a graph come up, but searching for "θ shannon capacity" does not. Given that "θ" is orders of magnitude more common and much easier to type, I would recommend that all instances of "ϑ" be changed to "θ" and the MOS changed to advise when either is acceptable, to only use "θ" to facilitate searching.
On the LaTeX side, Wikipedia's own internal search engine has problems with variation. Due to whole-word matching, searches for e.g. "theta shannon" and "vartheta shannon" give different results. Given that \theta is by far more common, more intuitive, and matches the visual appearance of the common and preferred HTML variant, I recommend replacing all instances of \vartheta with \theta and adding a guideline to that effect in the MOS as well.
I think these changes will also make it easier for readers who are struggling to understand the material presented. It's often unclear whether variations in typography matter; for example, "r", "r", and "r" might all mean the same thing, or they might mean different things. It's unclear that θ and ϑ are even the same letter, especially given how obscure ϑ is for a general audience and how grownups will have generally encountered θ in geometry class. Though it's understandable if different sources use different typography, I would expect, for example, a math textbook to use the same typography all the way through. If different chapters used different notation, I would be scratching my head over the difference, especially if I didn't have a teacher around to ask for clarification. We do note the variation on Theta so that people who know enough to ask the question, "are these the same letter?" can get an easy answer. But it seems unhelpful if we are creating both puzzlement and difficulty finding articles merely for the sake of aesthetic variation. Do those recommendations make sense to you? -- Beland (talk) 20:37, 12 March 2022 (UTC)
"I recommend replacing all instances of \vartheta with \theta and adding a guideline to that effect in the MOS as well." – I don't think this is acceptable by any means. \vartheta is commonly used in math literature. A1E6 (talk) 21:00, 12 March 2022 (UTC)
@A1E6: Could you explain in a bit more detail? Curved quotes like “ and ” are commonly used in English literature, but Wikipedia doesn't use them per MOS:STRAIGHT, in part to improve search results. Is this a similar situation where we could adopt one of two equivalent styles which circulate somewhat randomly, or are you seeing the cursive theta as strongly preferred in certain fields of mathematics? -- Beland (talk) 19:29, 13 March 2022 (UTC)
@Beland: No, it's a different situation. Curved quotes are visually too similar to straight quotes, but \theta and \vartheta are very different from each other. A1E6 (talk) 19:40, 13 March 2022 (UTC)
I'm not sure visual similarity is the issue so much as meaning, convention, and intelligibility. For example, "&" and "and" are very visually distinct, but per WP:AMPERSAND we interchange them. -- Beland (talk) 02:36, 14 March 2022 (UTC)
But it is widely known that "&" and "and" are the same thing. Also changing symbols in math can have far-reaching consequences, as far as the meaning is concerned, not to mention that \theta and \vartheta are sometimes used in the same context with a different meaning. A1E6 (talk) 11:11, 14 March 2022 (UTC)
@Beland: Importantly, changing \vartheta to \theta in articles en masse (you're doing that right now) is a blatant violation of MOS:VAR and you should stop immediately. A1E6 (talk) 11:38, 14 March 2022 (UTC)
Because "\theta" "\vartheta" and "right now" are somewhat ambiguous, for the record, I just want to point out that, I did not change any LaTeX thetas, and I did not change any HTML thetas after the initial batch on 12 March 2022 and the subsequent reverts. -- Beland (talk) 21:46, 1 April 2022 (UTC)
@Beland: These [1][2][3][4][5] edits are yours and are from 14 Mar 2022. A1E6 (talk) 00:22, 2 April 2022 (UTC)
My apologies for casting aspersions; your comment was completely correct! I totally did not remember doing that, and I completely missed those when I went back to my contribution history after seeing your comment to double-check. (I wrongly assumed the edit summaries would link back to the discussion, like the 12 March ones did). Looks like these five are all related to high school geometry angle notation, where it does seem unreasonable to have a cursive theta. I must have changed those to be consistent with the convention in that subfield in preparation to changing scripts to ignore articles that only use one style consistently. -- Beland (talk) 00:49, 2 April 2022 (UTC)
I recommend that we follow standard mathematical notation for whatever subject we are writing about and not make up new bastardized sort-of-similar-looking variations on that notation to appease lobotomized search software. Does that recommendation make sense to you? —David Eppstein (talk) 20:57, 12 March 2022 (UTC)
That's what I was asking in the first place, whether certain areas of math conventionally use the cursive variant. It sounded like the answer was "no", but if the answer is "yes", that changes everything. It would be helpful to know which are which. -- Beland (talk) 17:27, 13 March 2022 (UTC)
It appears that is a standard convention for the Chebyshev function, for instance. It is definitely the standard notation for Lovász number, where in his original work on the topic Lovász chose that notation specifically because it was related to but of distinct appearance to the non-script theta that he used for Shannon capacity. —David Eppstein (talk) 19:21, 13 March 2022 (UTC)
Ah, great, that's exactly the sort of info I was looking for. It seems high school geometry has a strong affinity for regular theta, and those are all already fixed. Rather than try to figure out all the possible preferences in detail now, I'm just going to change my scripts to complain only if an article uses both cursive and regular. Then if there are any flagged articles I'll check to see if it was done accidentally or on purpose. -- Beland (talk) 02:36, 14 March 2022 (UTC)
Beland: In my judgement, changing from one symbol to another is indeed a style change, and does fall under MOS:VAR . Facilitating external searches on symbols is deficient excuse for violating other editors' aesthetics; any reversions of your changes will be legitimate according to "The Rules". However eager you are, it's time to back off, regardless of whatever WP policy violations you've gotten away with in the past.
The very nature of mathematical symbols is that they are not fixed; any conventional symbol may be dispensed with, for the convenience of the writer, with a one-line definition statment. The only case where I would support mandatory inclusion name of the symbol on a page (as text not as a symbol) is where there is a well-established function name (e.g. written out as a word in Abramowicz & Stegun, et al. (1972) Handbook of Mathematical Functions, but not just because some symbol was used by them). There are several different "theta functions" in chapters 16, 17, and 18, and a search on "theta function" (as text) should find them. Supporting any other searches based on a conventional symbol used for a variable or function is bad. The consensus so far is unanimously against your proposal. Cool down Beland: You are engaged in hot pursuit of inane obnoxiousness. Don't do it. Stand down. Back off.
Astro-Tom-ical (talk) 01:40, 13 March 2022 (UTC)
I'm certainly not in favor of changing symbols to visible text. I was only suggesting swapping symbols for symbols (which in the case of LaTeX means changing hidden text). -- Beland (talk) 17:27, 13 March 2022 (UTC)
I was not suggesting replacing symbols with text. I was stating that the only instance where it is important to accommodate external searches is where the conventional name of a function incorporates the name of a letter: "gamma" function, "chi-squared" distribution, "delta" function, "theta" function, etc.
I believe that meddling with every article that uses a symbol to facilitate searches on that symbol is a wrongly ordered priority: The priority is the clarity of the formulas in the Wikipedia. The ability of search engines to recognize them is just a "nice to have"; it is not "need to have".
Astro-Tom-ical (talk) 06:20, 14 March 2022 (UTC)

Adding guideline on colon before equations

Many guidelines on mathematical writing say that equations should be treated as any other part of the sentence they belong to (When writing in math, do you use a comma or colon preceding an equation?). Thus, the rules for when to use colons in front of equations should be similar to when to use colons in general English writing. The general rule in English is that colons can only be used after what would constitute a full sentence (see the Wikipedia article about colon that confirms this). So for example in

The definition of f is:

the colon shouldn't be used because it separates the verb from the object of the sentence. So my suggestion is to add a guideline that a colon can't be used in front of an equation if what precedes the colon is not a full sentence. Morgr2 (talk) 19:48, 20 May 2022 (UTC)

I agree that mathematical writing should be punctuated like normal English, but I think that your description of "only after a full sentence" is a very inaccurate description of the use of colons in English grammar. For one thing, it is never a full sentence, because the full sentence doesn't end until the period; what you appear to mean is an independent clause. For another, it is common in normal English writing to use colons in other contexts than the separation of independent clauses. Just to take an example, the very first colon on this talk page (at the time I write this) is a sentence that begins MOS:MATH#PUNC currently says: "Similarly...". Note the lack of an independent clause. In fact, the second sentence of the colon (punctuation) article that you link so approvingly says "A colon often precedes an explanation, a list, or a quoted sentence." I think the use of a colon to precede a mathematical formula is very similar in nature. —David Eppstein (talk) 20:43, 20 May 2022 (UTC)
I initially posted on Village pump (idea lab), but was advised to post here instead. So the purpose was only to get feedback on the general idea of changing the guidelines, not that what I wrote should be the exact wording of a new proposed guideline.
As I wrote in my initial post, I am not proposing that a colon can't precede an equation. I am only referring to the case where what precedes the colon is a dependent clause or not a complete sentence (that's the term they use in Wikipedia's colon article). Feel free to suggest changes to my suggestion if you agree with it.
It seems that at least in the APA style, the example you cite seems to have an incorrect use of the colon. I haven't done any thorough research of which prescriptive language sources have this rule for the use of colon, but an initial search online seems that the majority of sources I find have this rule. Morgr2 (talk) 08:07, 22 May 2022 (UTC)

Changing special symbol guidance

Hello, everyone! After the previous discussion and a lot of thinking and looking at how these characters are used, I'd like to propose dropping the sentence in Wikipedia:Manual of Style/Mathematics#Special symbols which currently says: "As a rule of thumb, specific mathematical symbols shall be used, not similar-looking ASCII or punctuation symbols, even if corresponding glyphs are indistinguishable." It often seems in practice and sometimes explicitly because of this MOS, we actually do the opposite. The easiest replacement would I think be to simply give specific guidance for all the affected characters. There aren't that many, it clarifies the decisions a lot, and there are a lot of exceptions no matter what the default rule is.

Screenreaders tend to mangle the non-ASCII variants, reading e.g. "letter 223C" instead of "tilde", so the above preferences should further WP:ACCESS.

For almost all mathematical characters that have two Unicode representations, we already avoid using the special character. Namely:

Likewise, MOS:STRAIGHT prefers the ASCII characters over more-attractive curved quotes, though these are general punctuation not specific to math. ASCII encoding also keeps markup simple, which is a goal of MOS:MARKUP.

Using the ASCII and alphabetic characters would make it easier for readers and editors to find the text in question when using "find in page", the Wikipedia search engine, and external search engines. The ASCII characters are much easier to type, so that's what people would intuitively try first (and many don't know how to type something not on their keyboard). It's difficult or impossible to tell these character pairs apart visually, and many people don't know there are math-specific variants in Unicode, leading to a lot of potential frustration trying to get things to match up.

I propose adding the following specific guidance:


  • Characters that appear on keyboards are easier to type and to search for consistently, and more likely to be read properly by screenreaders. These are preferred over the identical-looking glyphs they can be confused with for Greek letters, slash (/), colon (:), tilde (~), and backslash (\).[1]
  • Use U+2032 PRIME or U+2033 DOUBLE PRIME where the prime symbol is appropriate; do not use the ASCII apostrophe (') or double quote (") in these cases.
  • Use U+002A * ASTERISK when the character should render like a superscript, and is typical when used as a postfix. This character also appears on keyboards, and is thus easier to type and search for. Example: C*-algebra. Use U+2217 ASTERISK OPERATOR (&lowast;) for subscripts and when the bottom of the character should roughly align with the baseline of neighboring characters, which is typical when used as a prefix or infix operator, or a standalone character. Usage should be consistent across articles covering the same subfield of mathematics; see Asterisk#Mathematics for a canonical list.

References

  1. ^ Specifically:
    • U+03BC μ GREEK SMALL LETTER MU instead of U+00B5 µ MICRO SIGN
    • U+03A3 Σ GREEK CAPITAL LETTER SIGMA instead of U+2211 N-ARY SUMMATION for Capital-sigma notation
    • U+03A0 Π GREEK CAPITAL LETTER PI instead of U+220F N-ARY PRODUCT for Capital-pi notation
    • U+002F / SOLIDUS instead of U+2215 DIVISION SLASH
    • U+003A : COLON instead of U+2236 RATIO
    • U+007E ~ TILDE instead of U+223C TILDE OPERATOR
    • U+005C \ REVERSE SOLIDUS instead of U+2216 SET MINUS

Existing guidance on specific characters would remain in place, including the non-ASCII minus sign at Wikipedia:Manual of Style/Mathematics#Minus sign.

More specifics:

According to Micro-, the Unicode standard itself prefers U+03BC, and the other seems to be only intended for compatibility with legacy character encodings. It seems logical to me to prefer sigma and pi over other characters because these characters are actually on keyboards, and so are easier for readers and editors to input and to search. Some browsers will not find one character by searching for the Greek alphabet equivalent, and users would not necessarily know that there are two Unicode characters that look identical, and wouldn't even know they might need to input a character in a different way, much less how.

I'll invite Comp.arch to comment on mu vs. micro; they raised objections (here). I'm not sure if the above is persuasive or if some editors wish to argue for specific exceptions?

The summation and product characters are unused in articles, except where the characters themselves are being discussed.

MOS:FRAC already requires use of the ASCII slash for fractions-as-inline-division. I went through and changed all existing uses (fewer than 100 articles) of the division slash, many of which were inappropriate as they did not actually involve divisions, and so far I've gotten no objections.

It's hard to tell which uses of the colon character are for time or Bible verses vs. ratio, but I'd estimate tens of thousands of articles already use the colon to express a ratio. Fewer than 100 articles used the ratio character, so I changed them all. Once again, many were used incorrectly, and so far no objections.

Proportionality (mathematics), Tilde, and Approximation all used the ASCII ~, and this is by far more common. It's a bit unclear, but it's possible the tilde operator U+223C is supposed to mean "varies proportionally" and not "approximately" which is for the ASCII tilde? Or are they completely interchangeable? I changed hundreds of "approximately" usages from U+223C to U+007E, and so far have gotten no objections.

Fewer than 100 articles use the set minus character, some of which only do so to discuss the character itself. I had intended to have this discussion before changing these, because I was less certain about them, but forgot to do so before they came up on my list. (Apologies!) Jacobolus objected, and I'm curious if my rationale is persuasive, and also what other editors think about the pros and cons of maintaining this symbol as distinct from an ASCII backslash.

-- Beland (talk) 21:50, 14 July 2023 (UTC)

  • Oppose This is a misguided effort that helps nobody and should have been discussed first before you started wantonly making automatic replacements project wide. Nobody cares about composed fractions, roman numerals, or unicode exponents, none of which is really helpful or appropriate in technical articles; these are not at all comparable to the use of appropriate math symbols instead of incorrect kinda-sorta-similar-looking ASCII glyphs. But replacing set minus with backslash \ is a big mistake, and automatically eliminating every use of &thinsp; or &hairsp; that was deliberately added by someone for legibility is unacceptable and frankly extremely rude. There should also be no effort to automatically eliminate other unicode glyphs like the tilde operator, asterisk operator, blackboard bold letters, or whatever glyphs for capital Sigma or Pi seem most appropriate in context to the editors at each page. If you want to manually change a formula somewhere as a one-off, or have a local per-page discussion about it, that's fine, but using automatic or semi-automatic tools just leaves a giant mess behind. If individual editors have trouble typing unusual glyphs they can use whatever substitute is at hand, and someone else later can fix it as appropriate. But we shouldn't encourage broken typography in the name of populist aid to technically ignorant editors without any evidence that this is actually causing anyone a problem.
    Better guidance overall would be to recommend that wherever possible authors of technical articles should prefer LaTeX <math> tags to plain ''italicized'' variable names or {{math}} or {{mvar}} templates, with the caveat that in some special circumstances LaTeX can't be used for technical reasons, e.g. in image captions. (Aside MOS:STRAIGHT is an abomination, but unfortunately now impossible to ever fix.) –jacobolus (t) 00:10, 15 July 2023 (UTC)
    • Well, editors apparently did care enough about fractions and Roman numerals enough to put it in the Manual of Style, the whole point of which is to avoid inconsistency across articles. Simply changing one character to an identical-looking character shouldn't result in a "giant mess", and that can be done (and is faster to do) in dedicated runs if you want to avoid mixing that with other edits. Whitespace changes are not part of this proposal. It's also worth pointing out that some people refuse to consider these questions for the Manual of Style unless there is an actual dispute between two or more editors, so I get procedural complaints no matter which procedural choice I make. 8( Based on what do you determine what "correct" typography is? The Unicode standard itself, for example, prefers mu over micro, which is apparently only there for backward compatibility. -- Beland (talk) 02:00, 15 July 2023 (UTC)
      I personally have no input on mu vs. micro; you'd have to ask scientists who deal with a lot of small units about that one. My own inclination would be to just leave the choice of either one up to individual article authors, because for the most part readers won't care and neither choice is hurting anyone. –jacobolus (t) 02:04, 15 July 2023 (UTC)
      Based on what do you determine what "correct" typography is? – Absent some strong reason with clear site-wide consensus, "correct" means whatever the local consensus is on each article, with a bias toward preserving existing conventions and stylistic preferences, see MOS:STYLEVAR. To quote directly: "enforcing optional style in a bot-like fashion without prior consensus, is never acceptable"–jacobolus (t) 02:50, 15 July 2023 (UTC)
      It seems though, that using other characters when ASCII characters are available does indeed hurt people using screenreaders, and having an inconsistent or difficult-to-type characters hurts people who are searching the text (who I would mostly expect to be readers, not editors). The MOS currently does not allow individual articles to choose which of these characters to use; it actually requires the math-specific ones. I think there's a good argument to be made for consistency for math notation: small typographic differences are sometimes very meaningful in mathematics, and for people just learning a new notation, introducing meaningless typographic differences can create confusion or uncertainty. For example, a bold "N" means something different than a non-bold "N", so does that mean that a thin "∶" means something different than a bigger ":" ? I would expect a professionally-typeset math textbook to use exactly the same characters when they mean exactly the same thing. -- Beland (talk) 19:38, 15 July 2023 (UTC)
      does indeed hurt people using screenreaders – This argument would be much more convincing if we had some mathematically savvy visually impaired readers here saying "specific article X was nicely accessible and I could make complete sense of it except for these particular unicode glyphs which rendered as gibberish". But I haven't actually seen any such testimonies. –jacobolus (t) 23:26, 15 July 2023 (UTC)
      Wikipedia is not written for a mathematically savvy audience, it's written for a general audience. Nor does it seem useless to partially solve a problem even if it cannot be completely solved. What would we gain by using non-ASCII characters that's important enough to make parts of an article unintelligible to some readers? -- Beland (talk) 03:02, 16 July 2023 (UTC)
      You are ignoring my point and redirecting the conversation away from it toward a triviality irrelevant to my argument. What I am hearing from you is, paraphrased, "I, a sighted person, speculate that a hypothetical visually impaired reader using an unknown poorly implemented screen reader might hypothetically run into an accessibility problem on this set of pages because I did an automatic search and found a unicode 'set minus' character". But that doesn't demonstrate an actual harm; it's purely speculative. (Apologies if the language here sounds sharp or seems unfair; it's not intended that way.)
      Can you find a single example of a specific person who tried to read a specific article and was stymied by a specific unicode symbol? If so, how was their experience on the rest of the page? Is this unicode symbol really the problem they are facing, or is is a tiny irrelevancy in a sea of accessibility problems with mathematical formulas? (As a sighted person with very limited screen reader experience, I honestly don't know the answer to that question.)
      I mentioned a mathematically savvy reader because many of the articles you are changing are quite technically demanding and full of jargon, advanced concepts, and tricky formulas. It's not very helpful to ask a layperson or high school student if they have an accessibility problem a unicode glyph as a one-off because that person might plausibly just say "WTF is all this gibberish text, I don't understand a thing with or without the formulas or symbols." –jacobolus (t) 03:21, 16 July 2023 (UTC)
      These characters are used in both technical and non-technical articles, so sometimes, yes, it's only one character on a page that isn't read properly to a sixth grader, and sometimes, yes, without the screen reader being able to read <math>...</math> content, the article is almost entirely inaccessible to a college math major. What I don't want is to have a standard of proof so high that we never address any accessibility problems that we can and should have predicted. Nor do I think visually impaired people are the only legitimate users of screen readers, even though it's most important to fix screenreader problems for the people who can't use their eyes as a workaround. We are gathering evidence in the section below on what problems screen readers actually encounter, and in practice we seem to mostly agree on the solutions that should be used for specific characters (and that we need to improve how screen readers handle LaTeX-like markup), so I'm not going to attempt to reach agreement on our philosophical reasons. -- Beland (talk) 17:47, 16 July 2023 (UTC)
      I would expect a professionally-typeset math textbook to use exactly the same characters when they mean exactly the same thing. – This is an argument toward using LaTeX consistently and avoiding raw text or {{math}} templates wherever possible, not an argument for substituting the nearest available ASCII character any time we hit a technical symbol. –jacobolus (t) 23:32, 15 July 2023 (UTC)
      Preferring LaTeX-like markup is in fact the solution we adopted for MOS:BBB. It cannot be used in image captions due to a rendering bug for mobile readers, but otherwise could work. I'd be fine with that for rare characters like U+2216 SET MINUS. Would that be an acceptable resolution for you for that character? Colon for "ratio", tilde for "approximate", and slash for "division" or "fraction" (the latter endorsed by MOS:FRAC) are used on tens of thousands of pages, most of which are not on mathematical topics. I don't think it would be desirable to inject LaTeX-like markup into those pages, because it would require non-mathematical editors to learn an entirely new syntax to express something they could simply type on their keyboard (which is what they would end up doing) and would significantly worsen the experience for people using screen readers on those pages. -- Beland (talk) 03:02, 16 July 2023 (UTC)
      Colon for ratio is fine, as is slash for division (the 'fraction slash' is inappropriate, the 'division sign' is not generally seen outside primary school, and the 'division slash' is usually just a slightly taller slash that (marginally) better aligns for formulas but doesn't really make much practical difference). Tilde for 'similar to' is probably not ideal but might generally work okay. I don't like backslash for 'set minus'. Hyphen or en dash for 'minus' would be entirely unacceptable. 'Multiplication sign' and 'center dot' should be left alone. Sum and product symbols, integral symbols, the partial derivative symbol, etc. should be left alone.
      most of which are not on mathematical topics – Authors on non-technical pages should do whatever seems best to them.
      @David Eppstein has long advocated for using LaTeX markup wherever possible due to font mismatches with other solutions. That seems like the best choice to me personally, but I don't necessarily think we should force any editors or pages to adopt it against their preferences. –jacobolus (t) 03:31, 16 July 2023 (UTC)
    • As an additional point, I've been really, really tired, for quite some time, of hearing how we have to do X or Y to cater to broken screen readers -- though in most cases no one offers any evidence that these broken screen readers actually exist. EEng 01:49, 15 July 2023 (UTC)
      • For MOS:FRAC we got empirical screenreader data from Graham87 which actually allowed us to use a wider variety of characters. Any comment on the above list? -- Beland (talk) 02:03, 15 July 2023 (UTC)
        I tried closing my eyes and using the built-in Mac Voice Over screen reader tool to navigate a handful technical articles, and my ignorant novice impression was that all of our formulas (both LaTeX and math template) were close to completely unusable. There was no distinction made between numerator/denominator of fractions, no indication of exponents, weird extra navigation hops, non-printing directives read aloud, etc. I found it impossible to make sense of even relatively simple mathematical notation despite already knowing beforehand what it was supposed to look like.
        But I admit I know almost nothing about screen readers. It's plausible that either (a) a better screen reader would do a better job, or (b) an expert user would know how to interpret the results better.
        If we care about screen reader users, I would recommend starting a project to recruit mathematically savvy screen readers users (ideally even some completely blind people) and work directly with them to figure out what their needs are. –jacobolus (t) 02:10, 15 July 2023 (UTC)
        Yeah maths and blindness is a very hard area indeed and way beyond the scope of Wikipedia. VoiceOver isn't the best ... apparently MathML is only supported using Safari on VoiceOver. That link (though a bit out-dated) describes the other options reasonably well. As it says, out of the major Windows screen readers, JAWS is the only one to have mathematical support built-in (it's adequate, a lot better than nothing). As for the proposal at hand, I agree with using ASCII characters where practical; screen reader support for non-ASCII characters has generally improved but whether such a character is spoken across all major screen readers/configurations cannot be guaranteed. Graham87 03:54, 15 July 2023 (UTC)
        The Wikipedia appearance preferences include a checkbox for rendering LaTeX formulas as "LaTeX source (for text browsers)". Note that LaTeX source is generally ASCII. I strongly suspect that using LaTeX, with this option and a screenreader that reads this source, strikes a much better balance between readability for vision-impaired users and accurate and consistent formatting of mathematical notation, than any attempt to substitute SIX-BIT ASCII BECAUSE APPARENTLY WE'RE STILL STUCK IN THE 1960S for properly formatted mathematical notation. —David Eppstein (talk) 20:37, 15 July 2023 (UTC)
        @Graham87 how much mathematical training do you have? Do you know if there are any visually impaired Wikipedians with, say, an undergraduate technical degree who might be able to give feedback about the screen-reader accessibility of our formulas (either those written using math templates with ASCII or Unicode symbols, or LaTeX rendered into SVG/MathML, or raw LaTeX source code)?
        @David Eppstein But I also don't want us to demand that LaTeX source code has to be optimally legible per se. It is often necessary (or at least helpful) to make various kinds of workarounds that somewhat clutter/obscure the source for the sake of improvements to the (visual) legibility of the output. –jacobolus (t) 23:24, 15 July 2023 (UTC)
        I did the second-highest level of maths available at my high school, which consisted of introductory probability, statistics, logarithms, calculus, and combinatorics, and also an externally run mathematical enrichment course for junior high school students where we covered the basics of proof by contradiction/induction and modulo arithmetic (along with the usual early algebra, geometry, and trigonometry). It's very rare for blind people to get even that far in mathematics, to the point that my main maths tutor and I conducted a presentation about my mathematical journey at a national conference on educating blind people. Having said all that, my grasp of that material wasn't always the best and I wasn't always a good student.
        To get a bit more on-topic, Of the few blind people who are in to maths, I think some really like LaTeX and some really like MathML ... I for one am in the latter camp (when things are working OK) ... but that's what the preferences section is for. Graham87 04:33, 16 July 2023 (UTC)
        Congrats! Math is definitely not generally made easy for folks with visual (or other) impairments. It's hard for me to even imagine manipulating symbolic notation, conceptualizing complicated geometric diagrams, etc. in a non-visual way. Do you think it would help overall to have some kind of plain-text fallback for mathematical notation (written down using prose the way a lecturer might read it aloud)? Or would you rather have some kind of hierarchical layout that the screen reader knows how to navigate that you can traverse using the keyboard or similar? I imagine there are some who are fine with raw LaTeX markup but that's definitely not ideally reader friendly. –jacobolus (t) 05:34, 16 July 2023 (UTC)
        Thanks! I prefer a hierarchical layout (which is what the MathML generates with screen readers). Graham87 06:03, 16 July 2023 (UTC)

Screen reader test

1. This camera has a 3:4 aspect ratio. (ASCII)

2. This camera has a 3∶4 aspect ratio. (non-ASCII)

3. This camera has a aspect ratio. (LaTeX-like)

4. 1/2 the class is on the blue team. (ASCII)

5. 1∕2 the class is on the blue team. (non-ASCII)

6. the class is on the blue team. (LaTeX-like)

7. Add ~1 cup of water (ASCII)

8. Add ∼1 cup of water (non-ASCII)

8B. Add ≈1 cup of water (double tilde)

9. Add cup of water (LaTeX-like)

10. Define on V \ 0 the binary relation v ~ w to hold when there exists a nonzero real number t such that v = tw. (ASCII)

11. Define on V ∖ 0 the binary relation vw to hold when there exists a nonzero real number t such that v = tw. (non-ASCII)

12. Define on the binary relation to hold when there exists a nonzero real number such that (LaTeX-like)

Screen reader test discussion

I frequently listen to Wikipedia articles using the Voice Aloud Reader for Android, so I can do other things at the same time, like gardening. I had that app speak the above section. The LaTeX-like markup is completely ignored, just skipped over. (1) reads "three four aspect ratio" which is intelligible, (2) reads "three to four aspect ratio" which is perfect, (4) reads "one half the class" which is perfect, (5) reads "one division slash two the class" which is intelligible but distracting, (7) reads "approximately one cup" which is perfect, and (8) reads "tilde operator one cup", which is momentarily confusing and only intelligible to people who know what "tilde" means. -- Beland (talk) 03:18, 16 July 2023 (UTC)

Listening to Laplace transform (which I learned about in class at MIT) the <math>...</math> markup is once again completely dropped. Some of the {{math}} markup reads fine, like "t ≥ 0" but "F(s)" is pronounced like a word. Often it's not possible to make sense because punctuation is ignored in expressions like "[0, ∞)". -- Beland (talk) 03:25, 16 July 2023 (UTC)
Right, so we should not make too much effort as individual page authors of highly technical pages to accommodate Android's Voice Aloud per se; any fix for "completely dropped" is going to need to come either at the Mediawiki level or the screen reader implementation level. We could try to pressure both Mediawiki developers and Android developers here though. –jacobolus (t) 03:42, 16 July 2023 (UTC)
Good luck with that. Graham87 04:33, 16 July 2023 (UTC)
I would expect us to use any feasible workarounds to make math expressions accessible in the meantime, but what is it that we would be asking those developers to change in the long run? Apparently there is some technology to turn LaTeX math expressions into speech, though I can't get Chromevox to work on my phone; it seems to be for ChromeOS only. -- Beland (talk) 09:25, 16 July 2023 (UTC)

Here are my results in the latest release versions of the two most common Windows screen readers:

  • With JAWS: ratios are "3 colon 4" (ASCII), 3 ratio 4 (non-ASCII), and "3 colon 4 (LaTeX, on the default punctuation setting). Fractions are "1 slash 2" (ASCII), "1 divided by 2" (non-ASCII), and "1 slash 2 LaTeX). Approximations are "tilde 1" (ASCII), "tilde operator" (non-ASCII), and "tilde operator" (LaTeX). As for items 10–12, the "V \ 0" is read as "v backslash 0", "∖" is read as "set minus", and the LaTex-like markup is read the same as the second item.
  • With NVDA: broadly similar, except the LaTeX content isn't red at all (because I don't have a MathML plugin installed ... most people wouldn't) and the non-ASCII "∼" is read as "similar to".

Both these screen readers can have wildly varying configurations about which speech engine is used, which can produce different results. I've heard that the markup within math tags is also ignored by VoiceOver on the iPhone as well as Siri. Therefore we should minimise their use in non-mathematical articles. Graham87 04:33, 16 July 2023 (UTC)

@Beland it sounds like 'similar to' should be preferred to 'tilde' for the mathematical uses with the NVDA reader, but either one should be about the same for JAWS, and for both readers 'set minus' should be preferred to backslash. For non-mathematical uses I would nearly always recommend using English words "roughly", "about", or "approximately" instead of the ~ character. In some contexts the character would be appropriate, but ~ meaning 'roughly' is typically kind of sloppy.
@Graham87 what happens in these screen readers if you toggle the math appearance preferences setting to "LaTeX source (for text browsers)" instead of SVG? –jacobolus (t) 05:09, 16 July 2023 (UTC)
I remember that setting very well because the LaTeX used to be the alt text of the maths images before MathML became the default (I don't miss it). Screen readers will read the LaTeX exactly as it's written ... with punctuation marks like "left brace", "backslash", etc. Graham87 06:03, 16 July 2023 (UTC)
Which, if you are familiar with LaTeX, is probably a pretty good choice. Unfortunately we also need many of our mathematics articles to be understandable by less mathematically-sophisticated users. —David Eppstein (talk) 06:31, 16 July 2023 (UTC)
FWIW, for me, just changing that setting doesn't have an effect on the read-out-loud audio...the app fetches pages on its own, separately from Chrome, so it gets what not-logged-in readers see. There is an option to "load from browser", and if I do that and also know enough to log in and go back to the page of interest, it no longer skips the <math>...</math> parts, but it says weird things, e.g. "one dollar per two dollars" when it sees "$ 1/2 $". Visually, instead of properly typeset formulas I see dollar signs and LaTeX source code, so this is not a setting I can leave turned on. So what happens if math comes up in an article is that I either pull out my phone and read it visually, or just give up on the article and listen to something else. -- Beland (talk) 07:51, 16 July 2023 (UTC)
I was wondering if we could simply add an "alt" attribute to <math>...</math>. Apparently, not anymore. I also found Wikipedia:Rendering math to be a helpful overview for researching technical options, BTW. Adding the CSS at mw:Extension:Math/advancedSettings#CSS for the MathML with SVG fallback mode gets me copy-paste-able MathML output in all browsers; this extension does the same for Firefox only. However, even with the "load in browser" trick, this does not transfer to Voice Aloud, which continues to skip the <math>...</math> parts entirely. Going back to the default setting, it appears this app simply skips images instead of reading the "alt" text (which I do see in the rendered HTML), which is what I would expect it to do, and I can't find any setting that would change that behavior. I can press on an image to see the LaTeX source visually because that's the alt text, but Android "select to speak" (in Accessibility settings) and "read aloud" (after highlighting) don't read that. Instead they read some of the ASCII characters but leave out punctuation and items like "integral" and "infinity". -- Beland (talk) 09:25, 16 July 2023 (UTC)
Orca is built into Gnome, and when I use that with Firefox, it seems to almost entirely ignore MathML and SVG alt text. It can read the LaTeX source if I set my preferences to that, but it's so full of punctuation that I have no idea what it's saying (having it read the first equation on Laplace transform.) -- Beland (talk) 09:58, 16 July 2023 (UTC)

I just added a version with "alt" set on the LaTeX elements, but it doesn't seem to actually work. That is, the alt attribute inside is not set to what I suggested, but is instead still set to the LaTeX source. (Also, Mac VoiceOver at least doesn't seem to pay any attention to it.) Perhaps Help:Displaying a formula should be altered to indicate that trying to explicitly set 'alt' doesn't do anything. –jacobolus (t) 09:05, 16 July 2023 (UTC)

It already does, in Help:Displaying a formula#Rendering. -- Beland (talk) 09:26, 16 July 2023 (UTC)
If it is now 6+ years out of date, we should just remove the whole paragraph. –jacobolus (t) 11:20, 16 July 2023 (UTC)
FTR, jacobolus has removed it. -- Beland (talk) 17:42, 18 July 2023 (UTC)

Summing up

It sounds like in some of the listed cases, ASCII is preferred, and in others full <math>...</math> markup (with the presumption of lobbying for better screen reader support)? No one has complained about the guidance on prime and asterisk, so that all together, how do people feel about the below?


Other

Outside of <math>...</math> markup:

  • Use U+2032 PRIME or U+2033 DOUBLE PRIME where the prime symbol is appropriate; do not use the ASCII apostrophe (') or double quote (") in these cases.
  • Use U+002A * ASTERISK when the character should render like a superscript, and is typical when used as a postfix. This character also appears on keyboards, and is thus easier to type and search for. Example: C*-algebra. Use U+2217 ASTERISK OPERATOR (&lowast;) for subscripts and when the bottom of the character should roughly align with the baseline of neighboring characters, which is typical when used as a prefix or infix operator, or a standalone character. Usage should be consistent across articles covering the same subfield of mathematics; see Asterisk#Mathematics for a canonical list.
  • Use U+002F / SOLIDUS instead of U+2215 DIVISION SLASH
  • Use U+003A : COLON instead of U+2236 RATIO
  • Use U+007E ~ TILDE instead of U+223C TILDE OPERATOR for "similar to" and "≈" for "approximately" in mathematical formulas
  • Use <math>...</math> markup instead of U+2216 SET MINUS or U+005C \ REVERSE SOLIDUS for set substraction.

I introduced "≈" here...it's read for me as "approximately equal" by Voice Aloud; the built-in Android Select to Speak skips both "~" and "≈". -- Beland (talk) 18:08, 18 July 2023 (UTC)

In mathematics, "~" and "≈" often have different meanings. "≈" unambiguously means "approximately equals". "~", instead, can mean "has the probability distribution" or "is asymptotically equivalent to". It can also sometimes be used for logical negation. I don't think we should be advising people to use them interchangeably. So I think we need greater clarity that the "similar to" above refers only to Similarity (geometry), and not some general colloquial meaning of being somewhat alike. Additionally, those other usages of ~ could be mentioned. —David Eppstein (talk) 18:14, 18 July 2023 (UTC)
I think this list should be less prescriptive/demanding, and bots or script-assisted editors should be discouraged from making mass changes to implement any recommendations.
I don't think it's helpful to insist on a preference for ~ 'tilde' vs. ∼ 'tilde operator' [note unicode synonyms: varies with (proportional to), difference between, similar to, not, cycle, APL tilde]. I would instead recommend altogether avoiding the ~ 'tilde' character, using either ≈ 'almost equal to' [unicode synonym: asymptotic to] or ≅ 'approximately equal to' to indicate an an approximate value or approximate equality, and sticking to ∼ 'tilde operator' in technical contexts where an operator is needed (it's unfortunate that the glyph is not included in the default text or math font used in Wikipedia skins, and doesn't really appear as intended in the fallback fonts on most platforms; it should look approximately like the LaTeX rather than like ~; the glyph included in STIX fonts is much better).
I would leave 'set minus' out of this list, but if you want to include it, directly mention <math>... \setminus ...</math> or <math>... \smallsetminus ...</math> (the choice of which should be left to author preference / local consensus), and then recommend that ∖ 'set minus' is fine in places where LaTeX is technically infeasible. –jacobolus (t) 18:25, 18 July 2023 (UTC)
While we're on this subject, I wonder if we could press for Wikipedia's CSS to include STIX fonts for {{math}} templates. They are designed to match the Times New Roman we currently specify as a math font, but with much broader coverage of technical symbols, meaning unicode math symbols would appear much more consistent and correct across platforms. (using the STIX 2.0 font might be even better, as it's slightly closer to LaTeX's computer modern).–jacobolus (t) 18:43, 18 July 2023 (UTC)
Good points on tilde. Trying to rewrite that to list all the mathematical uses got quite cumbersome, so I'm thinking a link to our own articles might be better. It does seem like a good idea to avoid the tilde in contexts where it could be ambiguous or there are more specific alternatives. How about this phrasing:
Care should be taken to use other symbols to mean "approximately" (and ¬ for negation) in contexts where tilde has another mathematical meaning.
I do think it is problematic for people trying to learn about the notation, search in a page, or use a search engine to have two different characters to mean exactly the same thing, so for consistency I think for mathematical expressions where tilde is used as an operator, we should pick one of U+007E ~ TILDE, U+223C TILDE OPERATOR, or <math>...</math> markup.
Readers and editors are already familiar with typing and searching for the ASCII tilde; it is used over 40,000 times in English Wikipedia (not counting URLs). ( is used about 10,000 times.) The tilde operator is not on anyone's keyboard and isn't in the pull-down of math characters. It appears even math editors preferentially use the ASCII tilde; for math expressions with operators, I can find only 21 articles that use the tilde operator but 80-100 that use ASCII tilde. Using the tilde operator character would create search difficulties for the majority who naturally type the ASCII tilde. -- Beland (talk) 22:02, 21 July 2023 (UTC)
Well, I put together all the feedback from above, polished a few things up, and updated the guideline page; I'm open to tweaks as needed. It sounds like it's now possible to get screenreaders to read MathML properly. Would it be helpful to make something like Template:Contains special characters that likes to a guide for configuring the popular screen reading software to do that? We'd presumably have to start a new page to be the link target, unless there's one floating around somewhere? -- Beland (talk) 19:51, 25 July 2023 (UTC)
I don't think your changes reflect consensus. The recommendations are too dogmatic, and the demands are not technically necessary. –jacobolus (t) 09:07, 6 August 2023 (UTC)

Blackboard bold caption restriction

I added the following:

Due to bug T263572, <math>...</math> markup cannot be used in image captions; use regular bold throughout the article instead.

35.139.154.158 reverted with the edit summary:

it *can* be used in captions, just not in the image viewer; so it should be avoided there....avoiding in the rest of the article is a non sequitir though

I'm not sure that IP editors can receive pings, but here goes anyway...

It's not possible to put these characters in captions without them showing up in the image viewer for mobile readers, so I consider those to be equivalent. I'm open to an alternative phrasing if you wish. A previous paragraph reads "each article should be consistent with itself", so if we must use regular bold in image captions, then that implies we must use it consistently throughout the article? -- Beland (talk) 03:14, 13 July 2023 (UTC)

Having gotten no reply, I've re-added this note, but rephrased in light of the above points. -- Beland (talk) 20:47, 5 August 2023 (UTC)
<math>...</math> markup cannot be used in image captions is too restrictive, imo. Apparently, you just mean blackboard bold inside <math>...</math>, don't you? - Jochen Burghardt (talk) 07:48, 6 August 2023 (UTC)
No, <math> tags cannot be used whatsoever inside image captions because they do not render when a reader clicks the image, the math notation does not render in the full-window popup image view. You can click on the bug report linked above to read about it. –jacobolus (t) 08:44, 6 August 2023 (UTC)
In that case, the warning should be given in a more prominent place on Wikipedia:Manual_of_Style/Mathematics#Using_LaTeX_markup, preferrably before the start of the first subsection ("Deprecated formatting"). I guess, a great amount of captions in math article are affected, and do not understand why bug T263572, has been assigned a low priority. - Jochen Burghardt (talk) 11:18, 6 August 2023 (UTC)
Your change is very confusing. I would personally recommend editors just use unicode symbols in image captions. They render more or less okay on nearly all devices nowadays. –jacobolus (t) 08:49, 6 August 2023 (UTC)