Wikipedia:Manual of Style/Glossaries/DD bug test cases

From Wikipedia, the free encyclopedia

This page provides test cases for, an explanation of, and information on working around, a long-standing MediaWiki bug with the <dd>...</dd> HTML element (the description/definition part of a description list, also known as a definition list or association list).

Overview[edit]

Due to MediaWiki's very brittle handling of HTML list structures generally (in particular, the inability to handle multiple paragraphs except under a very specific circumstance, outlined below), the ; and : wikimarkup features cannot be used to produce any but simplistic and easily-broken description lists, unlike real HTML. Some of the interrelated bug reports include: 1584 (unresolved as of September 2014), 5178 (unresolved as of September 2014), 6200, etc.

The main problem is detailed below, but there are also other problems with the wikimarkup version of definition lists. Bug reports include: 6776, 11894, and many others.[needs update]

It is unlikely that all these problems will be fixed any time within the next several years, especially as they are seen as "features" by some, including the termination of a list item by a line break. (See detailed comments in the bug reports for more information.)

Examination of the main problem[edit]

In ; and : markup, if a single description/definition (or term, for that matter) uses multiple paragraphs, it is necessary to specify the paragraphs explicitly with a <p> or the full <p>...</p> paragraph markup, and to do so without line breaks. We cannot just do the seemingly rational thing of creating a second description line for a second paragraph, as this indicates two separate definitions, not a multi-paragraph definition, and doing this causes all sorts of problems.

Due to bugs in (or alleged features of) the MediaWiki software (see above), the paragraphs must not be separated by a newline either inside or outside of the paragraph, in wikimarkup-based list items such as descriptions/definitions:

;term 1
:<p>Part of definition of term 1.</p><p>More of definition of term 1.</p>

or more compactly:

;term 1
:Part of definition of term 1.<p>More of definition of term 1.</p>

and it is highly unlikely that any regularly edited text would retain such precious formatting for long before someone broke it. The same holds true of the use (again without linebreaks) of <br /><br />, as suggested (along with <p>...</p>) at Help:List, which leads to accessibility problems anyway: text visually broken with simply <br /> line breaks this way will not be treated as separate paragraphs by screen readers.

To be clear, this example does not work:

;term 1
:<p>
Part of the definition of term 1.
</p><p>
More of definition of term 1.
</p>

and neither does this one:

;term 1
:<p>Part of definition of term 1.</p>
<p>More of definition of term 1.</p>

nor anything like them (see test cases below).

For the same reason, if a :-initiated definition/descritption requires an indented segment, one has to use something like <span style="margin-left: 1.6em;">...</span> (1.6em is how far rightward a MediaWiki-generated list item is indented) around it to get valid code as the result, and must butt those tags against any preceding or following paragraph tags in the same definition. Trying to use <blockquote>...</blockquote> for this would be semantically wrong (unless the content actually consists of a quotation), and would still have this no-line-breaking requirement.

Help:List's suggestion to use : as an indent produces semantically invalid markup, which will also not validate because it is missing the <dt> element.[1] While this is not an enormous concern with a quick-and-dirty wikimarkup lists, it's poorly-accessible and poorly-reusable output would be a major problem for structured glossaries:

Sloppy code Looks okay... But output is poor
{{glossary}}
{{term|term 1}}
{{defn|1=Beginning of definition.
:Indented text in definition.
Conclusion of definition.
}}
{{glossend}}
term 1
Beginning of definition.
Indented text in definition.
Conclusion of definition.
<dl>
<dt>term 1</dt>
<dd>Beginning of definition.
<dl>
<dd>Indented text in definition.</dd>
</dl>
Conclusion of definition.</dd>
</dl>

The rendered output of this looks correct to the fully-sighted human reader in a graphical browser, but in reality, the MediWiki parser has made the :-"indented" item into an entire new definition list (glossary), which is broken anyway (in having no term), as well as defeating the intent of the original list it's embedded in! Wikimarkup definition list structures cannot be mixed with structured markup.

The brittle list handling "bug/feature" is quite general, and also affects ordered (#) and (*) unordered lists.

From the perspective of non-technical editors, simple tricks can be used to make the code more readable for some editors. One such kluge is to put linebreaks between definitions, as already illustrated, giving the false appearance of a well-spaced list, when in reality MediaWiki creates a whole slew of pointless micro-lists, ruining the semantic value and accessibility of bothering to use definition list markup in the first place. Another is to code a multi-paragraph definition as multiple definitions, but write the prose as if it were a single definition in two paragraphs. And as already, noted this just blatantly falsifies the semantic markup, resulting in pretty-looking but technically awful output.

Such hacks will lead to confusing Wikipedia code (that other editors are likely to correct anyway), redundant MediaWiki output, and blatantly invalid HTML, among other problems, resulting in accessibility and usability issues.

To repeat: It just doesn't work, due to MediaWiki "bug/features", but there is an easy solution as shown below.

NB: Replacing : with a real <dd>...</dd> structure has no effect if ; is used, or vice-versa with <dt>...</dt> and :. The entire structure must be entirely HTML in order to function properly. Which brings us to...

Workaround[edit]

The workaround, as illustrated below, is to abandon ; and : entirely for any case in which one intends to produce rich definition lists, including glossaries, and instead use pure HTML markup: <dl><dt>...</dt><dd>...</dd>...</dl>. And there is no reason to use HTML manually to produce a glossary when the easy-to-use structured glossary templates will do this for you, and do it consistently with other glossary articles. For non-structured glossaries, use bullet lists or use subheadings and plain-text entries.

Test cases[edit]

<dl>
;term 1
:<p>
This is part of the definition.
</p><p>
This is more of the definition.
</p>
</dl>
term 1

This is part of the definition.

This is more of the definition.


Failure: Code invalid (two <dl> elements created, one broken by having sub-elements other than <dt> or <dd> directly inside). Visually, definitions not indented.

<dl>
;term 2
:<p>This is part of the definition.</p>
<p>This is more of the definition.</p>
</dl>
term 2

This is part of the definition.

This is more of the definition.


Failure: Code broken (<dl> terminates early and the continuance of the definition is not part of the list). Visually, only one definition indented.

<dl>
;term 3
<dd>
<p>
This is part of the definition.
</p><p>
This is more of the definition.
</p>
</dd>
</dl>
term 3

This is part of the definition.

This is more of the definition.


Failure: Definitions not indented.

<dl>
;term 4
<dd>
<p>This is part of the definition.</p>
<p>This is more of the definition.</p>
</dd>
</dl>
term 4

This is part of the definition.

This is more of the definition.


Failure: Definitions not indented.

<dl>
;term 5
<dd><p>
This is part of the definition.
</p><p>
This is more of the definition.
</p></dd>
</dl>
term 5

This is part of the definition.

This is more of the definition.


Failure: Definitions not indented.

<dl>
;term 6
<dd><p>This is part of the definition.</p>
<p>This is more of the definition.</p></dd>
</dl>
term 6

This is part of the definition.

This is more of the definition.


Failure: Definitions not indented.

<dl>
;term 7
<dd>This is the entire definition.</dd>
</dl>
term 7
This is the entire definition.

Failure: Definition not indented.

<dl>
;term 8
<dd><p>This is the entire definition.</p></dd>
</dl>
term 8

This is the entire definition.


Failure: Definition not indented.

<dl>
;term 9
:<p>This is part of the definition.</p><p>This is more of the definition.</p>
</dl>
term 9

This is part of the definition.

This is more of the definition.


Poor: Indentation is typical WP style for this markup, but undesirable for proper use of the tags to create a glossary, because even the term is indented.

<dl>
;term 10
<dd><p>This is part of the definition.</p><p>This is more of the definition.</p></dd>
</dl>
term 10

This is part of the definition.

This is more of the definition.


Failure: Definitions not indented.

<dl>
<dt>term 11</dt>
<dd><p>This is part of the definition.</p><p>This is more of the definition.</p></dd>
</dl>
term 11

This is part of the definition.

This is more of the definition.


Poor: Undesirable indentation of the term is gone, so it looks right, but it requires </p><p> on the same line, which is easily broken.

<pre> <dl> <dt>term 12</dt> <dd>1. This is the first definition.</dd> <dd><p>2. This is part of the second definition.</p> <p>This is more of the second definition.</p></dd> </dl> </pre>

term 12
1. This is the first definition.

2. This is part of the second definition.

This is more of the second definition.


Success: Perfect, and supports <p> properly, which means it will also support <blockquote>, nested lists, etc.

<dl>
<dt>term 13</dt>
:<p>This is part of the definition.</p>
<p>This is more of the definition.</p></dd>
</dl>
term 13

This is part of the definition.

This is more of the definition.


Failure: Only one definition indented.

<dl>
<dt>term 14</dt>

<dd>1. This is the first definition.</dd>

<dd>
2. This is part of the second definition.

This is more of the second definition.
</dd>
</dl>
term 14
1. This is the first definition.
2. This is part of the second definition. This is more of the second definition.

Failure: Lack of auto-markup of paragraphs. This used to work perfectly, because MediaWiki would auto-generate paragraph markup for plain text entered on isolated lines, even inside a <dd>...</dd> wrapper, and didn't have any problem with definitions spaced apart from terms, either. This broke in late 2013.

<dl>
<dt>term 15</dt>
<dd>1. This is the first definition.
<dd><p>2. This is part of the second definition.</p>
<p>This is more of the second definition.</p></dd>
</dl>
term 15
1. This is the first definition.

2. This is part of the second definition.

This is more of the second definition.


Success: Perfect, even with a two-fold coding error (missing </p></dd> on the first definition)

<dl>
<dt>term 16</dt>
<dd>1. This is the first definition.</dd>
<dd>2. This is part of the second definition.
<p>This is more of the second definition.</p></dd>
</dl>
term 16
1. This is the first definition.
2. This is part of the second definition.

This is more of the second definition.


Acceptable: Vertical spacing isn't quite right, but with real paragraphs of content in place, no one would really notice or care.

<dl>
<dt>term 17</dt>
<dd>1. This is the first definition.</dd>
<dd><p>2. This is part of the second definition.</p>
<p>This is more of the second definition.</p></dd>
</dl>
term 17
1. This is the first definition.

2. This is part of the second definition.

This is more of the second definition.


Success: Perfect; no p markup required on single-paragraph entries.

<dl>
<dt>term 18</dt>
<dd>1. This is the first definition.</dd>
<dd><p>2. This is part of the second definition.</p>
<p>This is more of the second definition.</p></dd>
<dd>3. This is a third, complex definition:
{{gbq|With a block quotation.}}
Another paragraph, and
* An embedded list
* More list
Conclusion of definition 3.</dd>
<dd>4. Fourth definition, with blank line

to cause paragraph break.</dd>
</dl>
term 18
1. This is the first definition.

2. This is part of the second definition.

This is more of the second definition.

3. This is a third, complex definition:

With a block quotation.

Another paragraph, and

  • An embedded list
  • More list
Conclusion of definition 3.
4. Fourth definition, with blank line to cause paragraph break.

Success: Perfect – everything works as expected.

{{glossary}}
{{term|term A}}
{{defn|This is the definition.}}
{{term|term B}}
{{defn|{{ghat|A hatnote.}}
1. This is the first definition.}}
{{defn|<p>2. This is part of the second definition.</p>
<p>This is more of the second definition.</p>
}}
{{defn|3. This is a third, complex definition:
{{gbq|With a block quotation.}}
Another paragraph, and
* An embedded list
* More list
Conclusion of definition 3.
}}
{{defn|4. Fourth definition, with blank line

to cause paragraph break.}}
{{term|term C}}
{{defn|This is the definition.}}
{{glossary end}}
term A
This is the definition.
term B

A hatnote.

1. This is the first definition.

2. This is part of the second definition.

This is more of the second definition.

3. This is a third, complex definition:

With a block quotation.

Another paragraph, and

  • An embedded list
  • More list

Conclusion of definition 3.

4. Fourth definition, with blank line to cause paragraph break.
term C
This is the definition.

Success: Perfect – everything works as expected, using templated version of code.

Developer views[edit]

In the process of working on the now-closed MediaWiki bug report Phabricator: T3584 (formerly bugzilla 1584, now tracked as T11996):

"Real HTML <li>[...]</li> tags are very rarely used, and then only by HTML-savvy editors who want to apply a class or style attribute, or by editors who cut-and-paste some existing HTML code. Since block or inline content is allowed in list items, this would be the way to allow correct and more complex formatting of the contents of list items, but at the cost of complex and unusual markup in the edit field, incompatible with wikitext lists."

References[edit]

  1. ^ "Markup Validation Service: Check the markup (HTML, XHTML, …) of Web documents". Validator.W3.org. v1.3+hg. World Wide Web Consortium. 2017. Retrieved December 13, 2017.