User:Kayau/Why Wikipedia's syntax coverage sucks

From Wikipedia, the free encyclopedia

Wikipedia's coverage on syntax sucks, and in many ways, to different degrees. At least half the articles would benefit from a complete rewriting. Relative clause and some of the grammatical category articles are probably the only ones of reasonable quality.

Morphology (which overlaps a lot with syntax anyway) and phonology are not much better, but if one ventures far enough from syntax, into the domain of phonetics, one immediately finds a collection of fine articles that accurately summarise the sum of linguists' related work. Not to mention other sciences like statistics, where detailed articles are easy to find. This essay will not speculate on why this is the case, but will rather point out how they are bad and suggest solutions to fix them.

The problems with syntax articles[edit]

There are, broadly, at least six huge problems in Wikipedia's syntactic coverage, plus two smaller ones:

  1. Gaps in coverage. Minor syntactic phenomena are often not covered or only given stubs, e.g. Heavy NP shift. In some cases, major phenomena such as clause chaining and locative inversion, have no article (as of January 2017) at all. Lack of comprehensiveness is clearly against Wikipedia's goals to cover all notable topics.
  2. Eurocentrism. In many cases, articles list only English examples. Those with huge lists of examples generally have huge Indo-European lists, plus a few Finno-Ugric, Semitic or East Asian languages. Moreover, structures found in European languages are sometimes taken to be central or prototypical, while others are given short shrift. Control (linguistics) gives no hint of the fact that control varies greatly across languages, for instance. Where are Kroeger's analysis of Tagalog and Givón's analysis of Ute?
  3. Articles are about languages, not language. Lists of examples should not be created solely for the sake of listing examples. Adding Chinese to Reflexive verbs is a good move, because it is an important language where long-distance 'anaphora' was identified. Adding all six major Romance languages is generally pointless. The focus is on the linguistic study of syntactic phenomena. One might argue that we're writing for a general audience rather than linguists/linguistics students, and dedicating most of the space on major languages will therefore be beneficial; but this, again, is just another manifestation of eurocentrism. Examples should be used to shed light on syntactic phenomena and typological patterns.
  4. Lack of theoretical treatments. This is not to say that all analyses ever published in the NLLT deserve mentions. The majority do not. However, when influential analyses like Austin and Bresnan (1996) get no more than a sentence in Non-configurational language while dependency grammar gets a full section, there is a problem. In fact, it almost looks as if Timothy Osborne has gone through the majority of Wikipedia's syntax articles, adding DG analyses, while theoreticians from every other framework has ignored WP. This is a huge problem. Hengeveld's FDG account of non-verbal predication belongs on Copula (linguistics), and ditransitive verb needs Larson's account (as much as it's nonsense). This is not to denigrate DG as a framework. It is hugely influential in computational linguistics, so if a construction poses a notable challenge in the field, the computational challenge - along with DG solutions - ought to be presented. But there is little need to go into detailed theoretical DG analyses in subjects where little DG work has been done (and the work that has been done is rarely cited).
  5. Lack of synchronic typology. How can Split ergativity not mention the Silverstein hierarchy? Ironically, Bybee was cited in grammatical category not for her discovery of the iconicity involved in the marking of categories, but for her summary of generative perspectives (!) on grammar. Typology is a fruitful and important line of research, and must not be excluded.
  6. Lack of diachrony. Recurrent paths of grammaticalisation have been a central area of linguistic research since Heine and Reh. For syntax articles to exclude diachrony is totally unacceptable, and is another symptom of the 'languages, rather than language' disease prevalent in syntax articles.
  7. Lack of psycholinguistics. In many cases, influential work has been done on the acquisition, processing or production of syntactic constructions. These ought to be included in the article whenever possible.
  8. Primary sources are used: Fieldworkers' grammars, foundational work, original NLLT papers and other related material are nice to put in the bibliography section if they are influential, but WP articles are primarily based on secondary sources. Syntax textbooks, handbooks and reviews should be our main sources of info - not Chomsky (1957) or Dixon (1972).

Hall of shame[edit]

It must be noted that this hall of shame does not actually intend to shame articles or their contributors. The two articles listed are typical of WP's syntax coverage. They are presented only as exemplars.

Nominalisation[edit]

Nominalisation is not purely a morphological phenomenon. Clauses can be nominalised, and so much work has been devoted to clausal nominalisation in both formalist and functionalist work (including Chomsky, 1970, one of the most-cited works in syntax) that it would be unthinkable to exclude all of it. Yet this is precisely what the article has done, except for a brief, unreferenced note on Eastern Shina. Where is the DP-IP parallel? Where are the formalist and typological-diachronic explanations for it? Where is the section on the diachronic origins of nominalisation markers? The relation between nominalisation and finiteness? None of this was covered.

Case (grammar)[edit]

At the very beginning of the article, case is defined as marking grammatical functions only. This is clearly false, as case can mark thematic roles in many languages (as the lead itself concedes). Quirky subject was not mentioned or even linked to. There was practically no theory. Case grammar and lexicase grammar, and how their conceptions of case affected the general theory of linguistics, are missing. The Chomskyan notion of 'abstract case' was not mentioned - again, it may be nonsense, but it's influential enough and merits a mention. There is, as usual for grammatical categories, a huge list of languages with case, but languages that are actually important for the general theory of case - like, say, Icelandic (quirky case) or Warlpiri (case is closely related to discontinuous constituents) - are only given brief mentions with no regard to their significance.

The way out[edit]

There is no solution other than to go through all of WP's syntax articles and revamp every single one, addressing each of the points made above. It's not an easy thing to do, but with a group of passionate syntax nerds, it can be done, bit by bit. Relative clause, despite its long list of examples, scores better than most other WP syntax articles in many respects, and can be used, pro tem, as a model. But the huge influence of Keenan and Comrie (1977) makes this article, in many respects, easier than the rest to write, and we must bear in mind that other topics are not quite so simple.