Wikipedia talk:Wikipedia Signpost/2012-09-10/News and notes

Discuss this story

Is there no simple way for editors to find out how complex their writing is? Lotus Word Pro uses to have the means for doing this, but I don't know of any other word processors that can do it. Maybe we might have a gizmo as part of the edit page. Apwoolrich (talk) 08:40, 11 September 2012 (UTC)[reply]
I had not read to the bottom of the article when writing the above! Apwoolrich (talk) 08:43, 11 September 2012 (UTC)[reply]

You definitely used to be able to do it using MS word too. I doubt they removed the functionality. Egg Centri c 09:27, 11 September 2012 (UTC)[reply]

MSWord in Office 2011/Mac has a "Show readability statistics" option in the spelling/grammar preferences (results are displayed as part of the grammar-checker tool). DMacks (talk) 13:08, 11 September 2012 (UTC)[reply]

The demo site is nice, but I think you're on the right track to suggest something easier to access from WP directly would be helpful. I checked a few of "my" articles and was shocked at how low the scores were. I'll definitely make a conscious effort to check the readability of my contributions in the future. Matt Deres (talk) 10:58, 11 September 2012 (UTC)[reply]

I find the Readability Calculator at online-utility.org (recommended at Simple Wikipedia) to be a big help and gives other readability indices in addition to Flesch. It even tells you which sentences particularly require improvement. Having said that, writing "simply" is fiendishly difficult. I've written a few articles at Simple Wikipedia and really struggle to get the Flesch above 65%. This one has a Flesch of 79 and it took a lot of work to get it there. Voceditenore (talk) 13:12, 11 September 2012 (UTC)[reply]
A grade school primer ("Dick can run. Jane can run, too. See Dick and Jane run!") scores very high in "readability," but an encyclopedia written thus would sound idiotic. How do the scholarly encyclopedia Britannica and the children's encyclopedia Worldbook score? Edison (talk) 14:58, 11 September 2012 (UTC)[reply]
I editted Simple English Wikipedia for a couple of years, and drop back occasionally. There are several ongoing problems:

There is a recommended word list. The words on the list were devised by someone who thought they knew what words people really needed. Not all the words on the list are "easy", and not all are "essential". In fact, many of the words that are absolutely essential for encyclopedic writing, are left out. I did make up a list, but don't have it to hand.

To write successful Simple English, one needs to have a very good grasp of the language. One needs to have the flexibility to reduce complex ideas to simple language. Basically, to do it well one needs to have experience in writing for children or for people who are learning English.

Some people who author those pages have limited language skills. Some of them with whom I have communicated have seen themselves as well suited for the purpose for the very reason that they had limited English language.

Having a background in teaching, and specifically in writing educational material for a museum, I know that this isn't the case.

Clarity, in simple words, is not easy to achieve. This is demonstrated by the fact that Simple Wiki editors sometimes string together a dozen one-syllable words in a non-grammatical way, just to avoid using one three-or four-syllable word correctly.

Among the requirements for writing good simple English articles are:

Deciding what vocabulary is essential for an article. This needs determining for every main subject area, and sometimes every individual topic. It may include some complex words that need to be explained within the context of the article itself.

Thinking around subjects for the easiest way to express them. The easy way doesn't always comply with the MOS of English Wikipedia.

I believe that Simple English Wikipedia could be very useful, if well done. But, unfortunately, the way to get it well done, is to get it done by experts.

And that defeats the purpose of Wikipedia. What we need is teamwork, because the people who have the knowledge, the people who are prepared to research to subjects, and the people who are good at writing simple, readable language are not always the same people.

Amandajm (talk) 15:40, 11 September 2012 (UTC)[reply]

An excellent analysis, Amandajm! Re Edison's query about how WP compares to children's encyclopedias. Britannica has the first paragraphs of entries in its Children's Encyclopedia (Ages 6-10) online. I compared the Flesch scores for "Navy" and "Frog" with the equivalent number of sentences from the first paragraphs in the Wikipedia and Simple English Wikipedia articles:

Navy

Children's Britannica = 79

Simple English Wikipedia = 76

Wikipedia = 18 (!!!)

Frog

Children's Britannica = 93

Simple English Wikipedia = 63

Wikipedia = 51

– Voceditenore (talk) 16:20, 11 September 2012 (UTC)[reply]

I will have to pay attention to this as well (and, thank you for bringing this to my attention): I plugged in six lengthy passages of personal writing (some used on Wikipedia, some not, although my style is similar in both), and came back with Flesch Reading Ease scores between 7.09 [edit: 1.33 for a page out of my thesis - it doesn't get any worse than that!] and 27.28 - but I have always believed my writing was clear and understandable! My scores on the other metrics were equally abysmal, the worst being a 29.68 on the Gunning Fog index (that's, according to Wikipedia, "post-post-post-post-post-doctoral"); a 16.03 on the Coleman-Liau; a 24.48 on the Flesch-Kincaid grade level; a 28.45 on the Automated Readability index; and a 21.11 on the SMOG index (oddly enough for numerologists, another piece of my writing was a 17.77). I plugged in some of Gibbon's Decline and Fall and it scored marginally better than my average. I take it from the above comments that these are abysmally abysmal scores, and that my writings is essentially nonsensical to a large segment of the population. Any tips or suggestions on how to bring my writing in to, at the least, the Flesch Reading Ease 50-60 level? I tried explaining some difficult philosophical and theological concepts in simple terms and scored equally poorly, although I managed to get a 48.21 writing about the Fall of Constantinople. St John Chrysostom ^Δόξα_τω^Θεώ 21:06, 11 September 2012 (UTC)[reply]

While I don't question the conclusion per se, the use of such a primitive metric is suspect. You can't measure how clear or accessible a piece is by sentence and syllable counts, but the difference to the reader is enormous. Simple routinely defines big words and then uses them repeatedly - this inflates the metric but doesn't pose any serious conceptual challenge to the reader. That said, Simple does need help with expansion and simplifying in a lot of areas if it is to keep growing while achieving its goal, rather than turning into an awkward subset of En. Dcoetzee 21:22, 11 September 2012 (UTC)[reply]

See Wikipedia:Village pump (proposals)/Archive 18#Suggestion: readability test(s) for Wikipedia articles (January 2008).

—Wavelength (talk) 21:24, 11 September 2012 (UTC)[reply]

To move away from the feature, I would advise users to go and have a look at the submissions for the new main page. I was pleasantly surprised, when I reviewed the submissions, due to a particular standout revision of the main page that caught my eye. --Izno (talk) 23:52, 11 September 2012 (UTC)[reply]

This comment copied from SEWP. This is what I like to call a typical "So what?" research article. The data is freely available, and the tools to generate some numbers are easy to use. But, so what? What is the actual relationship between Flesch Reading Ease and readability? They fail to demonstrate that, so the numbers don't mean anything to actual readers. Quote: "the whole concept of readability cannot be covered". Then don't present this as having anything to do with serving readers.

Plus, their methodology is pretty weak. They remove all of the headings and incomplete sentences. Yet, any editor knows that good use of headings and image captions can make a text much easier to understand. But, if they don't do that, their handy, convenient measurement won't work. Pretty poor. Also, it would have been reasonable to exclude every article on SEWP that is tagged complex from the sample. The community has marked these specifically for readers as not simple. These articles are basically placeholder texts there to be simplified. How about vocabulary frequency (common or uncommon vocabulary) and vocabulary recycling (how aften a word occurs in the text)? I could go on, but I won't.

Finally, they seem to be unaware of newer, more sophisticated tools like Coh-Metrix that go some way towards identifying writing qualities and measuring things like coherence and cohesion that are critical to ease of understanding. Gotanda (talk) 05:40, 12 September 2012 (UTC)[reply]

I wouldn't disagree with you, especially being the author of a very similar paper which is released earlier, and yet covering the same topic more in details and trying to provide some deeper understanding of the phenomenon, instead of just reporting some numbers. 130.233.245.45 (talk) 11:42, 19 September 2012 (UTC)[reply]

Simple English Wikipedia has a page 'Textual difficulty' Simple:Textual difficulty which explains some of the basics. The Flesch formula has been extensively researched, far more than any other readability formula, and I can provide references to substantiate this. However, I want to make a more important point.

These formulae were all developed for, and tested on, readers of printed material. Our material is a hypertext (obviously), and this has consequences. In particular, links and wiktionary give ways for readers to check immediately the meaning of words they do not know. In practice, well-constructed Simple pages give the reader these facilities. Therefore, the readability scores are universally overestimating difficulty. On the other hand, printed text in the form of a book is still outstanding in legibility and searchability.

To do better, what we really need is research on people: how our readers actually use our system. In detail, how they find out the answers to questions or topics they wish to know about, how they deal with problems, what their reading ability is, and so on. Who needs to know that some of our articles are terrrible? Of course there are. But the system is self-improving, and I can think of many pages which are far better than they were. This is another of the fundamental issues which get ignored by outsiders looking in. Macdonald-ross (talk) 10:02, 12 September 2012 (UTC)[reply]

you definitely can't easily go from en:wp to Simple. There are many rules, like don't use "phrasal verbs". It would take me a long time to figure out what's ok there. And they're not so concerned with sourcing, in my experience. It's definitely a different culture. As far as how useful it is, I don't know. Are there measure of page views and such? MathewTownsend (talk) 00:49, 17 September 2012 (UTC)[reply]

Re Hypertext

The comment in the project page about hypertext making Simple English Wik easier than the scores indicate needs qualification. First, many of us in less-developed countries do not have fast computers. This means that linking to hyertexts is often eschewed or, if done, breaks the train of thought. Understanding a sentence entails comprehending its entirety. When one goes to a reference and then returns, the train of thought is disturbed. Try this in a language that you barely know to get the feel of such action. You will also see that the slow pace raises your affective barrier. Another problem with hypertext is that the explanations are often not much direct help: either they are so long that a low-level reader (or a person not conversant in the field under consideration) totally gets lost or confused. To see what I mean here, look at almost any article on a technical or even somewhat technical subject in Wik/Eng (e.g., arithmetic): more than likely, you will stumbling through one definition after another trying to get some sort or understanding. This is in part because Wik is a hypertext text with no hierarchy of definitions, unlike, say, a standard text (in, e.g., mathematics), which usually has monothetic definitions. another problem with Wik hypertextualization is that the blue text (along with other font variations) adds a factor of "difficultization" not considered in Flesch. Kdammers (talk) 04:41, 21 September 2012 (UTC)[reply]