User:PAC2/Should wikipedians rebel against ChatGPT?

From Wikipedia, the free encyclopedia

ChatGPT is the new hype. It looks like magic. But if you think about it, you may be more skeptical and may want to resist against this new hype. Here are some obvious reasons why we as wikipedians should resist against this new trend.

Google is complementary to Wikipedia whereas ChatGPT tries to replace Wikipedia[edit]

From the end of the 1990's to the beginning of the 2020's, the web has been dominated by search engines such as Google. Search engines are complementary to Wikipedia since they need to put at the top of their queries reliable pages providing answers to users. Wikipedians success may be largely due to the ranking of Wikipedia articles in Google's results. ChatGPT design is very different. It generates a new answer without any reference to any source. Basically, it may write a new answer which is similar to the Wikipedia article but will never gives the reference to Wikipedia. Possibly, if people use ChatGPT instead of Google in the future, the traffic of Wikipedia may shrink.

ChatGPT relies on the exploitation of people from the global South[edit]

ChatGPT has been made possible by exploiting people from the global South. An investigation from Billy Perrigo for the Time Magazine show that OpenAI has used Kenyan workers to label explicit content and make ChatGPT safer[1].

ChatGPT has appropriated open resources such as Wikipedia without any concern about the CC-BY-SA licensing[edit]

ChatGPT has been trained in a wide corpus of text available on the Web. The Wikipedia corpus has been used to train the model[2]. The Wikipedia corpus is licensed under CC-BY-SA license. This means that people may reuse the content if and if they cite the Wikipedia contributors and if their share the work under the same conditions. ChatGPT is a commercial and proprietary software developed using Wikipedia. I'm not a lawyer and it's not clear weather ChatGPT should be considered as a derivated work from Wikipedia. But I think that we should have the debate to know if the license has been respected or not. Three is already a lawsuit against GitHub Copilot, an AI based on GPT-3 that has been trained in free and open source software to generate some code. It will be interesting to see if they will success[3].

ChatGPT generated texts are much more difficult to correct than human written texts[edit]

CNET, a tech media, has published 78 articles written by ChatGPT. When this has been discovered, they had to correct the articles. However, correcting articles written by a natural language generator is especially difficult since they are only optimised to write probable texts without any reference to truth[4].

Usually, when an editor approaches an article (particularly an explainer as basic as “What is Compound Interest”), it’s safe to assume that the writer has done their best to provide accurate information. But with AI, there is no intent, only the product. An editor evaluating an AI-generated text cannot assume anything, and instead has to take an exacting, critical eye to every phrase, world, and punctuation mark.

Lauren Leffer, "CNET Is Reviewing the Accuracy of All Its AI-Written Articles After Multiple Major Corrections", Gizmodo, https://gizmodo.com/cnet-ai-chatgpt-news-robot-1849996151

The CNET experiment shows us that it may be a very bad idea to use ChatGPT generated texts in Wikipedia.

ChatGPT is a threat for epistemic communities[edit]

Joseph Reagle points out that ChatGPT and other similar bots are very efficient at producing verisimilitude, content that looks like truth but is bullshit. This is a big threat for epistemic communities such as Reddit, StackOverflow and if course Wikipedia. Moderation may become a nightmare in the coming months[5].

References[edit]