Jump to content

Talk:Transformer (deep learning architecture)/GA1

Page contents not supported in other languages.
From Wikipedia, the free encyclopedia

GA Review

[edit]

The following discussion is closed. Please do not modify it. Subsequent comments should be made on the appropriate discussion page. No further edits should be made to this discussion.


Article (edit | visual edit | history) · Article talk (edit | history) · Watch

Nominator: Cosmia Nebula (talk · contribs) 19:04, 9 August 2024 (UTC)[reply]

Reviewer: Phlsph7 (talk · contribs) 08:14, 12 August 2024 (UTC)[reply]


Hello Cosmia Nebula and thanks for all your improvements to this article. However, despite the improvements, the article fails criterion 2b since there are too many unreferenced paragraphs. Examples are the paragraphs starting with "For many years, sequence modelling ", "As the Transformer architecture natively processes", and "A positional encoding is a fixed-size vector". According to criterion 2b, these passages require inline citations "no later than the end of the paragraph".

The article cites many papers from arXiv. They are usually considered self-published sources, making them unreliable, see WP:ARXIV. Maybe some of them are also published in reliable journals, in which case you could cite these versions instead. You would probably have to replace the rest with other sources.

I suggest that you add all the missing references and replace the arXiv papers before a renomination.

A few other observations

  • WP:EARWIG detects no copyvios
  • Linear transformers were first developed as an improvement over previous architectures for machine translation, but has found many applications since then. there is a problem with the clause starting with "but", should it be "..., but many additional applications have been found for them since then"?
  • An well-cited early example was replace "An well-cited" with "A well-cited" or maybe with "An often-cited"
  • One key innovation was use of an attention mechanism  add "the" before "use"
  • by removing its recurrence to processes all tokens in parallel should this be "to process" instead of "to processes"?

Phlsph7 (talk) 08:14, 12 August 2024 (UTC)[reply]

The discussion above is closed. Please do not modify it. Subsequent comments should be made on the appropriate discussion page. No further edits should be made to this discussion.