Talk:Exome sequencing

	This article is within the scope of WikiProject Molecular Biology, a collaborative effort to improve the coverage of Molecular Biology on Wikipedia. If you would like to participate, please visit the project page, where you can join the discussion and see a list of open tasks.Molecular BiologyWikipedia:WikiProject Molecular BiologyTemplate:WikiProject Molecular BiologyMolecular Biology articles
???	This article has not yet received a rating on the importance scale.
	This article is supported by the Genetics task force (assessed as High-importance).
	This article is supported by the Computational Biology task force (assessed as High-importance).

Exome sequencing received a peer review by Wikipedia editors, which is now archived. It may contain ideas you can use to improve this article.

Wiki Education Foundation-supported course assignment

This article was the subject of a Wiki Education Foundation-supported course assignment, between 24 March 2020 and 29 April 2020. Further details are available on the course page. Student editor(s): Amieeanes.

Above undated message substituted from Template:Dashboard.wikiedu.org assignment by PrimeBOT (talk) 21:02, 16 January 2022 (UTC)[reply]

Old reference is mis-quoted

A claim is made that the protein coding regions of the human genome constitute about 85% of the disease-causing mutations, referencing Choi et al 2005. This needs to be corrected for 2 reasons. First, the article mis-attributes and mis-quotes, Choi et al (2005), the authors refer to Cooper et al (1995) who state "Among the approximately 2,600 Mendelian diseases that have been solved, the overwhelming majority are caused by rare mutations that affect the function of individual proteins; at individual Mendelian loci, approximately 85% of the disease-causing mutations can typically be found in the coding region or in canonical splice sites". Mendelian diseases are a subset of human diseases. Note that genome wide association studies have identified few hits in protein coding regions for common diseases. Surely we could find a more recent reference that was at least written in a post genome wide association study era? — Preceding unsigned comment added by Ovenn (talk • contribs) 19:05, 15 November 2013 (UTC)[reply]

Article need rewritting

Many phrase are directly recopied, they should be rephrased to comply. -RobertMel (talk) 03:15, 4 March 2010 (UTC)[reply]

Excuse me sir can you please indicate instances in the article that are plagiarized? Every effort was used to paraphrase points and referencing was done accordingly. If we infringed on your research please indicate where and we will make the necessary adjustments. —Preceding unsigned comment added by 99.225.16.164 (talk) 03:35, 4 March 2010 (UTC)[reply]

Sorry, the above description I made is not accurate (where was my mind?), I added the tags not exactly for copyright, but copyright of content. The same sources are extensively used in the article. For example, Sarah B Ng et al. is used 12 times, and the attribution is absent in the body of the article and those info's are presented as facts, they're Scientific papers, you should consult secondary sources, it's OK to use primary sources intro's, when they do a review on the subject, but it should at least be backed by secondary sources for the rest. Let me see if I can find a more proper tag. -RobertMel (talk) 05:17, 4 March 2010 (UTC)[reply]

Also known as WXS

although WES is a more used acronym, WXS appears on many places and is worth mentioning 132.66.95.80 (talk) 08:47, 10 March 2016 (UTC)[reply]

"Sequencing" section should be deleted

The Sequencing section (https://en.wikipedia.org/wiki/Exome_sequencing#Sequencing) should be deleted. It reference sequencing technology that is many years old (GAII!). And the actual sequencing technology is not relevant. Speedyboy (talk) 20:39, 16 March 2016 (UTC)[reply]

I added a link to high-throughput sequencing in general. I agree that the section is out of date, but I feel it could do with a discussion of HTS as applied to WES - maybe advantages/disadvantages of technologies, whether Nextera and other library preps can be used, etc... Jmc200 (talk) 12:21, 22 December 2016 (UTC)[reply]

Target enrichment section

I'm a bit concerned that the Target enrichment section is about target enrichment in general, and not about its application to WES - I don't know of PCR and MIP being used in actual exome studies, so perhaps they should be deleted. Additionally, I think the section could do with more secondary sources (I've made a start on this). Jmc200 (talk) 12:21, 22 December 2016 (UTC)[reply]

OK - I've moved the non-WES target enrichment strategies here. I don't think this information belongs in this article, but perhaps there is room for it elsewhere on Wikipedia? Or perhaps summarised and put in a History section? Jmc200 (talk) 13:08, 22 December 2016 (UTC)[reply]

PCR

Polymerase chain reaction (PCR) is one of the most widely used enrichment strategies for over 20 years.^[1] PCR technology is used to amplify specific DNA sequences. It uses a single stranded piece of DNA as a start for DNA amplification. Uniplex PCR uses only one starting point (primer) for amplification and multiplex PCR uses multiple primers. This way multiple genes can be targeted simultaneously. This approach is known to be useful in classical Sanger sequencing because a uniplex PCR used to generate a single DNA sequence is comparable in read length to a typical amplicon. Multiplex PCR reactions which require several primers are challenging although strategies to get around this have been developed. A limitation to this method is the size of the genomic target due to workload and quantity of DNA required. The PCR based approach is highly effective, yet it is not feasible to target genomic regions that are several megabases in size due to quantity of DNA required and cost.

Molecular inversion probes (MIP)

Molecular inversion probe uses probes of single stranded DNA oligonucleotides flanked by target-specific ends. The gaps between the flanking sequences are filled and ligated to form a circular DNA fragment. Probes that did not undergo reaction remain linear and are removed using exonucleases.^[2]^[3] This is an enzymatic technique that targets the amplification of genomic regions by multiplexing based on target circularization. Accurate genotypes can be achieved from massively parallel sequencing using this method. This method is suggested to be useful for small numbers of targets in a large number of samples. Major disadvantage of this method for target enrichment is the capture uniformity as well as the cost associated with covering large target sets.^[1]

References

^ ^a ^b Kahvejian A, Quackenbush J, Thompson JF (2008). "What would you do if you could sequence everything?". Nature Biotechnology. 26 (10): 1125–1133. doi:10.1038/nbt1494. PMID 18846086.
^ Emily H. Turner; Sarah B. Ng; Deborah A. Nickerson; Jay Shendure (2009). "Methods for Genomic Partitioning". Annu Rev Genomics Hum Genet. 10: 30–35. doi:10.1146/annurev-genom-082908-150112. PMID 19630561.
^ Mertes F, Elsharawy A, Sauer S, van Helvoort JM, van der Zaag PJ, Franke A, Nilsson M, Lehrach H, Brookes AJ (2011). "Targeted enrichment of genomic DNA regions for next-generation sequencing". Brief Funct Genomics. 10 (6): 374–386. doi:10.1093/bfgp/elr033. PMC 3245553. PMID 22121152.

Copyright problem

Dear all,

a particular copyright issue with this article is being discussed here:

Wikipedia_talk:Copyright_problems#Wikipedia_page_"later"_published_in_scientific_paper:_copyvio?

I would value everyone's input.

--Steven Fruitsmaak (Reply) 20:19, 20 July 2019 (UTC)[reply]

Wiki Education assignment: Bioinformatics

This article was the subject of a Wiki Education Foundation-supported course assignment, between 29 August 2022 and 15 December 2022. Further details are available on the course page. Student editor(s): Albentley 99 (article contribs).

— Assignment last updated by Albentley 99 (talk) 17:57, 29 September 2022 (UTC)[reply]

Exon- what is it? An element of RNA splicing not translation

Perhaps unintended, by stating that the subset of DNA that are protein coding are called exons, you have created a definition of the exon which is inaccurate. The purpose of wikipedia is to provide even specializaed information in a format which can be accessed by the layperson. Often this means simplification or generalization even when this produces concepts which aren't "exactly true". While I can endorse approximating the truth for purposes of education, I would draw the line at teaching inaccuracies or falsehoods, which if the topic were pursued would have to be "unlearned". Unlearning is far more difficult than learning. In eukaryotes, any RNA transcript made by RNApolymerase-II is considered to be a pre-messangerRNA until proper processing. The primary transcript is composed of nucleotides that have only two possible fates. Stretches of nucleotides that will be retained in the message are called exons and stretches of nucleotides that will be removed by splicing are called introns. (In the original formulation EX- stood in for expressed while INTR- indicated intervening sequence.) There is no relationship between exons and protein-coding sequences other than that translated sequences must be part of a message (mRNA) and therefore in exons. The portion of a mRNA between the first nucleotide and the protein coding region is called the 5' untranslated region (5UTR). The portion of a mRNA between the last templated nucleotide and the protein coding region is called the 3' untranslated region (3UTR). Whole branches of molecular biology (gene regulation) are devoted to studying the function of these regions in translational control and RNA stability. This article would seem to erroneously define exons as translated regions thereby eliminating all UTRs. This is not only inaccurate but problematic. Exons and introns are defined by the process of RNA splicing, not translation. This needs clarification. The exome focuses on sequence that is found in mRNA, in part because protein coding sequences make a much larger proportion of exon sequence than the total genomic sequence. Thus, although the text is not 100% wrong, as it is I find that either of two meanings can be drawn from the text, one correct and one incorrect. Genesanddevelopment (talk) 07:13, 14 August 2023 (UTC)[reply]

Comparison with transcriptome sequencing

Ku CS, Wu M, Cooper DN, Naidoo N, Pawitan Y, Pang B, Iacopetta B, Soong R. Exome versus transcriptome sequencing in identifying coding region variants. Expert Rev Mol Diagn. 2012 Apr;12(3):241-51. doi: 10.1586/erm.12.10. PMID: 22468815.

Might be useful to add how sequencing exons and mRNAs compare, as they should contain the same information: protein-coding parts of genes. Hanif Al Husaini (talk) 02:16, 10 March 2024 (UTC)[reply]

[A-1] Kahvejian A, Quackenbush J, Thompson JF (2008). "What would you do if you could sequence everything?". Nature Biotechnology. 26 (10): 1125–1133. doi:10.1038/nbt1494. PMID 18846086.

[genParitioning-2] Emily H. Turner; Sarah B. Ng; Deborah A. Nickerson; Jay Shendure (2009). "Methods for Genomic Partitioning". Annu Rev Genomics Hum Genet. 10: 30–35. doi:10.1146/annurev-genom-082908-150112. PMID 19630561.

[3] Mertes F, Elsharawy A, Sauer S, van Helvoort JM, van der Zaag PJ, Franke A, Nilsson M, Lehrach H, Brookes AJ (2011). "Targeted enrichment of genomic DNA regions for next-generation sequencing". Brief Funct Genomics. 10 (6): 374–386. doi:10.1093/bfgp/elr033. PMC 3245553. PMID 22121152.

[1]

[2]

[3]