Single-nucleotide polymorphism

From Citizendium
Jump to navigation Jump to search
This article is a stub and thus not approved.
Main Article
Related Articles  [?]
Bibliography  [?]
External Links  [?]
Citable Version  [?]
This editable Main Article is under development and subject to a disclaimer.
DNA molecule 1 differs from DNA molecule 2 at a single base-pair location (a C/T polymorphism).

A single-nucleotide polymorphism (SNP, pronounced snip) is a genetic polymorphism in which a DNA sequence variation is a single nucleotideA, T, C, or G — in the genome (or other shared sequence) differs between members of a species (or between paired chromosomes in an individual). For example, two sequenced DNA fragments from different individuals, AAGCCTA to AAGCTTA, contain a difference in a single nucleotide. In this case we say that there are two alleles : C and T. Almost all common SNPs have only two alleles.

Within a population, SNPs can be assigned a minor allele frequency — the lowest allele frequency at a locus that is observed in a particular population. This is simply the lesser of the two allele frequencies for single-nucleotide polymorphisms[1]. There are variations between human populations, so a SNP allele that is common in one geographical or ethnic group may be much rarer in another.


In the past, SNPs with a minor allele frequency of greater than or equal to 1% (or 0.5%, etc.) were given the title "SNP".[1] Some used "mutation" to refer to variations with low allele frequency. With the advent of modern bioinformatics and a better understanding of evolution, this definition is no longer necessary, e.g., a database such as dbSNP includes "SNPs" that have lower allele frequency than one percent.[2]

Types of SNPs

Types of SNPs
  • Non-coding region
  • Coding region
    • Synonymous
    • Nonsynonymous
      • Missense
      • Nonsense

Single-nucleotide polymorphisms may fall within coding sequences of genes, non-coding regions of genes, or in the intergenic regions between genes. SNPs within a coding sequence will not necessarily change the amino acid sequence of the protein that is produced, due to degeneracy of the genetic code. A SNP in which both forms lead to the same polypeptide sequence is termed synonymous (sometimes called a silent mutation) — if a different polypeptide sequence is produced they are nonsynonymous. A nonsynonymous change may either be missense or "nonsense", where a missense change results in a different amino acid, while a nonsense change results in a premature stop codon. SNPs that are not in protein-coding regions may still have consequences for gene splicing, transcription factor binding, or the sequence of non-coding RNA.

Use and importance of SNPs

Variations in the DNA sequences of humans can affect how humans develop diseases and respond to pathogens, chemicals, drugs, vaccines, and other agents. SNPs are also thought to be key enablers in realizing the concept of personalized medicine.[3] However, their greatest importance in biomedical research is for comparing regions of the genome between cohorts (such as with matched cohorts with and without a disease).

The study of single-nucleotide polymorphisms is also important in crop and livestock breeding programs (see genotyping). See SNP genotyping for details on the various methods used to identify SNPs.


Example SNPs are rs6311 and rs6313 in the HTR2A gene. A SNP in the F5 gene causes a hypercoagulability disorder with the variant Factor V Leiden. An example of a triallelic SNP is rs3091244.[4]


As there are for genes, there are also bioinformatics databases for SNPs. dbSNP is a SNP database from National Center for Biotechnology Information (NCBI). SNPedia is a wiki-style database from a private company. The OMIM database describes the association between polymorphisms and, e.g., diseases.


The nomenclature for SNPs can be confusing: several variations can exist for an individual SNP and consensus has not yet been achieved. One approach is to write SNPs with a prefix, period and greater than sign showing the wild-type and altered nucleotide or amino acid; for example, c.76A>T.[5][6][7]


  1. E.g., Methods for Discovering and Scoring Single Nucleotide Polymorphisms. National Human Genome Research Institute.
  2. SNP Population Grows at NCBI, NCBI News, NCBI.
  3. Bruce Carlson. SNPs — A Shortcut to Personalized Medicine, Genetic Engineering & Biotechnology News, Mary Ann Liebert, Inc., 2008-06-15, p. 12. Retrieved on 2008-07-06. “(subtitle) Medical applications are where the market's growth is expected”
  4. Akihiko Morita, Tomohiro Nakayama, Nobutaka Dobac, Shigeaki Hinoharac, Tomohiko Mizutania and Masayoshi Soma (june 2007). "Genotyping of triallelic SNPs using TaqMan® PCR". Molecular and Cellular Probes 21 (3): 171–176.
  5. J.T. Den Dunnen (2008-02-20). Recommendations for the description of sequence variants. Human Genome Variation Society. Retrieved on 2008-09-05.
  6. Johan T. den Dunnen & Stylianos E. Antonarakis (2000). "Mutation Nomenclature Extensions and Suggestions to Describe Complex Mutations: A Discussion". Human Mutation 15: 7–12.
  7. Shuji Ogino, Margaret L. Gulley, Johan T. den Dunnen, Robert B. Wilson and the Association for Molecular Pathology Training and Education Committee (2007). "Standard Mutation Nomenclature in Molecular Diagnostics". The Journal of Molecular Diagnostics 9 (1). DOI:10.2353/jmoldx.2007.060081. Research Blogging.