This document provides an overview of different types of molecular markers that can be used in genetics and plant breeding. It discusses morphological markers that are visible traits, as well as several classes of molecular markers including biochemical markers such as allozymes, and DNA-based markers such as RFLPs, microsatellites, AFLPs, STS, and others. The advantages and disadvantages of each type of marker are described. Morphological markers are influenced by the environment while molecular markers directly reflect genotypes. DNA-based markers allow more precise tracking of genetic material and are not affected by the environment.
2. Submitted by
VIKAS KUMAR SINGH
M. SC. (Ag),
Department of Genetics & Plant
Breeding, Chaudhary Charan Singh
University,
Meerut
3. Classification Of Markers
Molecular Markers
Biochemical
Markers
DNA Based
Markers
Morphological
Markers
Marker is generally defined as an object used to indicate a
position, place, or route.
5. Morphological markers
These markers are often detectable by eye, by simple visual
inspection .These are Classical markers which are also known as
“naked eye polymorphism”.
Selected based on the experience of the breeder to correlate a
phenotypic trait with a trait of interest.
Examples of this type of marker include the presence or absence of
an awn, leaf sheath coloration, height, grain color, aroma of rice etc.
In well characterized crops like maize, tomato, pea, barley & wheat
tens or hundreds of genes that determine morphological traits have
been mapped to specific chromosome locations.
A best example is to use leaf tip necrosis(LTN) as a phenotypic
marker to predict the presence of durable rust resistance gene pair
Lr34/Yr18 in wheat.
Another example is to use pseudo black chaff seen on glumes &
below the nodes color as a morphological marker to predict the
presence of stem rust resistance gene Sr2 in wheat.
6. Molecular markers
• A sequence of DNA or protein that can be screened
to reveal key attributes of its state or composition
and thus used to reveal genetic variation
Also known as “Genetic Marker”.
Genetic markers are the sequences of DNA which
have been traced to specific location on the
chromosomes and associated with particular traits.
Molecular Markers are classified as:
1. Protein Based Markers/ Biochemical Markers
2. DNA Based Markers
7. Comparison of molecular & morphological
markers
Represent the actual
polymorphism of the phenotype
important for the breeder.
Generally scored quickly, simply
and without laboratory
equipments.
Influenced by environment,
Time consuming, require large
population.
Represent naturally occurring
polymorphism in DNA
sequence (i.e. base pair
deletion, substitution, addition
or patterns)
Generally scored in laboratory.
Comparatively less time
consuming and do not require
large population.
8. Protein based or Biochemical
markers
Isozymes analysis has been used for over 60 years for various research
purposes in biology, viz. to delineate phylogenetic relationships, to
estimate genetic variability and taxonomy, to study population genetics and
developmental biology, to characterization in plant genetic resources
management and plant breeding.
Isozymes were defined as structurally different molecular forms of an
enzyme with, qualitatively, the same catalytic function. Isozymes originate
through amino acid alterations, which cause changes in net charge, or the
spatial structure (conformation) of the enzyme molecules and also,
therefore, their electrophoretic mobility. After specific staining the isozyme
profile of individual samples can be observed.
Allozymes are allelic variants of enzymes encoded by structural genes.
Allelic variation can be detected by gel electrophoresis and subsequent
enzyme-specific stains that contain substrate for the enzyme.
9. Advantages:
• Allozyme analysis does not require DNA extraction or the
availability of sequence information, primers or probes,
they are quick and easy to use.
• Simple analytical procedures, allow some allozymes to be
applied at relatively low costs, depending on the enzyme
staining reagents used.
• Allozymes are codominant markers that have high
reproducibility. Zymograms (the banding pattern of
isozymes) can be readily interpreted in terms of loci and
alleles.
10. Disadvantages:
•The main weakness of allozymes is their relatively low abundance
and low level of polymorphism.
•Proteins with identical electrophoretic mobility (co-migration) may
not be homologous for distantly related germplasm.
• Lastly, often allozymes are considered molecular markers since they
represent enzyme variants, and enzymes are molecules. However,
allozymes are in fact phenotypic markers, and as such they may be
affected by environmental conditions. For example, the banding
profile obtained for a particular allozyme marker may change
depending on the type of tissue used for the analysis (e.g. root vs.
leaf). This is because a gene that is being expressed in one tissue might
not be expressed in other tissues. On the contrary molecular markers,
because they are based on differences in the DNA sequence, are not
environmentally influenced, which means that the same banding
profiles can be expected at all times for the same genotype.
11. DNA Based Markers
DNA marker = direct reflection of genotype.
“Any unique DNA sequence which can be used in DNA
hybridization, PCR or restriction mapping experiments
to identify target sequence.”
In 1980, variations in the pattern of DNA fragments
were observed , generated by restriction enzyme
digestion of genomic DNA , could be used as genetic
marker.
DNA-markers allow the breeder to introduce into their
cultivated plant only the gene(s) of interest from a
related species as compare to conventional breeding.
12. P1 x F1
P1 x P2
CONVENTIONAL BACKCROSSING
BC1
VISUAL SELECTION OF BC1 PLANTS THAT MOST
CLOSELY RESEMBLE RECURRENT PARENT
BC2
MARKER-ASSISTED BACKCROSSING
P1 x F1
P1 x P2
BC1
USE ‘BACKGROUND’ MARKERS TO SELECT PLANTS THAT
HAVE MOST RP MARKERS AND SMALLEST % OF
DONOR GENOME
BC2
13. Allow eliminating the undesired genomic regions in a
few generations .
Segregate as single genes.
Not affected by the environment.
DNA based markers are classified as:
1) Anonymous Markers
2) Defined Markers
Anonymous Markers:
“A cloned random DNA fragment whose function or
specific features are not known e.g. Microsatellites
and AFLP. These marker type generally measure
apparently neutral DNA variations.
14. Defined Marker OR Polymorphic marker:
“A defined marker may contain a gene or some other
specific features, e.g. restriction sites for cutting by
the restriction enzymes, etc.”
Polymorphism of DNA Marker:
“DNA markers representing polymorphism in the
actual base sequence of DNA.”
This can be represented by:
• Mutation at restriction site.
• Insertion or deletion between restriction sites.
• Mutations at single nucleotide.
• Changes in number of repeat unit between
restriction site or PCR primer sites.
15. Classification of MarkersName of the technique Discoverer
A. Biochemical Markers Allozymes Tanksley and Orton
1983;Kephart1990; May 1992
B. Molecular Markers
i) Non-PCR based techniques
Restriction Fragment Length
Polymorphisms (RFLP)
Botstein et al. 1980; Neale
and Williams 1991
Minisatellites or Variable
Number of Tandem Repeats
(VNTR)
Jeffreys et al.. 1985
ii) PCR-based techniques
DNA sequencing
Multi-copy DNA, Internal
Transcribed Spacer regions of
nuclear ribosomal genes (ITS)
Takaiwa et al. 1985; Dillon et
al. 2001
Single-copy DNA, including
both introns and exons
Sanger et al. 1977; Clegg
1993a
16. Sequence Tagged Sites(STS) Microsatellites, Simple Sequence
Repeat (SSR), Short Tandem
Repeat (STR), Sequence Tagged
Microsatellite (STMS) or Simple
Sequence Length Polymorphism
(SSLP)
Litt and Lutty
(1989),Hearne et
al.
1992; Morgante
and Olivieri 1993;
Jarne and Lagoda
1996
Sequence Characterized Amplified
Region (SCAR)
Michelmore et al.
(1991); Martin et
al. (1991); Paran
and Michelmore
1993
Cleaved Amplified Polymorphic
Sequence
(CAPS)
Akopyanz et al.
1992; Konieczny
and
Ausubel 1993
Single-Strand Conformation
Polymorphism (SSCP)
Hayashi 1992
17. Denaturing Gradient Gel
Electrophoresis (DGGE)
Riedel et al. 1990
Thermal Gradient Gel
Electrophoresis (TGGE) Riesner et al.
1989
Heteroduplex Analysis (HDA) Perez et al. 1999;
Schneider et al.
1999
Denaturing High Performance
Liquid Chromatography (DHPLC)
Hauser et al.
1998; Steinmetz
et al.
2000; Kota et al.
2001
Multiple Arbitrary Amplicon
Profiling (MAAP)
Caetano-Anolles
1996; Caetano-
Anolles et al.
1992
Random Amplified Polymorphic
DNA
(RAPD)
Williams et al.
1990; Hadrys et
al. 1992
18. DNA Amplification Fingerprinting (DAF) Caetano-Anolles et al. 1991
Arbitrarily Primed Polymerase Chain Reaction (AP-PCR) Welsh and McClelland 1990;
Williams et al. 1990
Inter-Simple Sequence Repeat (ISSR) Zietkiewicz et al. 1994;
Godwin et al. 1997
Single Primer Amplification Reaction (SPAR) Staub et al. 1996
Directed Amplification of Minisatellites DNA
(DAMD)
Heath et al. 1993; Somers and
Demmon 2002
Amplified Fragment Length Polymorphism
(AFLP)
Vos et al. 1995
Selectively Amplified Microsatellite
Polymorphic Loci (SAMPL)
Witsenboer et al. 1997
19. Non- PCR or Hybridization
Based Markers
“The variation in the length of DNA fragments produced by
specific Restriction Endonucleases from genomic DNA s of
two or more individuals of a species is called hybridization
and markers produced by this technique are called
hybridization based molecular markers.”
This type of hybridization marker includes-
RFLP
20. “RFLP is restriction fragment length
polymorphism, which are variations in the DNA
sequence of an individual which may be detected
by Restriction Endonucleases, which cut the
double stranded DNA whenever they recognize a
highly specific oligonucleotide sequence or a
restriction site.”
RFLP
“ A molecular method of genetic
analysis that allows individuals to
be identified on the basis of
unique patterns of restriction
enzymes cutting in specific regions
of DNA. It is an application of
Southern Hybridization Procedure.
”
RFLP Analysis
22. They are co- dominant.
Measure variation at the level of DNA sequence, not protein
sequence.
RFLP loci are very large so even very small segments of
chromosomes can be mapped and also study phylogenetic
relationship.
Very reliable for linkage analysis and for detecting coupling phase
of DNA molecules.
Disadvantages of RFLP-
Requires relatively very large amount of DNA.
Requirement of radioactive probe makes the analysis expensive
and hazardous.
They are not useful for detecting single base change or point
mutations.
It is time consuming, laborious, and expensive.
The level of polymorphism is low.
Advantages of RFLP-
23. Polymerase Chain Reaction (PCR)-
Based Markers
1.Random Amplified Polymorphic DNA(RAPD)
Developed by Welsh and McClelland In 1991
• RAPDs are DNA fragments amplified by the PCR using short synthetic primers
(generally 10 bp) of random sequence.
• The method is based on enzymatic amplification of target or random DNA
segments with arbitrary primers.
• In this reaction, a single species of primer anneals to the genomic DNA at two
different sites on complementary strands of DNA template. If these priming sites
are within an amplifiable range of each other, a discrete DNA product is formed
through thermo cyclic amplification. . On an average, each primer directs
amplification of several discrete loci in the genome, making the assay useful for
efficient screening of nucleotide sequence polymorphism between individuals.
• Amplified products (usually within the 0.5–5 kb size range) are separated on
agarose gels in the presence of ethidium bromide and view under ultraviolet
light (Jones et al. 1997) and presence and absence of band will be observed.
These polymorphisms are considered to be primarily due to variation in the
primer annealing sites.
25. Advantages:
They are quick and easy to assay.
Because PCR is involved, only low quantities of template DNA are
required.
No sequence data for primer construction are needed.
RAPDs have a very high genomic abundance and are randomly
distributed throughout the genome.
Disadvantages:
Low reproducibility.
RAPD analyses generally require purified, high molecular weight DNA,
and precautions are needed to avoid contamination of DNA samples
because short random primers are used that are able to amplify DNA
fragments in a variety of organisms.
The inherent problems of reproducibility make RAPDs unsuitable
markers for transference or comparison of results among research
teams working in a similar species and subject.
RAPD markers are not locus-specific, band profiles cannot be
interpreted in terms of loci and alleles (dominance of markers), and
similar sized fragments may not be homologous.
26. AFLP(Amplified Fragment Length
Polymorphism)
Keygene 1990; Vos & Jabean 1993
• AFLP, is a technique based on the detection of genomic restriction
fragments by PCR amplification and can be used for DNAs of any
origin or complexity. The fingerprints are produced, without any
prior knowledge of sequence, using a limited set of generic
primers. AFLP procedure mainly involves 3 steps-
• (a) Restriction of DNA using a rare cutting and a commonly cutting
restriction enzyme simultaneously (such as MseI and EcoRI)
followed by ligation of oligonucleotide adapters, of defined
sequences including the respective restriction enzyme sites.
• (b) Selective amplifications of sets of restriction fragments, using
specifically designed primers. To achieve this, the 5' region of the
primer is made such that it would contain both the restriction
enzyme sites on either sides of the fragment complementary to the
respective adapters, while the 3' ends extend for a few arbitrarily
chosen nucleotides into the restriction fragments.
• (c) Gel analysis of the amplified fragments.
28. Advantages & Disadvantages of
AFLP
• AFLP analysis depicts unique fingerprints regardless of the origin and
complexity of the genome. Most AFLP fragments correspond to unique
positions on the genome and hence can be exploited as landmarks in
genetic and physical mapping. AFLPs are extremely useful as tools for
DNA fingerprinting and also for cloning and mapping of variety-specific
genomic DNA sequences. Thus AFLP provides a newly developed,
important tool for a variety of applications.
• Advantages:
• • Fast
• • Relatively inexpensive
• • Highly variable
•
• Disadvantage:
• • Markers are dominant
• • Presence of a band could mean the individual is either homozygous
or heterozygous for the Sequence - can’t tell
29. Sequence Tagged Sites (STS):
In genomics, a sequence tagged site (STS) is a short DNA sequence that has a single
copy in a genome and whose location and base sequence are known. The DNA
sequence of an STS may contain repetitive elements, sequences that appear
elsewhere in the genome, but as long as the sequences at both ends of the site are
unique and conserved, researches can uniquely identify this portion of genome
using tools usually present in any laboratory.
Thus, in broad sense, STS include such markers as microsatellites (SSRs, STMS ),
SCARs, CAPs, and ISSRs
Main features of STS markers are given below.
1. STSs are short DNA sequences (200-500 nucleotide long).
2. STSs occur only once in the genome.
3. STS are detected by PCR in the presence of all other genomic sequences.
4. STSs are derived from cDNAs.
Advantages-
STSs are useful in physical mapping of genes.
This technique permits sharing of data across the laboratories.
It is a rapid and most specific technique than DNA hybridization techniques.
It has high degree of accuracy.
It can be automated.
Disadvantages:
Development of STS is a difficult task. It is time consuming and labour oriented
technique.
It require high technical skill.
30. SSR (Simple sequence repeat)
Simple sequence repeat or Microsatellites, also known as
Simple Sequence Repeats (SSRs) or Short Tandem Repeats
(STRs), are repeating sequences of 2-6 base pairs of DNA. It
is a type of Variable Number Tandem Repeat (VNTR).
Microsatellites are typically co-dominant.
Sequence
Primer
ACTGTCGACACACACACACACGCTAGCT (AC)7
TGACAGCTGTGTGTGTGTGTGCGATCGA
ACTGTCGACACACACACACACACGCTAGCT (AC)8
TGACAGCTGTGTGTGTGTGTGTGCGATCGA
ACTGTCGACACACACACACACACACACGCTAGCT (AC)10
TGACAGCTGTGTGTGTGTGTGTGTGTGCGATCGA
ACTGTCGACACACACACACACACACACACACGCTAGCT (AC)12
TGACAGCTGTGTGTGTGTGTGTGTGTGTGTGCGATCGA
32. • Microsatellite sequences are especially suited to distinguish closely related
genotypes; because of their high degree of variability, they are, therefore,
favoured in population studies and for the identification of closely related
cultivars .
• Microsatellite polymorphism can be detected by Southern hybridization or PCR.
• Microsatellites, like minisatellites, represent tandem repeats, but their repeat
motifs are shorter (1–6 base pairs). If nucleotide sequences in the flanking
regions of the microsatellite are known, specific primers (generally 20–25 bp)
can be designed to amplify the microsatellite by PCR.
• Microsatellites and their flanking sequences can be identified by constructing a
small-insert genomic library, screening the library with a synthetically labelled
oligonucleotide repeat and sequencing the positive clones.
• Alternatively, microsatellite may be identified by screening sequence databases
for microsatellite sequence motifs from which adjacent primers may then be
designed. In addition, primers may be used that have already been designed for
closely related species.
• Polymerase slippage during DNA replication, or slipped strand mispairing, is
considered to be the main cause of variation in the number of repeat units of a
microsatellite, resulting in length polymorphisms that can be detected by gel
electrophoresis.
35. Advantage-
SSR markers tend to be highly polymorphic.
The genotyping throughput is high.
This is a simple PCR assay. Many SSR markers are multi-allelic and
highly polymorphic.
Most SSRs are co-dominant and locus specific.
No special equipment is needed for performing SSRs assays; however,
special equipment is needed for some assay methods,
Start-up costs are low for manual assay methods (once the markers
are developed). SSR assays can be performed using very small DNA
samples (~100 ng per individual).
SSR markers are easily shared between laboratories.
Disadvantages:
The development of SSRs is labor intensive
SSR marker development costs are very high
Start-up costs are high for automated SSR assay methods.
Developing PCR multiplexes is difficult and expensive.
Some markers may not multiplex.
36. • In human beings, 99.9 percent bases are same.
• Remaining 0.1 percent makes a person unique.
– Different attributes / characteristics / traits
• how a person looks,
• These variations can be:
– Harmless (change in phenotype)
– Harmful (diabetes, cancer, heart disease, Huntington's
disease, and hemophilia )
– Latent (variations found in coding and regulatory
regions, are not harmful on their own, and the change
in each gene only becomes apparent under certain
conditions e.g. susceptibility to lung cancer)
SNPs
(Single Nucleotide Polymorphisms)
37. • SNPs are found in
– coding and (mostly) non coding regions.
• Occur with a very high frequency
– about 1 in 1000 bases to 1 in 100 to 300 bases.
• The abundance of SNPs and the ease with which
they can be measured make these genetic variations
significant.
• SNPs close to particular gene acts as a marker for
that gene.
• SNPs in coding regions may alter the protein
structure made by that coding region.
39. •In many organisms most polymorphisms result from changes in a
single nucleotide position (point mutations), has led to the
development of techniques to study single nucleotide polymorphisms
(SNPs).
•Analytical procedures require sequence information for the design of
allelespecific PCR primers or oligonucleotide probes.
•SNPs and flanking sequences can be found by library construction
and sequencing or through the screening of readily available sequence
databases.
•Once the location of SNPs is identified and appropriate primers
designed, one of the advantages they offer is the possibility of high
throughput automation
•SNP analysis may be useful for cultivar discrimination in crops where
it is difficult to find polymorphisms,
40. Advantages-
SNP markers are useful in gene mapping.
SNPs help in detection of mutations at molecular level.
SNP markers are useful in positional cloning of a mutant
locus.
SNP markers are useful in detection of disease causing
genes.
Disadvantages:
Most of the SNPs are bialleleic and less informative than
SSRs.
Multiplexing is not possible for all loci.
Some SNP assay techniques are costly.
Development of SNP markers is labour oriented.
More (three times) SNPs are required in preparing
genetic maps than SSR markers.
41. Diversity Array Technology (DArT)
high-throughput marker system
No sequence information is needed
DArT is based on microarray hybridizations
Detect the presence v/s absence of individual
fragments
Efficiently and economically scan from
hundreds to thousands of polymorphic
markers.
42. DArT technology consists of several
steps:
1. Complexity reduction of the DNA of interest
1. Library creation Microarraying libraries onto glass slides
2. Microarraying fragments onto glass slides
3. Hybridisation of fluoro-labelled DNA onto slides
4. Scanning of slides for hybridisation signal
5. Data analysis and extraction.
9
43. DArT operates on the principle that the genomic
'representation' contains two types of fragments:
• Constant fragments: found in any 'representation'
prepared from a DNA sample from an individual
belonging to a given species, and
• Variable (polymorphic): fragments called
molecular markers, only found in some but not all
of the 'representations'.
Principle of DArT
45. Presence vs. absence in a genomic 'representation' is assayed by
hybridizing the 'representation' to a DArT array consisting of a
library of that species.
47. 2. Library creation
DNA amplification
Cloning
Library in E coli
13
Each colony contains one of the
fragments from the genomic
'representation'.
48. 3. Microarraying
Selection of clones
Arranged into a plate format
(usually 384-well plates)
Fragments within library
amplified
Spotted onto glass slides
14Genotyping array
50. 5. scanning
The hybridised slides are Washed and processed
to remove unbound labelled DNA.
Then scanned using a scanner to detect
fluorescent signal emitted from the hybridised
fragments.
The result from each fluorescent channel is
recorded
The resulting images are stored in 'tif' format.
16
51. 6. Data analysis
The data from the scanned images is extracted and analysed
using the DArTsoft software and the information is managed
by the DArTdb Laboratory Information Management System.
17
53. Advantages of DArT technology
Marker density relevant to application
Sequence information and platform independence
High throughput due to a high level of multiplexing
Matching most cost-effective technology with the
application on modern platforms
19
54. DArT Applications
• Genome profiling and diversity analysis
• Genetic and physical mapping
• Identification of QTL
• Rapid introgression of genomic regions in accelerated
backcrossing programs
• Simultaneous marker-assisted selection for several
traits
• Genomic Selection
• Varietal identification of crops and genetic purity
testing
• Monitoring the composition of complex DNA samples
20
55. AN IDEAL MARKER SYSTEM
1) Highly polymorphic nature: It must be polymorphic as it is
polymorphism that is measured for genetic diversity
studies.
2) Codominant inheritance: Determination of homozygous and
heterozygous states of diploid organisms.
3) Frequent occurrence in genome: A marker should be evenly
and frequently distributed throughout the genome.
4) Selective neutral behaviors: The DNA sequences of any
organism are neutral to environmental conditions or
management practices.
5) Easy access (availability): It should be easy, fast and cheap
to detect.
6) Easy and fast assay
7) High reproducibility
8) Easy exchange of data between laboratories.