PolyMarker is an automated bioinformatics pipeline for SNP assay
development which increases the probability of generating
homoeologue-specific assays for polyploid species. PolyMarker generates a
multiple alignment between the target SNP sequence and the selected
reference genome (from the drop off menu in green below). It then
generates a mask with informative polymorphic positions between
homoeologs which are highlighted with respect to the target genome.
These positions include (see figure for example):
- Varietal polymorphism: this is the SNP that is targeted in the assay (&)
- Genome specific: this is a homoeologous polymorphism which is only present in the
target genome (upper case)
- Genome semi-specific: this is a homoeologous
polymorphism which is found in 2 of the 3 genomes, hence it
discriminates against one of the off-target genomes (lowercase)
- Homoeologous: if the target varietal SNP is also a
homoeologous polymorphism between genomes (e.g. A, B and D genomes in
the wheat reference Chinese Spring)
PolyMarker will generate KASP assays which are based on a three
primer system. Two diagnostic primers incorporate the alternative
varietal SNP at the 3' end, but are otherwise similar (black boxed
primers in figure). The third common primer is preferentially selected
to incorporate a genome-specific base at the 3' end (red boxed primer in
figure), or a semi-specific base in the absence of an adequate genome
specific position.
The code of the PolyMarker pipeline is available in github.
Using PolyMarker
- The input file must be uploaded as a CSV file (can be exported from Excel) with the following columns:
- Gene id: An unique identifier for the assay. It must be unique on each run
- Target chromosome: This will depend on the
Reference sequence being used. For wheat use 1A, 2D, 7B, etc... Note
that for other species you can find the exact chromosome nomenclature by
generating an example in the home page (press orange “Example” button
once the Reference is selected).
- Sequence: The sequence flanking the SNP. The SNP must be marked in the format
[A/T] for a varietal SNP with alternative bases, A or T.
- PolyMarker takes ~1 minute per marker assuming an input sequence of
200 bp (with the varietal SNP in the middle). [Longer sequences can be
used, but this will slow down the initial BLAST against the wheat survey
sequence. We have not seen improvement in performance with longer
sequences; therefore we recommend 200-bp of input sequence. The final
multiple alignment for the primer design only considers 100-bp on either
side of the target varietal SNP.]
- BLAST is used to search for the contigs which align to the SNP. By
default, the miniumm identity used to match across the genomes it is 90%
and the model used is est2genome.
Example
Input file
The example input file contains three markers to design.
1DS_1905169_Cadenza0423_2404_C2404T,1D,ccgccgtcgtatggagcaggccggccaattccttcaaggagtcaaccacctggcgcaaggaccatgaggtccatgctcacgaggtctctttcgttgacgg[C/T]aaaaacaagacggcgccaggctttgagttgctcccggctgtggtggatcaccaaggcaacccgcagccgaccttggtggggatccacgttggccatcccaa
1DS_40060_Cadenza0423_2998_G2998A,1D,ccagcagcgcccgtcccccttctcccccgaatccgccggagcccagcggacgccggccatgagcacctccgagtagtaagtccccggcgccgccgccgcc[G/A]ccgatctttctttctttctcgcttgatttgtctgcgtttcttttgttccgggtgattgattgatgtgcgtgggctgctgcagcgactacctcttcaagctg
1DS_1847781_Cadenza0423_2703_G2703A,1D,tttcctctcaaatgtagcttctgcagattcggtggaagggcattcaaccggagaacctcattctcatcacttgcggtcacctctaggtaggacaaaaact[G/A]catctgaataagagactcacagaggcgttcacagtagattctcttcacattcaataacctcaggcttctcatttgcctcagctctcccagttgtctaacag
The input text box supports to have the table separated by TAB, so you can paste the three columns from
excel.
Output: mask
The mask contains the details of the local alignment

REST API
PolyMarker jobs can be submitted via a REST API. To do this you need to submit a POST request to the url 'http://www.polymarker.info/snp_files.json'
The body
of the request must follow the following structure:
{
"snp_file":
{
"reference":"RefSeq v1.0",
"email":""
},
"polymarker_manual_input":
{
"post":"1DS_1905169_Cadenza0423_2404_C2404T,1D,ccgccgtcgtatggagcaggccggccaattccttcaaggagtcaaccacctggcgcaaggaccatgaggtccatgctcacgaggtctctttcgttgacgg[C/T]aaaaacaagacggcgccaggctttgagttgctcccggctgtggtggatcaccaaggcaacccgcagccgaccttggtggggatccacgttggccatcccaa\n1DS_40060_Cadenza0423_2998_G2998A,1D,ccagcagcgcccgtcccccttctcccccgaatccgccggagcccagcggacgccggccatgagcacctccgagtagtaagtccccggcgccgccgccgcc[G/A]ccgatctttctttctttctcgcttgatttgtctgcgtttcttttgttccgggtgattgattgatgtgcgtgggctgctgcagcgactacctcttcaagctg\n1DS_1847781_Cadenza0423_2703_G2703A,1D,tttcctctcaaatgtagcttctgcagattcggtggaagggcattcaaccggagaacctcattctcatcacttgcggtcacctctaggtaggacaaaaact[G/A]catctgaataagagactcacagaggcgttcacagtagattctcttcacattcaataacctcaggcttctcatttgcctcagctctcccagttgtctaacag"
}
}
The response will contain the ID (XXXXXXXXXXXXXXXXXXXX
in the example) of the request and the URL
with the link to the results as follow:
{
"id":"XXXXXXXXXXXXXXXXXXXX",
"url":"http://www.polymarker.info/snp_files/XXXXXXXXXXXXXXXXXXXX",
"path":"/snp_files/XXXXXXXXXXXXXXXXXXXX"
}
The valid reference
values for this instance are:
'bol-1.0'
'brapa-1.0'
'cadenza-1.1'
'cadenza-2'
'chinese_spring_refseq-1.0'
'chinese_spring_refseq-2.1'
'chinese_spring_refseq_pseudomolecules-1.0'
'chinese_spring_refseq_tetraploid-1.0'
'claire-1.1'
'darmor-bzh-4.1'
'fielder-1'
'glycine_max-2.1'
'ibsc-v2'
'kronos-1.1'
'kronos_collapsed_masked-1.1'
'paragon-1.1'
'paragon-3'
'robigus'
'secale_cereale_lo7-2'
'svevo-2'
'tu-2.0'