Flax (Linum usitatissimum L.) is a self-pollinated annual species (2n = 2x = 30) belonging to the Linaceae family. It has been utilised by mankind for some 30,000 years (Paleolithic era) , was domesticated ~7,000 years ago in the Near East and then spread to the Fertile Crescent where it was grown for its seed oil and stem fibres . Currently, Canada is the world’s largest producer of linseed (http://publications.gc.ca/collections/collection_2011/statcan/22-007-X/22-007-2011002-eng.pdf).
Flax oil is highly sought after in the fabrication of biodegradable products such as paint, linoleum and varnish, while its oil-free meal is used as livestock feed. Recently, linseed has gained importance as nutraceutical primarily because of its α-linolenic acid (ALA) and lignan content. The ALA component of flax oil (omega-3 fatty acid) improves bone and cardio-vascular health [3–5] while lignans are a rich source of antioxidants and precursors of various hormones . Animal feed for cattle and chicken is being fortified with flax to produce omega-3 enriched meat and eggs .
To assess and capitalize upon the genetic variability in flax, genomic resources are needed. The flax genome assembled from short shotgun reads  as well as a collection of expressed sequence tags (ESTs) from more than 10 different tissue libraries are now available . Genetic mapping remains a commonly used approach to understand the molecular basis of phenotypic traits. Various molecular markers including random amplified polymorphic DNA (RAPD), restriction fragment length polymorphism (RFLP), amplified fragment length polymorphism (AFLP) and simple sequence repeat (SSR) have been developed to analyse flax genetic diversity [10–19]. Three bi-parental population-based linkage maps of flax have been published to date: an AFLP map of 213 markers , an RFLP and RAPD map of 94 markers  and an SSR map of 113 markers . A recently constructed 770 SSR consensus map based on three populations constitutes a significant improvement over previous maps but even this marker density remains insufficient for many applications . An ideal molecular approach to generate markers is one that assesses numerous reliable markers covering the entire genome in a single and simple experiment . The discovery of single nucleotide polymorphic (SNP) markers combined with next generation sequencing (NGS) permits the identification of thousands of markers from entire genomes which can be used for linkage map construction, genetic diversity analyses, marker-trait association and marker-assisted selection . SNPs have been discovered by high throughput sequencing in humans , Drosophila melanogaster, wheat , eggplant , rice [26–28], Arabidopsis thaliana[29, 30], barley [31–33], walnut , lupin , globe artichoke , rapeseed , perennial ryegrass  and maize  to name but a few. SNP discovery through genome sequencing is readily accomplished in simpler genomes like rice and Arabidopsis[28, 40] but the task remains challenging for a number of economically important crops [41, 42]. The discovery process is also impeded by the presence of repeat elements, paralogous sequences and reference genomes that are incomplete or inaccurate. The flax genome of CDC Bethune has an estimated size of ~370 Mbp with a high proportion of low copy sequences . Its repetitive fraction consists of ribosomal DNA (~13.8%), known transposable elements (~6.1%) and putative novel repeat elements (~7.4%)  making it highly suitable for SNP discovery.
Genomic complexity can be reduced using restriction enzymes , high-Cot selection , methylation filtration , microarrays [47, 48] and cDNAs . Trebbi et al. have described the pros and cons of these methods . The use of reduced representation libraries (RRL) is advantageous because the reduction of genome complexity can be altered by selecting different enzymes or size ranges. RRL sequencing, first proposed for the human genome, reduces genome complexity, facilitates re-sampling and generates sufficient coverage for accurate SNP calling . Deep re-sequencing of RRLs using the sequencing-by-synthesis method has been performed for the purpose of SNP discovery in soybean and sorghum [51, 52].
SNP genotyping of one to several thousands of SNPs can be performed simultaneously using various chemistries such as Taqman® probes [53, 54], Invader® , iPLEX® , KASParTM, SNaPshotTM, GoldenGate®  and Infinium® assays . The high throughput and constantly decreasing cost of sequencing technologies makes genotyping-by-sequencing (GBS) an attractive choice for genome-wide SNP genotyping.
The objective of the current study was to discover and validate SNPs in flax using a combined NGS of RRLs and GBS strategy with the updated annotation based genome-wide SNP discovery pipeline (AGSNP) [34, 61]. The resource promises to have several downstream applications including the exploitation of flax genetic diversity through the understanding of important phenotypic traits.