A highly robust and optimized sequence-based approach for genetic polymorphism discovery and genotyping in large plant populations

Zewei Luo

Department of Biostatistics & Computational Biology, SKLG,

School of Life Sciences, Fudan University

zwluo@fudan.edu.cn

Advent of the new generation sequencing techniques motivates recent interest in developing sequence-based identification and genotyping of genome-wide genetic variants in large populations, with RAD-seq being a typical example. Without taking proper account for the fact that chloroplast and rRNA genes may occupy up to 60% of the resulting sequence reads, the current RAD-seq design could be very inefficient for plant and crop species. Based on the original RAD-seq idea, the new method presented here is thoroughly optimized by integrating an in silico guided selection of an optimal combination of restriction enzymes to shear the target genome into DNA segments with designed length, to remove the abundant and undesirable chloroplast and rRNA DNA from sequencing libraries and to guide selection of those DNA segments to be sequenced for predefined genome regions.

Fortran90_executable_and_guide_files.tar.gz