Novel use of whole genome sequence data
Whole genome sequence data provides information on the full DNA profile of an individual. This data is mainly used to identify variation among individuals and to create reference populations for imputation. However, we have now showed that karyotyping can also be performed using whole genome sequence data.
Karyotype abnormalities in pigs
In diploid mammals, a normal karyotype consist of two copies of each chromosome. Unfortunately, various abnormal karyotypes with gain or loss of DNA have been observed in unviable offspring and individuals with various clinical disorders. In contrast, balanced abnormalities have no gain or loss of chromosomal material and usually result in viable individuals with no apparent phenotypical consequences, except for reduced fertility (on average 40% reduction in litter size). The prevalence of such balanced chromosomal abnormalities in pigs is estimated to be 0.47%.
While large gain or loss of chromosomes can be detected using SNP genotyping arrays, balance chromosomal abnormalities are more difficult to detect. Currently karyotyping is performed routinely for AI boars in pig breeding by a laborious method that involves staining of chromosomes and visually assessing the presence of the chromosomes, their size and their banding patterns. Although prevalence is low, screening is very important because transmission of these abnormalities on large scale into the population through AI would have an enormous economic impact due to the reduced litter sizes of carriers.
Detection of abnormalities using sequence data
Ten boars with known chromosomal abnormalities (reciprocal translocations), as well as a number of control animals that did not have any chromosomal abnormalities, were sequenced at 30 fold coverage. A bioinformatic pipeline was created to screen the sequence data for the presence and the exact location of abnormality. We showed that short read sequencing can be used for detection of balanced abnormalities in pigs if sequencing depth is at least 20 fold coverage. However, repetitive areas of the genome are problematic and may require larger fragments or even long read sequence data. The advantage of using sequence data is that it provides insight in the exact position, type and features of the abnormality. It for instance showed that breakpoints often disrupt genes, but also what type of DNA-repair mechanisms were involved. Hence giving a lot of insight in to the biological aspect as well.
Karyotyping by staining of chromosomes is not expensive but laborious and a logistical challenge. Moreover, small abnormalities (<band size) cannot be detected. Hence, karyotyping by sequencing is a convenient and more extensive method, but currently sequencing cost are still rather high for routine screening of AI boars. On the other hand, whole genome sequence data contains an enormous amount of information of which most is currently not exploited. So the more applications of sequence data we add the more valuable the data will become.
These results are published (open access) in BMC Genomics. The paper by Bouwman et al. is entitled: Using short read sequencing to characterise balanced reciprocal translocations in pigs, and can be found here. This work was done within Breed4Food in close collaboration with Topigs Norsvin.