Identification of genetic variants associated with production traits in a pig F2 design

One or two decades ago, enormous efforts were put into the establishment of large resource populations for QTL mapping purposes. Especially in pigs and chickens, huge F2 designs were set up, which were phenotyped in-depth and for many traits. These designs were mostly analyzed using sparse microsatellite marker maps resulting in large QTL confidence intervals. With the advent of SNP chips and later affordable genome sequencing, some designs were revisited to obtain a higher mapping resolution, which is most promising in pooled designs. We overcame the shortcomings of previous studies by pooling four large F2 designs to produce smaller linkage disequilibrium blocks and by resequencing the founder generation at high coverage and the F1 generation at low coverage for subsequent imputation of the F2 generation to whole genome sequencing marker density. This led to the discovery of more than 32 million variants (SNPs and InDels < 50bp), 8 million of which have not been previously reported. The pooling of the four F2 designs enabled us to perform a joint genome-wide association study (GWAS), which led to the identification of numerous significantly associated variant clusters on chromosomes 1, 2, 4, 7, 17, and 18 for the growth and carcass traits average daily gain, back fat thickness, meat fat ratio, and carcass length. This study, however, did not cover the whole depth of the genomic data since they only captured SNPs and short InDels. Structural variants (SVs) have so far been widely neglected due to the obstacles in obtaining high-confidence variant calls. We profiled these variants via state-of-the-art strategies in the founder animals of the same F2 pig crosses. This led to the discovery of 13,201 high-confidence structural variants and 103,730 polymorphic tandem repeats (with a repeat length of 2-20 bp). We observed a moderate to high (r from 0.48 to 0.57) level of co-localization between SNPs or small indels and SVs or tandem repeats. In GWAS 56.56% of the significant variants were not in high LD with significantly associated SNPs and small indels identified for the same traits in the earlier study and thus presumably not tagged in the case of a standard association study. For the four growth and carcass traits investigated, many of the already proposed candidate genes were confirmed and additional ones were identified. Interestingly, a common pattern on how structural variants or tandem repeats regulate the phenotypic traits emerged. Many of the significant variants were embedded or nearby long non-coding RNAs drawing attention to their functional importance. Our current focus in this project is directed toward the characterization of traits involving muscle enzyme content and activity and their influence on relevant production traits in pigs.

Project publications