The VCF files were imputed using the Sanger Imputation Service. The reference consortium chosen was the 1000Genomes Phase3 dataset and the data was pre-phased and imputed using EAGLE2.
Quality control
23 imputed VCF files were obtained for the autosomes and the X chromosome. The files were combined and prepared for input into Plink2.
# To combine all the VCF filesbcftoolsconcat1.vcf.gz2.vcf.gz3.vcf.gz4.vcf.gz5.vcf.gz6.vcf.gz7.vcf.gz8.vcf.gz9.vcf.gz10.vcf.gz11.vcf.gz12.vcf.gz13.vcf.gz14.vcf.gz15.vcf.gz16.vcf.gz17.vcf.gz18.vcf.gz19.vcf.gz20.vcf.gz21.vcf.gz22.vcf.gz-Oz-ocombined.vcf.gz# To remove duplicate IDsbcftoolsview-Hcombined.vcf.gz|awk'!seen[$3]++'>>output.vcf# To compress and index the VCF filebgzipoutput.vcftabix-pvcfoutput.vcf.gz
Obtaining PRS using Plink2
This data was used as target data to compute PRS using Plink. The scoring file used was PGS000785.