Genetic variation at the major histocompatibility complex (MHC) has been shown to be strongly associated with many human traits and diseases. However, large-scale association testing has been primarily conducted using SNP genotyping chips, based on the hg38 human reference genome sequence. Given that the MHC is the most polymorphic region in the genome, it is likely that these signals may not be capturing the full extent of variants which may impact disease risk. To address this challenge and generate high resolution MHC sequences for association testing, we have developed 1) a set of customized, MHC-enriched target capture probes for paired-end short read sequencing and 2) a novel sequence assembly algorithm, MHConstructor (Wade et al., 2024), to perform de novo assembly and scaffolding in a high throughput fashion. With these metodologies, we are able to sensitively query association signals across large disease cohorts.
MHConstructor is free for use and can be retrieved from our GitHub page here.