Introduction
There is a strong genetic component to Type One Diabetes (T1D), as evidenced by twin and familial studies. In monozygotic twins (identical twins, with the same genome), if one twin gets T1D, there is more than a 65 % chance that the other twin will get it by the age of 60 (1). If you are a sibling of a T1D patient, you are at higher risk for developing T1D than members of the general population; there is a 6% probability for siblings, compared to 0.4% among the white US population (2).
Importantly, the presence of T1D is not completely genetically determined; rather, your genetic makeup increases your susceptibility. This is obvious upon examining the monozygotic twin studies mentioned above; if T1D was fully genetic then you would expect 100% of monozygotic twins to be (or not be) diabetic, but this is not the case. Therefore, environmental factors must also play a role. Additionally, cases of T1D in western societies have doubled in the past 20 years, a rate of change presumably too quick to be caused by genetics alone.
Type One Diabetes is a polygenetic disease, meaning that many different genes affect the probability of contracting the condition. This contrasts with diseases such as cystic fibrosis, which manifest from mutations in a single gene and inheritance of the disease follows Mendelian genetics. Genes that increase one’s risk of developing T1D are called susceptibility genes. Researchers have identified over 40 different susceptibility genes for T1D, with a range of risk (3). Susceptibility genes that confer the highest risk are found on chromosome six, and are part of the major histocompatibility complex II (MHC II a.k.a. human leukocyte antigen II (HLA-II)). An estimated 50% of the genetic risk for T1D can be attributed to HLA locus (4).
The MHC II is involved in the immune response against bacteria and viruses. It is a protein complex embedded within the cellular membrane of specific cell types and is involved in displaying antigens (peptides derived from proteins inside the cell). Antigens that are derived from invading bacterial or viral proteins are displayed on MHC II complexes. This allows the immune system to target infected cells for destruction. MHC II also displays self-antigens (peptides derived from its own genome). This acts as a quality control so that the immune system doesn’t destroy your own cells. In autoimmune diseases, self-antigens, MHC complexes, and/or some other aspect of the immune system does not function properly, so the immune system destroys normal, healthy body tissues. In T1D, immune cells destroy the pancreatic β cells. The immune response that is elicited against β cells is a complex topic which I plan to address in a future write-up. The MHC II genotype that confers the highest risk for T1D is DR3/4-DQ8 (DRB1*0301-DQA1*0501-DQB1*0201/DRB1*04-DQA1*0301-DQB1*0302). Given the importance of this genotype in T1D susceptibility, the rest of this article will be dedicated to trying to understand it.
What is DR3/4-DQ8?
The DR3/4-DQ8 genotype refers to specific types of MHC II complexes[1] displayed on specific cell surfaces. There are five types of MHC II complexes that can be displayed: DO, DM, DR, DQ, and DP. These complexes differ in their amino acid sequence (mainly at the peptide (antigen) binding site), and have been classified by serotyping. MHC II complexes are composed of two polypeptides (alpha (α) and beta (β)) that combine to form a heterodimer (two different polypeptides that come together to form a complex).
Therefore, the genotype of an MHC II complex depends on two genes, the α and β, which come together to form DO, DM, DR, DQ, and DP complexes. Given that there are five different types of MHC II complexes—each containing an alpha and beta peptide—and humans are diploid (contain two copies of each gene), the variation in MHC II complexes comes from twenty different genes. The DNA sequence varies among individuals for these genes, leading to different alleles for the MHC II complexes. These alleles are indicated by the number following the letter designation (i.e. DR3, DQ8). There can be thousands of different alleles within the HLA encoding region of chromosome six. Indeed, the HLA region is the most diverse region observed in the human genome, with over six thousand alleles (5).
|
Figure 1. MHC class II complexes are embedded in the membranes of antigen presenting cells. They are composed of two proteins (alpha and beta) that form an antigen binding site. Each polypeptide (corresponding to a single gene) has intracellular and transmembrane domains, as well as two extracellular domains. The B1 and A1 extracellular domains form the site where processed antigens are presented (the peptide binding site).
So let’s take a closer look at the DR3/4-DQ8 genotype: DRB1*0301-DQA1*0501-DQB1*0201/DRB1*04-DQA1*0301-DQB1*0302.
MHC II complexes are identified by the first two letters (DR and DQ for the above genotype).
Because each complex is composed of an alpha and beta there should be a B or A designation after the two-letter code. Above we see that DR does not have an alpha (A) gene described—this is because the DRA1 is non-variant (no alleles), so people don’t bother writing it.
The “/” designates the second set of alleles (from the other chromosome—remember, humans are diploid).
Some will refer to DRB1*0301-DQA1*0501-DQB1*0201 as DR3-DQA1*0501-DQB1*0201 or just DR3 or DQB1*0201.
Similarly, DRB1*04-DQA1*0301-DQB1*0302 can be called DR4-DQA1*0301-DQB1*0302 or DR4 or DQB1*0302 or DQ8.
Presumably, some abbreviate these genotypes to just DR3 or DR4 (or DQ8). Often the DR3/4-DQ8 genotype is referred to as “heterozygous”. This is because the individual has two different alleles for DR (i.e. DR3 and DR4, where homozygous would be DR3/DR3 or DR4/DR4).
The DQB1*0302 allele is often referred to as DQ8 and is linked to DR4. Presumably, if you are DR4 than you are automatically DQ8 (as I have suggested above).
Figure 2. Chromosome 6 pair representing the gene organization for a DR3/4-DQ8 individual. The genes are transcribed and translated to form the protein complexes below (Figure 3).
Figure 3. Representation of the complexes expressed in a DR3/4-DQ8 cell. Alleles specific to the DR and DQ MHC II complexes increase susceptibility to T1D. These susceptibility alleles are often referred to as DR3 and DR4 (or DQ8). Some hypothesize that the heterozygous individuals can form chimera DQ complexes that have atypical antigen specificity.
DR3/4-DQ8—What’s the risk?
Of people who have T1D, approximately 40% have the DR3/4-DQ8 genotype, as compared to 2.4% of the general population (2). If a child is DR3/4-DQ8, the risk of developing diabetes is between 1 in 15 and 1 in 25, as compared to 1 in 300 for the general population (though by which age of diagnosis this risk is calculated at is unclear to me)(6). Patients with this genotype typically present disease phenotypes at an early age (within childhood). It’s likely that if you were diagnosed with T1D after childhood you probably do not have this genotype. However, having part of this genotype can increase susceptibility—95% of patients have DR3 or DR4 alleles, compared to ~40% of the U.S. white population (7).
Being recently diagnosed with T1D and having genotyping data from 23andme, I wanted to know if I could learn anything about my own genetic susceptibility. It looks like 23andme used to provide information on T1D susceptibility but probably got rid of it after issues with the FDA.
However, 23andMe genotypes many positions which are not included in susceptibly reports. This data can be accessed through the “browse raw data” tab, downloaded, and fed into other programs like Promethease to get more info. The quickest way to determine if an individual is DR3/4-DQ8 is to genotype two positions of the MHC-II locus: rs7454108 and rs2040410 (8). Presumably, this is because alleles are linked, so if you know the genotype at one position you can be reasonably confident of the sequence at neighboring positions (haplotypes). If rs7454108 is C/T then you are DR4 (8). 23andme doesn’t genotype rs2040410 but you can use a proxy SNP: rs2187668 (9). If rs2187668 is A/G or A/A you are DR3 (8). If you have both (DR3 and DR4) then you are DR3/4-DQ8 (8). These statements are summarized in an algorithm represented in Figure 4. Disclaimer: this algorithm may not be correct, please consult the corresponding references for further information, and please let me know if I missed something.
The Barker et al. paper that describes these two SNPs (8) has no discussion on using these SNPs for identifying if an individual is homozygous for DR3 or DR4. However, I propose that these SNPs can be used to predict homozygosity for two reasons.
First, of 1,191 DR3/4-DQ8-positive individuals in the T1DGC, 94% were AG/CT (rs2040410/rs7454108) and 4.6% were AA/CT.
Second, DR3/4-DQ8-positive individuals in the British cohort were 87.4% AG/CT and 10% AA/CT, and in the DAISY all positive individuals were AG/CT.
In these cohorts, the CT genotype was always present for DR4 individuals, but the DR3 genotype was either AG or AA. An AA genotype at rs2040410 would predict homozygosity, but this is not the case in some of these cohorts (i.e. 10% DR3/4-DQ8 positive individuals were AA/CT in the British cohort).
Therefore, rs2040410 is not that great for predicting DR3 homozygosity. But the provided algorithm uses a proxy SNP (rs2187668) which seems to be better for predicting homozygosity (10).
Figure 4. A. Algorithm for determining DR3/4-DQ8 genotype from 23andme data. This chart is largely based off (8). The authors demonstrated that these two SNPs can be used to identify DR3/4-DQ8 individuals. However, they use rs2040410 for DR3, whereas this algorithm uses rs2187668. This is because 23andme doesn’t genotype rs2040410, but rs2187668 is a proxy SNP for rs2040410 (9). The percentage of individuals with this genotype is reported below (SNPedia, European ancestry). SNPedia uses the compliment for rs2187668 (i.e. if 23andme reports C/C than use G/G (as used here)). The heterozygous genotype is CT/AG (rs7454108/rs2187668). If the genotype is A/G for rs2187668 but not C/T for rs7454108 then the individual is DR3/X (~20%), and if rs7454108 is C/T but not A/G for rs2187668 then the individual is DR4/X (~25%), where X is some other DR allele (not DR3 or DR4). B. SNP sequence and corresponding alleles. Percentage of population taken from SNPedia. DRX is some other DR allele.
Interpreting the risk of individual alleles is complicated because HLA-II alleles seem to play a protective role. There are at least three alleles that seem to play a protective role against T1D: DQB1*0602, DRB1*0403, and DRB1*1401 (11). However, 23andMe does not genotype these positions and/or there is no information regarding these SNPs on SNPedia. Consequently, this leaves any protective role of SNPs out of the above algorithm.
After about ten months on the wait list, I finally met with a genetics counselor to discuss the genetics of T1D and the chance of my potential off-spring being DR3/4-DQ8. I learned, from a clinical perspective, the genetics of T1D is still very far in the research realm and therefore little can be said about the risk of the disease for my potential off-spring.
In her 20 years of being a genetics counselor, at a well-known university hospital, she has not once seen a T1D patient to discuss the genetics behind the disease. I told her about the SNPs from GWAS studies and she seemed unimpressed. She informed me that GWAS studies have not provided anything to genetic councilors yet, and she seemed suspect about the prospect.
Indeed, since one of the first publications on GWAS in 2007, I cannot find a clear example of it being applied in the clinic. However, this is still a very young field in scientific terms, and I imagine the results of GWAS will start to materialize within the years to come (if not in the clinic at least in the lab).
Nonetheless, I learned from my 23andme data that I am DR3/DRX, and that I inherited my DR3 allele from my father (he is also DR3/DRX, whereas my mother is DRX/DRX). My wife is DR4/DRX. If I understand correctly, this would mean that the probability of our off-spring being DR3/DR4 is 25%. This seems quite high.
References
- Redondo, M.J., Jeffrey, J., Fain, P.R., Eisenbarth, G.S. and Orban, T. (2008) Concordance for islet autoimmunity among monozygotic twins. N Engl J Med, 359, 2849-2850.
- Steck, A.K. and Rewers, M.J. (2011) Genetics of type 1 diabetes. Clin Chem, 57, 176-185.
- Barrett, J.C., Clayton, D.G., Concannon, P., Akolkar, B., Cooper, J.D., Erlich, H.A., Julier, C., Morahan, G., Nerup, J., Nierras, C. et al. (2009) Genome-wide association study and meta-analysis find that over 40 loci affect risk of type 1 diabetes. Nat Genet, 41, 703-707.
- Mehers, K.L. and Gillespie, K.M. (2008) The genetic basis for type 1 diabetes. Br Med Bull, 88, 115-129.
- Noble, J.A. and Erlich, H.A. (2012) Genetics of type 1 diabetes. Cold Spring Harb Perspect Med, 2, a007732.
- Rewers, M., Bugawan, T.L., Norris, J.M., Blair, A., Beaty, B., Hoffman, M., McDuffie, R.S., Jr., Hamman, R.F., Klingensmith, G., Eisenbarth, G.S. et al. (1996) Newborn screening for HLA markers associated with IDDM: diabetes autoimmunity study in the young (DAISY). Diabetologia, 39, 807-812.
- Steck, A.K., Armstrong, T.K., Babu, S.R., Eisenbarth, G.S. and Type 1 Diabetes Genetics, C. (2011) Stepwise or linear decrease in penetrance of type 1 diabetes with lower-risk HLA genotypes over the past 40 years. Diabetes, 60, 1045-1049.
- Barker, J.M., Triolo, T.M., Aly, T.A., Baschal, E.E., Babu, S.R., Kretowski, A., Rewers, M.J. and Eisenbarth, G.S. (2008) Two single nucleotide polymorphisms identify the highest-risk diabetes HLA genotype: potential for rapid screening. Diabetes, 57, 3152-3155.
- Romanos, J. and Wijmenga, C. (2009) Comment on: Barker et al. (2008) Two single nucleotide polymorphisms identify the highest-risk diabetes HLA genotype: Diabetes 57:3152-3155, 2008. Diabetes, 58, e1; author reply e2.
- Monsuur, A.J., de Bakker, P.I., Zhernakova, A., Pinto, D., Verduijn, W., Romanos, J., Auricchio, R., Lopez, A., van Heel, D.A., Crusius, J.B. et al. (2008) Effective detection of human leukocyte antigen risk alleles in celiac disease using tag single nucleotide polymorphisms. PLoS One, 3, e2270.
- Baker, P.R., 2nd and Steck, A.K. (2011) The past, present, and future of genetic associations in type 1 diabetes. Curr Diab Rep, 11, 445-453.
[1] Writing “MHC II complexes” is redundant, as the C in MHC stands for complex, but I want to emphasize the fact that there is more than one MHC II and it’s not clear to me how to pluralize MHC II. Also using the phrase “MHC II molecules” can be confusing as a MHC II complex is composed of two protein molecules that come together to form a MHC complex.