Despite the striking variation in the protective efficacy of the BCG vaccine for TB observed among different world populations, the joint impact of host and pathogen variation on novel TB vaccine candidates has never been characterized. Using epitope prediction programs, we investigated the impact of clinical variations in the Mtb72f TB vaccine components, the PPE18 and PepA proteins, on epitope binding to alleles of the Class II HLA DRB1 gene. We identified conserved and unconserved promiscuous epitopes in the PPE18 and PepA proteins using the clinical variants described previously  and determined that while 60% of potential CD4+ T-cell epitopes in the PPE18 protein were unconserved, 65% of the actually predicted T-cell epitopes in this protein were unconserved. We furthermore found several DRB1 alleles that bound few vaccine epitopes overall and others that bound predominantly unconserved or non-TB epitopes, allowing us to determine individual genotypes as well as broader populations where the Mtb72f vaccine may not offer maximum protection.
Our finding that 60% of potential CD4+ T-cell epitopes in PPE18 are unconserved is consistent with previous research that has found the PPE gene family to be particularly variable among tuberculosis isolates . The significant increase in the proportion of unconserved T-cell epitopes among the actually predicted epitopes compared to the potential epitopes (all nonamer sequences in the vaccine) likely reflects the selective pressure that the immune system places upon the bacterium to alter antigenic protein regions. Based on the clinical protein sequences previously described , no potential epitopes in the PepA protein were classified as unconserved. Although to our knowledge no other studies have compared PepA protein conservation to that of other M. tuberculosis proteins, previous findings that PepA is highly conserved among Mycobacterium species  suggest that PepA may be a relatively well-conserved protein. The high conservation of the PepA protein even under selective pressure from the immune response may indicate that little variation in this protein is compatible with protein function, and thus PepA may be a particularly good vaccine target .
We also examined the conservation of promiscuous epitopes. Epitope promiscuity is commonly evaluated in the generation of epitope-based vaccines [22–24], but since they are antigenic, promiscuously binding epitopes should be selected against by a large range of host immune systems. We would therefore expect to find many unconserved promiscuous epitopes. This was found to be the case: seven of the ten promiscuous epitopes predicted for PPE18 were unconserved. However, all five of the promiscuous PepA epitopes were conserved, and thus these epitopes could be particularly good candidates for epitope-based vaccines.
Although the majority of the promiscuous PPE18 epitopes identified were found to be unconserved, it is possible that these epitopes might nevertheless provide protection against the bacterium if they are found in other M. tuberculosis proteins. We therefore investigated whether these epitopes were found in two proteins (PPE19 and PPE60) that are closely related to PPE18. This analysis revealed that only three of the promiscuous epitopes were present in all three proteins. The remaining seven were found only in PPE18.
Our finding that three of the twelve DRB1 alleles selected, DRB1*0301, DRB1*0802, and DRB1*1301, were predicted to bind few total and conserved epitopes in the Mtb72f vaccine suggests that the vaccine might not be maximally protective in people who have one or more of these alleles. By contrast, DRB1*0101 was consistently predicted to bind far more total (median 147) and conserved (median 93) vaccine epitopes than any other DRB1 allele (median 16 total or 9 conserved), suggesting that individuals with this allele might be particularly well protected by the vaccine. DRB1*0101 is a common allele in the US and Belgium, where Phase I clinical trials for the Mtb72f vaccine were conducted, but unfortunately is rarer in Brazil, China, Indonesia, and many other countries where TB is endemic according to the Allele*Frequencies in Worldwide Populations database . Therefore, while the vaccine may be effective in the US and Belgium, it may not be equally effective in other regions.
Our analysis of all the DRB1 alleles in the Allele*Frequencies in WorldWide Populations database  that were found to be one of the top three most common alleles in any population in any of the twenty-two countries designated as TB high-burden countries by the WHO  generated a list of the alleles of concern in high-burden TB countries. These alleles include DRB1*0301, 0302, 0403, 0411, 0802, 0803, 0807, 1202, 1401, 1403, 1404, 1405, and 1504. Although the epitope predictions for these alleles should be treated with more caution, as predictions for most could be generated by only one or two prediction programs, the importance of these alleles merits their consideration.
While the number of epitopes capable of binding to a particular MHC allele does not necessarily indicate how strong the immune response will be in that MHC background , the fact that many of these alleles bind very few conserved epitopes is of concern for two reasons. First, the analysis conducted here includes only predictions of epitope binding to the MHC molecule and not the cellular processing that must take place before MHC presentation occurs. Because of this processing, many potential antigenic peptides will not be generated in vivo. For MHC alleles to which few epitopes are predicted to bind, it is possible that none of the potentially binding epitopes will be generated in vivo and thus that no epitopes in the vaccine will be presented on DRB1 proteins. Furthermore, for some DRB1 alleles the number of conserved epitopes predicted to bind is much smaller than the number of unconserved epitopes predicted to bind. In the immune response to a vaccine or pathogen, a single or small number of epitopes is usually immunodominant  even though many epitopes could potentially bind to the MHC alleles in question. For MHC alleles that are predicted to bind many fewer conserved than unconserved epitopes, it is likely that the immune response to the vaccine would be dominated by responses to unconserved epitopes and therefore would not be optimally protective against all M. tuberculosis strains. Most (8/13, ~62%) of the alleles of concern were predicted to bind at least as many unconserved as conserved epitopes, and one was predicted to bind no conserved or unconserved epitopes at all. Only five of the 21 alleles predicted to bind more than four conserved epitopes were predicted to bind at least as many unconserved as conserved epitopes. Individuals with these alleles - DRB1*1101, 0901, 1402, 1502, and 1602 - might also be less efficiently protected by the Mtb72f vaccine.
Because Mtb72f is a polyprotein containing several non-TB potential epitopes at the junction of the PPE18 and PepA proteins and at the N-terminus of the polyprotein, we investigated whether epitopes present in the Mtb72f vaccine but not in the native PPE18 and PepA proteins might misdirect the immune response upon immunization. Fortunately, most of the alleles of concern noted above did not bind any of the non-TB epitopes in the vaccine, but DRB1*0302, 0403, 0411, and 1301 were each predicted to bind to a median of one non-TB epitope. This finding is of particular concern for DRB1*0302, a common allele in South Africa, because it was not predicted to bind any of the protective epitopes (conserved or unconserved) in the vaccine. The DRB1-mediated immune response to Mtb72f in people with this allele might thus be misdirected against epitopes that are found in the vaccine but not in TB and thus this portion of the immune response would not be protective. As South Africa is one of the sites for the Phase II clinical trials of the Mtb72f vaccine, it would be beneficial to collect immunological data from vaccine recipients in order to determine whether the vaccine is effective in persons with the DRB1*0302 allele.
While this analysis should be useful for assessing population coverage of the Mtb72f vaccine, it is important to recognize the limitations inherent in the use of epitope prediction programs. No epitope prediction program is perfectly accurate, and there is substantial disagreement among the epitopes predicted by each program [14, 15]. We increased the accuracy of our analysis by using multiple prediction programs for each DRB1 allele when possible, but there are nevertheless likely to be differences between the predicted and actual epitopes for many alleles. However, as we obtained good agreement among programs as far as which alleles were predicted to bind relatively many or few vaccine epitopes, the lists of alleles and populations of concern would not likely be substantially changed by improvements in the prediction programs or experimental confirmation.
Although we compared epitope predictions across eight different programs, these programs are not perfectly comparable. Several of the programs, such as Vaxign, provided information on peptide affinity predictions only for epitopes predicted to bind each allele. As we could not obtain affinity information for peptides predicted not to bind, we were unable to evaluate the binding threshold to which the prediction program was automatically set. Furthermore, the binding cutoffs used by several of the programs likely differ from the IC50 ≤ 500 nM cutoff that we imposed on programs generating IC50 predictions. However, as the number of epitopes predicted by programs with non-IC50-based cutoffs generally though not always fell within the range of programs that did use the IC50 cutoff, it is unlikely that the differing thresholds among programs severely skewed the results.
Finally, it should be noted that the HLA allele frequencies in different populations  that were utilized to generate the list of alleles of concern were in some cases generated through only a few small studies. It is likely that further study of HLA genotypes would alter our predictions of which populations might be more or less effectively protected by the Mtb72f vaccine.