Bos taurus genome sequence reveals the assortment of immunoglobulin and surrogate light chain genes in domestic cattle
© Ekman et al; licensee BioMed Central Ltd. 2009
Received: 11 December 2008
Accepted: 30 April 2009
Published: 30 April 2009
The assortment of cattle immunoglobulin and surrogate light chain genes has been extracted from the version 3.1 of Bos taurus genome sequence as a part of an international effort to sequence and annotate the bovine genome.
63 variable lambda chain and 22 variable kappa chain genes were identified and phylogenetically assigned to 8 and 4 subgroups, respectively. The specified phylogenetic relationships are compatible with the established ruminant light chain variable gene families or subgroups. Because of gaps and uncertainties in the assembled genome sequence, the number of genes might change in the future versions of the genome sequence. In addition, three bovine surrogate light chain genes were identified. The corresponding cDNAs were cloned and the expression of the surrogate light chain genes was demonstrated from fetal material.
The bovine kappa gene locus is compact and simple which may reflect the preferential use of the lambda chain in cattle. The relative orientation of variable and joining genes in both loci are consistent with a deletion mechanism in VJ joining. The orientation of some variable genes cannot be determined from the data available. The number of functional variable genes is moderate when compared to man or mouse. Thus, post-recombinatorial mechanisms might contribute to the generation of the bovine pre-immune antibody repertoire. The heavy chains probably contribute more to recombinational immunoglobulin repertoire diversity than the light chains but the heavy chain locus could not be annotated from the version 3.1 of Bos taurus genome.
Immunoglobulins are the molecular mediators of the adaptive humoral immune response in jawed vertebrates. Somatic recombination during B lymphoid differentiation is required for immunoglobulin expression . In the germline state, the genes encoding for the variable (V), diversity (D) and joining (J) segments are dispersed across a wide genomic stretch. A process called V(D)J joining brings together the specific genes for each segment type and thereby creates the second exon of a transcriptionally competent immunoglobulin gene. The recombination machinery consists of two recombination activating gene products RAG1 and RAG2 as well as various other proteins, reviewed in . The cis-acting recognition signal sequences (RSSs) target the recombination machinery to the correct genomic site. Each RSS consists of heptamer and nonamer motifs flanking a 12 or 23 bp long central spacer. In the rearranging locus, two variably separated double strand DNA breaks are introduced next to one 12 bp and one 23 bp RSS. The nascent non-homologous DNA ends are joined into a coding joint in the middle of the recombined gene. The DNA fragment between the breaks is either deleted or inverted depending on the relative orientation of the recombining genes.
The immunoglobulin heavy chain and light chain rearrangements in many species are temporally separated during B cell development. In mouse and man but not in chicken, a population of cells can be demonstrated that has undergone rearrangement only in the immunoglobulin heavy chain locus [3, 4]. A surrogate light chain (SLC) is temporarily expressed at this stage of the B cell development . SLC is composed of two polypeptides VPREB and IGLL1 that are homologous to the variable and the constant domain of the immunoglobulin light chain, respectively . In mice, three VPREB paralogues VPREB1, VPREB2 and VPREB3 have been described [7, 8]. The IGLV-like VPREB2 is missing from the human genome. Gene targeting studies demonstrate the role of SLC genes in the production of B cells .
The genome sequence of Bos taurus permits for the first time a direct estimate of the size of the immunoglobulin light chain gene pool in domestic cattle, one of the most important farm animal species. We have characterized the structure and composition of bovine immunoglobulin and surrogate light chain gene loci as a part of a community effort to annotate the version 3.1 assembly of Bos taurus genome sequence .
The bovine immunoglobulin lambda (λ) chain locus is located on chromosome 17. In version 3.1 of the genome sequence (Btau_3.1), 63 variable, 3 joining and 5 constant genes could be identified in 10 scaffolds. 25 λ variable genes (ca. 41%) fulfilled the criteria for classification as potentially functional (see Methods and Additional file 1).
Three immunoglobulin lambda joining and five immunoglobulin lambda constant genes were identified (Additional file 1). Two of the J-C gene pairs form apparently functional units. IGLC1 and IGLC2 have identical coding sequence but differ at 3'UTR. Chen et al.  described four IGLC genes which correspond to IGLC2-IGLC5 in this paper.
The cDNA and genomic DNA sequence analysis of the surrogate light chain genes revealed several single nucleotide differences in comparison with the reference genomic sequence (Additional file 5). Therefore, it seems that the bovine surrogate light chain genes are polymorphic.
In this paper, we have presented the analysis of the immunoglobulin and surrogate light chain gene assortment extracted from the Bos taurus genome sequence Btau_3.1 . Btau_3.1 is nearly completely based on a whole genome shotgun sequence from a single animal (L1 Dominette 01449) with a 30% inbreeding coefficient [10, 17]. This facilitates the analysis of immunoglobulin genes, which is in mixed databases greatly complicated by gene polymorphism and targeted somatic mutations . Most of the functional light chain genes have probably been included in our gene set although the exact number of genes is likely to change in the future genome versions. 32 λ variable genes were in genomic contigs not assigned to a specific chromosomal location and might include orphons.
An interspecies comparison suggests ruminant specific adaptations:
Characteristics of CDR1 and CDR2 in the variable regions of bovine, mouse and human light chains
lambda chain variable region
CDR1 length (amino acids)
CDR2 length (amino acids)
unique IGLV CDR1/CDR2 pairs
kappa chain variable region
CDR1 length (amino acids)
CDR2 length (amino acids)
unique IGKV CDR1/CDR2 pairs
(3.) The phylogenetic analysis suggests that most of the potentially functional λ genes belong to a single subgroup (subgroup 1, see additional file 6) that is not apparent in the human or in the mouse genomes but is present in sheep genome. This subgroup comprises 21 variable genes of which 16 are potentially functional. The CDR1  is either 8 or 9 amino acids long with a characteristic hydrophobic residue at position 30. Based on similarities on primary sequences, the CDR 1 structures among the members of subgroup 1 correspond most closely to the canonical loop 1 structures 1 and 2 found on λ chain variable regions . CDR2 is 3 amino acids long and probably adopts a hair pin structure commonly found on CDR2 of λ and κ light chains . It remains to be seen whether or not the CDRs adopt any of the established canonical immunoglobulin structures in reality. No high resolution structures are currently available for bovine immunoglobulins in the PDB archives .
(4.) The apparent expansion of the pseudogene subgroup 5 is intriguing although the reasons behind this are currently elusive. 12 subgroup members out of 13 share an identical stop codon in framework 3.
The data on the overall organization of the bovine λ chain locus is still quite fragmentary (figure 4). It could resemble the human locus, which displays a 900 kb long upstream region of 73 to 74 variable genes followed by 7 to 11 pairs of joining and constant genes all in one transcriptional orientation . However, recombination using inversion cannot be ruled out in the bovine λ chain locus at present. In contrast to what is found in man and cattle, the murine λ chain locus is much reduced in size (only about 240 kb) and contains two small clusters of different immunoglobulin lambda chain genes [reviewed in ].
The heavy chain locus could not be annotated as most of it is missing from Btau_3.1. The available data on the light chain loci suggests that a moderate number of potentially functional light chain genes exist in the bovine genome. Although the heavy chains add more to the recombinatorial diversity of immunoglobulins than the light chains, post-recombinatorial mechanisms might also contribute to a fully blown bovine preimmune repertoire. The relative importance of V(D)J recombination for the generation of the preimmune repertoire in ruminants is currently controversial [14, 18]. In late fetal and neonatal sheep, however, the repertoire is expanded by somatic hypermutation in the ileal Peyer's patch [12, 13].
Surrogate light chain (SLC) is needed to expand the H+L- cell population in species in which heavy and light chain genes are sequentially arranged. This assures that sufficient number of cells productively rearrange both loci . The expression of SLC genes in the bovine fetal tissues (figure 3) confirms their functionality. The data presented in this paper does not permit further conclusions on the role of SLC genes in cattle. Nevertheless, analyses of serial sections by immunohistochemistry have revealed specific sites in the bovine fetus where there are no light chain positive cells but which still contain heavy chain positive cells (Ekman and Iivanainen, unpublished).
This study describes the bovine assortment of immunoglobulin and surrogate light chain genes based on Btau_3.1. A large fraction of the potentially functional variable genes belong to subgroups that are shared between cattle and sheep but not found in man or in mouse. The number of functional light chain variable genes in Btau_3.1 is moderate in comparison with the corresponding number in the human or mouse genomes. The new data on the immunoglobulin light chain genes provides novel insight on the humoral immune system of ruminants and should facilitate the development of vaccines and other therapeutic tools against cattle specific infectious diseases.
Gene identification and annotation
An iterative blast search against the bovine genomic sequence database was performed via Ensembl genome browser . The initial query sequences were bovine light chain variable gene encoded cDNAs with frequent matches in the dbEST database at the National Center for Biotechnology Information . Genome-wide annotation evidence based on Swiss-Prot, TrEMBL and various other databases at GenBank, EMBL and DDBJ were provided by The Wellcome Trust Sanger Institute  and by the Bovine Genome Database . Annotation of the genomic sequence and its comparison against the various evidence entries was carried out using Apollo , Otterlace  and blast .
Functional and phylogenetic analyses of genes
Sequence extractions were done in the European Molecular Biology Open Software Suite . The extracted genes were further analyzed using the following criteria: (a) an uninterrupted open reading frame, (b) consensus splice sites at exon/intron boundaries, (c) the presence of four conserved framework residues C23, W41, L89 and C104 for the variable and constant genes, and F/W-G-X-G motif for the joining genes , and (d) a likely functional recombination signal sequence. In functional recombination assays, the spacer length and three outmost nucleotides of the heptamer have been shown to be the most critical parameters for efficient recombination .
Multiple alignments of genomic sequences corresponding to regions spanning from FR1 up to but excluding CDR3  were performed using a global alignment strategy in the MAFFT package, version 6.603b . Evolutionary distances were computed and phylogenetic trees constructed in PHYLIP, version 3.67  using the F84 model for nucleotide substitution and neighbor joining algorithm, respectively. The reliability of the tree topologies were evaluated using the bootstrap test (n = 1000) in PHYLIP. The consensus tree was calculated using majority rule in the Consense consensus tree program in PHYLIP.
Since the complete gene pool is not available, ad hoc gene names are used in this paper. The variable gene families or subgroups identified in cattle  and in sheep [12–15] are used where the phylogenetic analyses indicate a close relationship. Furthermore, nucleotide sequence identity matrix for the gene region corresponding to FR1–FR3 (e.g., amino acids 1 to 104 in the IMGT numbering system ) was calculated from globally aligned sequences using the BioEdit Sequence Alignment Editor v. 7.0.9 . Truncated or incomplete genes IGVL59, IGLV61, IGLV62 and IGLV63 were excluded from the initial alignment. They were subsequently assigned to the respective subgroups by phylogenetic analysis in PHYLIP, based on alignments using the local alignment strategy in the MAFFT package (Additional file 1).
Cloning and expression analysis of the surrogate light chain genes
Bovine fetal material was obtained from a local slaughterhouse. The use of animal tissues was approved by the local animal welfare authorities. Total RNA was isolated from muscle, thymus, liver, spleen, lymph node and bone marrow of fetuses at 135, 175, 190, 210 and 230 days of gestational age . 50 – 400 mg of frozen tissue was crushed with a mortar, suspended in Eurozol RNA extraction reagent (Euroclone) and homogenized using Polytron PT1200 homogenizer (Kinematica AB) with a 5 mm cutter. The extraction procedure was carried out according to manufacturer's instructions. RNA was further purified by precipitating with 2.5 M LiCl (Sigma) and dissolved in water. Prior to reverse transcription RNA was treated with RQ1 DNAse (Promega) to remove possible genomic contamination. In the reverse transcription reaction 20 pmol of oligo(dT) primer was added to 1 μg of total RNA, and RevertAid M-MuLV reverse transcriptase (Fermentas) was used according to manufacturer's instructions. RiboLock ribonuclease inhibitor (Fermentas) was added to the reaction.
Gene specific primers used in this study
The expression of VPREB1, VPREB3 and IGLL1 surrogate light chain genes was confirmed by RT-PCR using the following RNA preparations (age in gestational days): bone marrow (135d, 175d, 190d, 210d, 230d), liver (135d, 175d, 190d, 210d, 230d), lymph node (190d, 210d, 230d), muscle (135d, 190d, 210d, 230d), spleen (135d, 175d, 190d, 210d, 230d), and thymus (135d, 175d, 190d, 210d, 230d). Expression of the housekeeping gene GAPDH was used to monitor the variation in RNA quality and quantity. GAPDH specific control RT-PCRs without reverse transcriptase did not yield any products (not shown). For primers, see table 2.
complementarity determining region
recombination signal sequence
immunoglobulin lambda variable
immunoglobulin lambda joining
immunoglobulin lambda constant
immunoglobulin kappa variable
immunoglobulin kappa joining
immunoglobulin kappa constant
pre-B lymphocyte gene
immunoglobulin lambda-like polypeptide
surrogate light chain
recombination activating gene
glyceraldehyde phosphate dehydrogenase
kappa deleting element
This study was supported by grants from The Academy of Finland (122540/2007 to AI), The Research Funds of The University of Helsinki (914/51/2006 to AI) and Finnish Ministry of Agriculture and Forestry (828/312/2009 to AI). The authors thank Kirsti Sihto, DVM for help in collecting the fetal material, and Kirsi Lahti and Tuire Pankasalo for expert technical assistance.
- Tonegawa S: Somatic generation of antibody diversity. Nature. 1983, 302: 575-581.View ArticlePubMed
- Sekiguchi J, Alt F, Oettinger M: The mechanism of V(D)J recombination. Molecular biology of B cells. Edited by: Honjo T, Alt F, Neuberger M. 2003, Elsevier, 61-82.View Article
- Kearney JF, Won WJ, Benedict C, Moratz C, Zimmer P, Oliver A, Martin F, Shu F: B cell development in mice. Int Rev Immunol. 1997, 15: 207-41.View ArticlePubMed
- Weill JC, Reynaud CA: Galt versus bone marrow models of B cell ontogeny. Dev Comp Immunol. 1998, 22: 379-385.View ArticlePubMed
- Ogawa M, ten Boekel E, Melchers F: Identification of CD19(-)B220(+)c-Kit(+)Flt3/Flk-2(+)cells as early B lymphoid precursors before pre-B-I cells in juvenile mouse bone marrow. Int Immunol. 2000, 12: 313-324.View ArticlePubMed
- Kerr WG, Cooper MD, Feng L, Burrows PD, Hendershot LM: Mu heavy chains can associate with a pseudo-light chain complex (psi L) in human pre-B cell lines. Int Immunol. 1989, 1: 355-61.View ArticlePubMed
- Kudo A, Melchers F: A second gene, VpreB in the lambda 5 locus of the mouse, which appears to be selectively expressed in pre-B lymphocytes. EMBO J. 1987, 6: 2267-72.PubMed CentralPubMed
- Shirasawa T, Ohnishi K, Hagiwara S, Shigemoto K, Takebe Y, Rajewsky K, Takemori T: A novel gene product associated with mu chains in immature B cells. EMBO J. 1993, 12: 1827-1834.PubMed CentralPubMed
- Shimizu T, Mundt C, Licence S, Melchers F, Mårtensson IL: VpreB1/VpreB2/lambda 5 triple-deficient mice show impaired B cell development but functional allelic exclusion of the IgH locus. J Immunol. 2002, 168: 6286-6293.View ArticlePubMed
- The Bovine Genome Sequencing and Analysis Consortium: The Genome Sequence of Taurine Cattle: A window to ruminant biology and evolution. Science. 2009, 324: 522-PubMed CentralView Article
- Sinclair MC, Gilchrist J, Aitken R: Molecular characterization of bovine V lambda regions. J Immunol. 1995, 155: 3068-3078.PubMed
- Reynaud CA, Mackay CR, Müller RG, Weill JC: Somatic generation of diversity in a mammalian primary lymphoid organ: the sheep ileal Peyer's patches. Cell. 1991, 64: 995-1005.View ArticlePubMed
- Reynaud CA, Garcia C, Hein WR, Weill JC: Hypermutation generating the sheep immunoglobulin repertoire is an antigen-independent process. Cell. 1995, 80: 115-125.View ArticlePubMed
- Reynaud CA, Dufour V, Weill JC: Generation of diversity in mammalian gut-associated lymphoid tissues: restricted V gene usage does not preclude complex V gene organization. J Immunol. 1997, 159: 3093-3095.PubMed
- Hein W, Dudler L: Diversity of Ig light chain variable region gene expression in fetal lambs. Int Immunol. 1998, 10: 1251-1259.View ArticlePubMed
- Chen L, Li M, Li Q, Yang X, An X, Chen Y: Characterization of the bovine immunoglobulin lambda light chain constant IGLC genes. Vet Immunol Immunopathol. 2008, 124: 284-294.View ArticlePubMed
- Bovine Genome Project. [http://www.hgsc.bcm.tmc.edu/projects/bovine]
- Jenne CN, Kennedy LJ, McCullagh P, Reynolds JD: A new model of sheep Ig diversification: shifting the emphasis toward combinatorial mechanisms and away from hypermutation. J Immunol. 2003, 170: 3739-3750.View ArticlePubMed
- Butler JE: Immunoglobulin gene organization and the mechanism of repertoire development. Scand J Immunol. 1997, 45: 455-462.View ArticlePubMed
- The international ImMunoGeneTics information system®. [http://imgt.cines.fr]
- Lefranc MP, Pommié C, Ruiz M, Giudicelli V, Foulquier E, Truong L, Thouvenin-Contet V, Lefranc G: IMGT unique numbering for immunoglobulin and T cell receptor variable domains and Ig superfamily V like domains. Dev Comp Immunol. 2003, 27: 55-77.View ArticlePubMed
- Al-Lazikani B, Lesk AM, Chothia C: Standard conformations for the canonical structures of immunoglobulins. J Mol Biol. 1997, 273: 927-948.View ArticlePubMed
- RCSB Protein databank. [http://www.rcsb.org]
- Kawasaki K, Minoshima S, Nakato E, Shibuya K, Shintani A, Schmeits JL, Wang J, Shimizu N: One-megabase sequence analysis of the human immunoglobulin lambda gene locus. Genome Res. 1997, 7: 250-261.View ArticlePubMed
- Lefranc M-P, Lefranc G: Immunoglobulin Lambda (IGL) Genes of Human and Mouse. Molecular Biology of B Cells. Edited by: Honjo T, Alt FW, Neuberger M. 2004, London. Elsevier Academic Press, 37-59.
- Kawasaki K, Minoshima S, Nakato E, Shibuya K, Shintani A, Asakawa S, Sasaki T, Klobeck HG, Combriato G, Zachau HG, Shimizu N: Evolutionary dynamics of the human immunoglobulin kappa locus and the germline repertoire of the Vkappa genes. Eur J Immunol. 2001, 31: 1017-1028.View ArticlePubMed
- Zachau HG: The immunoglobulin κ genes and the κ locus of the mouse. [http://biochemie.web.med.uni-muenchen.de/zachau/kappa]
- Zachau HG: Immunoglobulin κ genes of Human and Mouse. Molecular Biology of B Cells. Edited by: Honjo T, Alt FW, Neuberger M. 2004, London. Elsevier Academic Press, 27-36.
- Röschenthaler F, Kirschbaum T, Heim V, Kirschbaum V, Schäble KF, Schwendinger J, Zocher I, Zachau HG: The 5' part of the mouse immunoglobulin kappa locus. Eur J Immunol. 1999, 29: 2065-2071.View ArticlePubMed
- Thiebe R, Schäble KF, Bensch A, Brensing-Küppers J, Heim V, Kirschbaum T, Mitlöhner H, Ohnrich M, Pourrajabi S, Röschenthaler F, Schwendinger J, Wichelhaus D, Zocher I, Zachau HG: The variable genes and gene families of the mouse immunoglobulin kappa locus. Eur J Immunol. 1999, 29: 2072-2081.View ArticlePubMed
- Klobeck HG, Zachau HG: The human CK gene segment and the kappa deleting element are closely linked. Nucleic Acids Res. 1986, 14: 4591-4603.PubMed CentralView ArticlePubMed
- Moore MW, Durdik J, Persiani DM, Selsing E: Deletions of kappa chain constant region genes in mouse lambda chain-producing B cells involve intrachromosomal DNA recombinations similar to V-J joining. Proc Natl Acad Sci USA. 1985, 82: 6211-6215.PubMed CentralView ArticlePubMed
- Siminovitch KA, Bakhshi A, Goldman P, Korsmeyer SJ: A uniform deleting element mediates the loss of kappa genes in human B cells. Nature. 1985, 316: 260-262.View ArticlePubMed
- Mårtensson IL, Keenan RA, Licence S: The pre-B-cell receptor. Curr Opin Immunol. 2007, 19: 137-142.View ArticlePubMed
- The Ensembl project. [http://www.ensembl.org]
- The National Center for Biotechnology Information. [http://www.ncbi.nlm.nih.gov]
- The Wellcome Trust Sanger Institute. [http://www.sanger.ac.uk]
- The Bovine Genome Database. [http://www.bovinegenome.org]
- Lewis SE, Searle SM, Harris N, Gibson M, Lyer V, Richter J, Wiel C, Bayraktaroglir L, Birney E, Crosby MA, Kaminker JS, Matthews BB, Prochnik SE, Smithy CD, Tupy JL, Rubin GM, Misra S, Mungall CJ, Clamp ME: Apollo: a sequence annotation editor. Genome Biol. 2002, 3: RESEARCH0082-PubMed CentralView ArticlePubMed
- Searle SM, Gilbert J, Iyer V, Clamp M: The otter annotation system. Genome Res. 2004, 14: 963-970.PubMed CentralView ArticlePubMed
- Altschul SF, Gish W, Miller W, Myers EW, Lipman DJ: Basic local alignment search tool. J Mol Biol. 1990, 215: 403-410.View ArticlePubMed
- Rice P, Longden I, Bleasby A: EMBOSS: the European Molecular Biology Open Software Suite. Trends Genet. 2000, 16: 276-277.View ArticlePubMed
- Akamatsu Y, Tsurushita N, Nagawa F, Matsuoka M, Okazaki K, Imai M, Sakano H: Essential residues in V(D)J recombination signals. J Immunol. 1994, 153: 4520-4529.PubMed
- Katoh K, Misawa K, Kuma K, Miyata T: MAFFT: a novel method for rapid multiple sequence alignment based on fast Fourier transform. Nucleic Acids Res. 2002, 30: 3059-3066.PubMed CentralView ArticlePubMed
- Felsenstein J: PHYLIP – Phylogeny Inference Package (Version 3.2). Cladistics. 1989, 5: 164-166.
- Hall T: BioEdit Sequence Alignment Editor. [http://www.mbio.ncsu.edu/BioEdit/bioedit.html]
- Rüsse I: Rind. Lehrbuch der Embryologie der Haustiere. Edited by: Rüsse I. 1991, Berlin: Parey, 159-168.
This article is published under license to BioMed Central Ltd. This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.