Conserved cryptic recombination signals in Vκ gene segments are cleaved in small pre-B cells

Background The cleavage of recombination signals (RS) at the boundaries of immunoglobulin V, D, and J gene segments initiates the somatic generation of the antigen receptor genes expressed by B lymphocytes. RS contain a conserved heptamer and nonamer motif separated by non-conserved spacers of 12 or 23 nucleotides. Under physiologic conditions, V(D)J recombination follows the "12/23 rule" to assemble functional antigen-receptor genes, i.e., cleavage and recombination occur only between RS with dissimilar spacer types. Functional, cryptic RS (cRS) have been identified in VH gene segments; these VH cRS were hypothesized to facilitate self-tolerance by mediating VH → VHDJH replacements. At the Igκ locus, however, secondary, de novo rearrangements can delete autoreactive VκJκ joins. Thus, under the hypothesis that V-embedded cRS are conserved to facilitate self-tolerance by mediating V-replacement rearrangements, there would be little selection for Vκ cRS. Recent studies have demonstrated that VH cRS cleavage is only modestly more efficient than V(D)J recombination in violation of the 12/23 rule and first occurs in pro-B cells unable to interact with exogenous antigens. These results are inconsistent with a model of cRS cleavage during autoreactivity-induced VH gene replacement. Results To test the hypothesis that cRS are absent from Vκ gene segments, a corollary of the hypothesis that the need for tolerizing VH replacements is responsible for the selection pressure to maintain VH cRS, we searched for cRS in mouse Vκ gene segments using a statistical model of RS. Scans of 135 mouse Vκ gene segments revealed highly conserved cRS that were shown to be cleaved in the 103/BCL2 cell line and mouse bone marrow B cells. Analogous to results for VH cRS, we find that Vκ cRS are conserved at multiple locations in Vκ gene segments and are cleaved in pre-B cells. Conclusion Our results, together with those for VH cRS, support a model of cRS cleavage in which cleavage is independent of BCR-specificity. Our results are inconsistent with the hypothesis that cRS are conserved solely to support receptor editing. The extent to which these sequences are conserved, and their pattern of conservation, suggest that they may serve an as yet unidentified purpose.


Background
The ability to mount specific immune responses depends on a highly diverse repertoire of T-and B-cell antigenreceptor molecules. The genetic diversity required for millions of distinct antigen-receptors is created by the somatic recombination and fusion of individual variable (V), diversity (D), and joining (J) gene segments in a process known as V(D)J recombination. During V(D)J recombination, genomic DNA is cleaved at the boundaries of individual V, D, and J gene segments and the intervening DNA removed or inverted; subsequently, the newly apposed gene segments are ligated to form the variable region exon of one of the four types of antigen-receptor genes (reviewed in [1]). These recombination events are mediated by RAG-1 and RAG-2 in the form of a V(D)J recombinase holoenzyme that is directed to proper sites of cleavage by DNA motifs known as recombination signals (RS). RS are located at the boundaries of V, D, and J gene segments and defined by highly conserved heptamer-and less-well conserved nonamer sequences that are separated by non-conserved spacer regions 12-or 23base pairs (bp) in length [2][3][4][5]. Under physiologic conditions, V(D)J recombination follows the "12/23 rule" to assemble functional antigen-receptor genes, i.e., cleavage and recombination occur only between RS with dissimilar spacer types.
Previously, we conducted a global analysis of cRS across mouse V H gene segments using a computational algorithm to predict the location and functional activity of V H cRS; these predictions were then tested using a ligationmediated PCR (LM-PCR) to detect V H cRS cleavage in purified populations of mouse B-lineage cells recovered from murine bone marrow [4,25]. We discovered that not only are cRS conserved at sites distributed throughout V H gene segments but also that V H cRS are cleaved only during the pro-B cell stage of development [25]. Both results are inconsistent with the paradigmatic view that functional V H cRS are maintained to facilitate the rescue of autoreactive B cells that would otherwise be lost to the mechanisms of self tolerance [7,9,[22][23][24]. Our results suggested to us that V H cRS may be conserved for other reasons [25].
In contrast to receptor editing via V H replacement, receptor editing at the Igκ locus, takes the form of either secondary, de novo Vκ → Jκ rearrangements that replace or invert primary VκJκ joins [26][27][28][29][30], or more rarely, inactivating rearrangement with cRS that flank the Cκ exon [21]. Secondary, de novo rearrangements are not only possible at the Igκ locus, but highly efficient because of the locus' organization: Vκ gene segments are associated with 12-RS while Jκ gene segments are associated with 23-RS, removing the need for a D gene segment and allowing repeated, direct VκJκ rearrangements; Vκ genes are present in both orientations, resulting in many inversion rearrangements and conserving Vκ gene segments that lie between the rearranging Vκ and Jκ gene segments for subsequent rearrangements; The possibility for rearrangement at the Igλ locus further increases the opportunity for editing.
A corollary of the argument that V H cRS are conserved to provide a mechanism for secondary rearrangement at the Igh locus [7,9] is that cRS would not be conserved within Vκ gene segments. Thus far, however, there have been no systematic attempts to search for cRS within Vκ gene segments, to determine the extent of Vκ cRS conservation, or to determine whether they are functional. Previous work searched Vκ sequence alignments for partial heptamer motifs (CACA) at a location within Vκ orthologous to the location of the 3' V H cRS [8,9]. It was noted that 10% of the Vκ gene segments examined contain this partial heptamer motif [8]. We extend this study using a computational algorithm that allows for systematic scanning of the full length of Vκ gene segments for complete cRS [4,6] and by showing that conserved Vκ cRS are cleaved.
To test the hypothesis that functional cRS are not conserved in Vκ gene segments, we conducted a global examination of mouse Vκ segments using the computational and experimental methods of our earlier study of V H cRS [25]. As in our study of V H cRS, we find that Vκ cRS are present and cleaved at multiple, conserved locations in Vκ gene segments. These cRS are conserved across Vκ gene families and are cleaved during the small pre-B cell stage of B-cell development. This study is the first to show that cRS are conserved within Vκ gene segments, and that these cRS are cleaved in vivo. Our findings support the hypoth-esis [25] that cRS are conserved in Ig V gene segments for a purpose(s) unassociated with the maintenance of selftolerance.

Identification of cRS embedded in Vκ gene segments
To identify cRS in Vκ gene segments, we applied a statistical model of mouse RS to the 135 mouse Vκ gene segments and alleles listed in the Immunogenetics Information System (IMGT) reference directory set [4,6,31]. We previously used this analytic method to identify cRS in mouse V H gene segments and in a 212-kb control region of mouse chromosome 8 (accession AC084823) not subject to physiologic V(D)J recombination [6]. Our statistical model assigns a recombination information content (RIC) score to any RS-length DNA sequence beginning with the nucleotides CA; such sequences are referred to as potential cRS. DNA sequences of length 28-bp are assigned RIC scores based on the RIC 12 model for RS with 12-bp spacers, while 39-bp sequences are assigned RIC scores based on the RIC 23 model. Higher RIC scores indicate higher sequence similarities to mouse RS and are predictive of higher recombination efficiencies [1,4,6,25].
We have previously determined a threshold for 28-bp cRS of RIC 12 ≥ -45 using the RIC score of the functional cRS embedded in the 3H9 V H transgene [6,9,25]. 39-bp RS have a lower RIC score than 28-bp RS (RIC 23 = -60 vs RIC 12 = -40, respectively), thus we set a correspondingly lower threshold for the detection of 39-bp cRS of RIC 23 ≥ -65 [25].
We scanned for potential cRS on both DNA strands of each Vκ gene segment. Potential cRS found on the sense strand, and thus in the orientation of physiologic RS, are referred to as being in orientation 1 (O1). Potential cRS found on the antisense strand, and thus opposite in orientation to physiologic RS, are defined to be in orientation 2 (O2). Both strands of sequence AC084823 were also scanned. cRS in the strand listed in NCBI were arbitrarily assigned the O1 orientation, and cRS in the inverse complement sequence assigned to the O2 orientation.

Vκ cRS are conserved in O2
We compared the relative frequencies of 12-and 23-cRS in Vκ gene segments with those present in control sequence AC84823 [25] and found that the relative frequencies of 12-and 23-Vκ cRS in the O2 orientation are significantly higher than in the AC84823 control (0.031 vs. 0.018; P = 10 -5 and 0.091 vs. 0.048; P = 10 -19 ). In contrast, the frequencies of Vκ cRS in O1 do not differ from those in AC84823 (0.02 vs. 0.017; P = 0.17 and 0.036 vs. 0.046; P = 0.013, 12-cRS and 23-cRS, respectively) ( Table 1). These biases for cRS in Vκ gene segments are unlike those of V H cRS, which contain significantly more O1 and O2 12-cRS and significantly fewer O1 and O2 23-cRS than AC84823 [25].
To examine further the differences between Vκ and V H cRS, we compared the distributions and orientations of Vκ cRS with those of the cRS present in V H gene segments [25]. Vκ gene segments exhibit significantly higher relative frequencies of 23-cRS in either O1 or O2 than do V H gene segments (0.036 vs. 0.016; P = 10 -11 and 0.091 vs. 0.037; P = 10 -26 , O1 and O2, respectively), whereas the relative frequencies of O1 and O2 12-cRS are not different between Vκ and V H gene segments (Table 1).
Even though Vκ and V H gene segments and the AC084823 sequence exhibit similar relative frequencies of potential cRS, these frequencies diverge as RIC scores increase  12 and RIC 23 were computed for all 28-bp and 39-bp sequences beginning with a CA-dinucleotide in both O1 and O2 in V H and Vκ gene segments, and in a 212-kb portion of chromosome 8 (accession number AC084823). The number of sequences with RIC above -45 and above -65 are shown with the relative frequencies shown in parentheses.
towards the threshold values associated with RS activity (Figure 1). O1 and O2 potential 12-cRS with RIC 12 ≥ -50 are more common in V H gene segments than in the AC084823 control ( Figure 1A), while O1 and O2 potential 23-cRS with RIC 23 > -70 are less common in V H gene segments than in the AC084823 control ( Figure 1B) [25]. In contrast, of Vκ potential 12-cRS with RIC 12 ≥ -50, only those in O2 are more common than in AC084823 ( Figure  1A), and O2 potential 23-cRS with RIC 23 > -70 are more common in Vκ gene segments than in the control sequence ( Figure 1B). As described above, at cRS RIC score thresholds, these differences are statistically significant.
Thus, while both V H and Vκ gene segments are significantly enriched for O2 12-cRS relative to the AC084823 control, V H gene segments appear to be selected for increased frequencies of 12-cRS and the suppression of 23-cRS, regardless of orientation, and Vκ gene segments appear to be under selection for O2 cRS, regardless of spacer length. These patterns of bias indicate that, relative to V H gene segments, Vκ segments are enriched for O2 23-cRS.
Given that the relative frequency of the 215 O2 23-cRS embedded within Vκ gene segments (0.091, Table 1) is much higher than the relative frequency of O2 23-cRS in V H gene segments or in AC84823, we examine the extent of Vκ O2 23-cRS conservation, explore whether their conservation can be explained by conservation of the encoded amino acid sequence, and determine whether they are cleaved.

Vκ 23-cRS in O2 are conserved at multiple locations within Vκ genes and across Vκ gene families
We first examined whether the locations of the 215 O2 23-cRS within Vκ gene segments were conserved across Vκ gene families. Indeed, a third (73/215; 33.95%) are located at nucleotide position 282 and one-fourth (53/ 215; 24.65%) at nucleotide position 238 in framework 3 ( Figure 2). About 10% (22/215; 10.23%) of O2 23-cRS are located at nucleotide position 39 in framework 1, and the remaining 67 O2 23-cRS are distributed across 16 other locations ( Figure 2). Importantly, only 3 cRS are located at nucleotide position 313, the position of the most highly conserved cRS in V H segments and of the cRS that mediates V H gene replacement [9,25].   The proportion of RS-length sequences with RIC scores above a given threshold Figure 1 The proportion of RS-length sequences with RIC scores above a given threshold. For increasing RIC score thresholds, the number of RS-length sequences beginning with a CA-dinucleotide and with a RIC score above the threshold was divided by the total number of RS-length sequences beginning with a CA-dinucleotide. Thresholds are shown on the X-axis, and the corresponding proportions are shown on the Y-axis. The proportions of above-threshold sequences are plotted for both RIC 12 ( Figure 1A) and RIC 23   In contrast, the observed DNA sequence diversity for nucleotide positions within cRS is much less than the maximum possible diversity that could be attained while conserving the amino acid sequences, indicating selection on the DNA to a greater extent within cRS than within FR and beyond that required to maintain the necessary amino acid sequence. Thus, we conclude that the O2 23-cRS embedded within Vκ gene segments are not present as an artifact of amino acid conservation.

Vκ cRS are cleaved in vivo
To determine if the Vκ O2 23-cRS identified by RIC scores are cleaved in vivo, we performed ligation-mediated PCR (LM-PCR) [33] to amplify Vκ cRS signal ends (SE) recovered from 103/BCL2 cells and small pre-B cells from the bone marrow of C57BL/6 mice ( Figure 4). LM-PCR is a standard assay used to demonstrate RAG-mediated cleavage at RS and cRS heptamers [33]. RAG expression in 103/ BCL2 cells is temperature dependent. At 34°C, 103/BCL2 cells proliferate, RAG1 and RAG2 proteins are minimally expressed, and Igκ rearrangements are undetectable [34]. At 39°C, 103/BCL2 cells enter growth arrest, RAG1 and RAG2 expression is upregulated, and Igκ rearrangements are induced [34]. To control for potential LM-PCR artifacts, we used genomic DNA from 103/BCL2 cells cultured at 34°C and 39°C as LM-PCR templates, in addition to DNA from sorted pre-B cells ( Figure 4). To determine the extent of functional O2 cRS in gene segments from the Vκ2, Vκ5, Vκ6, Vκ8, and Vκ17 families, we designed a series of Vκ family-specific PCR primers and used a standard intronic LM-PCR [25] to detect primary Jκ SE as a positive control. The Vκ primers are designed such that only O2 cRS are detected.
LM-PCR amplicands representing RAG-and ligasedependent Vκ cRS SE cleavage products were readily detected in both 103/BCL2 and small pre-B cells ( Figure  4A). The dual products recovered from both 103/BCL2 and bone marrow cells using Vκ6 and Vκ8 family primers represent cleavage at nucleotide positions 282 (220-bp fragment) and 342 (280-bp fragment) in Vκ6 gene segments ( Figure 4A) and at positions 288 and 327 in Vκ8 gene segments (data not shown). Similarly, LM-PCR amplifications of genomic DNA from 103/BCL2 cells using five sets of Vκ family-specific primers indicated that ≥ 1 cRS is present and cleaved in V gene segments belonging to the Vκ2, -5, -6, -8, and -17 gene families ( Figure 4B), all of the families for which cleavage was assayed.
To ensure that these LM-PCR amplification products represented bona fide Vκ cRS SE, the LM-PCR amplicands were gel-purified, cloned, and sequenced (Tables 2 and  3). Of 101 sequences obtained, 82 represent Vκ gene segments ending precisely at blunt, double-strand ends (Tables 2 and 3 Vκ cRS are conserved at the DNA level   Table 2). Although no primer set was used to amplify gene segments of the Vκ11 and Vκ14 families, the Vκ8 primer set matches to both Vκ11 (17 of 18 nucleotides) and Vκ14 (15 of 18 nucleotides) genes and resulted in the amplification of a single Vκ11 cRS SE product and two Vκ14 cRS SE products ( Table 2).  LM-PCR products from 103/BCL2 cells and from pre-B cells isolated from RAG2:GFP knock-in mice and C57BL/6 mice were sequenced and aligned to sequences from the IMGT reference directory set to identify products from cleavage at Vκ-embedded cRS.
Vκ cRS cleavage is detected in C57BL/6 bone marrow pre-B cells and 103/BCL2 cells
Six of the 10 Vκ-embedded cRS cleavage events were at nucleotide position 282 (Table 2), the most conserved location for O2 23-cRS identified by RIC. Cleavage at this cRS was identified in the Vκ6-32, Vκ5-39, and Vκ5-45 gene segments. The remaining 4 Vκ-embedded cRS cleavage events were distributed as follows: 1 at Vκ nucleotide position 232 in Vκ2-116, 1 at position 238 in Vκ17-127, and 2 at position 313 in the Vκ2-109 and Vκ2-116 gene segments (Table 2). Thus, we observe cleavage events occurring both at the same location across different Vκ gene families, and at different locations within the same gene family.
Vκ6 cRS SE and Jκ4 SE are approximately equally abundant in recombinationally active 103/BCL2 cells (Figure 4 and data not shown). Given that the Vκ6 family comprises eight or nine gene segments (IMGT database) and that each of these likely contain at least two functional cRS (Table 2), we estimate the rate of Vκ6 cRS cleavage to be 5% -13% of Jκ4 RS.

Vκ cRS SE are detected only in pre-B cells
To identify the developmental stages in which Vκ cRS are cleaved, we isolated genomic DNA from highly enriched (>95%) populations of pro-B, pre-B, and immature B cells sorted from the bone marrow of C57BL/6 mice and congenic RAG2:GFP animals [25]. We previously demonstrated that V H cRS SE are present in pro-B cells but not in pre-B or immature B cells from the bone marrow of RAG2:GFP mice [25]. In this study, J H RS SE were detected only in pro-B cells, Jκ RS SE only in pre-B cells, and TCR Dβ RS SE were not detected in any B-cell population [25].
Vκ cRS cleavage is detected only in pre-B cells from RAG2:GFP knock-in mice Figure 5 Vκ cRS cleavage is detected only in pre-B cells from RAG2:GFP knock-in mice. LM-PCR was conducted on pro-, pre-, and immature B cells from RAG2:GFP knock-in mice [25]. These sorted cells were shown in [25] to have the appropriate lineage and developmental restrictions of Tdt and RAG1 expression, and of J H and Jκ SE. Samples from these same sorted cells were used to amplify SE in Vκ 6 gene segments. Cleavage at Vκ 6-embedded cRS was detected only in pre-B cells. CD14 amplification demonstrates the equivalence of genomic template. LM-PCR products from 103/BCL2 cells and from pre-B cells isolated from RAG2:GFP knock-in mice or from C57BL/6 mice were sequenced and aligned to sequences from the IMGT reference directory set to identify the germline gene segment. Where matches to IMGT sequences were not found, the LM-PCR products were aligned to germline Vκ gene segments in NCBI (indicated in bold). The source, location, number of observations, cRS sequence, Vκ gene segment, and cRS sequence RIC score are shown for each independent cleavage event. cRS sequences are written in heptamer-to-nonamer orientation, and nucleotide positions using IMGT numbering indicate the location of the first heptamer nucleotide. * The LM-PCR product shows 2 mismatches to the genomic cRS sequence (cactGctggagaaTcgggatgggactccaggacgaagag), indicated with capital letters. We attribute this difference to sequencing error.
To determine if Vκ cRS are cleaved in vivo and to identify the developmental stage in which cleavage occurs, we isolated genomic DNA from the samples of bone marrow pro-, pre-, and immature B cells sorted in the previous study [25]. The genomic DNA was used as template for LM-PCR to detect cleavage of O2 Vκ cRS in the Vκ6-32 gene. We targeted this Vκ gene segment because it contains multiple cRS (Figure 4) with the highest RIC scores of the cRS for which SE were detected in 103/BCL2 cells (Table 3). Ligase-dependent, Vκ6-32 cRS SE could be detected at nucleotide positions 282 and 342 ( Figure 4 and Tables 1 and 2) in pre-B, but not pro-B or immature B cells ( Figure 5). These LM-PCR products were validated as Vκ6-32 cRS SE by sequencing (Table 3). Thus, Vκ cRS appear to be cleaved in vivo during the developmental stage that is permissive for primary Igκ Vκ → Jκ rearrangements.

Conclusion
The adaptive immune system has evolved to generate a diverse antigen-receptor repertoire. One mechanism of somatic diversification is V(D)J recombination, a process that joins antigen-receptor V, D, and J gene segments by initiating double-strand breaks at RS flanking the gene segments (for a review, see [1]). RS at locations other than the boundaries of V, D, and J segments have been identified at both the Igh and Igκ loci [22,23]. Until recently, cRS in the Igh locus were thought to be limited to the 3' end of V H gene segments where cRS can mediate V H gene replacement [7,9,[22][23][24]. V H gene replacement can participate in a form of receptor editing at the heavy chain locus, which otherwise is incapable of secondary rearrangements that follow the 12/23 rule [7]. It has been proposed that the utility of receptor editing is sufficient to drive the evolutionary conservation of V H cRS [7]. There is mounting evidence, however, that at least some receptor editing is antigen-independent, and that the conservation of Ig V H cRS may result from other selective pressures.
The earliest evidence that the regulation of V H replacement is independent of BCR-specificity came from studies [35][36][37] that demonstrated frequent V H replacement in mice transgenic for non-autoreactive heavy chains. These data suggested that selection for V H cRS includes the capacity for increasing BCR diversification, in addition to self-tolerance [8,35]. We subsequently showed that V H cRS SE were detected only in pro-B cells, including the pro-B cells of μMT mice which can not assemble functional BCR [25,38]. Together, these results support the notion that V H gene replacement may not be driven by the recognition of antigen.
Koralov et al. [39] demonstrated that, in transgenic mice homozygous for nonproductive heavy-chain rearrangements, V H replacement events are only three times more frequent than direct V H to J H joining, in violation of the 12/23 rule. These results demonstrate the inefficiency of cRS-mediated V H replacement and beg the question: How can such an inefficient mechanism for rescuing autoreactive B cells increase fitness sufficiently to maintain V H cRS conservation? If V H cRS are conserved to mediate V H replacement, shouldn't V H replacement at cRS be much more efficient than rearrangements in violation of the 12/23 rule? The results of Koralov et al. [39] suggest that while V H replacement may be mediated by V H cRS, their conservation is unlikely to result only from their role in V H replacement.
Unlike the cRS associated with Igh, the cRS previously identified in Igκ loci were not embedded in Vκ gene segments but sited in the Jκ-Cκ intron and 3' of Cκ and mediated locus inactivation [11,17,[20][21][22]. The cRS located in the Jκ-Cκ intron are known as IRS (IRS1 and IRS2), while the cRS found 3' of Cκ is named the kappa deleting element (kde) in humans and RS in mice. For clarity, we reserve 'RS' for signals adjacent to V, D, and J gene segments, and refer to the signal 3' of Cκ in mice as RS κ3 .
The structure of the Igκ locus allows for secondary Vκ → Jκ rearrangements. Thus, if antigen-driven receptor editing is the primary force behind conservation of V-gene cRS [7,9], Vκ gene segments should not be selected for embedded cRS. Fanning et al. [8] noted the presence of a partial heptamer motif (CACA) in Vκ gene segments at a location orthologous to the 3' V H cRS, but to date, there has been no systematic attempt to identify potential cRS at other sites within Vκ gene segments or to determine their function. The determination of cleaved cRS within Vκ gene segments is an important first step in identifying their physiologic role(s) and resolving the selective forces that maintain their conservation.
To determine whether the Igκ locus contains active cRS embedded in functional Vκ gene segments, we conducted a computational scan for cRS in Vκ gene segments and evaluated their functionality using LM-PCR. Our results indicate that, despite the capacity for repeated secondary Igκ rearrangements, functional Vκ cRS have been evolutionarily conserved. Vκ cRS are primarily conserved in an orientation (O2) opposite to physiologic Vκ 12-RS and have 23-bp spacers (Table 1 and Figure 1). This conserved orientation and spacer size mirrors our earlier demonstration that conserved V H cRS are oriented opposite to physiologic V H 23-RS and contain 12-bp spacers [25].
As with V H cRS, Vκ cRS are conserved at multiple sites in Vκ gene segments and across Vκ gene families. Although our genomic scan identified relatively few Vκ cRS at positions analogous to the 3' V H cRS (nucleotide position 313, IMGT numbering) that mediate V H replacement ( Figure  1), we did observe two cRS SE at this location, both in Vκ2 gene segments (Table 2). Of the 10 unique cleavage events at Vκ-embedded cRS, 8 represent cRS SE ≥ 30 nucleotides upstream of complementarity determining region (CDR) 3 (Table 2). V gene replacement (Vκ → VκJκ) at one of these embedded cRS would result in substantially lengthened variable-region product that would be unlikely to produce a typically folded L-chain protein. The conservation of functional cRS at such sites in Vκ gene segments in a locus capable of secondary Vκ → Jκ rearrangements implies a function distinct from immunological tolerance.
cRS previously identified at the Igκ locus (IRS1, IRS2 and kde/RS κ3 ) mediate rearrangement events that inactivate the locus and may serve to ensure Igκ allelic exclusion or activation of the Igλ loci (reviewed in [40]). Rearrangements between kde/RS κ3 and IRS result in the deletion of Cκ and rearrangements between kde/RS κ3 and Vκ RS result in the deletion of Jκ and Cκ [21,41]. It is possible that the O2 Vκ 23-cRS likewise participate in these inactivation rearrangements, as recombination between IRS and O2 Vκ 23-cRS would result in deletion or inversion of the Jκ gene segment cluster.
Inactivating rearrangements involving IRS and kde/RS κ3 have been implicated in antigen-induced receptor editing (reviewed in [22]), and Kiefer et al. [42] observed RS κ3 cleavage in IgM -BM pre-B cells, IgM low immature BM B cells, and in IgM low IgD + splenic T3/T3' B cells. Our results indicate that cleavage of O2 Vκ 23-cRS is confined to the IgM -, small pre-B compartment (Figures 4 and 5). We conclude that either Vκ cRS SE are rare relative to RS κ3 SE, or that Vκ cRS SE are not present in immature B cells (perhaps because the cRS themselves are not accessible) and, consequently, may be unrelated to antigen-driven receptor editing. In either case, despite their frequency and function, Vκ cRS appear to play a less significant role in antigen-driven genomic change than do IRS and kde/RS κ3 .
The similarities between the V H and Vκ cRS suggest that these DNA motifs are conserved for a common function. Both cRS types are conserved at multiple locations, and both are conserved with an orientation and spacer length opposite to the corresponding physiologic V-associated RS. Both sets of cRS are cleaved coincidentally with the physiologic RS in the same locus. That is, V H cRS are cleaved in pro-B cells and Vκ cRS are cleaved in pre-B cells. We consider below possible mechanisms for conservation of these V-gene cRS in the Igh and Igκ loci.
First, V H and Vκ cRS could be conserved to inactivate the Igh and Igκ loci. If so, this inactivation might help to ensure allelic exclusion, as evidence indicates that V H [25] and Vκ cRS SE ( Figure 5) do not depend on the generation of a functional B-cell receptor. Inactivation of the Igκ locus would increase the proportion of λ-expressing B cells and could act to increase the diversity of the BCR repertoire. A similar argument cannot be made for the Igh locus as there is no alternative locus. Furthermore, the frequency of IRS-to-kde/RS κ3 rearrangements mitigates any need for V-embedded cRS for inactivation at the κ locus. Thus, we doubt that the selection pressure resulting from locus inactivation via V cRS cleavage is sufficient to result in conservation of the cRS.
We previously suggested that V-embedded cRS could function to form hybrid V gene segments thereby creating combinatorial diversity beyond that created through the combination of V, D, and J or V and J gene segments [25]. While the results are controversial, there is evidence for such hybrid heavy chain V genes [43,44] An alternative hypothesis to the conservation of cRS for their recombinogenic potential is that the nucleotide sequences are conserved to maintain appropriate V region amino acid sequences, and the corresponding recombinogenic potential is a coincidence. We present evidence that the conservation of O2 cRS embedded in V H and Vκ is not explained by the need to maintain V region amino acid sequences ( [25] and Figure 3). In V H gene segments, the second, third, and fourth nucleotides of the 3' cRS (...TGTG) encode the conserved Cysteine at amino acid position 104 (Cys 104 ), while the codon for the conserved Cysteine at amino acid position 23 (Cys 23 ) is not part of any known cRS. Cysteine is degenerately encoded, and we find that only 38% of Cys 23 are encoded by TGT [25]. Ninety-eight percent of Cys 104 are encoded by TGT, however, providing evidence for selection pressure to maintain the recombinogenic potential of the 3' V H cRS [25]. Similarly, analysis of FR codons in Vκ gene segments shows that codon diversity at cRS is reduced relative to the maximum possible to a significantly greater extent than at any other FR site (Figure 3), a finding that implies strin-gent selection against synonymous nucleotide substitutions in the cRS. The absence of synonymous mutations is important given that the predicted recombinogenic potential of most conserved (116/128) O2 Vκ 23-cRS could be eliminated by a single, synonymous nucleotide substitution (data not shown). Of the remaining 12 cRS, the recombinogenic potential for 10 of them would be significantly reduced (>90%) by one synonymous nucleotide substitution (data not shown). Thus, while nucleotide substitutions in cRS motifs that eliminate efficient recombination without altering Vκ amino acid sequence are potentially frequent, they are rare or absent in the genome. We conclude that there is evolutionary selection for V H -and Vκ-embedded O2 cRS.
Another alternative hypothesis to the conservation of Vgene cRS for their recombinogenic potential is that the cRS nonamers are conserved for nucleosome positioning. Consensus RS nonamers may contribute to nucleosome positioning and influence RS accessibility to the V(D)J recombinase [45]. While the cRS nonamers may influence nucleosome positioning, this property is unlikely to explain conservation of V-gene cRS. First, RIC scores are based on the complete cRS sequence, and above-threshold RIC scores would not result from conserved nonamer motifs alone. Second, cleaved Vκ cRS ( Table 2) do not contain consensus nonamers and lack the stretch of adenosine nucleotides thought to be responsible for nucleosome positioning [45,46]. Thus, it is unlikely that selection for nucleosome positioning motifs has resulted in the maintenance of functional Vκ cRS.
We provide the first exhaustive search using a rigorous method for cRS embedded in Vκ gene segments. We demonstrate not only that Vκ cRS are conserved, but also that they are cleaved in vivo. We show that the patterns of conservation for Vκ cRS are analogous to those for V H [25], namely that the V-embedded cRS are conserved with an orientation and spacer length opposite to that for V-associated RS in the same locus. We provide evidence that these V-embedded cRS are not conserved as a consequence of selection pressure to maintain V region amino acid sequence and explore several possible explanations for their conservation. While the role of these V-gene cRS is not yet clear, their conservation in both V H [25] and Vκ gene segments implies a substantial evolutionary benefit to their presence.

Identification of Vκ cRS
To identify cRS in Vκ gene segments, we computed the RS information content (RIC) score for 28-and 39-bp segments in the 135 mouse Vκ gene segments available in the Immunogenetics Information System (IMGT) reference directory set [47]. RIC is based on the position-specific nucleotide combinations present in a sequence and the relative frequency of these nucleotide combinations in the set of mouse physiologic RS; sequences with nucleotide combinations frequent in mouse physiologic RS have a high RIC score [4]. We have previously demonstrated that RIC scores can be used to identify RS and cRS and are predictive of recombination efficiency [4,6].
We used RIC scores to determine the location and number of 12-and 23-cRS in mouse Vκ gene segments in both orientations and compared the corresponding relative frequencies with those previously reported for cRS in mouse V H gene segments and in a 212-kb region of mouse chromosome 8 (NCBI accession AC084823). Statistical significance was determined using Chi-square tests.

Estimation of nucleotide diversity
To estimate the nucleotide diversity at each Vκ framework region position, we computed the Shannon entropy [32] at position i where p i, j is the probability of nucleotide j at position i and C is any constant. We estimated p i, j as n i, j /N i where n i, j is the number of