Motivation and experimental setup. (A) The problem of undersampling in NGS antibody repertoire sequencing is most easily explained by the marble analogy. Assuming an urn is filled with k numbers of marbles of different species in varying frequencies-urn and marbles represent the original antibody mixture. The problem is clear: if only a sample of size n (n < k) is drawn, then three qualitatively different sampling outcomes can arise: (1) If n is too small, species richness (number of different colors in the urn) is not accurately determined and consequently neither are species frequencies. (2) In the case that n is larger, species richness is accurately represented but species frequencies can be off. (3) Only if n is large enough, both species richness and frequency are accurately reflected in the sample. This study set out to answer, which outcome best describes antibody repertoire NGS data from ASCs of immunized mice. (B) To address undersampling concerns, we explored two different scenarios of ASC diversity: 10 female BALB/c mice were immunized with NP-CGG and sacrificed 14 days post-injection. Subsequently, bone marrow plasma cells were isolated as described previously, as were CD138-positive splenocytes . ASCs of 1 mouse (1M) were pooled as were those of 9 mice (9M). RNA was isolated, followed by RT-PCR and Illumina MiSeq sequencing of triplicates (see Methods).