Models and methods for mining B cell repertoire dynamics from next-generation sequencing studies
The ability of our immune system to recognize ever-evolving threats is critical to survival. Initial recognition of pathogens depends on generating a diverse repertoire of antibodies through recombination of gene segments. This naïve repertoire is dynamically modified as activated B cells undergo cycles of division, somatic hypermutation and affinity-dependent selection. This affinity maturation process produces expanded memory B cell clones expressing mutated antibodies with high-affinity for the pathogen. Analyzing the collection of receptors expressed by naïve and memory B cells offers insights into the infection history of individuals. It can teach us about fundamental immune processes, and reveal disregulation. The recent development of high-throughput sequencing brings exciting possibilities, allowing for large-scale characterization of antibody repertoires. However, the statistical methods and models to plan these high-throughput experiments and analyze their results are lacking. Hereby, I will present several new computational tools that were designed to address three crucial steps in lymphocyte receptor repertoire analysis: process raw data, quantify affinity dependent selection and build a targeting model for the observed mutation spectrum. Examples of the applicability of these tools will be demonstrated through the analysis of several studies involving next generation antibody sequencing datasets. I will share my view of the major obstacles that still need to be confronted before we can utilize lymphocyte receptor repertoire analysis for diagnosis and prognosis.