From: Systematic evaluation of B-cell clonal family inference approaches
 | Approach | VJ Partioning | Region | Sequence Type | Identical Junction Length | Similarity Measure | Sequence Clustering |
---|---|---|---|---|---|---|---|
A1 | Unique junction (AA) | No | Junction | AA | Yes | Exact match | Dissimilarity = 0% |
A2 | Subclone (AA) | Yes | Junction | AA | Yes | Exact match | Dissimilarity = 0% |
A3 | Absolute threshold (AA) | Yes | Junction | AA | Yes | Hamming Distance between junction regions | Dissimilarity <= 1 AA (absolute threshold) |
A4 | Relative threshold (AA) | Yes | Junction | AA | Yes | Length normalized Hamming Distance between junction regions | Dissimilarity <=15% (relative threshold) |
A5 | Relative threshold (NT) | Yes | Junction | NT | Yes | Length normalized Hamming Distance between junction regions | Dissimilarity <=15% (relative threshold) |
A6 | Change-O | Yes | Junction | NT | Yes | Length normalized Hamming Distance between junction regions | Sample-based dissimilarity threshold based on bimodal distance-to-nearest distribution |
A7 | SCOPer (junction) | Yes | Junction | NT | Yes | Kernel matrix (distance based on junction) | Unsupervised spectral clustering |
A8 | SCOPer (shared) | Yes | Junction + VJ sequence | NT | Yes | Kernel matrix (distance based on junction + shared mutations in VJ) | Unsupervised spectral clustering |
A9 | Partis | Yes | Full sequence | NT | No | Likelihood ratio to decide if two sequence (sets) were derived from same ancestor, and Hamming distance between reconstructed germline sequences. | Hamming Dissimilarity <=1.5% OR Likelihood ratio < = variable threshold |
A10 | Alignment free | No | Full sequence | NT | No | Cosine distance calculated from the tf-idf statistic. | Automatic clonal distance threshold determination by negation, fraction of the distances to negation sequences threshold = 10% |