Skip to main content

Table 2 Selection of important statistical methods suitable for the analysis of immunological data.

From: A guide to modern statistical analysis of immunological data

Example of research question

Type of data [D: dependent, I: independent]

Other data assumptions

Statistical method1

Univariate techniques

Univariate group mean comparison techniques

Compare expression of a cytokine between two independent groups (e.g. treatment vs. control)

D: continuous

I: categorical

Normal distribution homogeneity of variances

t-test

 

D: continuous or ordinal

I: categorical

 

Mann Whitney-U test

Compare expression of a cytokine between two related groups (e.g. before and after treatment)

D: continuous

I: categorical

Normal distribution, homogeneity of variances

Paired t-test

 

D: continuous or ordinal

I: categorical

 

Wilcoxon rank sum test

Compare expression of a cytokine between three or more independent groups defined by one factor (e.g. treatments A, B, C)

D: continuous

I: categorical

Normal distribution, homogeneity of variances

One-way analysis of variance

 

D: continuous or ordinal

I: categorical

 

Kruskal Wallis – H test

Compare expression of a cytokine between three or more related groups (e.g. measurements 1, 2, and 3 weeks after treatment)

D: continuous

I: categorical

Multivariate normal distribution, assumptions about covariance

Repeated measurements analysis of variance

 

D: continuous or ordinal

I: categorical

 

Friedman's ANOVA

Correlation and regression analysis

Quantify association between two cytokines or a cytokine and another continuous variable

D: continuous

I: continuous

Linear relationship, normality

Pearson correlation coefficient

 

D: continuous or ordinal

I: continuous or ordinal

Linear relationship

Spearman rank correlation coefficient

Predicting expression of a cytokine by a continuous independent variable

D: continuous

I: continuous

Specified relationship (e.g. linearity for linear regression), normal distribution (for parametric regression)

Univariate regression

Multivariate techniques

Multivariate correlation and regression techniques

Quantify associations between two cytokines adjusted for the effect of a third continuous variable

All variables: continuous

Linear relationship, normality

Partial correlation coefficient

Predicting a continuous outcome (e.g. a cytokine) by several continuous or categorical independent variables

D: continuous

I: continuous, ordinal or categorical

Specified relationship (e.g. linearity for linear regression), normal distribution for parametric regression, No multi-collinearity

Multiple regression

  

Specified relationship, multi-collinearity

Partial least squares regression

Quantifying the magnitude of correlation between two groups of continuous variables (e.g. Th1 and Th2 related cytokines)

All variables: continuous

 

Canonical correlation analysis

Multivariate group mean comparison procedures

Compare cytokine expressions between three or more independent groups defined by two or more factors (e.g. treatment and gender)

D: continuous

I: categorical

Normal distribution, homogeneity of variances

Multi-way analysis of variance (ANOVA)

Simultaneously compare expressions of two or more cytokines between three or more independent groups defined by two or more factors

D: continuous

I: categorical

Multivariate normal distribution, homogeneity of covariance matrices

Multivariate analysis of variance (MANOVA)

Compare cytokine expressions between three or more related groups defined by two or more factors (e.g. measurements at different time points during a study and treatment)

D: continuous

I: categorical

Multivariate normality, homogeneity of covariance matrices

Multi-way repeated measurements analysis of variance

Grouping set of correlated cytokines to summary variables ("principal components")

All variables: continuous

High degree of multicollinearity

Factor analysis/Principal components analysis

Grouping subjects in homogenous subgroups according to similar expression levels of two or more cytokines

All variables: continuous

Low degree of multicollinearity

Cluster analysis

Classification procedures

Explaining or predicting group membership of two or more independent groups by cytokine levels

D: categorical

I: continuous

Multivariate normal distribution, equal covariance matrices, low degree of multicollinearity

Linear discriminant analysis

Explaining or predicting group membership of two independent groups by cytokine levels

D: categorical

I: continuous, ordinal or categorical

 

Logistic regression

Explaining or predicting group membership of three or more groups by cytokine levels

D: categorical

I: continuous, ordinal or categorical

 

Multinomial logistic regression

Advanced techniques for multiple relationships

Modelling multiple relationships between several immunological parameters and one or more outcome variables

All variables: categorical, ordinal or continuous data

Conceptual framework specifying the multiple relationships among the study variables

Path analysis/Structural equation modelling

  1. 1All univariate and multivariate statistical approaches listed above can be implemented in general purpose statistical packages, e.g. among others S-PLUS® (Insightful Corporation, Seattle, WA), SAS® (SAS Institute Cary, NC, USA), SPSS® (Chicago: SPSS Inc.) or STATA® (StataCorp. Stata Statistical Software. College Station, TX: StataCorp LP). Path analysis/structural equation modelling can be implemented in STATA and SPSS that provide the extensions modules GLLAMM and AMOS, respectively, as well as in several special purpose software packages, e.g. among others LISREL® (Scientific Software International, Inc, IL, USA) or MX® (MCV, Department of Psychiatry, Richmond, VA, USA).