How do you perform a linear discriminant analysis?
Linear discriminant analysis was formulated by Ronald A. Fisher in 1936 and had practical uses as a classifier. Ronald developed this analysis for a 2-class problem. However, later in 1948, C. R. Rao generalized Fisher’s work as multi-class linear discriminant analysis or multiple discriminant analysis.
The general approach of LDA is simple. It can be performed in three steps. We have discussed these steps below to help you understand:
- Calculating the ‘separability’ between classes
- Computing the within-class variance
- Constructing a lower-dimensional space
The ‘separability’ between classes is also known as the between-class variance. It is the distance between the mean of different classes. The between-class variance allows the algorithm to quantitatively measure how difficult the problem is. If the means are closer, the problem is harder. The ‘separability’ between classes is usually kept in a scatter matrix known as the ‘between-class scatter matrix.’
This is also known as the distance between the mean and the sample of every class. The within-class variance is the other factor in the difficulty of separation. If there is a higher variance within a class, making a clean separation is extremely difficult.
This lower-dimensional space should maximize the between-class variance ('separability') and minimizes the within-class variance. LDA can be computed using eigenvalues, least-squares method, and singular value decomposition.
How to prepare data for linear discriminant analysis
You should make sure that your data is fit for use before performing LDA. We recommend that you consider this list of suggestions when you are preparing your data for use with LDA:
- Classification problems
A Linear discriminant analysis only caters to classification problems with categorical output variables. This means that this analysis will not work if the data is not categorical. LDA can be used for both binary and multi-class classification.
- Gaussian distribution
LDA models assume that the input variables have a Gaussian distribution. This is the standard implementation of this analysis. You can make your variables to be more Gaussian-looking by using transforms and reviewing the univariate distributions of each attribute. You can do this using tools such as the Box-Cox for skewed distributions and the log and root for exponential distributions.
- Removing outliers
Your data should be free of outliers. You should consider removing outliers because they skew the basic statistics like the mean and standard deviation that are used to separate classes in LDA.
- Same variance for each input variable
When performing linear discriminant analysis, we assume that each input variable has the same variance. For this reason, it is good to standardize data before using LDA. Doing this will ensure that you have a mean of 0 and a standard deviation of 1.
Extending Linear Discriminant Analysis
We mentioned at the beginning of this article that LDA is a simple but effective method for classification. Since it boasts of these qualities, you can add several variations and extensions to the method. Some of the popular extensions of LDA include:
- Quadratic Discriminant Analysis (QDA)
In this type of analysis, every class has its estimate of variance. The estimate of variance can also be seen as the covariance when there are more than two input variables.
- Flexible Discriminant Analysis (FDA)
FDA uses non-linear combinations of inputs. An example of such inputs splines.
- Regularized Discriminant Analysis (RDA)
RDA moderates the influence of different variables on LDA by introducing regularization into the estimate of the variance.