Regularization
What can be done to reduce the need for data to estimate the covariance
matrix? Basically, there are only a couple of options.
- Assume that the features are statistically independent.
This results in a diagonal covariance matrix, C0 = diag(v1,
v2, ..., vd ), where vk is the variance
for Feature k. The individual variances are pretty easy to estimate. Unfortunately,
the assumption of independence is likely to be a poor one.
- Assume that the covariance matrix is the same for all of the
classes. This allows us to pool the data from all the classes.
It is an attractive option if the number of classes is large.
- Regularize the estimated covariance matrix. One way
to do this is to form a convex combination of the estimate covariance matrix
C and C0, the estimate obtained assuming statistical independence:
a C + (1 - a) C0. Here a is a parameter between zero and one
that you have to adjust experimentally to get the best results. This brings
us to the topic of validation.
Back to Covariance
On to Validation
Up to Learn