A more general linear transformation rotates as well as stretches the
coordinates. Clusters of data points that were originally spherical get
transformed into ellipsoids whose axes are rotated relative to the coordinate
axes. This introduces a covariance between the components
of the feature vector.

It is hard to imagine why we would purposely want to convert spherical
clusters into ellipsoidal clusters. However, we very well might want to
convert ellipsoidal clusters into spherical clusters. To do this, we need
to understand more about the covariance.

Back to Scaling
On to Covariance
Up to Mahalanobis