Exhaustive Methods
Here is a brute-force procedure for finding the best subset of m features,
where m < d: 
While this may be feasible when d is small, there are some obvious problems
with this approach: 
  - There are so many ways to select m features out of d, namely, d!/(m!(d-m)!).
For example, if d = 100 and m = 10, we must try over 1013 subsets.
 
 
- If we repeatedly use the same test data, we may obtain features that
are well suited for that particular test set, but that are not the best
in general. 
 
 
- The results will depend on m. We may also have to repeat the process
for various values of m to make an informed choice. 
 On to Stepwise
On to Stepwise  Up to Feature Selection
Up to Feature Selection