Class ValueIter

java.lang.Object
  extended byValueIter

public class ValueIter
extends java.lang.Object

This is the template of a class that should run value iteration on a given MDP to compute the optimal policy which is returned in the public policy field. The computed optimal utility is also returned in the public utility field. You need to fill in the constructor. You may wish to add other fields with other useful information that you want this class to return (for instance, number of iterations before convergence).


Field Summary
 int[] policy
          the computed optimal policy for the given MDP
static double precision
          the precision used to determine when to stop iterating (called epsilon in lecture)
 double[] utility
          the computed optimal utility for the given MDP
 
Constructor Summary
ValueIter(Mdp mdp, double discount)
          The constructor for this class.
 
Methods inherited from class java.lang.Object
equals, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Field Detail

policy

public int[] policy
the computed optimal policy for the given MDP


utility

public double[] utility
the computed optimal utility for the given MDP


precision

public static double precision
the precision used to determine when to stop iterating (called epsilon in lecture)

Constructor Detail

ValueIter

public ValueIter(Mdp mdp,
                 double discount)
The constructor for this class. Computes the optimal policy for the given mdp with given discount factor, and stores the answer in policy. Also stores the optimal utility in utility.