COS 402: Artificial Intelligence

Written Exercises W3, HP Problems

W3 HP Problems

Fall 2013

Due: Thursday, November 7


There are 9 problems that are worth a total of 20 HP. Be sure to show your work and justify all of your answers.  Turn these problems in together with your W3 homework. See the course home page for information on when and where to submit written exercises, and grading criteria


In this HP homework we explore what information the structure of a Bayesian Network gives about the properties of the joint distribution it represents. First, we will explore how information "flows" through the Bayesian Network graph. Intuitively, there is information flowing between two nodes (variables) X and Y if finding out something about one, provides additional information about the other. In other words information flows between X and Y if they are not independent.

In general, if there is a path through the Bayesian Network between two nodes (irrespective of the directionality of the arcs) then information will flow through it. There are, however, exceptions. In some cases the information flow through the path can be blocked by "colliders", or "V-structures". These are arcs on the path that point to the same node (collide head to head). For instance in the example below, B together with the arcs pointing to it forms a V-structure and blocks the information flow between X and Y. Note that A does not block the information flow between X and Y because, even though there are two arcs pointing to A, one of the arcs is not part of the path between X and Y.

[2 HP] Problem 1: Prove that in any joint distribution that can be represented by a Bayesian Network with the structure given by figure above, variables X and Y must be independent (i.e. there is no information flowing between X and Y).

Information flow can also be blocked by "evidence". That is, if the value of one or more variables {Z1,Z2,...,Zk} are known, X does not give any more information about Y that was not already provided by knowing the values of variables {Z1,Z2,...,Zk}. In other words, X and Y are conditionally independent given {Z1,Z2,...,Zk}. An example of this kind of blocking appears in the example below:

[2 HP] Problem 2: Prove that in any joint distribution that can be represented by a Bayesian Network with the structure given by figure above, variables X and Y must be conditionally independent given variable Z (i.e. the information flow that would have flown on the path from X to Y is blocked by Z)

To make things more confusing, evidence does not always block the information flow. Sometimes evidence "unblocks" it.

[2 HP] Problem 3: Give an example of a joint distribution that can be represented by a Bayesian Network with the structure given by the figure above, where X and Y are NOT conditionally independent given Z.

Now we have enough to give a general characterization when information flows between two variables in a Bayesian Network:

The significance of the results above is that conditional independence relationships can be simply "read" from the graph structure. If you are still confused about d-separation you can find a nice explanation here.

[1 HP] Problem 4: Find 10 conditional independence relationships that hold for the Bayesian Network with the structure given by the figure above. (i.e. specify which variables are conditionally independent given which variables). No partial credit for this problem (you must get all 10 conditional independence relationships correct).

[3 HP] Problem 5: Give efficient algorithm to find all nodes that are d-separated from node X given a set of evidence nodes {Z1,Z2,...,Zk}.

[2 HP] Problem 6: How can the algorithm above be applied to inference problems in Bayesian Networks?

In the second part we analyze what happens when the modeler fails to take into account all relevant variables. So far in class we have assumed that the modeler is able to identify all the relevant variables. However, more often than not, this is not the case.

[3 HP] Problem 7: Let the true joint distribution be represented by a Bayesian Network with the structure above. How would the graph change if variable A would be missing? That is, what is a graph structure of the Bayesian Network that represents the joint distribution of the remaining variables {B,...,J}. Give the structure that is most similar to the original graph structure. Draw the resulting structure and justify your answer.

[3 HP] Problem 8: Give an algorithm for "removing" a variable from a general Bayesian Network. (An algorithm to transform a Bayesian Network structure with n variables into one with n-1 nodes where a given variable is missing.

[2 HP] Problem 9: Give an example where missing a relevant variable is really bad. Explain why it is really bad. Give an example where missing a relevant variable would NOT be problematic.