Describe a procedure for converting a Boolean formula in CNF (n variables, m clauses) into an equivalent backprop network. How many hidden units does it have?
Suppose in the CNF we have n variables and m clauses. For each variable, we make an input node. With the unit input node, we have n+1 input nodes. For each clause, we make a hidden unit to simulate the value of this clause(thus, the number of hidden units would be m). Finally, we direct the output of those hidden units to a output node to simulate the "and" state of the CNF.
The weights of the net would be set in the way given in the lecture.
A key issue in LSI is picking "k", the number of dimensions. Let's say we had a set of 10,000 passages. Explain how we could combine the idea of cross validation and autoassociation to select a good value for k.
First we divide the 10,000 passages into a two parts: training data and cross validation data, say 2000 for the cross validation. Then we pick a value for the number of hidden units to get a neural net. Train this net using the traning data and evaluate the net using the cross validation data. Then we adjust the number of hidden units to minimize the error on the cross validation data. In this way, we will not end up memorizing the entire data and will get a general autoassiciation network.