COS 226 Programming Assignment Checklist: WordNet


Frequently Asked Questions

Is a vertex considered an ancestor of itself? Yes, this is the typical convention.

Can I assume the id numbers will be integers in a small range? Yes, if there are V synsets, the ids will be numbered 1 through V (sorry, not the usual 0 through V-1).

Should my program work on datasets other than WordNet? Absolutely. It should work on any datasets in the appropriate format.

Should SAP work if the digraph is not a DAG? Yes, the definition still applies in the presence of cycles.

Some of the glosses have example sentences at the end. What is this? That's just part of the gloss.

Any advice on how to read in and parse the data files? Use the readLine() method in our In library to read in the data, one line at a time. Use the split() method in Java's String library to divide a line into fields. Use Integer.parseInt() to convert string id numbers into integers.

I'm an ontologist and I noticed that your hypernyms.txt file contains both is-a and is-instance-of relationships. Yes, you caught us. This ensures that every noun (except entity) has a hypernym. Here is an article on the subtle distinction.

Input, Output, and Testing

Input and output. We encourage you to create your own (possibly pathological) inputs to help test your program. If your datasets create problems for other programs (or ours!), we'll award extra credit. The input should be very small, and it should expose a potential flaw that other programs are likely to face. In your readme.txt, you should describe what the input is testing.

Extra credit. Submit either an interesting example (or corner case) that you used to test your code, preferably a case that arises in the WordNet digraph (and one that uses everyday words). But you can also make up your own small synsets.txt and hypernyms.txt files.

Submission and readme

Here is a template readme.txt file. It should contain the following information:

Possible progress steps

Optional Optimizations

There are a few things you can do to speed up a sequence of SAP computations on the same digraph.