Programming Assignment 5 Checklist: Kd-Trees

Frequently Asked Questions

I'm nervous about writing recursive search tree code. How do I even start on KdTreeST.java? You can use BST.java as a guide. The trickiest part is understanding how the put() method works. You do not need to include code that involves storing the subtree sizes (since this is used only for ordered symbol table operations).

What makes KdTree a hard assignment? How do I make the best use of my time? Debugging performance errors is very hard on this assignment. It is very important that you understand and implement the crucial optimizations listed in the assignment text, namely:

Do not start range search or nearest neighbor until you understand these rules.

Is a point on the boundary of a rectangle considered inside it? Do two rectangle intersect if they have just one point in common? Yes and yes (which is consistent with the implementation of RectHV.java).

Can I use the distanceTo() method in Point2D and RectHV? No, you may use only the subset of the methods listed. You should be able to accomplish the same result by using distanceSquaredTo() instead of distanceTo().

What should I do if a point has the same x-coordinate as the point in a node when inserting or searching in a 2d-tree? Go to the right subtree as specified on the assignment page under Search and insert.

What should I do if a point is inserted twice in the data structure? The data structure represents a symbol table, so you should replace the old value with the new value.

What should points() return if there are no points in the data structure? What should range() return if there are no points in the range? An Iterable<Point2D> object with zero points.

What should nearest() return if there are two (or more) nearest points? Any one.

How much memory does a Point2D object use? For simplicity, assume that each Point2D object uses 32 bytes—in reality, it uses a bit more because of the Comparator instance variables.

How much memory does a RectHV object use? You should look at the code and analyze its memory usage.

I run out of memory when running some of the large sample files. What should I do? Be sure to ask Java for additional memory, e.g., java -Xmx1600m RangeSearchVisualizer input1M.txt.

Testing

Testing. A good way to test KdTree is to perform the same sequence of operations on both the PointST and KdTreeST data types and identify any discrepancies. The sample clients RangeSearchVisualizer.java and NearestNeighborVisualizer.java take this approach.

Sample input files.   The directory kdtree contains some sample input files in the specified format.

Possible Progress Steps

These are purely suggestions for how you might make progress on KdTreeST.java. You do not have to follow these steps.

  1. Complete the KdTree worksheet. Here is a set of practice problems for the core kd-tree methods. Here are the answers.

  2. Node data type. There are several reasonable ways to represent a node in a 2d-tree. One approach is to include the point, a link to the left/bottom subtree, a link to the right/top subtree, and an axis-aligned rectangle corresponding to the node.
    private class Node {
       private Point2D p;      // the point
       private Value value;    // the symbol table maps the point to this value
       private RectHV rect;    // the axis-aligned rectangle corresponding to this node
       private Node lb;        // the left/bottom subtree
       private Node rt;        // the right/top subtree
    }
    
    Since we don't need to implement the rank and select operations, there is no need to store the subtree size.

  3. Writing KdTreeST.

Optimizations

These are many ways to improve performance of your 2d-tree. Here are some ideas.