COS 226 Programming Assignment Checklist: Kd-Trees

Frequently Asked Questions

Is a point on the boundary of a rectangle considered inside it? Do two rectangle intersect if they have just one point in common? Yes, and yes.

Can I add extra public methods to the Point and RectHV APIs? You are permitted (and encouraged) to add a main() test client, but you may not add other public methods.

What should I do if a point is inserted twice in the data structure? The data structure represents a set of points, so you should only keep one copy.

What should I do if a point has the same x-coordinate as the point in a node when inserting / searching in a 2d-tree? Go the right subtree as specified.

Can I assume that all x- or y-coordinates of points inserted into the KdTree will be between 0 and 1? Yes.

How should I scale the coordinate system when drawing? Don't, please keep the default range of 0 to 1.

How should I set the size and color of the points and rectangles when drawing? Your Point and RectHV data types should draw points and lines using StdDraw.point() and StdDraw.line() as usual. They should not call StdDraw.setPenColor() or StdDraw.setPenRadius(). In the client code, use StdDraw.setPenColor() before drawing the points and rectangles. Also use StdDraw.setPenRadius(.01) before drawing the points and StdDraw.setPenRadius() before drawing the lines.

My implementation of distanceTo() in RectHV has numerous cases. Is this the correct approach? This is acceptable provided you handle every possible case. Alternatively, there is another approach that requires only about 6 lines of code.

Am I required to explicitly store a RectHV in each node in my 2d-tree? No, this is not a requirement, but it is probably wise in your first version.

What should be the format of the toString() methods in Point and RectHV? These methods are primarily for debuginng, so any reasonable format is fine. We recommend (x, y) for points and [xmin, xmax] x [ymin, ymax] for rectangles.

Testing and Submitting

Testing.   We highly recommend testing each method in each data type. Catching mistakes in Point and RectHV early will save you significant time when debugging later. In general your testing code will have more cases than the methods to be tested.

RectHV. To be more specific, when testing the method contains() in RectHV, you will likely have 4 boolean expressions but you will want to test points that are NW, N, NE, W, inside, E, SW, S, and SE relative to the rectangle. You want to also have just as many test cases for boundary conditions. With intersects() you want to be careful to test all possible overlapping rectangles as well as the case where one rectangle is contained within the other. With distanceTo() you want to test in the same style as with contains().

KdTree. A good way to test this data type is to perform the same sequence of operations on both the PointSET and KdTree data types and identify any discrepancies.

Sample input files.   We provide some input files for testing

Possible Progress Steps

These are purely suggestions for how you might make progress. You do not have to follow these steps.

  1. Getting started. Download the directory kdtree to your system. It contains some sample input files and test clients that you will use.

  2. Node data type. There are several reasonable ways to represent a node in a 2d-tree. One approach is to include the point, a link to the left/bottom subtree, a link to the right/top subtree, and an axis-aligned rectangle corresponding to the node.
    private class Node {
       private Point p;        // the point
       private RectHV rect;    // the axis-aligned rectangle corresponding to this node
       private Node lb;        // the left/bottom subtree
       private Node rt;        // the right/top subtree
    }
    

Optimizations

These are some ways to speed up your 2d-tree.


Kevin Wayne