Programming Assignment 5: Kd-Trees


Write a symbol table data type that provides the ability to map from Point2D objects to arbitrary values. Use a 2d-tree to support efficient range search (find all of the points contained in a query rectangle) and nearest neighbor search (find a closest point to a query point). 2d-trees have numerous applications, ranging from classifying astronomical objects to computer animation to speeding up neural networks to mining data to image retrieval.

Range search and k-nearest neighbor


Geometric primitives. To get started, use the following geometric primitives for points and axis-aligned rectangles in the plane.

Geometric primitives

Use the immutable data type Point2D.java (part of algs4.jar) for points in the plane. Here is the subset of its API that you may use:

public class Point2D implements Comparable<Point2D> {
   public Point2D(double x, double y)              // construct the point (x, y)
   public  double x()                              // x-coordinate 
   public  double y()                              // y-coordinate 
   public  double distanceSquaredTo(Point2D that)  // square of Euclidean distance between two points 
   public     int compareTo(Point2D that)          // for use in an ordered symbol table 
   public boolean equals(Object that)              // does this point equal that object? 
   public    void draw()                           // draw to standard draw 
   public  String toString()                       // string representation 
}
Use the immutable data type RectHV.java (not part of algs4.jar) for axis-aligned rectangles. Here is the subset of its API that you may use:
public class RectHV {
   public    RectHV(double xmin, double ymin,      // construct the rectangle [xmin, xmax] x [ymin, ymax] 
                    double xmax, double ymax)      
   public  double xmin()                           // minimum x-coordinate of rectangle 
   public  double ymin()                           // minimum y-coordinate of rectangle 
   public  double xmax()                           // maximum x-coordinate of rectangle 
   public  double ymax()                           // maximum y-coordinate of rectangle 
   public boolean contains(Point2D p)              // does this rectangle contain the point p (either inside or on boundary)? 
   public boolean intersects(RectHV that)          // does this rectangle intersect that rectangle (at one or more points)? 
   public  double distanceSquaredTo(Point2D p)     // square of Euclidean distance from point p to closest point in rectangle 
   public boolean equals(Object that)              // does this rectangle equal that object? 
   public    void draw()                           // draw to standard draw 
   public  String toString()                       // string representation 
}
Do not modify these data types.

Brute-force implementation. Write a mutable data type PointST.java that maps Point2D objects to generic values. Implement the following API by using a red-black BST (using either RedBlackBST from algs4.jar or java.util.TreeMap); do not implement your own red-black BST).

public class PointST<Value> {
   public         PointST()                                // construct an empty symbol table of points 
   public           boolean isEmpty()                      // is the symbol table empty? 
   public               int size()                         // number of points 
   public              void insert(Point2D p, Value val)   // associative the value val with point p
   public             Value get(Point2D p)                 // value associated with point p 
   public           boolean contains(Point2D p)            // does the symbol table contain point p? 
   public              void draw()                         // draw all points to standard draw 
   public Iterable<Point2D> range(RectHV rect)             // all points that are inside the rectangle 
   public           Point2D nearest(Point2D p)             // a nearest neighbor to point p; null if the symbol table is empty 
   public static void main(String[] args)                  // unit testing of the methods (not graded) 
}
Your implementation should support insert(), get() and contains() in time proportional to the logarithm of the number of points in the set in the worst case; it should support nearest() and range() in time proportional to the number of points in the symbol table. You may assume that clients will not pass key or value arguments equal to null.

2d-tree implementation. Write a mutable data type KdTreeST.java that uses a 2d-tree to implement the same API (but replace PointST with KdTreeST). A 2d-tree is a generalization of a BST to two-dimensional keys. The idea is to build a BST with points in the nodes, using the x- and y-coordinates of the points as keys in strictly alternating sequence, starting with the x-coordinates.

Insert (0.7, 0.2)

insert (0.7, 0.2)
Insert (0.5, 0.4)

insert (0.5, 0.4)
Insert (0.2, 0.3)

insert (0.2, 0.3)
Insert (0.4, 0.7)

insert (0.4, 0.7)
Insert (0.9, 0.6)

insert (0.9, 0.6)
Insert (0.7, 0.2)
Insert (0.5, 0.4)
Insert (0.2, 0.3)
Insert (0.4, 0.7)
Insert (0.9, 0.6)

The prime advantage of a 2d-tree over a BST is that it supports efficient implementation of range search and nearest neighbor search. Each node corresponds to an axis-aligned rectangle, which encloses all of the points in its subtree. The root corresponds to the infinitely large square from [(-∞, -∞), (+∞, +∞ )]; the left and right children of the root correspond to the two rectangles split by the x-coordinate of the point at the root; and so forth.

Clients.  You may use the following interactive client programs to test and debug your code.

Analysis of running time and memory usage. Analyze the effectiveness of your approach to this problem by giving estimates of its time and space requirements.

Extra credit.  For one point of extra credit, create a new data type EnhancedKdTreeST.java that contains one additional method:

public Iterable<Point2D> nearest(Point2D p, int k)
This method should return the k points that are closest to the query point (in any order); return all N points in the data structure if Nk. It should do this in an efficient manner, i.e. using the technique from kd-tree nearest neighbor search, not from brute force. Once you've completed this class, you'll be able to run BoidSimulator.java (which depends upon both Boid.java Hawk.java). Behold their flocking majesty.

Challenge for the bored.  Make an interesting improvement to the boid simulator. This can involve interactivity, improvement of flocking behavior, n or out, creation of a better physics model (e.g. right now the hawk has a fixed amount of directional thrust). Josh is the only one who will provide support for this optional challenge. Particularly good submissions will be worth up to 2 bonus points (on top of the 1 point for extra credit).

KdTree competition.  Optionally, you may submit your code to the KdTree competition. Your program will be timed and the results displayed in this public leaderboard. You should submit all of your java files, as well as nickname.txt. Whatever you put in nickname.txt will be used as your name in the leaderboard. There is no official reward for doing well in the competition. You may submit any number of times. You are welcome to use this test as a guide for evaluating your program's performance.

Submission.  Submit only PointST.java and KdTreeST.java. Each of the two data types should include their own main() that thoroughly tests the associated operations. We will supply Point2D.java, RectHV.java, stdlib.jar, and algs4.jar. You may not call any library functions other than those in java.lang, java.util, stdlib.jar, and algs4.jar. Finally, submit a readme.txt file and answer the questions.

This assignment was developed by Kevin Wayne, with boid simulation by Josh Hug.