Programming Assignment Checklist: Traveling Salesperson Problem

Frequently Asked Questions

What are the main goals of this assignment? You should (i) learn about the notorious traveling salesperson problem, (ii) learn to use linked lists, and (iii) get more practice with data types.

What files do I need to write? Point and Tour. The Tour data type is the most interesting one because it involves using linked structures.

How do I represent infinity in Java? Use Double.POSITIVE_INFINITY.

How long should my programs take to execute? It should take less than a minute for the 13,509 city example (substantially less if you have a fast computer), unless you're animating the results, in which case it will take considerably longer. If your code takes much much longer, try to discover why (think analysis of algorithms), and explain it in your readme file.

What is the file Tour$Node.class? When you declare a nested class like Node, the Java compiler uses the $ symbol to mark its name.

What is a NullPointerException? You can get one by initializing a variable of type Node, say x, to null and then accessing x.next or x.p. This can happen in your insertion routine if you inadvertently break up the circular linked list.

When should I create a new linked list node with the keyword new? To create a tour with N elements, you should use new exactly N times with Node, once per invocation of insert. It is unnecessary (and bad style) to use new with your list traversal variables since this allocates memory that you never use.

How can I produce an animation of the heuristic in action? It's easy and instructive - just redraw the tour after each insertion. See the instructions in SmallestInsertion.java. It could take a while on a big input file, so you might want to modify it so that it redraws only after every 20 insertions or so.

Can I use Java's built in LinkedList class? Absolutely not! One of the main goals of this assignment is to gain experience writing and using linked structures. The Java libraries can only take you so far, and you will quickly discover applications which cannot be solved without devising your own linked structures. Stay tuned.

Is this the same type of approach used in MapQuest? Similar, but not quite the same. Roughly speaking, MapQuest's problem is to find a shortest path between two distinguished points in a network where only certain pairs of points are connected by edges (roads). This problem is much better understood than the TSP,and there are remarkably efficient algorithms for solving it.

For the extra credit, how fast should my solution be? Obviously it depends on the speed of your computer, but say around 5 minutes for usa13509.txt. If your program is substantially slower, consult a preceptor for suggestions on speeding it up.

Do I have to used a linked list for the extra credit? No.

Input, Output, and Testing

Input. There are many sample data files (with extension .txt) available in /u/cs126/files/tsp/. Most are taken from TSPLIB.

Debugging. A good debugging strategy for most programs is to test your code on inputs that you can easily solve by hand. Start with 1 and 2 city problems. Then, do the 4 city problem. Choose the data so that it is easy to work through the code by hand. Draw pictures. If your code does not do exactly what your hand calculations indicate, determine where they differ. Use the System.out.print method.

Compilation. Your programs must be named Tour.java and Point.java. Don't forget to hit the Run Script button on the submission system to test that it compiles cleanly.

Execution. Use the following command to execute the client NearestInsertion.java

java NearestInsertion < tsp10.txt

Or, if your program has alot of output, use

java NearestInsertion < tsp1000.txt | more

Checking your work. For usa13509.txt we get distances of 50538.33716519955 and 29710.412348094807 for nearest insertion and smallest insertion, respectively. For circuit1290.txt we get 25029.790452731024 (or 25091.6899) and 14596.097124575306. Note that there two possible correct answers with the nearest insertion heuristic because after inserting the first 3 points, there are two distinct tours that have the same length: A->B->C and A->C->B. Depending on which one your program chooses, you can end up with a different tour length (possibly better, possibly worse) down the road.

Timing. Use the client program TSPTimer.java to help you estimate the running time as a function of the input size N. It takes one command line input N, runs the two heuristics, and prints out how long each took.

Submission

Submission. Submit the following files: readme.txt, Tour.java, Point.java.

readme.txt. Use the following readme file template. You will lose points if you don't address these questions.

Possible Progress Steps

These are purely suggestions for how you might make progress. You do not have to follow these steps.

Download the directory tsp to your system. It contains sample data files and sample clients.
First, implement the Point data type. This will be very similar to the program Point.java from precept and the booksite Section 3.3. Be sure to acknowledge the source of the code. Test your Point data type by reading in a TSP data file and plotting the points using the client program PointTest.java.

Create a file Tour.java. Include the standard linked list data type Node. Include an instance variable, say first, of type Node that is a reference to the the "first" node of the circular linked list. Here's a template Tour.java file. For debugging purposes only, have the constructor create a small circular linked list of points.

public Tour() {
    Node a = new Node();
    Node b = new Node();
    Node c = new Node();
    Node d = new Node();
    a.p = new Point(100.0, 100.0);
    b.p = new Point(500.0, 100.0);
    c.p = new Point(500.0, 500.0);
    d.p = new Point(100.0, 500.0);
    a.next = b;
    b.next = c;
    c.next = d;
    d.next = a;
 
    first = a;
}

Implement the method show(). It should traverse each Node in the circular linked list, starting at tour, and print each Point using System.out.println. This method requires only a few lines of code, but it is important to think about it carefully, because debugging linked-list code is notoriously difficult and frustrating. Start by just printing out the first Point. With circular linked-lists the last node on the list points back to the first node, so watch out for infinite loops. Test your method by writing a main function that creates a new Tour object and calls its show method. If your function is working properly, you will get the following output for 4 city problem above.
```
(100.0, 100.0)
(500.0, 100.0)
(500.0, 500.0)
(100.0, 500.0)
```
Test your show method on inputs with 0, 1 and 2 points to check that it still works. You can create such instances by modifying the 4-node debugging constructor.
Implement the method distance. It is very similar to show, except that you will need to invoke the method distanceTo in the Point data type. The four point example has distance 1600.0.
Implement the method draw. It is also very similar to show, except that you will need to invoke the methods drawTo in the Point data type. The four point example above should produce a square.

If you've gotten to this point, you should feel a sense of accomplishment, as working with linked list is quite difficult at first!

For this crucial step, you must write a method to insert one point p into the tour. First, get rid of the debugging constructor (or you'll have 4 extra nodes in the tour). As a warmup, implement a simpler version named insertInorder to insert a point into the tour after the point that was added last. To do this, write a loop to find the last point, and insert it after it. It's ok (and expected) that you'll have a special case for the very first point ever inserted, but you shouldn't need any other special cases.
Implement insertNearest. To determine which node to insert the point p after, compute the Euclidean distance between each point in the tour and p by traversing the circular linked list. As you proceed, store the node containing the closest point and its distance to p. After you have found the closest node, create a node containing p, and insert it after the closest node. This involves changing the next field of both the newly created node and the closest node. As a check, the resulting tour for the 10 city problem has distance 1566.136. Note that the optimal tour has distance 1552.961 so this rule does not, in general, yield the best tour.
After doing the nearest insertion heuristc, you should be able write the method insertSmallest by yourself, without any hints. The only difference is that you want to insert the point p where it will result in the least possible increase in the total tour length. As a check, the resulting tour and has distance 1655.746. In the case, the smallest insertion heuristic actually does worse than the nearest insertion heuristic (although this is not typical).

Extra Credit

Possibilities for improvement are endless. Here are some ideas. If you are shooting for an A or A+, this is a good opportunity to impress us. However, be warned, this is a difficult extra credit. If you complete the extra credit, submit a program ExtraCredit.java along with any accompanying files. You need not reuse the Tour data type or linked lists, but you should exercise good modular design.

Enrichment

Here's a 13,509 city problem that contains each of the 13,509 cities in the continental US with population over 500. The optimal solution was discovered in 1998 by Applegate, Bixby, Chvatal and Cook using theoretical ideas from linear and integer programming. The following 15,112 city problem was solved to optimality in April, 2001, and is the current world record. It took 585,936,700 CPU seconds (along with alot of cleverness) to find the optimal tour through 15,112 cities in Germany.
Here's a nice pictorial survey of the history of the TSP problem.
Some folks even use the TSP to create and sell art. Check out Julian Lethbridge's page.

Kevin Wayne