3.5   Case Study:   Purple America


This section under major construction.

Layers of abstraction.

When specifying the set of values for a Java class, we can include data members which are either primitive types or reference types. This enables us to build objects from other objects, and build layers of abstractions. In literature, we have letters, words, sentences, paragraphs, chapters, and books. In music we have notes, chords, measures, phrases, parts (musical event for one performer), movements, symphonies. Each note encapsulates its pitch, dynamic, and duration. In chemistry we have atoms, molecules, compounds. In a poker game, we have cards, the deck, hands, the dealer, and players. Classic example = networking (http, tcp, ip, etc.) Use abstraction to understand and decompose complex systems.

Modules. Decompose problem into individual re-usable data types. Each data type has its own well-specified interface, which makes it suitable for use as a black-box component in more complex system. Composition: combine data types to build more complex data types. Intuition: Although commercial aircraft are sold by one company, no company manufactures all of the parts and components to build it. The airframe, engine, and avionics are all made by different companies. In order for them to interact properly, a system architect must completely spec out the interfaces of each part. Commodity PCs work the same way: video card, Ethernet adapter, mouse, monitor, operating system are typically made by different companies.

Modules in Java. In Section 3.2 we built user-defined types from primitive types, strings, and arrays. In this section we will also build user-defined data types from other user-defined data types. Offers many advantages for building software systems.

Computational geometry.

We create a number of data types for geometric objects in the plane. Although we will only introduce a tiny set of computational geometry primitives, they arise in numerous applications, including computer graphics (movies, games, virtual reality), computer vision, data mining, VLSI design, mathematical models of the physical world (maps, architecture, medical imaging), astronomical simulations, scientific computing, and and geographic information systems.

Purple America.

(as created and popularized by Robert J. Vanderbei)   We compose a program to read in countywide election data and visualize the results.

Program InteractiveElectionMap.java adds user input - when the user clicks a region, the program outputs the name of the region and the vote tallies for that region.

A card game.

We will create a card game application that shuffles and deals out a deck of 52 cards to four players, as in bridge or hearts. We divide up our problem by creating objects for each of the main components in our card game. The physical game has playing cards (Queen of spades, four of clubs, etc), a deck (of 52 cards), and four players (each of which is dealt 13 cards). We build up our program in small modules by creating data types for each of these entities.

Program Card.java implements a data type for playing cards. Each object represents one of the 52 cards in a deck and supports the following operations: toString, less, and draw. The less method compares two cards based on their suit, then on their rank. It is useful to sort a hand of cards by suit, then by rank, as you would do in the card games bridge or hearts. We use StdDraw.java to display graphical representations of the playing cards. The archive file cards.jar contains a library of GIF images of playing cards. (As per the GPL license, here is the original source of the images). The images named 0.gif, 1.gif, through 51.gif are GIF images of the two of clubs, three of clubs, through the Ace of spades. The file back.gif represents the reverse side of the playing card. To use the library, put the file in your current directory, compile as usual, and execute with

java -classpath .:cards.jar Card.java

On Windows, replace the colon with a semicolon.

BlackJack


The data type Deck.java is a data type to represent a deck of 52 playing cards. The constructor initializes a new deck of 52 playing cards. The method shuffle rearrange the cards in random order using the algorithm from Section 2.5. The method dealFrom deletes one card from the deck and returns it. The method isEmpty is useful to check if the deck is empty. As an example, if we execute the following code

Deck deck = new Deck();
System.out.println(deck);
deck.shuffle();
System.out.println(deck);
System.out.println(deck.dealFrom());
System.out.println(deck);

then we might get the following output

Deck  2C 3C 4C 5C 6C 7C 8C 9C ... JS QS KS AS 
Deck  4H 7H JH 9D 5D JD KC 3D ... 5H TC 7S 2S 
4H
Deck  7H JH 9D 5D JD KC 3D ... 5H TC 7S 2S

The data type Player.java is a data type for representing a player in a card game of bridge or hearts. A player consists of a name, a location (for drawing), and a hand of cards. The method dealTo takes a Card as an argument and inserts that card into the player's hand. The internal method sort() is used to sort the cards by suit and rank, before converting to a string representation for use with toString or plotting using standard draw.

The data type Game.java represents the beginning of a card game application. It creates a deck of cards and four players (North, East, South, and West). It shuffles the 52 cards and deals them out to the four players. The cards for each player are printed to standard output

North  4C 6C 7C TC 5D 5H 8H 3S 4S 5S 8S TS KS
East   3C 8C QC 3D 4D 8D TD JD TH QH AH 2S QS
South  5C JC KC AC QD KD 2H 3H 4H 7H JH KH AS
West   2C 9C 2D 6D 7D 9D AD 6H 9H 6S 7S 9S JS

and then plotting using standard draw. Note: picture below does not correspond to hands above, and it is sorted in "lefty" order.

Bridge Hands

Here's how you might do it Java 1.5 with enum


Q + A

Q. If I have a large project, do I need to javac each .java file separately?

A. Not necessarily. If A.java depends on B.java and B.java depends on C.java and all three are out of date (date of .java after date of .class file) , then compiling A.java will automatically recompile B.java and C.java. On the other hand, if A.java and C.java are out of date, but not B.java, then C.java will not get automatically recompiled when you compile A.java.

Exercises

  1. Write a program that generates N circles at random in the unit box of diameter 0.05, draws them to the screen, and highlights all those pairs that overlap.
  2. What does the following code fragment print out?
    public void mystery(Point p, Point q) {
       p.x = 17;
       p.y = 17;
       Point temp = p;
       p = q;
       q = p;
    }
    
    public static void main(String [] args) {
       Point a = new Point(0, 0);
       Point b = new Point(0, 0);
    
       System.out.println("a = " + a);
       System.out.println("b = " + b);
       System.out.println();
    
       mystery(a, b);
    
       System.out.println("a = " + a);
       System.out.println("b = " + b);
       System.out.println();
    }
    
    Answer: it makes a reference the point (17, 17), but does not swap a and b.
  3. Create a data type LineSegment that represents a line segment in the plane. Represent a line segment using two Point instance variables.
  4. Add a method centroid() to Polygon that returns the centroid as a Point. The centroid (or center of mass) of a non self-intersecting polygon is given by the point (x, y) where x and y are defined by

    Centroid of a polygon
  5. Create a data type Line that represents a line in the plane. Represent a line as a Point and a slope.
  6. Write a method distanceTo() that computes the shortest (Euclidean) distance from a point (x, y) to a Line from (x1, y1) to (x2, y2). The formula is

    Distance from a point to a line

    Note that the denominator is the distance between the two points on the line so you should reuse the distanceTo() method in Point. The numerator is signed, so this returns the signed distance - you may wish to take absolute values.

  7. Write a method intersects() so that s1.intersects(s2) returns true if the two line segments intersect and false otherwise. To determine if two line segments a-b and c-d intersect, compute

    Intersection of two line segments

    If 0 ≤ r ≤ 1 and 0 ≤ s &le 1, then the line segments intersect. Moreover, the point of intersection is (x, y) = (ax + r(bx-ax), ay + r(by-ay)). If the denominator in equation 1 is zero, a-b and c-d are parallel. If the numerator in equation 1 is also zero, then a-b and c-d are collinear.

  8. Compute the area of Texas using the method in Polygon. Try to convert to square miles - the coordinates of the Texas polygon in USA.txt are given as latitude / longitude pairs.
  9. Add a method isSelfIntersecting() to Polygon that returns true if the polygon intersects itself, and false otherwise. Use the line segment intersection routine from the previous exercise and test all N choose 2 line segments.
  10. Create a static method ccw() that takes three Point inputs a, b, and c and returns +1 if a-b-c is a counterclockwise angle, -1 if a-b-c is a clockwise angle, and 0 if a-b-c are collinear. Use the following determinant formula which gives twice the (signed) area of the triangle a-b-c. If the area is positive, then a-b-c is counterclockwise; if the area is negative, then a-b-c is counterclockwise; if the area is zero then a-b-c are collinear.

    ccw test
  11. Interval. Create a data type Interval.java that represents a closed interval [a, b] on the real line. Include methods for returning the length of the interval, checking whether two intervals intersect, returning a new Interval that is the intersection of two intervals (largest interval contained in both intervals), and returning a new Interval that is the union of two intervals (smallest interval that contains both intervals).
  12. Bounding box (2D interval). Create a data type BoundingBox.java that represents a bounding box. A bounding box is an axis-aligned rectangle. In addition to a constructor, it should have a method union() so that a.union(b) initializes returns a new bounding box that is the smallest bounding box that contains both a and b. Add methods to determine whether a Point is inside a bounding box or whether two bounding boxes intersect, area, perimeter. Include a constructor should take an array of points, and return the smallest bounding box containing each point.

    Hint: implement it using two Interval instance variables, one for the x-axis and one for the y-axis.

  13. Create a data type Triangle that represents a triangle. It should have a constructor which take references to three Point objects as inputs and create a triangle with those endpoints. It should have methods area(), perimeter(), isIsosceles(), isRight(), isEquilateral(), and isScalene().
  14. Write a client program Pi.java that estimates the mathematical constant π using Monte Carlo simulation. Create a circle centered at (1/2, 1/2) of radius 1/2. Then repeatedly generate random points in the unit square and check if they are inside the circle. Print out the fraction of points that fall inside the circle. The true answer estimates π / 4 since the ratio of the area of the circle to the area of the square is π / 4.
  15. Modify Card.java to add a more user-friendly constructor so the following code works
    Card c = new Card(Card.KING, Card.HEARTS);
    

    Add classwide variables to Card to achieve this effect.

  16. Write a program to print the 52 cards in sorted order and then again in shuffled order, as in the picture below. Add a helper method draw to Deck.java.

    Sorted and Shuffled Deck of Cards


Creative Exercises

  1. Bullseye. Create several concentric circles and associate a value with each circle. Generate a point at random (using some distribution). The number of points you score is the value of the smallest enclosing circle.
  2. Random sphere packing. Generate random disks of diameter d in the unit square and repeatedly add each new one as long as it doesn't intersect an existing disk. Stop trying when 100 attempts in-a-row fail. How many disks on average are placed? Use the Circle.java data type.
  3. 2-point correlation function. Given N points and a parameter r, the 2-point correlation function is the number of pairs of points that lie within distance r of each other. The 2-point correlation function roughly measures the clumpiness of a set of points. It is a fundamentally important statistic in astrophysics. Write a program that takes a command line parameter N, reads in N points, and calculates the number of points within a radius of r of each other for various values of r.
  4. Traveling salesperson problem. Write a program that takes a command line input N, generates N random Point objects in the unit square, a finds the shortest tour that visits each point exactly once. Base your algorithm on Program Permutations.java which generates all N! permutations on N elements. What is the biggest value of N for which your program finds an answer is less than 5 minutes?
  5. Rational roots. If a polynomial with integer coefficients has a rational root p/q (reduced) and a0 != 0, then p is a divisor of a0 and q is a divisor of a_n. Can enumerate all possibilities by factoring.
  6. Bridge. One of the best ways to determine the strength of a bridge hand is to count up the number of high cards. The Milton Work Point Count assigns a numerical score to each hand by adding 4 points for each Ace, 3 for each King, 2 for each Queen, and 1 for each Jack. Deal out a million hands a compute a histogram of the number of points.

    Answer: Here is the exact distribution of each of the high card point counts. Use the helper data type Histogram.java, which relies on Draw.java.

  7. Bridge. It is advantageous in bridge to have suits with a large or small number of cards. Modify the program in the previous exercise to assign a bonus of 2 for each suit in which the hand has no cards of the given suit (void) and a bonus of 1 for each suit in which the hand has only a single card (singleton). Name your program BridgeExperiment.java. Use Histogram.java to create an animated histogram of the results.
  8. Crazy 8s. In the card game Crazy 8s, each player typically sorts their cards according to rank (instead of suit as in Bridge). Modify the less method in Card so that the sort method in Player sorts in the appropriate order.
    public boolean less(Card other) { return rank < other.rank; }
    
  9. I Doubt It. I Doubt It is a multiplayer card game in which it is convenient for each player to sort their cards in an interesting order...
  10. Concentration. Implement the card game concentration.
  11. Texas Hold 'Em. Each player is dealt two cards. The two players share five community cards and try to make the best hand from the 7 cards. Given the hole cards of two players, determine the probability that each will win after the community cards are revealed.
  12. Texas Hold' Em intransitivity. Find two card hands so that A beats B, B beats C, and C beats A, where "beats" means that you have a higher probability of winning when the community cards are revealed.

    Solution: AS TD, QH JH, 2C 2S. AS TD beats QH JH roughly 57%, QH JH beats 2C 2S roughly 53%, 2C 2S beats AS TD roughly 50%.

  13. k-means clustering. Cluster objects (in a vector space) into k partitions.
    • Partition objects at random into k sets.
    • Define the k centers as the centroids of each set.
    • Assign each object to its nearest center.
    • Repeat the last two steps until the process converges (the assignment doesn't change).
    Fast convergence in practice. Convergence is guaranteed in theory, but can be exponentially slow. No guarantee of global optimum. Can repeat several times.

    Here's a visual application for clustering colors in an image. Cluster pixels in an image according to k colors, e.g., to reduce the number of colors in a picture from thousands to 32.

  14. Zip code lookup. Given a zip code, identify it on a map. The file zips2000.csv from the Tiger database contains a list of US zipcodes followed by the longitude and latitude of their centroid.
  15. Albers equal-area conic projection. The Albers equal-area conic projection maps latitude and longitude (φ, λ) to Catesian coordinates (x, y). It gives it the curvy surface we are accustomed to seeing. Used by the U.S. Geological Survey.

    This code is slightly adapted from Visualizing Data by Ben Fry.

    double phi0    = 0.0;
    double lambda0 = Math.toRadians(-96.0);   // central meridan = 96 W
    double phi1    = Math.toRadians( 29.5);   // standard parallel 1 = 29.5 N
    double phi2    = Math.toRadians( 45.5);   // standard parallel 2 = 45.5 N
    double phi     = Math.toRadians(latitude);
    double lambda  = Math.toRadians(longitude);
    double n       = 0.5 * (Math.sin(phi1) + Math.sin(phi2));
    double theta   = n * (lambda - lambda0);
    double C       = (Math.cos(phi1) * Math.cos(phi1)) + (2 * n * Math.sin(phi1));
    double rho     = Math.sqrt(C - 2 * n * Math.sin(phi))  / n;
    double rho0    = Math.sqrt(C - 2 * n * Math.sin(phi0)) / n;
    double x       = rho * Math.sin(theta);
    double y       = rho0 - rho*Math.cos(theta);
    
  16. Azimuthal projection. Instead of plotting (latitude, longitude), use some kind of map projection, either conical or azimuthal (cylindrical is bad for large countries like US that have long east-west extent).

    Recommended: Lambert conformal conical with center meridian at 95°W. Two parallels 33°N and 45°N.

    Let LAT1 and LAT2 be two latitudes and let CLON be longitude. Compute cone constant (between 0 and 1). For US (with LAT1 = 33 and LAT2 = 45), the cone constant is .6304776973.

        CONE = COS(90 - LAT1)
    
               LOG (COS(LAT1)) - LOG (COS(LAT2))
        CONE = -------------------------------------------------  // two latitudes
               LOG (TAN (45-LAT1/2)) - LOG (TAN (45-LAT2/2))
    
    To project (RLAT, RLON), compute the formulas
        R = (TAN(45 - RLAT/2))**CONE
    
        U =  R * SIN(CONE*(RLON-CLON))
        V = -R * COS(CONE*(RLON-CLON))
    
    reference.
  17. Random point in a polygon. Given a simple polygon, design a method to choose a random point in its interior. Add a method to the polygon ADT that returns a bounding box. Then use the rejection method to find a point.
  18. Dot plots. Instead of plotting the color of a polygon using a shade of purple, plot red, blue, and green dots in the region at random. Make each dot represent x hundred votes.
  19. Poorest counties. Plot the 100 poorest counties in the US in blue on the county map and the 100 richest counties in the US in red. (Ignore counties in Alaska or Hawaii.)
  20. Demographics. Use the county maps to create data visualizations for other demographics.