**Algorthm Development.**
Developing a good algorithm is an iterative process. We create a model of the problem, develop
an algorithm, and revise the performance of the algorithm until it meets our needs.

**Union-Find.**
The ultimate goal is to develop a data type that support the following operations
on a fixed number *N* of objects:

`union(int p, int q)``connected(int p, int q)`

`find(int p)`

The `find()` method is defined so that `find(p) == find(q)`
iff `connected(p, q)`.

**Key observation: connectedness is an equivalence relation.**
Saying that two objects are connected is the same as saying they are in an equivalence class.
This is just fancy math talk for saying "every object is in exactly one bucket, and we want to
know if two objects are in the same bucket". When you union two objects, you're basically just
pouring everything from one bucket into another.

**Quick find.**
This is the most natural solution, where each object is given an explicit number. Uses an
array `id[]` of length *N*, where `id[i]` is the bucket
number of object `i` (which is
returned by `find(i)`). To union two objects `p` and `q`, we set every object
in `p`'s bucket to have `q`'s number.

*Union*: May require many changes to`id`. takes*N*time in the worst case (to union large sets).*Connected*(and*find*): takes constant time.

**Quadratic algorithms don't scale.**
Given an *N* times larger problems on an *N* times faster computer,
the problem takes *N* times as long to run.

**Quick union.**
`id[i]` is the parent object of object `i`.
An object can be its own parent. The `find()` method climbs the ladder of
parents until it reaches the root (an object whose parent is itself).
To union `p` and `q`, we set the root of `p` to point to the root of `q`.

*Union*: Requires only one change to`id[]`, but also requires root finding (worst case*N*time).*Connected*(and*find*): Requires root finding (worst case*N*time).

**Weighted quick union.**
Rather than `union(p, q)` making the root of `p` point to the root of `q`,
we instead make the root of the smaller tree point to the root of the larger one.
The tree's *size* is the *number* of nodes, not the height of the tree.
Results in tree heights of lg *N* (you should understand this proof).

*Union*: Requires only one change to`id`, but also requires root finding (worst case log*N*time).*Connected*(and*find*): Requires root finding (worst case log*N*time).

**Weighted quick union with path compression.**
When find is called, the tree is compressed. Results in nearly flat trees.
Making *M* calls to union and find with *N* objects results in no more
than *M* log*(*N*) array accesses. For any conceivable values of *N*
in this universe, log*(*N*) is at most 5.

- What are the best-case and worst-case tree heights for weighted quick-union and weighted quick-union with path compression? Give your answers in terms of order of growth.
- Textbook: 1.5.1, 1.5.2, 1.5.3

- Fall 11 Midterm, #1
- Fall 12 Midterm, #1
- Textbook: 1.5.8
- Textbook: 1.5.9

- Textbook: 1.5.10
- If we're concerned about tree height, why don't we use height for deciding tree size instead of weight? What is the worst-case tree height for weighted-quick-union vs. heighted-quick-union? What is the average tree height?
- Try writing weighted-quick-union-with-path-compression without looking at the code on the booksite. You may look at the API. Compare your resulting code with the booksite code.