Math 176 - Data Structures
Programming Assignment #3
Connected Components (Union-Find)
Due date: Friday, November 10 at 6:00 PM.
Total points for this assignment: 75 points.
In this homework assignment, you will write routines for keeping track of the connected components of an undirected graph with online (real-time, interactive) methods. This is essentially the same as a union-find algorithm, using union by size and using path compression. One new feature of the connected components routines you will write is that you have to keep track of the minumum numbered node in each connected component.
You will write a class ConnectedComponents
which exactly matches in
functionality the routines documented here: online
documentation for ConnectedComponents
.
This includes the following routines:
addEdge(i,j)
- adds an
edge from vertex number i
to vertex
number j. The connected component information must be updated accordingly. If the two
vertices were not already in the same connected component, then this routine must return
the number, N, of the smallest numbered vertex in the new enlarged connected component. It
returns the negative number -(N+1) if the two vertices were already in the same connected
component before the new edge was added.minConnected(i)
- returns the
number of the smallest numbered vertex that is in the same connected component as vertex
number i
.areConnected(i,j)
.- returns true
if vertices i
and j
are in the same connected component of the
graph.In addition, your routines must keep statistics on how many pointers are traversed in
the routines minConnected
, areConnected
and addEdge
.
For example, in minConnected
, if vertex i
is the root vertex of the tree storing the
elements of vertex i
's its
connected component, then zero pointers are traversed. On the other hand, if it is not the
root vertex in its connected component, then at least one pointer will be traversed. The
routine
getNumPointerTraversals()
-
returns the total number of pointer traversals.By the inverse Ackermann upper bound, you should expect that a very small number of extra pointer need to be traversed per operation. I will provide you with a main program, called CcStatistics.java, that runs trials for you to gather statistics about numbers of pointer traversals. The main program will allow you to gather the following kinds of statistics:
Gather statistics for several values of N. Use N=100, 1000, 10000, 100000, 10000000 (if your computer cannot go as high as one million, use the a large value for N close to the maximum attainable). As usual, remember to increase the heap size with the -X command line options.
Please note that this main program will not test the accuracy of your code. It will only gather statistics. You are responsible for ensuring the accuracy of your code with your own test programs.
Source Materials: You should look for the HTML documentation on CcStatistics.java and ConnectedComponents.java.
You will find the source code for CcStatistics.java and for HashSetLinear.java
in the directory ProgHomework3 on ieng9. You should get
a copy the source code files and compile them yourself. The version of HashSetLinear is a
special one, that includes a routine getRandomElement that is used by the CcStatistics
program.
There is no full tester for the ConnectedComponents class.
However, some kinds of errors in ConnectedComponents will cause errors in to occur in
CcStatistics, so this will give you a partial test of your program.
Turnin materials: You must turn in the following:
This programming assignment is covered by the usual Academic Integrity Guidelines for programming assignments.