Math 176 - Data Structures
Programming Assignment #3
Connected Components (Union-Find)
Due date: Friday, November 10 at 6:00 PM.
Total points for this assignment: 75 points.
In this homework assignment, you will write routines for keeping track of the connected components of an undirected graph with online (real-time, interactive) methods. This is essentially the same as a union-find algorithm, using union by size and using path compression. One new feature of the connected components routines you will write is that you have to keep track of the minumum numbered node in each connected component.
You will write a class
ConnectedComponents which exactly matches in
functionality the routines documented here: online
This includes the following routines:
addEdge(i,j)- adds an edge from vertex number
ito vertex number j. The connected component information must be updated accordingly. If the two vertices were not already in the same connected component, then this routine must return the number, N, of the smallest numbered vertex in the new enlarged connected component. It returns the negative number -(N+1) if the two vertices were already in the same connected component before the new edge was added.
minConnected(i)- returns the number of the smallest numbered vertex that is in the same connected component as vertex number
areConnected(i,j).- returns true if vertices
jare in the same connected component of the graph.
In addition, your routines must keep statistics on how many pointers are traversed in
For example, in
minConnected, if vertex
i is the root vertex of the tree storing the
elements of vertex
connected component, then zero pointers are traversed. On the other hand, if it is not the
root vertex in its connected component, then at least one pointer will be traversed. The
getNumPointerTraversals()- returns the total number of pointer traversals.
By the inverse Ackermann upper bound, you should expect that a very small number of extra pointer need to be traversed per operation. I will provide you with a main program, called CcStatistics.java, that runs trials for you to gather statistics about numbers of pointer traversals. The main program will allow you to gather the following kinds of statistics:
Gather statistics for several values of N. Use N=100, 1000, 10000, 100000, 10000000 (if your computer cannot go as high as one million, use the a large value for N close to the maximum attainable). As usual, remember to increase the heap size with the -X command line options.
Please note that this main program will not test the accuracy of your code. It will only gather statistics. You are responsible for ensuring the accuracy of your code with your own test programs.
Source Materials: You should look for the HTML documentation on CcStatistics.java and ConnectedComponents.java.
You will find the source code for CcStatistics.java and for HashSetLinear.java
in the directory ProgHomework3 on ieng9. You should get
a copy the source code files and compile them yourself. The version of HashSetLinear is a
special one, that includes a routine getRandomElement that is used by the CcStatistics
There is no full tester for the ConnectedComponents class. However, some kinds of errors in ConnectedComponents will cause errors in to occur in CcStatistics, so this will give you a partial test of your program.
Turnin materials: You must turn in the following:
This programming assignment is covered by the usual Academic Integrity Guidelines for programming assignments.