In this project, you will experiment with and observe the performance of various sorting algorithms applied to data in different states of organization. Two of the data files will be organized randomly (two different sizes), one will be already sorted, and one will be sorted in exactly in the opposite order of the desired order. You will learn that different sort algorithms have different performance in these different situations

Download and unzip the ZIP file indicated above. It contains the L&C example code for the class "SortingAndSearching" which has multiple sorting methods each of which uses a different algorithm. It contains code for a SortCounter class with a main method that reads a file and sets up its data in an array for sorting by each of the five methods. There are a number of data files as samples for sorting -- the names should be self-explanatory except for mystery.txt.

SortCounter class:

NOTE: DO NOT CHANGE THE OPERATOR INTERFACE IMPLEMENTED IN THIS CLASS. IT MUST BE KEPT AS-IS SO THAT THE GRADER CAN USE A COMMON SCRIPT FOR TESTING EACH STUDENT'S CODE.

Modifications to SortingAndSearching class:

There are attributes and methods to maintain an integer attribute compareCount for the number of comparisons that the sort algorithm makes on the data in each file while sorting it. It also has a memoryCount and maxMemoryCount attributes for the amount of extra memory used by mergeSort. A constructor initializes all attributes to zero. A getCompareCount() method reads the compare counter value (after performing a sort experiment on a file). A resetCompareCount()method resets the compare count to zero prior to doing the next sort experiment.  A getMemoryCount method reads the maximum memory counter after that merge sort method is called. None of the code needed to generate data in these attributes is provided.

You must "instrument" the code of the "SortingAndSearching" class to generate the correct data. Study the code for each sorting method. Identify all places where the code calls compareTo on the data values that are being sorted and add code to increment the compareCount value at those places. Also, add code to the mergeSort method to generate the memoryCount and maxMemoryCount data. Now, run experiments using the supplied data files to characterize the performance of each sort algorithm. Copy and paste the results of your experiments to your report (in memo.txt) to record the data as a baseline.

Study the code for the bubble sort method further. You should have already run experiments that show how well the bubble sort code performs. Now, implement an improvement in the code so that the inner loop keeps track of if a swap is made and the outer loop terminates if the inner loop makes no swaps. That would imply that the data is already sorted. Run experiments that show how well it performs after you make the change. Explain the observed differences from the baseline in your report.

Study the code for the quick sort method further. You should have already run experiments that show how well the quick sort code performs. Now, implement the suggestion made in the lecture notes to select the middle element instead of the first element as the partition element. Hint: Swap the middle element and the first element before doing the rest of the sort. Run experiments that show how well it performs after you make the change. Explain the observed differences from the baseline in your report.

Study the code for the merge sort method further. You should have already run experiments that show how well the merge sort code performs and display the amount of extra memory it used. Now, implement the suggestion made in the lecture notes to use less memory by allocating memory for the merge process after the recursive calls instead of before the recursive calls. Don't forget to move the code for measuring the memory use data. Explain the observed differences from the baseline in your report.

Experimenting and Reporting:

The zip file also contains some test data files that present different situations for sorting. Any file named random contains data that is not in any sorted order. Any file named sorted contains data that has previously been sorted in the same order that the algorithms will be trying to sort the data. Any file named reverse contains data that has previously been sorted in exactly the opposite order from the order that the algorithms will be trying to sort the data. Sort the file mystery.txt and try to figure out what problem it causes for one of the algorithms. Include your conclusion in your report.

Gather your experimental test results and discuss your observations/conclusions in your memo.txt report file. You must discuss your observations relative to the expected theoretical behavior of each algorithm based on Big-O notation. Include copy and paste results of your runs from the DrJava interactions window into memo.txt to provide supporting evidence for your discussion and your conclusions.