CS 110: Introduction
to Computing with Java
Lab 9
Pre-Lab
1.
Do a Google search on the three word
phrase Word Frequency Analysis. Study
the topic and its applications.
Lab
Introduction
This lab has you finish a utility that counts
words in a text file. Word frequency analysis
is used by linguists and cryptographers in various applications.
We use the following algorithm. We maintain a Map of words and their frequencies in the
analyzed text. A Map is a table that
connects a key to a
Purpose
This
lab gives you some experience working with the File I/O, exceptions, and Java
Collections.
Activities
1.
Copy the project. A nearly complete set of Java files for the
word count application can be found in Lab9.zip. Extract these files to folder of your
choosing on the student drive. As usual,
open Dr Java, create a new project, and open the files you just unzipped.
2.
Go to the
DrJava Edit-> Preferences-> Compiler Options Menu and uncheck the box labeled
“Show Unchecked Warnings”.
3.
Look at the code. You can make the code work by completing one
(or maybe 2) line of code in the WordCount
class. This line of code must create a Scanner object from a String containing
the file name. NOTE: You invoke this program on the Dr Java
Interactions Pane with two command line arguments: one for the file name and
one for the minimum frequency count.
4.
Write the line of code and run the application. We have provided
three text files that you can use as sample data. These sample files contain the Declaration of
Independence (doi.txt), Homer’s Odyssey (odyssey.txt), and Dicken’s Great
Expectations (ge.txt). Use the
application to find the words that occur more than 100 times in the
Odyssey. Compare with the sample output
below.
5.
Test exceptional conditions.
In particular,
what happens when you give an in
6.
Modify the application.
In particular,
modify the code so the case where an
in
7.
Test your changes.
Sample Output
You
should see the following when you run the WordCount
class with application parameters “odyssey.txt” and “1000”:
> java WordCount odyssey.txt 1000
the 5,846
and 5,036
to 3,196
of 3,058
you 1,906
i 1,869
he 1,823
a 1,812
in 1,627
for 1,296
his 1,277
as 1,189
with 1,134
it 1,123
that 1,115
him 1,059
>
Before
you leave, have your TA check off that you completed the lab. Make sure each person saves a copy of your
work.
Lab
Report
Write
a document describing your experiences.
Your lab must be printed (not handwritten).
Answer
the following questions related to what you did in this week’s lab. You may complete the code on your own, but
the TA must certify that most of your work was done in the lab.
1.
Answer each of the following
questions about the application:
a)
In the original application, what
were the different ways that NumberFormatException
and IOException are handled?
b)
Find
all Java Collection objects used in this application. What are the roles of each object?
2.
Describe what you learned doing this
lab. Explain what was difficult and what
was easy.
3.
Attach a listing of your completed WordCount
classes.
Note: You should work alone on writing the lab
report.
Note: The assignment is due
at the BEGINNING of your next lab. No
late assignments will be accepted.
Emailed assignments will not be accepted. If you are not going to be in lab on the due
date, you can turn the assignment ahead of time to the CS110 TA box in the CS
department office.