cs634 hw1
CS634 – Homework 1
Released Jan. 24, Due Mon., Feb. 5
Relationships, Creating and Loading Databases, Generated
PKs (15 points)
Recent edits are marked by yellow
highlight
Instructions: The homework is due BEFORE CLASS on the duedate.
Please hand in paper copies (either word-processor or hand-written
copies are fine, as long as the hand writing is clear). You also need to
create files which you must place in your UNIX course directory for
cs634 in directory hw1.
Problem 1
- Copy the tables directory from $cs634 (/data/htdocs/cs634) to your
cs634/hw1 directory, ending up with directory cs634/hw1/tables. Read
its README and *.sh. Use its contents to load the sample databases
from the book into your Oracle and mysql accounts. Note that the
Report any problems (other than the expected ones involving .bad
files) as soon as possible.
- Determine why Oracle fails to load the three rows shown in the .bad
files. Explain both the catalog and suppliers rows—there
are different reasons. Examine how mysql handles these three
rows: any output related to them in the load? What do they look
like when loaded? Briefly summarize the root cause of the
difference between how the two databases handle these loads.
- Edit createdb.sql by changing one digit of one number so that the
loads on Oracle and mysql work without error and load the same data.
Explain in your homework what that edit was.
Problem 2
- Draw an E-R diagram for emp-dept-works tables of createdb.sql and
explain why the diagram should be drawn that way.
- Draw an E-R diagram for the Yelp database. See a table-level
diagram at https://www.yelp.com/dataset/documentation/sql
and schema linked from there. First figure out which table(s) qualify
as relationship tables. Also show the E-R diagram with review as an
entity.
Problem 3
We looked at the flights table in class and determined it wasn’t
normalized.
- Draw the E-R diagram of flights-aircraft-emplyees-certified as
supplied.
- Normalize the database and draw the new E-R diagram. Show the
changed/new create table statements.
Problem 4 Review of SQL
a. (Review of cs630, OK to reuse old homework paper) Exercise 5.2 in
the textbook, except use the table schemas in createdb.sql, slightly
different. Give the quereis in relational algebra and SQL. Try the
SQL queries out on Oracle and mysql
by writing a file
queries4a.sql and then executing it on both databases, leaving their
output in queries4a-oracle.txt and queries4a-mysql.txt in your cs634/hw1
directory, along with queries4a.sql of course
. Show queries4a.sql
in your homework paper. Note that some of these queries need relational
division, very tricky to do in SQL. See an example of division in SQL in
Q9, pg 150. Specifically, do parts 1, 3, 4, 5, 7, and 10 (all but #3 done
in hw2 last term),
b. Queries over the Yelp database (following the logic of Exercise 5.2
queries, and numbered accordingly) for queries4b.sql. Show the SQL queries
in your homework paper. Note that the user table has been renamed to yuser
to avoid using the Oracle reserved word user. The Oracle tables exist but
are empty, while the mysql tables have data. Columns originally named
"date" have been renamed as well. Sample query: select
count(*) from yelp_db.yuser; Note that output is not required for
these queries because many of them take over 5 min. to run. Please abort
queries that take more than 30 seconds, unless you are running after
midnight.
1. Find the business names for which there is some review (no duplicate
names).
3. Find the names of users who have reviewed every business in Clark,
NV.
4. Find the businesses by name which are reviewed by Macaroni and noone
else.
6. For each business, find the user (by name) who rates it highest
(highest stars value), including ties.
7. Find the ids of users who review only businesses in zipcode
02138 (no such rows exist in the data)
10. For every business that is reviewed only by users having
"cool count" value at least 10, print the business name and the
total number of reviews they have.
Problem 5 Generating Ids
Create a table with abstract schema:
Employee (eid:integer, ename:string, age:integer,
salary:USmoney)
a. The exact types used are up to you,
as long as they can do the job and are portable. The eid should be
generated by auto_increment on mysql and by a sequence on Oracle.
Note that neither of these features are covered by SQL-92, so
portability isn't expected. The create table statement is different in
the two cases. Write create_employee_oracle.sql and
create_employee_mysql.sql for the two cases, where the Oracle file
creates the sequence as well as the table. Create one data
file employees.txt that works for both cases with 3 employees.
Write a shell file employees_oracle.sh and employees_mysql.sh that
builds and loads the table (just
use inserts to load the table), and drop_employees_oracle.sql
and drop_employees_mysql.sql that drop the table and sequence. All the
files should be in your hw1 directory. Show the create_employee_* and
employee_* files in your homework paper, along with a script of runs
that shows the loaded table.
b. Discuss how to do the id generation portably, within SQL-92. You
need another table. We use this way in cs636.
Files for delivery in hw1
tables directory, with *.bad, edited createdb.sql
queries4a.sql, queries4a-oracle.txt, queries4a-mysql.txt
queries4b.sql, queries4a-oracle.txt, queries4b-mysql.txt
create_employee_oracle.sql, create_employee_mysql.sql
employees.txt, employees_oracle.sh, employees_mysql.sh
drop_employees_oracle.sql, drop_employees_mysql.sql