CS634 – Homework 3
Released Feb 25, Due Wed., March 9
Being a DBA, Indexing, Intro Query Evaluation
Problem 1
a. Follow the instructions in mysql_install.html to bring up your own mysql installation on topcat. Show the "ls -l" output showing the contents of your data directory (like slide 6 of class08, but no sudo is necessary since you own the directory.)
b. Create a user named by your Linux username, and a database name <username>db, with the same user privileges as your same-named user on topcat's main mysql installation. Show the command(s) you used to get your old privileges and set up the new user and database on your own installation. Show the "ls -l" output showing the contents of your data directory after this.
Problem 2
a. Load the book's tables into your new installation using $cs634/tables, as in hw1. You'll need to edit ~/.my.cnf for this. Explain what else you needed to edit.
b. Although these tables are too small to really merit indexing, pretend they are larger for now. Study the queries of Exercise 5.2 (as on hw1, the ones not needing relational division) and determine what indexes could speed these queries up. Any column used in a join is a good candidate. Also any column involved in a selection with equality or range condition. Create files add_indexes.sql and drop_indexes.sql to do these actions, and show them in your paper.
c. Run queries.sql (now in the tables directory) before and after creating the indexes and running "analyze table" on the affected tables. Bring down the server and restart it before running the queries to start with an empty buffer pool. Don't worry if it doesn't show improvement, since these tables are so small. Describe your timing results. You can use "mysql> set profiling=1" and after the queries "show profiles" to get the timings of individual queries.
Problem 3
Problem 4
Exercise 11.8 Assume 40 duplicates of each R.a value, = 10, 20, ..., 250,000 (25,000 values with 40 duplicates each for 1,000,000 rows total, as before)
Problem 5
Exercise 12.4, parts 3, 4
Problem 6
Exercise 13.4 parts 1, 2a, 2d (2a uses single-block i/o, 2d uses multi-block i/o)