IT 244: Introduction to Linux/Unix

IT 244: Introduction to Linux/Unix
Class 12

Microphone

Quiz 4

I have posted the answers to Quiz 4 here.

Homework 6

I have posted homework 6 here.

It is due this coming Sunday at 11:59 PM.

Midterm

The Midterm exam for this course will be held on Tuesday, March 25th.

That is the first Tuesday after the Spring Break

It will be given in this room.

It will consist of 25 questions like those on the quizzes.

60% of the questions will come from the Weekly Graded Quizzes.

There is a link to the answers to the graded quizze on the class web page.

The other 40% of points will be questions that I create specially for this exam.

For these questions you will have to know

Absolute and relative pathnames
The PATH system variable
Access permissions
Redirection & and pipes
grep
Utilities

The last class before the exam, Thursday, March 13th, will be a review session.

You will only be responsible for the material in the Class Notes for that class on the exam.

You will find the Midterm review Class Notes here.

If for some reason you cannot take the exam on the date mentioned above you must contact me to make alternate arrangements.

The Midterm is given on paper.

I scan each exam paper and upload the scans to Gradescope.

I score the exam on Gradescope.

You will get an email from Gradescope with your score when I am done.

The Midterm is a closed book exam.

You are not allowed to any resource, other than what is in your head, while taking the exam.

Cheating on the exam will result in a score of 0 and will be reported to the Administration.

Remember your Oath of Honesty.

To prevent cheating, certain rules will be enforced during the exam.

Remember, the Midterm and Final determine 50% of your grade.

Questions

Are there any questions before I begin?

Tips and Examples

Viewing Directory Permissions

Running ls -l on a directory will show the permission of everything inside that directory

	$ ls -l tmp
	total 8
	-rw-r--r-- 2 ghoffman faculty 22 Jun 19 14:15 lines.txt
	-rw-r--r-- 2 ghoffman faculty 22 Jun 19 14:15 test.txt

What if you wanted to see the permission on the directory itself?
You have two options

You can run ls -l on the parent directory

	$ ls -l ~
	total 80
	drwxr-xr-x  4 ghoffman grad       4096 Oct 15  2016 bin
	drwxr-xr-x  6 ghoffman faculty    4096 Jan 20 14:44 code
	drwxr-xr-x  6 ghoffman faculty    4096 Sep  9  2016 course_files
	...
	drwxr--r--  2 ghoffman faculty    4096 Jun 19 14:15 tmp

Or you can run ls -ld on the directory itself

	$ ls -ld tmp
	drwxr--r-- 2 ghoffman faculty 4096 Jun 19 14:15 tmp

The -d option tells ls to show information on the directory
Not the things inside the directory

Review

Running a Unix Command

You can run a Unix command by entering the name of the command on the command line
The name of the command is really the name of the executable file ...
that contains the binaary code for the command
It turns out that there is another way you can run a Unix command
You can use a pathname for the executable file
For example, we can use which to find the absolute pathname of ls
```
$ which ls
/usr/bin/ls
```

I can run ls using this absolute pathname

$ /usr/bin/ls /
bin      data          groups     lib32       media     proc  snap     swap.img  users
boot     dev           home       lib64       mnt       root  sources  sys       usr
cdrom    etc           home.ORIG  libx32      nobackup  run   spool    tmp       var
courses  etc.ORIG.tar  lib        lost+found  opt       sbin  srv      tools

I can also use a relative pathname

$ ../../usr/bin/ls /
bin      data          groups     lib32       media     proc  snap     swap.img  users
boot     dev           home       lib64       mnt       root  sources  sys       usr
cdrom    etc           home.ORIG  libx32      nobackup  run   spool    tmp       var
courses  etc.ORIG.tar  lib        lost+found  opt       sbin  srv      tools

So there are two way you can run a Unix command
- Using the name of the executable file for the command
- Using a pathname for the executable file

Syntax of the Command Line

A command typed at the command line has the following format
```
COMMAND [OPTIONS] [ARG1] [ARG2] ... [ARGn]
```
The brackets indicate that the contents are optional
Commands vary in the number of options and arguments they accept
Some accept none
Others require a specific number of arguments
Still others accept a variable number of arguments
Arguments must be separated by one or more spaces

Command Options

Options modify the behavior of the command
Options are usually preceded by one or two dashes, -
GNU programs have options that are preceded by two dashes, --
But they usually retain the single letter options from the original Unix commands
The options in GNU utilities are words
The options in the original Unix utilities were a single letter
Single dash, -, options allow a combination of options
An example of this is ls -ltr
Options using two dashes, -- cannot be combined
Each option must be written separately and preceded by two dashes
Sometimes the option can have it's own argument
Utilities that report the size of files usually do so in bytes
Such utilities often have a -h, or --human-readable, option
With this option, the file size will be displayed in kilobytes, megabytes or gigabytes, as appropriate
Many commands display a help message when run with the --help option
Most GNU utilities accept this option

Device Drivers

All operations that need to access any hardware on the machine ...
must do so through the kernel
The kernel controls access to
- RAM (short term memory)
- disk (long term memory)
- CPU
- Other connected devices (e.g. printers)
Every manufacturer of a device must provide software ...
that the allows the hardware to talk to the kernel
This software is called a device driver

tty

In the very early days of Unix people used a machine called a teletype ...
instead of a screen and keyboard
It consisted of a keyboard and a printer
A user would type a command on the keyboard
The output of the command would be printed on a continuous flow of paper
The name of this device was soon shortened to tty
The printer part of these devices was soon replaced with a video monitor
Any device that allows you to send text to a computer and see the output ...
is called a terminal
But Unix still refers to it as tty

The Unix tty Device Driver

The device driver for tty is built into the kernel
Otherwise you would not be able to talk to the machine
As you type, your keystrokes are collected by the tty device driver
This program looks at each character as you type ...
and takes appropriate action
Most of the time, it places the character in a buffer
A buffer is a space in RAM that holds data for later processing
But tty device driver responds differently to certain special characters
When the character you type is the backspace
tty erases the previous character from the buffer
When the character is the Control U something different happens
tty erases the buffer from the current insertion point to the beginning of the line
tty is responsible for all command line editing
When the tty gets a newline character
It passes the contents of the buffer to the shell
Newline is the character you get from hitting Enter on a Windows machine ...
or Return on a Mac

Parsing the Command Line

The shell takes the contents of the buffer and breaks it up into tokens
Tokens are the strings of text separated by spaces
This action is called parsing
Parsing is the act of making a list of all the strings on the command line
Next, the shell looks for the name of the command
Usually, the command name is the first string on the command line
The command can be specified by a simple filename
```
ls
```
Or by using a pathname to the executable file
```
/usr/bin/ls
```

The PATH System Variable

When you run a program using a pathname
The shell knows where to find the executable file
For example, /usr/bin/ls tells the shell exactly where the executable file ls is located
But when we run ls we usually type
```
ls
```
Not
```
/usr/bin/ls
```
Programs are executable files that can be stored anywhere in the filesystem
So how does the shell find the correct file?
The shell checks a system variable called PATH
PATH contains a list of directories to search for an executable file
The shell checks each directory in PATH for a file with the right name
It checks each directory in order and stops when it finds the first match
PATH always has a default value which is created when the system is installed

Here is the default value on our system

$ echo $PATH
/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:/usr/games:/usr/local/games:/snap/bin:/usr/lib/oracle/12.1/client64/bin

The absolute pathname of each directory is separated from the next by a colon, :
If the shell can't find the file it prints an error message
You will also get an error message if you don't have permission to run the file
You can modify the PATH variable in your own Unix environment
We'll see how to do this in a few classes

Running a Program in the Current Directory

You should never put the current directory, . , in the PATH list
That would be a security vulnerability
But what do you do if you want to run an an executable file ...
in your current directory?
When that directory is not in PATH ...
you can do this using the following construction
```
./PROGRAM_NAME
```
This will always work regardless of the contents of PATH

Running the Command Entered on the Command Line

When the shell gets the command line from tty
It uses PATH to find the location of the file
The shell then asks the kernel to start a process for that program
A process is a running program ...
and it needs resources to do its job
- Memory (RAM)
- Access to files
- Time on the CPU
Each process has RAM space allocated to it ...
that it alone can use
This prevents one program from interfering with another
The shell also gives the program the list of tokens from the command line
- The name used to call the program
- The options used
- The arguments used
The shell does not check the options or arguments
While the program is running the shell goes into an inactive state known as "sleep"
When the program finishes it must send an exit status to the shell
The exit status is an integer that must be 0 or greater
An exit status of 0 means that the program was able to do its work without error
Any exit status greater than zero indicates an error
A program can issue different error status values for different types of errors
You can see the exit status of the last program ...

by looking at the value of the system variable ?

$ cat foo
cat: foo: No such file or directory

$ echo $?
1

Attendance

New Material

Using Computers More Efficiently

In the 50s computers were big expensive machines
They were so expensive that people who used them were charged for their time
Writing programs and getting them to work took a long time
You wrote the program and ran it
The code usually did not work the first time
So you had to look at the code to find the bugs
This was not easy
Back then only one person could use the computer at one time
Someone would book an hour on the computer to work on a program
But little of that time was spent running the program
Most of the time the programmer would be thinking about the code
So the expensive computer time was unused
Other people could not use the computer while the programmer was thinking
There was a need for a better way to use these expensive machines

Computers are Fast But People are Slow

Most of the time we spend in front of a machine
We do one of two things
Typing or thinking
Thinking requires nothing of the machine
Typing make take us several minutes
But reading what we type takes the machine microseconds
So most of the time we are using a machine ...
the CPU has nothing to do
The only time we make it work is when we run a program
Engineers realized that while we were thinking ...
the machine could do work for other users
It could work for one user for a while ...
then stop and save what it had done
before moving on to the next person
The amount of time given to one user was called a time slice
During this time, the CPU can execute many instructions
Machines are so fast that the users would never notice ...
when the CPU was not running their code
The computer would still be working with one person at a time
But it didn't seem that way to the users
It seemed to them that many people were using the machine at the same time
To make this work a new type of operating system was needed
A multiuser operating system

The Birth of Unix

Creating such an operating system would not be easy
When the computer is running a program it stores things in RAM
So when the machine finished giving time to one user ...
it had to store the contents of RAM somewhere ...
before moving on to the next user
This had to be done quickly
Which required special hardware ...
and software
But multiuser operating systems promised to be very efficient
So there was big push to make this happen
This lead to the creation of a project called Multics
It was a huge project involving GE, MIT and Bell Labs
The project was so big and unwieldy that it was never finished
Bell Labs eventually decided to pull out of the Multics project
But the Labs had two very talented programmers
Their names were Ken Thompson and Denis Ritchie
They looked at the Multics project ...
and decided that it was trying to do too much
They simplified the specification inherited from Multics
And made some very clever design decisions
When they implemented this new design, they created Unix
One of their key ideas involved data streams

Data Streams

Computers work with information
They take information in and they send information out
We can think of these flows of information as data streams
When we type into a word processor, characters flow into the program
This is an input stream
When we open a document, characters flow from the disc to RAM
This is also an input stream
Running a program usually produces a flow of characters to the screen
This is an output stream
When saving a document, characters flow from RAM to the disc
This is also an output stream

The Monitor and Keyboard

We tend to think of the screen and the keyboard as separate things
After all, we can buy and replace them separately
But this is a recent development
In the 1970 when Unix was created
People used a single device to communicate with the machine
At first this was a teletype
Eventually these were replaced with a terminal ...
that was a screen with a built-in keyboard
In both cases a single device handled input and output
The situation on a multiuser machine can be drawn like this

The terminal is a device so it needs a device driver
That device driver is tty
It handles both input and output to the terminal
We seldom use use physical terminals connected to a Unix machine now
We mostly use ssh to talk to them with from our laptops
But tty is still used ...
to allow the kernel to communicate with the keyboard and screen
Even when the keyboard and screen are on a remote laptop ...
connected through ssh
tty comes standard with every Unix kernel

Standard Input, Standard Output and Standard Error

Every Unix process always has access to three different data streams
- Standard Input
- Standard Output
- Standard Error

The person who writes the program does not have to use these streams
But they are always given to a process
Standard input is where the program normally gets its input
By default, standard input is the keyboard
Standard output is where the program normally sends the results
By default, standard output is the screen
Standard error is where the program normally sends error messages
By default, standard error is the same as standard output
The screen
The end point of each of these data streams can be changed by the user
This is done using a Unix feature called redirection

The Keyboard and Screen as Standard Input and Standard Output

By default, standard input is taken from the keyboard
By default, standard output goes to the screen
By default, standard error also goes to the screen
The cat utility expects you to give it a filename as an argument
What happens when you don't give an argument?
In this case, cat will accept input from standard input ...
which by default is the keyboard

Redirection

Redirection is when we send a data stream ...
to something other than its usual target ...
or take data from a different source
When we redirect standard output ...
the output goes to something other than the screen
When we redirect standard input ...
data comes from something other than the keyboard
Redirection makes pipes possible
In a pipe, standard output of the first command does not go to the screen
It goes to the second command
The second command does not take input from a file or the keyboard
It takes it from standard input ...
which is connected to the standard output of the previous command
Redirection is one of the features that makes Unix flexible
It allows you to take input from, or send output to, any file or device
You can take input from something other than the keyboard
Like a file
You send output to something other than the screen
Like a file or a device

Redirecting Standard Output

To redirect output use the greater than symbol, > ...
followed by a filename
This tells Unix to send the output from the command to the file or device ...
that appears after the symbol >
The format for output redirection is
```
COMMAND [ARGUMENTS] > FILENAME
```

For example, to save a list of everyone currently logged on, you could use

$ who > current_logins.txt

$ cat current_logins.txt 
bmt11989  pts/1        2011-10-02 16:43 (c-24-147-18-10.hsd1.ma.comcast.net)
vtran     pts/0        2012-09-26 17:34 (c-76-119-98-173.hsd1.ma.comcast.net)
abutawha  pts/1        2012-09-26 17:36 (158.121.234.175)
ghoffman  pts/2        2012-09-26 18:19 (dsl092-066-161.bos1.dsl.speakeasy.net)

If you redirect to a file that does not exist ...
the file will be created for you
If you redirect to a file that already exists ...
the contents of that file will be replaced
The new contents of the file ...
will be the output of the command

Redirecting Standard Input

Redirecting standard output sends output to something ...
other than the screen
Redirecting standard input takes input from something ...
other than the keyboard
To do this, we use the less than symbol, <
Here is the format
```
COMMAND [ARGUMENTS] < FILENAME
```

repeat.sh is a shell script that repeats the the text the user enters

$ ./repeat.sh 
Enter line 1: 1
Enter line 2: 2
Enter line 3: 3
Enter line 4: 4
Enter line 5: 5

You entered
-----------
1
2
3
4
5

But I can also take input from a file by redirecting standard input

$ cat five_lines.txt 
Line 1
Line 2
Line 3
Line 4
Line 5

$ ./repeat.sh < five_lines.txt 

You entered
-----------
Line 1
Line 2
Line 3
Line 4
Line 5

Redirecting Standard Output Can Destroy a File

If you redirect standard output to a file that already exists ...
you will overwrite the contents of that file
You will replace the original contents of the file ...
with the output of the new command
There is a "noclobber" option in Bash to prevent this from happening
But it is best to simply be careful about the file ...
to which you redirect standard output

Adding Output to an Existing File

If you redirect standard output to a file that already exists ...
you will lose the original contents of that file
But Unix allows you to add something to the bottom of a file
This is called appending
The append symbol is two greater than symbols
With no space in between, >>
The format is
```
COMMAND [ARGUMENTS] >> FILENAME
```

For example

$echo foo > test.txt

$ cat test.txt 
foo

$ echo bar >> test.txt 

$ cat test.txt 
foo
bar

Notice that "foo" is still in the file ...
and "bar" is on the following line

/dev/null

Sometimes a program will do something useful ...
but produce output you don't want
For situations like this, Unix provides /dev/null
Any output you send to /dev/null will disappear
It will never appear on the screen
If you redirect input to come from /dev/null
The command will receive an empty string
/dev/null is most useful when dealing with error messages
Since error message normally go to the screen ...
they will be mixed up with the regular output
Redirecting standard error to /dev/null will prevent this from happening
You can use /dev/null to test your scripts for errors

The following script has a command that will cause an error

$ cat script_with_error.sh 
# this script has an error

cd XXXXXX # line with error

ls -l /

But if you run it, you probably won't spot the problem ...

because the error message scrolls off the top of the screen

$ ./script_with_error.sh 
./script_with_error.sh: line 3: cd: XXXXXX: No such file or directory
total 73732
drwxr-xr-x    2 root root     4096 Oct 15 06:45 bin
drwxr-xr-x    3 root root     4096 Oct 15 06:45 boot
-rw-------    1 root root 75390976 Oct  8 00:06 core
drwxr-xr-x  102 root root        0 Oct  6 07:06 courses
drwxr-xr-x   46 root root        0 Oct  6 07:06 data
drwxr-xr-x   15 root root     4160 Oct  6 07:06 dev
drwxr-xr-x  119 root root    12288 Oct 15 13:32 etc
drwxr-xr-x    8 root root        0 Oct  6 07:06 groups
drwxr-xr-x 1943 root root        0 Oct 14 14:42 home
drwxr-xr-x    3 root root     4096 Aug 25 16:18 home.ORIG
lrwxrwxrwx    1 root root       33 Sep 29 06:40 initrd.img -> boot/initrd.img-3.16.0-50-generic
lrwxrwxrwx    1 root root       33 Sep 11 06:49 initrd.img.old -> boot/initrd.img-3.16.0-49-generic
drwxr-xr-x   21 root root     4096 Aug 26 22:46 lib
drwxr-xr-x    2 root root     4096 Aug 26 22:46 lib32
drwxr-xr-x    2 root root     4096 Aug 26 06:50 lib64
drwxr-xr-x    2 root root     4096 Aug 26 22:46 libx32
drwx------    2 root root    16384 Aug 25 16:11 lost+found
drwxr-xr-x    3 root root     4096 Aug 25 16:12 media
drwxr-xr-x    2 root root     4096 Apr 10  2014 mnt
drwxr-xr-x   11 root root        0 Oct  6 07:06 nobackup
drwxr-xr-x    2 root root     4096 Feb 18  2015 opt
dr-xr-xr-x  522 root root        0 Oct  6 07:06 proc
drwx------    4 root root     4096 Sep  9 14:22 root
drwxr-xr-x   19 root root     1020 Oct 15 13:31 run
drwxr-xr-x    2 root root    12288 Oct 15 06:45 sbin
drwxr-xr-x  174 root root        0 Oct  6 07:06 sources
drwxr-xr-x    5 root root        0 Oct  6 07:06 spool
drwxr-xr-x    2 root root     4096 Feb 18  2015 srv
dr-xr-xr-x   13 root root        0 Oct  6 22:29 sys
drwxrwxrwt   40 root root     4096 Oct 15 13:32 tmp
drwxr-xr-x    2 root root     4096 Sep 21 16:28 TMP
drwxr-xr-x  269 root root        0 Oct  6 07:06 tools
drwxr-sr-x    2 root root     4096 Aug 26 23:30 users
drwxr-xr-x   12 root root     4096 Aug 26 22:46 usr
drwxr-xr-x   12 root root     4096 Oct 15 05:45 var
lrwxrwxrwx    1 root root       30 Sep 29 06:40 vmlinuz -> boot/vmlinuz-3.16.0-50-generic
lrwxrwxrwx    1 root root       30 Sep 11 06:49 vmlinuz.old -> boot/vmlinuz-3.16.0-49-generic

But if you run the script and redirect standard output to /dev/null ...

all you will see will be the error messages

$ ./script_with_error.sh > /dev/null
./script_with_error.sh: line 3: cd: XXXXXX: No such file or directory

I use this trick in my testing scripts to check your Class Exercises
You should use this trick to test your homework scripts

Tricks with `cat` and Redirection

Unix commands that normally take input from a file ...
will also take it from standard input
cat takes as its argument a file ...
which it uses as input
If you don't give cat an argument ...
it take input from standard input ...
which is the keyboard by default
If you run cat without specifying a file ...
it will simply echo what you type
```
$ cat 
foo
foo
bar
bar
bletch
bletch
^C
```
cat normally sends its output to the screen
This is the basis of a Unix trick
The trick allows you to create a small file without bothering to run an editor

You can do something like this

$ cat > lines.txt [Enter]
Line 1 [Enter]
Line 2 [Enter]
Line 3 [Enter]
[Control D]

If you then run cat on lines.txt
```
$ cat lines.txt
Line 1
Line 2
Line 3
```
This trick allows you to use cat as a simple text editor
But it won't allow you to backspace
This is an example of redirection
By adding
```
> lines.txt
```
to the command line, we change where standard output goes
It does not go to screen as it usually does
Instead it goes to the file lines.txt

Devices

Devices are electronic equipment connected to the computer
They communicate with the kernel through data streams
The device drivers are software supplied by the manufacturer
They allow the device to talk to the kernel ...
and the kernel to talk to the device
The device driver is specific to the hardware ...
and the operating system
Every Unix or Linux machine can talk to the terminal ...
for both input and output ...
right out of the box
It does this using tty ...
which is built into Unix or Linux
Otherwise you would not be able to configure the operating system
Every Unix or Linux machine can also talk to a hard drive
Again because the device driver for the drive is built into Unix
But when you add another piece of hardware like a printer ...
you need to install a driver for it

Unix Devices Are Files

Unix is powerful because its design is simple and elegant
A design is a series of decisions about how thing should behave
One such decision was how Unix treats devices
Unix treats a device the same way it treats files on disk
You can read both read from and write to a file on disk
You can do the same with devices
You can read from them and write to them
You write to a file using the name of the file
You write to a device using the filename of its device driver
To take input from a device redirect standard input ...
from the filename of the device driver

`ssh` and Pseudo-terminals

Our computers have both input and output
We type something on our keyboard ...
and the results appear on our screens
This is the same thing that happened with terminals ...
from the early days of Unix
Whenever we run an ssh client to talk to a Unix machine ...
we have both input and output streams
We type something at our keyboard ...
and the ssh client sends it to the Unix machine
The Unix machine produces output ...
which is sent to our ssh client ...
and it appears on our screens
To make this work, Unix uses an abstraction called a pseudo-terminal
A pseudo-terminal pretends to be a connection to a physical terminal
All pseudo-terminals appear as files in the directory /dev/pts

The Unix command tty will show you your pseudo-terminal

$ tty
/dev/pts/31

$ ls -l /dev/pts/31
crw--w---- 1 ghoffman tty 136, 31 Mar  9 13:09 /dev/pts/31

Tips and Examples

Review

New Material

Studying

Quiz 4

Homework 6

Midterm

Questions

Tips and Examples

Viewing Directory Permissions

Review

Running a Unix Command

Syntax of the Command Line

Command Options

Device Drivers

tty

The Unix tty Device Driver

Parsing the Command Line

The PATH System Variable

Running a Program in the Current Directory

Running the Command Entered on the Command Line

Attendance

New Material

Using Computers More Efficiently

Computers are Fast But People are Slow

The Birth of Unix

Data Streams

The Monitor and Keyboard

Standard Input, Standard Output and Standard Error

The Keyboard and Screen as Standard Input and Standard Output

Redirection

Redirecting Standard Output

Redirecting Standard Input

Redirecting Standard Output Can Destroy a File

Adding Output to an Existing File

/dev/null

Tricks with cat and Redirection

Devices

Unix Devices Are Files

ssh and Pseudo-terminals

Studying

Class Exercise

Class Quiz

Tricks with `cat` and Redirection

`ssh` and Pseudo-terminals