IT 116: Introduction to Scripting
Class 19
Tips and Examples
Review
New Material
Microphone
Graded Quiz
You can connect to Gradescope to take weekly graded quiz
today during the last 15 minutes of the class.
Once you start the quiz you have 15 minutes to finish it.
You can only take this quiz today.
There is not makeup for the weekly quiz because Gradescope does not permit it.
Solution to Homework 7
I have posted a solution to homework 7
here.
Let's take a look.
Homework 9
I have posted homework 9
here.
It is due this coming Sunday at 11:59 PM.
Questions
Are there any questions before I begin?
Tips and Examples
A Simple Function
Don't Use readline() in a for
Loop
Review
Getting Rid of Newline Characters
- Lines in a text file have a newline,
\n, at the end of each line
- You must remove this newline character when printing the line ...
- or there will be a blank line after each printed line
- As you will see in a few weeks, strings are objects
- Like most objects, they have
methods
- strip is a string method that removes
whitespace at both ends of a string
- You use it like this
line = line.strip()
Writing and Reading Numbers
- In this course we will only be dealing with text files
- Any numbers you read from a file will be strings
- You can't perform arithmetic on strings
- You need to convert them into numbers using either
int
or
float
- When writing a number to a file, you must convert it into a string
- You can do this with the
str
conversion function
- If you try to write a number to a file you have opened with "w" ...
- you will get an error
>>> file = open("numbers.txt","w")
>>> file.write(5)
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
TypeError: write() argument must be str, not in
Appending Data to an Existing File
- If you write something to a file that already exists ...
- it will erase what was in the file
- To avoid doing
this use the append access mode, "a"
- Writing to a file opened in this mode adds text to the bottom of the file
Reading Files with a for
Loop
Attendance
New Material
Running a Script Using python3
- To write a program in the computer language C you create a text file ...
- that contains statement in the C language
- This file is called the
source code
- You cannot run this file
- You have to run the source code through a program called a
compiler
- The compiler creates a file of binary instructions ...
- that the CPU understand
- These binary instructions are called
machine language
- Our scripts are text files containing Python statements
- The computer does not understand Python text
- Only the Python interpreter understands Python statements
- That means to run a Python script we need two things in RAM
- The text of the script
- The binary code for the Python interpreter
- The picture in memory looks like this
- The Python interpreter was written in a language like C ...
- and compiled into an executable file
- When we type
python3
at the command line ...
- we are running Python in
interactive mode
$ python3
Python 3.5.2 (default, Apr 16 2020, 17:47:17)
[GCC 5.4.0 20160609] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>>
- To run a script we can type
python3
followed by the name of the script
- The interpreter then reads each Python statement ...
- translates it into machine language ...
- and runs the translated statements
- If I have the Python script hello_1.py
# prints a friendly message
print("Hello world!")
- I can run it on the Unix command line like this
$ python3 hello_1.py
Hello world!
Making a Script Executable
- Unix gives us another way to run a Python script
- Look at the Bash script hello1.sh
# prints a friendly message
echo Hello world
- I can run this script using the
bash
command
$ bash hello_1.sh
Hello world!
- The program
bash
can run a Bash script
- Just like the program
python3
runs a Python script
- But if I change the access permission on this Bash script ...
- using the Unix command
chmod
...
- and give everyone read and execute permission
$ chmod 755 hello_1.sh
- I can now run the script directly without using
bash
$ ./hello_1.sh
Hello world!
- In order to run a script in my current directory ...
- I have to put . / before the script name
- The reason for this has to do with Unix security
- If I give the same permissions to a Python script
hello_1.py
$ chmod 755 hello_1.py
- And try to run it, I will get an error
$ ./hello_1.py
./hello_1.py: line 3: syntax error near unexpected token `"Hello world!"
./hello_1.py: line 3: `print("Hello world!")'
- You get this error because Unix knows the script contains text ...
- not machine language instructions
- Whenever you ask Unix to run a text file ...
- it assumes that the file contains Unix commands
- So the binary code it loads into the process RAM ...
- is the binary code for Bash
- But a Python script does not contain Unix commands ...
- so we get an error
- Unix has a special feature which we can use here
- This features allows us to tell Unix to run
python3
on the file ...
- instead of running
bash
- To do this we must add a line with a special format ...
- as the first line of the script
- The first two characters must be
#!
- Followed by the
absolute pathname
of the program to run the script ...
- instead of
bash
- An absolute pathname holds the name and location of a file ...
- and can be used inside any directory
- On our Unix systems we use /usr/bin/python3
to run python scripts
- To make Python scripts executable on Linux ...
- we first give the file read and execute permission
- On the Unix command line this is done using
chmod
chmod 755 FILENAME
- It can also be done in FileZilla
- But then we must add the special line I mentioned above
- Here is the line for our Unix system
#! /usr/bin/python3
- The script now looks like this
#! /usr/bin/python3
# prints a friendly message
print("Hello world!")
- I can now run the script without running
python3
$ ./hello_2.py
Hello world!
- This special line is called the
hashbang line
- Since ! on Unix is often called "bang"
- This is often referred to as
shebang
- For some reason, this word makes my skin crawl
- So I use hashbang
- This line must be the first line in the script
- And the first two characters must be #!
- If the script were written in Perl we would need a different
pathname
- We would have to use the absolute pathname of the Perl interpreter
- We would have to write
#! /usr/bin/perl
- From now on all your scripts must be executable
- Or you will lose points
- I have changed the
Homework Script Rules
to add this requirement
Why Make a Script Executable?
Making Scripts Executable with FileZilla
- All work you create for this course must be copied to
pe15.cs.umb.edu
- Which you can do using FileZilla
- But now you need to make your Python scripts executable
- You can also do this with FileZilla
- Right-click on the file and select "Permissions" from
the menu
- Enter 755 in the box provided
- This will make the script executable
Averaging the Numbers in a File
- Let's use a
for
loop to average the numbers in a file
- Say we have a text file containing integers, one to a line
$ cat numbs.txt
1
2
3
4
5
6
7
8
9
10
- We need to create a file object, so let's ask the user for the filename
- We can use this to create a file object for reading
filename = input("Filename: ")
file = open(filename, "r")
- We will need to total the numbers
- But we also need to count them
- So we must initialize
two
accumulators
to 0
total = 0
count = 0
- Now we need a
for
loop that will read each line
- In the body of the loop we do two things
- First we
increment
the count
count += 1
- Then we convert the line to integer ...
- and add it to the total
total += int(line)
- Finally we calculate the average and print it
average = round(total/count, 2)
print("Average:", average)
- This gives us the following script
#! /usr/bin/python3
# reads numbers from a file and averages them
filename = input("Filename: ")
file = open(filename, "r")
total = 0
count = 0
for line in file:
count += 1
total += int(line)
average = round(total/count, 2)
print("Average:", average)
- The script has a hashbang line and I have made it executable
- So I can run the script like this
$ ./for_average.py
Filename: numbs.txt
Average: 5.5
- The text in blue is user input
Records
Reading Records from a File
- How can we capture two pieces of information from a single line?
- To do this we need to know where one field ends ...
- and the next begins
- The characters used to separate one field value from another...
- are called
delimiters
- In temps.txt the delimiter is a space
- When you export data from Excel you can choose many formats
- Very often people chose CSV
- Which stands for Comma Separated Values
- In these files a comma separates the field values
- We know how to read in a text file line by line
- But how do we get each field?
- We can use a string method called split
- split breaks up the line into individual fields
>>> line = '2017-06-01 67'
>>> line.split()
['2017-06-01', '67']
- Now can use
multiple assignment
...
- to give the value of each field to a variable
>>> date, temp = line.split()
- With this approach we can read each record in temps.txt
$ cat temps_print.py
#! /usr/bin/python3
# prints each field in temps.txt
file = open('temps.txt', 'r')
for line in file:
date, temp = line.split()
print(date, temp)
$ ./temps_print.py
2017-06-01 67
2017-06-02 71
2017-06-03 69
2017-06-04 88
2017-06-05 74
...
Finding the Average From a File of Records
- We read a file of records to do some processing of the data
- Like calculating the average
- We can do this by making a copy of temps_print.py
...
- and modifying the code
- First we need add two accumulators
count = 0
total = 0
- The
for
loop header stays the same
- But we need to add a statement to count lines
count += 1
- And we need to update total
- But before we do this we must turn temp into an
integer
temp = int(temp)
total += temp
- Outside the loop we need to calculate and print the average
average = round(total/count, 2)
print('Average:', average)
- Putting this all together we get
$ cat temps_average.py
#! /usr/bin/python3
# calculates the average the temperature in temps.txt
file = open('temps.txt', 'r')
count = 0
total = 0
for line in file:
count += 1
date, temp = line.split()
temp = int(temp)
total += temp
average = round(total/count, 2)
print('Average:', average)
- The added statements are shown in red
- Now we run it
$ ./temps_average.py
Average: 77.27
Additional Processing of Records
- In the code above we only calculated the average
- But we can do more than that while processing the file
- Why not compute the highest temperature and the lowest?
- To compute the highest temperature we use the following
algorithm
set max to the lowest possible temperature
for each temperature in the file
if the temperature is greater than max
set the max to this temperature
- Why did do we have to set the variable to a low value?
- To make sure it is replaced by a real value ...
- we encounter in the loop
- If we set the variable to 100
- It would never be replaced
by any value in temps.txt
- We need to set this variable before entering the loop
max = -100
- Then we need to add an
if
statement to the loop
if temp > max:
max = temp
- A similar algorithm will calculate the lowest temperature
set min to the highest possible temperature
for each temperature in the file
if the temperature is less than min
set min to this temperature
- Once again we need to set the the variable before entering the loop
min = 200
- And update this value inside the loop
if temp < min:
min = temp
- Here is the script with the additions in red
$ cat temps_max_min_average.py
#! /usr/bin/python3
# calculates the average, maximum and minimum in temps.txt
file = open('temps.txt', 'r')
count = 0
total = 0
max = -100
min = 200
for line in file:
count += 1
date, temp = line.split()
temp = int(temp)
total += temp
if temp > max:
max = temp
if temp < min:
min = temp
average = round(total/count, 2)
print('Average:', average)
print('Maximum:', max)
print('Minimum:', min)
- When we run this we get
$ ./temps_max_min_average.py
Average: 77.27
Maximum: 89
Minimum: 66
Closing a File Object
- All objects take up space in RAM
- For small programs like ours with only 1 or 2 objects ...
- this is not something we have to worry about
- But in big programs with lots of objects ...
- the space they take up can be significant
- This can make the script run more slowly ...
- of even cause the script to crash
- In big programs like this you would want to get rid of an object ...
- when you are done using it
- To get rid of a file object use the close
method
- Like this
file.close()
- For the small programs we write we don't have to worry about this
- The interpreter will close the file object for you ...
- when the script ends
Looping Through a File More Than Once
- How would we calculate the number of days the temperature was above average?
- We need to loop through the file more than once to do this
- We loop through it once to get the average
- Then we loop through it again ...
- to count the days with temperatures above average
- We read a text file using
sequential access
- That means we can only read a file in one direction
- Once we reach the end of a file ...
- we cannot go back to the beginning
- Instead we have to create a new file object ...
- for a second run through the file
- The algorithm we need is similar to that for calculating a maximum
set days_above to 0
for each temperature in the file
if the temperature is greater than the average
add 1 to the days_above
- We make a copy of temps_average.py ...
- and add the following code
file = open('temps.txt', 'r')
days_above = 0
for line in file:
date, temp = line.split()
temp = int(temp)
if temp > average:
days_above += 1
print("Days above average:", days_above)
- When we run this we get
$ ./temps_days_above_average.py
Days above average: 13
The ACM Code of Ethics and Professional Conduct
Class Exercise
Class Quiz