IT 116: Introduction to Scripting

IT 116: Introduction to Scripting
Class 11

Today's Topics

Review

for Loops
The range Function
Setting the First and Last Values with range
Setting the Increment with range
Reversing the range Values
Why Does Python Have the range function?
Nested Loops

New Material

Calculation a Running Total
Augmented Assignment Operators
Calculating Averages
Sentinels
Functions
Using Functions in Writing Programs
Two Types of Functions
Function Names
Defining a Function
Calling a Function
Layout Rules for Functions
Functions in Scripts
Formatting Functions for This Course

Homework 6

I have posted homework 6 here.

It is due this coming Sunday at 11:59 PM.

Mid-term

The mid-term exam will be given on Tuesday, March 23rd.

It will consist of questions like those on the quizzes along with questions asking you to write short segments of Python code.

60% of the points on this exam will consist of questions from the Ungraded Class Quizzes.

The other 40% will come from four questions that ask you to write a short segment of code.

The last class before the exam, Thursday, March 18th, will be a review session.

You will only be responsible for the material in the Class Notes for that class on the exam.

The Mid-term is a closed book exam.

To prevent cheating, certain rules will be enforced during the exam.

Review

`for` Loops

The for loop in Python has the following format

for VARIABLE_NAME in LIST_OF_VALUES:
    STATEMENT
    STATEMENT
    ...

Python's for loop works differently from for loops in other computer languages
The for keyword is followed by a variable
Next comes the in keyword followed by a list of values
The first value in the list is assigned to the variable
It keeps that value while passing through the code block
When execution of the code block ends, the next value in the list is assigned to the variable
This continues until all values are used at which point the loop ends

Here is is a for that prints the powers of 2

$ cat powers_of_2.py 
# this program uses a for loop to print the powers of 2
# from 1 to 10

for number in [1, 2, 3, 4, 5, 6, 7, 8, 9, 10]:
    value = 2 ** number
    print("2 to the power", number, "is", value)

$ python3 powers_of_2.py 
2 to the power 1 is 2
2 to the power 2 is 4
2 to the power 3 is 8
2 to the power 4 is 16
2 to the power 5 is 32
2 to the power 6 is 64
2 to the power 7 is 128
2 to the power 8 is 256
2 to the power 9 is 512
2 to the power 10 is 1024

You can do everything with a Python for loop that you can do with this loop in other languages
But the Python for loops makes some things easier
In other languages the value of the loop variable must be an integer
But not in Python
In Python the loop variable can be of any data type

The `range` Function

Python provides the range function to make it easy to create a list of integers
The range function creates a special kind of list that consists of a sequence of integers
```
>>> for number in range(10):
...   print(number)
... 
0
1
2
3
4
5
6
7
8
9
```
Notice that the numbers started with 0and ended with one less than the value of the argument

Setting the First and Last Values with `range`

The range function can be run with two arguments
The first argument is the first value in the list
The second argument is again one more than the last number in the list

So if I wanted to print the numbers from 1 to 10 I would write

>>> for number in range(1, 11):
...     print(number)
... 
1
2
3
4
5
6
7
8
9
10

Setting the Increment with `range`

If you call the range function with one or two arguments
And each value it creates is 1 more than the previous value
You can change this by giving range a third argument
The third argument is called the step value
It tells range how much to add to each value to get the next number in the sequence

We can use this feature to print the even numbers up to 10

>>> for number in range(2, 11, 2):
...  print(number)
... 
2
4
6
8
10

Reversing the `range` Values

What if we wanted the loop values to decrease with each pass through the loop?
We can do this by using -1 as the 3rd argument
But when we do we have to be careful about the 2nd argument
To see what I mean let's use range to print the numbers from 10 down to 1
The 1st argument is 10 and the 3rd is -1

We want the last number to be 1 so normally the 2nd argument would be 2

>>> for number in range(10, 2, -1):
...     print(number)
... 
10
9
8
7
6
5
4
3

Whoops
In truth, the value of the second argument is more complicated
The second argument should be what you get if you added the value of the step to the last number you wanted
Normally the step value is 1, but here it is -1
So the 2nd argument should be the last number we want, 1 to which we add -1 to get 0
```
>>> for number in range(10, 0, -1):
...   print(number)
... 
10
9
8
7
6
5
4
3
2
1
```

Why Does Python Have the `range` function?

The for loop in Python is more powerful than the for loop in other languages
That's because it can loop over any kind of value
But it will not automatically create a list of integers
Since we often want the for loop to work on such a list we need a separate function to create it
That function is range

Nested Loops

Just as you can have an if statement inside another if statement you can have a loop inside another loop
When you have one loop inside another it is called a nested loop
When you nest for loops you must be sure that you use different names for each loop variable
If you don't you will get very strange results

Let's write a script to create a times table

$ cat times_table.py 
# This script creates a times table using nested for loops

for row in range(1, 11):
    for column in range(1, 11):
        entry  = row * column
        print(entry, end="\t")
    print()

$ python3 times_table.py 
1   2   3   4   5   6   7   8   9   10  
2   4   6   8   10  12  14  16  18  20  
3   6   9   12  15  18  21  24  27  30  
4   8   12  16  20  24  28  32  36  40  
5   10  15  20  25  30  35  40  45  50  
6   12  18  24  30  36  42  48  54  60  
7   14  21  28  35  42  49  56  63  70  
8   16  24  32  40  48  56  64  72  80  
9   18  27  36  45  54  63  72  81  90  
10  20  30  40  50  60  70  80  90  100

With each pass through the outer loop the loop variable row gets a new value
It keeps that value while the inner loop variable column changes from 1 to 10

New Material

Calculation a Running Total

One of the most basic things we can ask a computer to do is to add a group of numbers

To add a series of numbers we would use the following algorithm

set a variable to 0
for number in the range start value to finish value
    add the number to the variable
print the result

Here is code to add the numbers from 1 to 10

>>> total = 0
>>> for number in range(1, 11):
...     total = total + number
... 
>>> print(total)
55

Each time we go through the loop a new value is added to total
A total that is updated as new numbers are encountered is called a running total
A good example of a running total in everyday life is a bank balance
Your bank balance is the running total of all deposits and withdrawals from your account
In programming, running totals are calculated using two features
- A variable that holds the running total
- A loop adding each new number
The variable that holds running total is called accumulator
Before you enter the loop, the accumulator must be set to 0
If the numbers are added by a user we have to know when the user is finished
There are many ways to get this information

One way is to ask the user before entering numbers how many numbers are to be added

$ cat add_many_numbers.py 
# adds a series of numbers entered by the user
# uses a for loop to do this after asking the user
# for the number of entries to be added

entries = int(input("How many entries? "))
total = 0
for entry_no in range(entries):
    number = int(input("number: "))
    total = total + number
print("Total", total)

$ python3 add_many_numbers.py 
How many entries? 5
number: 34
number: 54
number: 123
number: 345
number: 55
Total 611

Notice that I did not use the value of the loop variable entry_no
I used range to give me a list of values
But what I really needed was a list of the right size
So the loop ran as many times as was needed

Augmented Assignment Operators

The program above contains the line
```
total = total + number
```
This statement does the following
- Get the current value of the variable total
- Adds the value of number to total
- Assign this new value to total
This sort of thing happens so often that most languages have a special operator +=
It is used like this
```
total += number
```
This operator does three things
- Gets the current value of the variable on the left
- Add the value of the expression of the right to it
- Assigns this new value to the variable on the left

Using this new operator we can rewrite add_many_numbers.py like this

$ cat add_many_numbers_2.py 
# adds a series of numbers entered by the user
# uses a for loop to do this after asking the user
# for the number of entries to be added

entries = int(input("How many entries? "))
total = 0
for entry_no in range(entries):
    number = int(input("number: "))
    total += number
print("Total", total)

$ python3 add_many_numbers_2.py 
How many entries? 5
number: 48
number: 243
number: 53
number: 175
number: 65
Total 584

The += operator makes the code shorter which reduces the risk of errors
An operator like this is called an augmented assignment operator
Because it does more than assign a value to a variable

Python an augmented assignment operator for every arithmetic operator

Operator	Example	Equivalent To
+=	num += 5	num = num + 5
-=	num -= 5	num = num - 5
*=	num *= 5	num = num * 5
/=	num /= 5	num = num / 5
//=	num //= 5	num = num // 5
%=	num %= 5	num = num % 5
**=	num **= 5	num = num ** 5

Here are some examples

>>> number = 5
>>> number += 1
>>> number
6
>>> number -= 1
>>> number
5
>>> number *= 3
>>> number 
15
>>> number = 7
>>> number /= 2
>>> number
3.5
>>> number = 7
>>> number //= 2
>>> number
3
>>> number = 7
>>> number %= 2
>>> number
1
>>> number = 2
>>> number **= 3
>>> number
8

Calculating Averages

We use a running total when calculating averages
The average is the total divided by the number of values

Here is an example

$ cat average.py 
# this program asks the user how many numbers
# they have to enter, then performs a running
# total and computes the average

entries = int(input("How many entries? "))
total = 0
for entry_no in range(entries):
    number = int(input("number: "))
    total +=  number
average = total / entries
print("Average", average)

$ python3 average.py 
How many entries? 10
number: 10
number: 9
number: 8
number: 7
number: 6
number: 5
number: 4
number: 3
number: 2
number: 1
Average 5.5

Sentinels

In the script above we had to ask the user for the number of values
This forces that user to count the values before entering them
In other words, we are asking the user to do extra work
But computers are supposed to make our work easier
A better way to know when to stop is to use a special value
The program keeps adding entries to the total until the user enters this special value
This special value is called a sentinel

Here is a program to calculate the average using 0 as a sentinel

$ cat average_3.py 
# this program averages a series of numbers entered
# by the user using a sentinel to indicate the
# end of input

sentinel = 0
total    = 0
entries  = 0
print("Enter numbers when prompted")
print("When you are done, enter 0")
number = int(input("number: "))
while number != sentinel:
    total   += number
    entries += 1
    number   = int(input("number: "))
average = total / entries
print("Average", average)

Enter numbers when prompted
When you are done, enter 0
number: 5
number: 4
number: 3
number: 2
number: 1
number: 0
Average 3.0

We had to make a number of modifications to make this change
1. Tell the user the value to used as a sentinel
2. Change the for loop to a while loop
3. Create the variable entries initialized to 0
4. Increment entries each time through the loop
5. Ask for number before entering the loop
To initialize a variable is to give it it's first value
But we have to be careful when choosing a sentinel value
The value cannot be one of the numbers we are averaging
This could happen if we are calculating average temperatures
Because 0 is a perfectly good value for temperature
We need a number that would never occur in our list of values
We can do this with temperatures
No Celsius number can be below -273 degrees so we could use -500 as a sentinel value
Why use -500 when -274 would do?
Because it is easier to remember 500 than 274

Functions

A function is a group of statements that has a name and performs a specific task
The Python interpreter has a number of functions that are always available
You don't have to do something special to use them
They are called built-in functions because they are actually written inside the Python interpreter
But you can also define your own Python functions
Most Python scripts contain functions that break up the work
You can also create Python files that only contain functions and variables
These files are called modules
Modules are not scripts
But they contain things that other scripts can use
To use the functions in your script you use an import statement
```
import MODULE_NAME
```
The module name is the file name without the .py extension
The import statement loads the module code into memory
Some modules are already on your machine
They were loaded when Python was installed
The interpreter knows where to find these modules
If you create your own modules you have to let the interpreter know where to find them
On Mac and Unix you do this by setting the PYTHONPATH system variable
This variable tells the interpreter where to look for the modules you create

Using Functions in Writing Programs

What is the first thing you should do if you are given a big job?
You should break it down into smaller tasks and then do each one separately
The book gives an example of a program to print a check
This job breaks down into a number of smaller tasks
- Get the hours worked
- Get the employee's hourly rate
- Calculate the gross pay
- Calculate the overtime pay
- Calculate the withholdings
- Calculate the net pay
- Print the check
The idea of breaking a big job into small jobs is one of the most important ideas in engineering
The formal name for this process is modularization
But it often referred to as "divide and conquer"
Without this we would never be able to do big jobs like build a bridge
When we break things up into smaller tasks we can assign the work to different groups
Each of which may have its own specialty
In programming, each specific task should have its own function
This approach has many advantages
Programs that are broken up into functions are easier to read
Since each function does a specific job it is easier to understand what it does
This is particularly true if the function has a well chosen name
Modularized programs are usually shorter because there is less repeated code
You can use the same function over and over again
You only have to write the function once
But you could call the function as many times as you need
Modularization makes programs easier to test
Because you can test each function individually
Modularization also makes it easier to work in teams
If each function has a simple job to do one team member can work on it alone

Two Types of Functions

There are two types of functions
- Functions that return a value
- Functions that do not return a value
The textbook calls the second type of function a void function

Conversion functions like str, int and float are examples of functions that return a value

>>> result
'5'
>>> result = int('5')
>>> result
5
>>> result = float('5')
>>> result
5.0

An example of a function that does not return a function is print
It causes something to be printed, but it doesn't return a value
```
>>> result = print("Hello world!")
Hello world!
>>> result
>>>
```

Function Names

Function names must abide by the same rules as variable names
- You cannot use keywords variable names
- Variable names cannot have spaces
- The first character of a variable name must be a letter, a-z and A-Z, or an underscore, _
- After the first character you can use letters, digits or the underscore
- Uppercase and lowercase letters are distinct
You can use several words in a function name
But you have to separate each word with an underscore, _
A function name should describe what the function does
Since functions do some work, most function names use verbs

calculate_gross_pay
get_hourly_rate
calculate_overtime

Defining a Function

A function definition has the following format

def FUNCTION_NAME([PARAMETER][...]):
    STATEMENT
    STATEMENT
    ...

A function is a statement that contains other statements
So a function is a compound statement
The first line of a function definition is called the function header
There are four parts to the function header
- The keyword def
- The function name
- The parameter list enclosed in parentheses
- A colon, :, at the end of the line
Parameters are variables defined and used inside a function
Even if the function has no parameters it must still have parentheses
Following the function header is a series of indented statements that form a code block

Here is a function that prints the UMB address

# prints the address of UMB
def print_umb_address():
    print("University of Massachusetts at Boston")
    print("100 Morrissey Boulevard")
    print("Boston, Massachusetts   02125-3393")

This function takes no arguments
It is also a void function since it returns no value
Notice that I put a comment before the function definition that describes what the program does
All the functions you write for this course must have such a comment

Calling a Function

To run a function you write a function call
A function call has two parts
- The function name
- A list of arguments enclosed in parentheses
You must follow the function name with parentheses even if the function takes no arguments
Here is how we would call the print_umb_address function
```
print_umb_address()
```

Here is a script that defines the function and calls it

$ cat -n umb.py 
     1  # this program contains a function that 
     2  # prints the address of our campus
     3
     4  # prints the address of UMB
     5  def print_umb_address():
     6      print("University of Massachusetts at Boston")
     7      print("100 Morrissey Boulevard")
     8      print("Boston, Massachusetts   02125-3393")
     9      
    10  print("I teach at UMass/Boston")
    11  print()
    12  print_umb_address()
    13  print()
    14  print("I am the IT Program Director in the Computer Science Department")

$ python3 umb.py 
I teach at UMass/Boston

University of Massachusetts at Boston
100 Morrissey Boulevard
Boston, Massachusetts   02125-3393

I am the IT Program Director in the Computer Science Department

Let's look at what the Python interpreter does as it runs the script
The first two lines begin with # which means they are comments
The interpreter ignores comments
It skips the blank link to go down to the next
This line is also a comment and is ignored
Line 5 is the function header which names the function and lists the parameters it needs
The interpreter has to do something special when it comes across a function
The interpreter processes a script statement by statement
Once it has processed a statement it forgets all about it
Functions can be called many times as the script runs
So when the interpreters sees a function definition it has to store it
It is stored inside a special place in the memory for the script
Line 9 is another blank line and is skipped
When the interpreter gets to line 10 it notices that the statement is not indented
This means that the function definition has ended
The interpreter can can now execute each statement it reads and forget about it after it is done
The interpreter executes the print statements on line 10 and 11
Line 12 is a function call
The interpreter jumps to the function it previously stored in memory
It runs each statement in the function body until it gets to the end
Now it jumps back to the statement after the function call
Line 13 and 14 are print statements
print is built inside the interpreter
So it doesn't have to change lines in the script
Then the interpreter quits

Layout Rules for Functions

A layout is the arrangement of information in a document
When writing a script that has functions inside it Python has only two layout rules
- The function definition must come before the code that calls it
- The body of the function must be indented
These are not arbitrary
They are the way code must be laid out
for the interpreter to do its job
Function definitions must come before the code that calls it so the interpreter can store them in memory
The interpreter process the script statement by statement
Once the interpreter has processed a statement it forgets about it
Since a function can be called many times, it must be stored in memory
The body of a function is the code block that comes after the header
The indentation tells the interpreter which statements belong to the function
And which are part of the rest of the script

Functions in Scripts

Most scripts are composed of two sections
A section at the top consisting of function definitions
Followed by code that makes the script do its work and calls the functions

In the script below, this second part appears in red

# this program prints some information about me
# prints the address of our campus

# prints the address of UMB
def print_umb_address():
    print("University of Massachusetts at Boston")
    print("100 Morrissey Boulevard")
    print("Boston, Massachusetts   02125-3393")

# prints some information about me
def print_personal_info():
    print('Glenn Hoffman')
    print('Information Technology Program Director')
    print('Computer Science Department')
    print('University of Massachusetts at Boston')
    print('Glenn.Hoffman@umb.edu')
    print('McCormack 3-0201-22')

print("Some information about me")
print()
print_personal_info()
print()
print_umb_address()

This second part is sometimes called the driver or the main body of the code
Some programmers like to put this code in a special function called main
In the code below main highlighted in red

The call to main is the blue statement at the end of the file

# this program prints some information about me
# prints the address of our campus

# prints the address of UMB
def print_umb_address():
    print("University of Massachusetts at Boston")
    print("100 Morrissey Boulevard")
    print("Boston, Massachusetts   02125-3393")

# prints some information about me
def print_personal_info():
    print('Glenn Hoffman')
    print('Information Technology Program Director')
    print('Computer Science Department')
    print('University of Massachusetts at Boston')
    print('Glenn.Hoffman@umb.edu')
    print('McCormack 3-0201-22')

def main():
    print("Some information about me")
    print()
    print_personal_info()
    print()
    print_umb_address()

main()

There is nothing wrong with this approach, though I do not use it
Notice that this shows that one function can call another

Formatting Functions for This Course

The way information is presented affects how easily it can be read
Here is another version of the script above

It contains the same statements as the original, but looks very different

def print_umb_address():
    print("University of Massachusetts at Boston")
    print("100 Morrissey Boulevard")
    print("Boston, Massachusetts   02125-3393")
def print_personal_info():
    print('Glenn Hoffman')
    print('Information Technology Program Director')
    print('Computer Science Department')
    print('University of Massachusetts at Boston')
    print('Glenn.Hoffman@umb.edu')
    print('McCormack 3-0201-22')
print("Some information about me")
print()
print_personal_info()
print()
print_umb_address()

Although this code works, it is hard to read
Writing code that works is not enough
You must also write code that can be understood
In most large programming companies there are coding standards
The standards specify the format for code
These standards are meant to improve consistency and readability
I want you to learn good habits when writing code
So all homework scripts written for this course must follow these rules
If you do not, I will deduct points
All scripts must have a comment at the top of the script
This comment must describe what the script does
It does not have to be very long
All functions in homework scripts must appear at the top of the file before the main body of the code
Python only requires that the function definition comes before the first call to the function

This means that the following code will run properly even though it is very hard to read

print("Some information about me")
print()
def print_umb_address():
    print("University of Massachusetts at Boston")
    print("100 Morrissey Boulevard")
    print("Boston, Massachusetts   02125-3393")
print_umb_address()
print()
def print_personal_info():
    print('Glenn Hoffman')
    print('Information Technology Program Director')
    print('Computer Science Department')
    print('University of Massachusetts at Boston')
    print('Glenn.Hoffman@umb.edu')
    print('McCormack 3-0201-22')
print_personal_info()

This is not what I want
Each function must be preceded by a comment that tells what the function does
These comments must be written using #s
Do not use triple quotes
There should be no blank line after the comment

Function comments should look like this

# prints the address of UMB
def print_umb_address():

not like this

# prints the address of UMB

def print_umb_address():

There should be no blank lines inside the function body

Don't do this

def print_umb_address():
    print("University of Massachusetts at Boston")

    print("100 Morrissey Boulevard")

    print("Boston, Massachusetts   02125-3393")

Do this instead

def print_umb_address():
    print("University of Massachusetts at Boston")
    print("100 Morrissey Boulevard")
    print("Boston, Massachusetts   02125-3393")

There must be a blank line between the end of one function and the start of the next

In other words I should see this

# prints the address of UMB
def print_umb_address():
    print("University of Massachusetts at Boston")
    print("100 Morrissey Boulevard")
    print("Boston, Massachusetts   02125-3393")

# prints some information about me
def print_personal_info():
    print('Glenn Hoffman')
    print('Information Technology Program Director')
    print('Computer Science Department')
    print('University of Massachusetts at Boston')
    print('Glenn.Hoffman@umb.edu')
    print('McCormack 3-0201-22')

Not this

# prints the address of UMB
def print_umb_address():
    print("University of Massachusetts at Boston")
    print("100 Morrissey Boulevard")
    print("Boston, Massachusetts   02125-3393")
# prints some information about me
def print_personal_info():
    print('Glenn Hoffman')
    print('Information Technology Program Director')
    print('Computer Science Department')
    print('University of Massachusetts at Boston')
    print('Glenn.Hoffman@umb.edu')
    print('McCormack 3-0201-22')

There must also a blank line between the last function definition and the rest of the script