COS 109 Lab 5: Python!

Lab 5: Python!

Thu Oct 26 11:48:39 EDT 2023: Minor clarifications added to requirements for both GPA and majors computations

Due Sunday Oct 29 at midnight

This lab is meant to help you learn the rudiments of Python by writing some small Python programs. You will probably also begin to appreciate that programming requires orderly thinking and meticulous attention to detail. On the other hand, it's a great feeling when your program works. Enjoy.

Read these instructions through before beginning the lab.
Follow the instructions about program format, variable names, etc.
Pay attention to syntax and grammar of language constructs.
Read and understand the Python examples here.

One other bit of advice: it often takes longer than you expect to write a program, so

Don't leave this to the last minute.

Part 1: Introduction
Part 2: The Python Language
Part 3: What's My GPA?
Part 4: How Many Majors?
Part 5: Finishing Up

Part 1: Introduction

Programming languages provide a way to express computer algorithms in a form that is convenient for humans and can be translated into a computer's instruction repertoire. Programming languages are the way that we tell computers how to perform a task.

In class, we have studied the very low-level instructions that the computer itself understands (for example, the Toy), and talked about a variety of programming languages, easier for people to use, that are translated into native machine instructions by programs like compilers and assemblers. There are many such languages, each with its own good and bad points, and often with noisy adherents and detractors.

Python, the topic of this lab and the next, is one of the most widely used programming languages. It's easy to learn, easy to use, and it comes with an enormous library of useful code for almost any application area you might think of. We don't expect you to become a full-fledged Python programmer, but you will do enough in this lab and the next to get some understanding of what programs look like and what is involved in converting a process into a program.

There is an enormous amount of Python information on the Web, and thousands of books. You might start by searching for "python tutorial" or take a look at Codecademy, which has an interactive walk through Python basics. W3schools is good for syntax and small examples that you can experiment with. Python.org is the definitive source.

Python has three major components:

the Python language itself
building blocks that you can use to create your program
methods that let your Python program interact with its environment

In this lab, we're going to concentrate on programming, not on fancy input and output. This will mean using variables, arithmetic, loops and decisions, and functions, while doing only minimal interaction with the environment, just enough to get data from a user and to display results. The focus is on figuring out what the pieces are and how to combine them into logic that does a computational task.

Although a programming language provides a way for you to tell a computer what to do, you have to spell out the steps in precise detail. You have at your disposal statements for:

reading input data and displaying output
storing values in variables and doing arithmetic
testing conditions and deciding what to do next
looping and repeating groups of statements
calling existing functions to do part of the job
defining your own functions to organize your computation

Programming languages provide a number of piece-parts that you can use to help create a program quickly; in fact, you couldn't do much without some of these, and it would be very time-consuming to create others from scratch. In effect, these component are like prefabricated components for houses -- windows, doors, kitchen cupboards -- that can be used even by beginners to create polished products.

Some of the components are basic computational units: functions for mathematical operations (computing a square root or generating a random number, for example), or processing strings of characters (like finding or replacing a word within a sentence, or converting between upper and lower case).

Other building blocks let you interact with a user, with the environment on your computer, and with the Internet.

Colab

Normally you would run Python on your own computer, and if that's possible for you, it's good experience. But for this lab we will use Google's Colab, which provides an interactive web-based environment for running Python. When Python runs within the confines of a browser, there are some limitations on what you can do. For example, it may not be possible to read information directly from the file system on your computer or to store information there, but you can usually access anything on the Internet.

To get started, go to the Colab web site, select File, then New notebook, then type your program in the "+ Code" box; you can use a "+ Text" box to add explanatory comments, or use Python comments. Here's the first example:

Clicking the triangle icon will compile and run the program, producing this:

You can add as much code as you like; add more sections as you evolve a system, perhaps interspersed with Text sections to explain what you're doing.

As you type, or if you hover over something, Colab will show you how to use a statement or possible continuations of what you are typing, like this:

These popups provide an online equivalent of a manual, and you can always get more by searching, including the official Python documentation and general web searches for things like mysterious error messages. You can learn more by reading, but right now just start playing.

Advice: Do your programming in small steps, and make sure that each step works before moving on to the next. Pay attention to error messages from the Python compiler; they may be cryptic and unfamiliar, but they do point to something that's wrong.

Part 2: The Python Language

Here we will provide a brief overview of some of the commands and syntax available in Python. This is by no means a complete reference, however, so you will want to look at some of the web sites we suggested above, and review what was covered in class. We are using Python 3. Version 2 is still around (it seems to be the default on macOS) but it doesn't do what we want; in particular the input function that we use for fetching input from a user doesn't work right, and print now requires parentheses, so the code in the xkcd cartoon above doesn't run.

We have set up a Colab notebook with some of the material already in place:

https://colab.research.google.com/drive/1heF1dV8lxJLbhieBJ7nKF-mjeX2Nh3W7?usp=sharing

Start by making a copy of this notebook, and do your work in the copy. At the end of the lab, you will save the notebook in a file on your own computer, and upload it to Gradescope. To save typing and prevent errors, you can copy the text from the pink blocks directly into Colab.

Displaying text with `print`

Step 1 is to create and run a minimal program to be sure that you can create and run programs; this is traditionally called Hello World. Here's the simplest version, which calls the function print to display text:

print("Hello, world")

This is the basic framework that you will now flesh out -- the single print will gradually be replaced by more and more statements.

Input with the `input` function

The function input is a sibling to print; it displays a message and waits for you to type a value. Whatever you type is returned as a sequence of characters that can be used in the rest of the program.

reply = input("Type your name: ")
print("Hello", reply)

The first statement calls the function input and stores the resulting value (what the user typed) in the variable reply. The second statement calls print to display Hello and the reply text, separated by a space.

If you press Return or Enter without typing anything, input returns an empty (zero-length) sequence of characters, which is represented in Python as "". Otherwise, input returns the characters that the user typed. This sequence tests whether the reply was empty:

reply = input("Type your name: ")
if reply == "":
   print("Hello, you nameless person")
else:
   print("Hello", reply)

There's more info on if-else below.

Variables

A variable is a name that refers to a place in your program where information is stored; it ultimately corresponds to a location in memory. Variables can hold different types of information, but you don't generally have to worry much about that -- Python handles the details for you. The main types are integers (like 123), floating point numbers (with a decimal point and optional fractional part, like 3.14159), and string (a sequence of characters, like "hello, world" or "3.1415" or "").

Before you can use a variable in a Python program, you must assign a value to it. For example:

name = "Joe College"
year = 2024
GPA = 2.1

Python distinguishes between upper case and lower case, so gpa and GPA are two different and unrelated variables. Names must begin with a letter or underscore, followed by zero or more letters, digits and underscores.

Quotes and quoting

If you need a literal string of characters in your program, enclose it in either single ' quotes or double " quotes. If the string contains one kind of quote, you can quote it with the other, like "Don't do that!". If you inadvertently omit a quote or don't balance quotes properly, the Python compiler will complain.

If you are creating your Python with a text editor instead of Colab, make sure that the editor (e.g., Notepad, TextEdit) preserves the ASCII quote characters (the regular single- and double-quote characters on the keyboard); word processing programs think they are doing you a favor by using so-called "smart" quotes. The single quote in Don't above is the proper character.

Arithmetic Operators, Expressions and Statements

Operators are special symbols like the + and - of ordinary arithmetic. An operator acts on values and variables to compute a new value. An expression is a legal combination of values, operators and parentheses, like (my_height+your_height)/2 or area = pi * r * r. These are the most important Python operators:

Operator Description
= Assigns the value on right to the variable on left (e.g., next_year = year + 1)
+ - * / % ** Arithmetic: add, subtract, multiply, divide, remainder, exponentiation
(// is integer division: no remainder)
+ Strings: make a new string by joining the two operands
< <= > >= Comparisons: less than, less than or equals, greater than, greater than or equals
== != More comparisons: equals, doesn't equal
and Conditional AND (true if left and right expressions are both true)
or Conditional OR (true if either left or right expression is true)

A statement computes a value or does some other operation. For our purposes here, the main kinds of statements are function calls like print() and input(), assignments like n = 1, and the control-flow statements described below.

Control-Flow Statements

Control-flow statements like if-else, while, and for determine what the program will do next, depending on the evaluation of some expression. They are the Python versions of statements like goto and ifzero in the Toy computer, but easier to work with.

The `if-else` Statement

The if statement is used to select which one of two groups of statements to do next:

if it's raining:
I will stay home
else:
I will go to class

or in code:

if condition:
   statements 
else: 
   other_statements

If the condition is true, then the statements are performed, otherwise the other_statements are performed. Both groups of statements can be one or more lines.

The statements controlled by if and else must be indented consistently, as shown. Python uses indentation to group statements, and will not compile a program that is not indented properly.

The else part is optional; sometimes the logic doesn't require it and you can omit it. You can also string together a sequence of tests with elif clauses:

temp = int(input("What's the temperature? "))
if temp > 212:
   print(temp, "is boiling")
elif temp < 32:
   print(temp, "is freezing")
else:
   print(temp, "is in between")

Make sure you understand how the control flow in this code works. How would you modify it to work with Celsius temperatures?

int() and str()

The result that comes back from calling input is a Python string, that is, a sequence of characters. Even if that is a sequence of digits, it's not interpreted as a numeric value; it's just characters.

That means that if you want to do arithmetic on it, or compare it to other numeric values, you have to convert it into an integer. The conversion is done with the int function, which converts a string of ASCII digits into its internal representation as a binary integer. (You can also use float() if you want to handle non-integral values.)

Try something like this:

s = "123"
print(s)
n = int(s)
print(n)
print(type(s), type(n))

If you get complaints about strings and integers when you run your Python code, it's likely because you didn't handle the types correcty.

Other functions

There are a lot of built-in functions for processing strings of characters (see this list). Among the more useful:

s = s.upper()		# convert a string s to upper case
s = s.lower()		# convert to lower case
s = s.strip()		# remove spaces and tabs from both ends; also lstrip, rstrip
a = s.split()		# split a string into a list; also variations like split(',') 
a = s.upper().split()	# convert a string to upper case and then split it

These functions do not change the string they are applied to, but return a new potentially modified string. The original string can then be overwritten, as in the first three examples above.

Comments

Comments are annotations added by programmers but ignored by Python when it is compiling and running your program. Comments are used to make the code more readable for other people that may have to work on it, or to remind you of what you were thinking when you wrote a particular line of code. Python comments start with # and extend until the end of a line. It's a good idea to include comments that explain unobvious code, for yourself and others.

cummings = "e e cummings"  # his preferred spelling
Cummings = "E E Cummings"  # upper-case alternative

The `while` Loop

The while statement is a loop that causes a group of statements to be repeated while some condition is true:

while my room is a mess:
pick up junk
throw it out

In Python, this would look like

while condition:
   statements

As long as the condition is true, then the statements will be repeated. (That is, the condition is tested, and if it is true, then the entire loop body is executed, after which the condition is evaluated again.) As with if-else, the statements of the loop must be indented.

So, for example, you might print a sequence of temperature conversions like this:

fahr = 0
while fahr <= 212:
   celsius = 5 / 9 * (fahr - 32)
   print(fahr, "F is", celsius, "C")
   fahr = fahr + 10

When you use while, watch out for infinite loops. Make sure that you increment or decrement the loop variable, or in some other way ensure that the loop eventually ends.

The `for` Loop and Ranges

The function range produces a sequence of integer values. It's most commonly used as range(0,N) in a for loop:

for i in range(0,10):  # i is 0, 1, 2, ..., 9
  print(i)

Lists (Arrays) and Dictionaries

Python provides two mechanisms for storing a set of data items with similar properties as a group: lists (arrays) and dictionaries.

A list (often called an array in other languages) is a set of values of any types, stored as a single variable. If the list name is A, the values are accessed as A[0], A[1], and so on.

The elements of the list are numbered from 0 to the length minus 1: A[0]..A[len-1]. You can iterate over the elements of a list with a while loop:

A = []   # create a list A with no elements
i = 0
while i < 10:
  A.append(i)  # append i to the growing list; i ranges from 0 to 9
  i = i + 1
print(A)

or more compactly with a for loop and a range:

A = []   # create an list A with no elements
for i in range(0, 10):
  A.append(i)   # append i to the growing list
for i in A:
  print(i, A[i]) # print subscript and value

The first line A = [] creates an empty list; the loop appends values to it.

The range function is more general: it can create lists with arbitrary starting, ending and step values.

A dictionary is different from a list: its subscripts or indices can be arbitrary strings instaad of numbers, so they can take on any values at all. The elements of the dictionary can take on any values, though numbers and strings are most common. Here's an example that initializes a dictionary with the weights for each letter grade, and then prints them:

weight = {
  'A+': 4.0, 'A': 4.0, 'A-': 3.7,
  'B+': 3.3, 'B': 3.0, 'B-': 2.7,
  'C+': 2.3, 'C': 2.0, 'C-': 1.7,
  'D': 1.0, 'F': 0, 'P': 0
}

for i in weight:
  print(i, weight[i])
}

A+ 4.0
A 4.0
A- 3.7
B+ 3.3
B 3.0
B- 2.7
C+ 2.3
C 2.0
C- 1.7
D 1.0
F 0
P 0

Functions

A function is a chunk of program that can be used or called from other places in a program to do an operation or compute a value; it's a way to wrap up a computation in a package so it can be easily used just by referring to its name. The Python library includes functions that have already been written for you -- mathematical functions like square root (which is called sqrt in Python) are good examples -- and you can write your own too.

For example, you could write a function to do the Fahrenheit to Celsius temperature conversion; putting the computation in a function makes it easier to change if necessary, and reduces clutter and confusion in the rest of the program. Here is such a function, called ftoc:

def ftoc(fahr):  # convert Fahrenheit to Celsius
   return 5 / 9 * (fahr - 32)

The return statement transfers control back to the function's caller, and sends a value back too if there is one. Notice that the statements of the function are indented.

Functions usually have parameters, so that they can operate on different data values, like print and input do. Thus ftoc has a single parameter, the temperature to be converted, which will be referred to as fahr within the function.

Now you could use ftoc in a loop like this:

def show_temps(low, high, step):
   f = low
   while f <= high:
      print(f, "F is ", ftoc(f), "C")
      f = f + step

and then call the function in a statement like this:

show_temps(0, 212, 10)

Variables created within a function are only accessible within that function; variables created outside of any function are accessible anywhere in the program.

You won't need to write any functions in this lab, but you will be using functions that others have written for you, so it's important to understand the idea.

Work through the examples in the Colab notebook, trying variations and experiments to be sure that you understand the basics.

Part 3: What's My GPA?

In Lab 4, you created a spreadsheet to compute GPA for an individual. What if you had to compute GPAs for 6,000 undergrades? For that, you need a program.

Your assignment is to implement a Python version of the GPA exercise of the previous lab, as part of the Colab page that you have been using. A no-frills version is only about 15 lines of code. If your program is getting much bigger than that, you're probably off on the wrong track and should review the notes and the Python examples.

All of the information you need to write this program is in the previous sections and the Colab page.

Pay attention to syntax and grammar and language constructs. You can't just throw code fragments around randomly and expect them to work. Read the instructions. All the information you need is right here.

Your program should do the following:

Read a single line of input from the user, containing a set of grades in any order and in any combination of upper and lower case, separated by spaces, like this:
a a- b- c A+ F p
Print the total number of courses (excluding any F) and the GPA. For this specific set of grades, your program should print
courses 6 GPA 3.28
Be sure you implement the proper definition of GPA, especially if you got that wrong in the previous lab.

This is a simplified version of this calculation from the registrar's site.

Copy these lines into your Python code. They create a dictionary that contains the numeric weight of each potential grade.

weight = {
  'A+': 4.0, 'A': 4.0, 'A-': 3.7,
  'B+': 3.3, 'B': 3.0, 'B-': 2.7,
  'C+': 2.3, 'C': 2.0, 'C-': 1.7,
  'D': 1.0, 'F': 0, 'P': 0
}

Now write the rest of the program:

Read a single line of input, convert it to upper case, and split it into a list of grades called grades.
Initialize variables for total score, number of courses, and number of P grades.
Loop through each item in grades. If it's not an F, add it to the number of courses. If it's a P, increment that count. Add the weight of that particular grade to the total score.
At the end, compute and print the course count and the GPA. It will be helpful to also print the total score and the number of P's. (I fixed my buggy first version by doing that.)
(If you want to format the GPA a little better, you can use something like print("%.3f" % GPA) to trim the output to only three significant figures.)

You can add any other embellishments you like as long as the core functionality works. For example, you don't have to do any error checking, but a real program would have to, so you could add a test to detect invalid grades.

One Step at a Time

If this is your first Python program you are liable to make mistakes. Finding the bugs (debugging) will not be too hard if, as we suggested above, you write a small section at a time, and test as you go. This approach makes it easier to determine what section of your code has the error: it's probably the one you just added.

Use print statements to display results as your computation proceeds, or to verify that your program is doing what you think it is. These are easy to add and help narrow down problems quickly.

For testing, use the simplest possible inputs, where you know the right answer, before trying more complicated inputs. For example, do you get the right answer for a single grade like A or P? If that doesn't work, it's unlikely that longer inputs will work. And you can be sure that this is how we will test your submission.

When you prepare your lab for submission, you should leave any debugging statements in the program source code, but "commented out" so they can be easily restored if something goes wrong.

Part 4: How Many Majors?

In this section, we'll do a bit of programming to process information about the number of majors over the past few years. This will involve reading an Excel file in a text format called CSV or comma-separated values, and doing some simple processing on the data.

As the name suggests, CSV format separates the values in each field in a row by a comma, with some addional rules for quoting in case a field contains a comma. You can create a CSV file from Excel or Google Sheets or the like by saving in .csv format. The file majors.csv contains the same information as the majors.xlsx from the last lab, but I have removed the header line and the totals. Here are the first few lines:

African American Studies,12,23,25,16,20
Anthropology,45,67,75,60,72
Architecture,24,25,19,22,22
...

The first step is to read a file from a Python program. There are plenty of ways to do this, but since we are going to be processing a CSV file, it's easiest to use a method suggested in the Python manual. The minimal program to read and print looks like this:

import csv

with open('majors.csv', newline='') as f:
  for r in csv.reader(f):
    print(r)

Before you can use this in Colab, you have to upload the file majors.csv. Click the file folder icon at the left side:

This opens a new area on the left that lists some files and shows an upload icon near the top left (to the right of the magnifying glass):

Click on the upload icon to bring up a file upload dialog box:

Select the file you want (majors.csv) and miraculously the file will appear among the other files:

Now you can use the file in your Python program. Make this a new Code section in your Colab notebook

Your Assignment for this Part

The Colab page has a sequence of things for you to try.

How many departments are there?
How many departments have more than say 50 majors in the most recent year? Make "50" be a parameter, so your code first asks what the threshold is.
Print name and 2022-23 enrollment for each department.
Print the info for the largest department.
Many departments had a dip during 2020-21 due to Covid, but others did not. Print all departments where the Covid year 2020-21 was higher than 2022-23.
How did enrollments change from 2020-21 to 2022-23? Print the percentage change from 2020-21 to 2022-23 for each department.
Hint: the math.max function is your friend, and slices like [1:5] are the next best thing to sliced bread. You will also likely need to use the int() function.

Go do them.

Part 5: Finishing Up

Make sure that you have implemented what we asked for.

Save your Python code in a text file called lab5.py on your computer. You can use File / Download / Download .py. You might have to rename the file once it has been downloaded.

Upload lab5.py to Gradescope for Lab 5. Do not upload anything to cpanel for this lab.

Operator	Description
`=`	Assigns the value on right to the variable on left (e.g., `next_year = year + 1`)
`+ - * / % **`	Arithmetic: add, subtract, multiply, divide, remainder, exponentiation
	(`//` is integer division: no remainder)
`+`	Strings: make a new string by joining the two operands
`< <= > >=`	Comparisons: less than, less than or equals, greater than, greater than or equals
`== !=`	More comparisons: equals, doesn't equal
`and`	Conditional AND (true if left and right expressions are both true)
`or`	Conditional OR (true if either left or right expression is true)