Lecture 2: Lists, Tuples, Dictionaries, and Debugging#

Topics#

This is the lecture note for the second day of INF80054 Data Science Fundamentals. The topics to be covered include:

  • Extreme values, Escape sequence, Raw string, docstring

  • Built-in string methods

  • Lists, tuples, and dictionaries

  • Testing, debugging, and exceptions

  • Custom functions (intro.)

Extreme values int and float#

In Python, the largest value of int is limited only by the available amount of memory in your computer. For float, the largest possible value is approximately \(1.8 \times 10^{308}\). This is the standard 64-bit “double precision” maximum value. Anything greater than this will be shown as the string inf.

The closest a non-zero number to zero is approximately \(5.0 \times 10^{-324}\). Anything closer to zero than that number is effectively zero.

# Python ints can be arbitrarily large (limited by memory)
big_int = 10**1000  # 1 followed by 1000 zeros
print("A very large int:", big_int)

# But floats have a maximum value
big_float = 1.79e308
print("Largest float:", big_float)

# Going beyond float's limit results in 'inf'
too_big_float = 1e309
print("Too big float:", too_big_float)

# Smallest positive float (closest to zero)
small_float = 5e-324
print("Smallest positive float:", small_float)

# Going smaller results in 0.0
too_small_float = 1e-325
print("Too small float:", too_small_float)

# But int can go negative as well
very_negative_int = -10**1000
print("A very negative int:", very_negative_int)
A very large int: 10000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000
Largest float: 1.79e+308
Too big float: inf
Smallest positive float: 5e-324
Too small float: 0.0
A very negative int: -10000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000

string and escape sequence \#

String is a sequence of characters enclosed within single quotes (’<string>’) or double quotes (”<string>”). Q: What if the itself contains single or double quotes?

We will receive a SyntaxError message unless if we do one of the following two steps shown below:

  • Define the string using double quotes

  • Use backward slash (‘’) as the escape sequence

# Syntax error when a string defined with a single quote contains a single quote

print('This string contains a single (') character')
  Cell In[2], line 3
    print('This string contains a single (') character')
                                                      ^
SyntaxError: unterminated string literal (detected at line 3)
# First solution, enclose the string in double quotes 
print("This string contaings a single quote (') character")

# Second solution, use the escape sequence character '\' (the backward-slash character) 
print('This string contaings a single quote (\') character')
This string contaings a single quote (') character
This string contaings a single quote (') character

Raw string#

Raw String is defined by preceding the string with the character r or R. The purpose of the raw string instruction is to tell Python to interpret the string literally.

Note

The \n character combination is interpreted by Python as a <newline> character.

# \n is interpreted as new line
print('foobar')
print('foo\nbar')

# prefixing the string with `r` tells Python to read it literally
print(r'foo\nbar')

# one way if we want to have a string containing the backslash, then it needs to be "escaped"
print("foo\\bar")

# again, prefixing with "r" or "R" tells python to read the string literally
print(R"foo\\bar")
foobar
foo
bar
foo\nbar
foo\bar
foo\\bar

docstring (triple-quoted string)#

Triple-Quoted Strings are delimited by matching groups of three single quotes or three double quotes:

'''<string>'''
"""<string>"""

You can have quote and/or double quote within the string and the string can be multiline. Escape sequences still work.

We usually use docstring to provide comments about a custom function, which would then automatically read by syntax highlighter to provide quick information about the function.

# example of a docstring
print(""" This is a multi line string containing quote (" and ' or "" and '')
          and escape sequence still works too such as to include a single
          backslash \\.          
      """)
 This is a multi line string containing quote (" and ' or "" and '')
          and escape sequence still works too such as to include a single
          backslash \.          
      

The dot operator and built-in string methods#

The dot operator is used it to invoke an object’s built-in properties and methods.

Syntax: (to invoke the built-in method of an object)

myobject.mymethod(<args>)

where <args> specifies the arguments passed to the method (if any).

Like many other objects, the string object also has many built-in methods as listed and illustrated in the examples below. For more examples, see online resources such as w3schools:

  • s.capitalize(): returns a copy of s with the first character converted to uppercase and all other characters converted to lowercase. Non-alphabetic characters are unchanged.

  • s.lower(): returns a copy of s with all alphabetic chars in lowercase

  • s.swapcase(): returns a copy of s with uppercase alphabetic chars #converted to lowercase and vice versa:

  • s.title(): returns a copy of s with the first letter of each word in uppercase and remaining letters are lowercase

s = 'foO BaR BAZ# quX123'
print("original:", s, "\ns.capitalize() output:", s.capitalize())
print("original:", s, "\ns.lower() output:", s.lower())
print("original:", s, "\ns.swapcase() output:", s.swapcase())
print("original:", s, "\ns.title() output:", s.title())
print("what's happened to ted's IBM stock?".title())
original: foO BaR BAZ# quX123 
s.capitalize() output: Foo bar baz# qux123
original: foO BaR BAZ# quX123 
s.lower() output: foo bar baz# qux123
original: foO BaR BAZ# quX123 
s.swapcase() output: FOo bAr baz# QUx123
original: foO BaR BAZ# quX123 
s.title() output: Foo Bar Baz# Qux123
What'S Happened To Ted'S Ibm Stock?
  • s.upper(): returns a copy of s with all alphabetic chars as uppercase

  • s.isalnum(): returns True if s is nonempty and all its characters are alphanumeric (either a letter or a number), and False otherwise.

  • s.isalpha(): returns True if s is nonempty and all its characters are alphabetic and False otherwise.

s = 'foO BaR BAZ# quX123'
print("original:", s, "\ns.upper() output:", s.upper())

s = 'abc123' 
if s.isalnum():
    print(s + ' consists of only alphanumeric character')
else:
    print(s + ' contains non-alphanumeric character')
    
s = 'abc$123' 
if s.isalnum():
    print(s + ' consists of only alphanumeric character')
else:
    print(s + ' contains non-alphanumeric character')

s = 'abc$123' 
if s.isalpha():
    print(s + ' consists of only alphabetic character')
else:
    print(s + ' contains non-alphabetic character')

s = 'ABCabc' 
if s.isalpha():
    print(s + ' consists of only alphabetic character')
else:
    print(s + ' contains non-alphabetic character')
original: foO BaR BAZ# quX123 
s.upper() output: FOO BAR BAZ# QUX123
abc123 consists of only alphanumeric character
abc$123 contains non-alphanumeric character
abc$123 contains non-alphabetic character
ABCabc consists of only alphabetic character

Lists, Tuples, and Dictionaries#

Python lists#

In Python, a list is a collection of objects. We can think of it as a dynamic flexible array.

Lists are defined with square bracket: mylist = [<sequence of objects>]

For example:

mylist = [‘Roses’, ‘Orchids’, ‘Lilies’, ‘Daisies’]

Singleton list is a list with only one element.

The elements inside a list :

  • Are ordered.

  • Can be any arbitrary objects.

  • Can be accessed by index.

  • Do not have to be unique

The number of elements of a list is only limited by your computer memory.

Lists can be nested to arbitrary depth.

Lists are mutable and dynamic.

# Lists are ordered

listA = ['donut', 'croissant', 'burger', 'cookie']
listB = ['cookie', 'burger', 'croissant', 'donut']

if listA == listB:
    print("Lists are NOT ordered")
else:
    print("Lists are ordered")
    
if [1, 2, 3, 4] == [4, 1, 3, 2]:
    print("Lists are NOT ordered")
else:
    print("Lists are ordered")
Lists are ordered
Lists are ordered
# Lists can contain any abritray object

#The elements of a list can all be the same type:
numlist = [2, 4, 6, 8]
print("The element of numlist: ", numlist)

#or varying types:
mixlist = [19, 'tea', 1, 200, 'bubble', False, 3.14]
print("The element of mixlist: ", mixlist)

#Lists can even contain complex objects (functions, classes, and modules)
complexlist = [int, float, len]
print("The element of complexlist: ", complexlist)

#List elements do not have to be unique
multilist = ['Toyota', 99, 'Nissan', 'Goodyear', 'Nissan']
print("The element of multilist: ", multilist)
The element of numlist:  [2, 4, 6, 8]
The element of mixlist:  [19, 'tea', 1, 200, 'bubble', False, 3.14]
The element of complexlist:  [<class 'int'>, <class 'float'>, <built-in function len>]
The element of multilist:  ['Toyota', 99, 'Nissan', 'Goodyear', 'Nissan']

Working with list element index#

Similar to the case of string, we can work with the elements of a list using the built-in indexing. In the example below, if we define:

mylist = ['foo', 'bar', 'baz', 'qux', 'quux', 'corge']

Then, the index to the elements of mylist is shown in the following diagram:

List element indexing

So, in this example, mylist[0] evaluates to foo and mylist[3] evaluates to qux

Similarly to string again, we can also work with negative indexing of the list elements as shown in the diagram below.

List element negative indexing

Therefore, in this example, mylist[-5] evaluates to bar and mylist[6] evaluates to foo

mylist = ['foo', 'bar', 'baz', 'qux', 'quux', 'corge']

print(mylist[0])
print(mylist[3])

for i in range(1,3):
    print("Element number ", i, "is:", mylist[i])

print("Negative indexing")
print(mylist[-5])
for i in range(-6,-1,2):
    print("Element number ", i, "is:", mylist[i])
foo
qux
Element number  1 is: bar
Element number  2 is: baz
Negative indexing
bar
Element number  -6 is: foo
Element number  -4 is: baz
Element number  -2 is: quux

List slicing#

mylist[i:j] returns a slice of the elements from index \(i\) (inclusive) to, but not including, index \(j\)

mylist[i:] returns a slice of the elements from index \(i\) (inclusive) to the end of the list

mylist[:j] returns a slice of the elements from start of the list to, but not including, \(j\)

mylist = ['foo', 'bar', 'baz', 'qux', 'quux', 'corge']

print(f"mylist contains: {mylist}")
print("Slicing element 2 to 4:", mylist[2:5])
print("Slicing element -5 to -3", mylist[-5:-2])
print(mylist[-5:-2] == mylist[1:4])

#Omitting the first index starts the slice at the beginning of the list
firstpart = mylist[:4]
print("First part:", firstpart)

#Omitting the second index extends the slice to the end of the list:
secondpart = mylist[4:]
print("Second part:", secondpart)

# Addition operator on a list
print("Added together:", firstpart + secondpart)
print((mylist[:4] + mylist[4:]) == mylist)
mylist contains: ['foo', 'bar', 'baz', 'qux', 'quux', 'corge']
Slicing element 2 to 4: ['baz', 'qux', 'quux']
Slicing element -5 to -3 ['bar', 'baz', 'qux']
True
First part: ['foo', 'bar', 'baz', 'qux']
Second part: ['quux', 'corge']
Added together: ['foo', 'bar', 'baz', 'qux', 'quux', 'corge']
True

Slicing [:] with string vs list#

The [:] slicing operation works slightly differently between string and list objects.

  • if s is a string, then s[:] returns a reference (i.e. pointer) to s. In this case, s and s[:] are the same object.

  • if m is a list, then m[:] returns a copy (or duplicate) of m. In this case, m and m[:] are two different objects.

# For an s string, s and s[:] are the same object
s = "My string"
print(s is s[:])

# For a list m, m and m[:] are two objects. This is because list slicing creates duplicates
m = ['foo', 'bar', 'baz', 'qux', 'quux', 'corge']
print(m is m[:])
True
False

List operator and built-in methods#

Similar to string, there are operators and built-in methods associated with lists.

  • in and not in operators

car = ['Toyota', 'Nissan', 'Volvo', 'Holden']
if 'donut' in car:
    print("donut is a car")
else:
    print("donut is not a car") 
print("Nissan is not a car. Is this statement true?", "Nissan" not in car)
donut is not a car
Nissan is not a car. Is this statement true? False
['Toyota', 'Nissan', 'Volvo', 'Holden', 'Toyota', 'Nissan', 'Volvo', 'Holden', 'Toyota', 'Nissan', 'Volvo', 'Holden']
Number of elements: 4
The minimum element in food list: burger
The maximum element in food list: donut
The minimum element in numlist: -1
The maximum element in numlist 8
  • The * operator creates a new list containing replication of the list

car = ['Toyota', 'Nissan', 'Volvo', 'Holden']
print(car * 3)
  • Built-in list methods: len(), min(), max()

(for more details and examples on built-in list methods see W3 schools).

food = ['donut', 'croissant', 'burger', 'cookie']
print("Number of elements:", len(food))
print("The minimum element in food list:", min(food))
print("The maximum element in food list:", max(food))

numlist = [8, 4, -1, 7]
print("The minimum element in numlist:", min(numlist))
print("The maximum element in numlist", max(numlist))
Number of elements: 4
The minimum element in food list: burger
The maximum element in food list: donut
The minimum element in numlist: -1
The maximum element in numlist 8

Nested lists#

A nested list is a lists with any list as its elements as shown in the example below.

Nested list

In this case, the usual indexing and slicing syntax applies to the sublist within the nested list.

Note

Functions and operators does not apply recursively through nested list

# indexing and slicing with nested list
x = ['a',['bb',['ccc','ddd'],'ee','ff'],'g',['hh', 'ii'],'j']

print("x[0] is", x[0])
print("x[2] is", x[2])
print("x[4] is", x[4])

print("x[1][1] is", x[1][1])
print("x[1][1][0] is", x[1][1][0])

print("x[1][1][-1] is :", x[1][1][-1])
print("x[1][1:3] is:", x[1][1:3])
print("x[3][::-1] is:", x[3][::-1])

# function and operator does not work recursively through nested lists
print("Number of elements in x is", len(x))

print('ddd is in x?', 'ddd' in x)
print('ddd is in x[1]?', 'ddd' in x[1])
print('ddd is in x[1][1]?', 'ddd' in x[1][1])
x[0] is a
x[2] is g
x[4] is j
x[1][1] is ['ccc', 'ddd']
x[1][1][0] is ccc
x[1][1][-1] is : ddd
x[1][1:3] is: [['ccc', 'ddd'], 'ee']
x[3][::-1] is: ['ii', 'hh']
Number of elements in x is 5
ddd is in x? False
ddd is in x[1]? False
ddd is in x[1][1]? True

Lists are mutable#

Immutable means the value cannot be changed; but, we can assign a whole new value. In contrast, the value of mutable objects can be changed. Since lists are mutable, we can add, delete, shift, move their elements around (without needing to make a “whole” reassignment/rebinding).

Recall, scalar/atomic object (int and float) cannot be divided further. They are immutable and we change their values through assignments. Strings, while they are non-scalar/composite object (reducible to smaller components, i.e., chars), they too are immutable. This is why we cannot change the value directly to any of a string’s element via indexing or slicing. Instead, with string, we need to do reassignment/rebinding.

# string is immutable
s = "melbourne"
s[0] = "M"

print(s)
---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
Cell In[36], line 3
      1 # string is immutable
      2 s = "melbourne"
----> 3 s[0] = "M"
      5 print(s)

TypeError: 'str' object does not support item assignment
# lists are mutable, item assignment is OK

car = ['Toyota', 'Nissan', 'Volvo', 'Holden']
car[0] = 'Honda'
print("After changing 1st element to Honda:", car)

# we can even delete an element of a list
del(car[2])
print("After deleting 3rd element", car)
After changing 1st element to Honda: ['Honda', 'Nissan', 'Volvo', 'Holden']
After deleting 3rd element ['Honda', 'Nissan', 'Holden']

Slice assignment and iterable object#

Iterable objects are objects which contains other objects as elements. For examples, strings and lists are iterable objects.

A slice assignment is when we are changing multiple values of a list at once.

Given an iterable object x, we can perform a slice assignment of the iterable object x using the following syntax:

x[m,n] = <iterable>

See the following examples.

# slice assignment: changing multiple values of a list

food = ['donut', 'croissant', 'burger', 'cookie']
print("Original list", food)

print("Replacing 2nd element with iterables ['bread','chips']")
food[1:2] = ['bread', 'chips']
print("The modified list is:", food)

#zero length slice [n:n]: inserting elements into a list without removing anything
car = ['Toyota', 'Nissan', 'Volvo', 'Holden']
car[4:4] = ['Honda', 'Mazda', 'Porsce', 'Ford']
print("After adding four cars:", car)

#empty slice assignment results in deletion
car[0:2] = []
print("After deleting the first two elements", car)
Original list ['donut', 'croissant', 'burger', 'cookie']
Replacing 2nd element with iterables ['bread','chips']
The modified list is: ['donut', 'bread', 'chips', 'burger', 'cookie']
After adding four cars: ['Toyota', 'Nissan', 'Volvo', 'Holden', 'Honda', 'Mazda', 'Porsce', 'Ford']
After deleting the first two elements ['Volvo', 'Holden', 'Honda', 'Mazda', 'Porsce', 'Ford']

Prepending and appending list#

We can use the concatenating operator (+) to add elements to a list, but these added elements must be in a list format. Therefore, even if we only want to append one element to a list, we need to set the new element as a singleton list first, and then concatenate it to the existing list.

# Prepending and appending a list

food = ['donut', 'croissant', 'burger', 'cookie']
print("Original list", food)
print("After prepending with 'bread': ", ['bread'] + food) # note ['bread'] is a singleton list
print("After appending with 'pie' and 'muffin': ", food + ['pie', 'muffin'])
Original list ['donut', 'croissant', 'burger', 'cookie']
After prepending with 'bread':  ['bread', 'donut', 'croissant', 'burger', 'cookie']
After appending with 'pie' and 'muffin':  ['donut', 'croissant', 'burger', 'cookie', 'pie', 'muffin']

Converting a string to a list#

String’s built-in method .split() allows us to split a string into a list of the words forming the string. Here, words are defined as anything in the string separated by a space.

# From string to list
mystring = "At 170cm high, he is not particularly tall"
mywordlist = mystring.split()
print(mystring)
print(mywordlist) # notice 'high,' with a commma at the end is a single "word"
At 170cm high, he is not particularly tall
['At', '170cm', 'high,', 'he', 'is', 'not', 'particularly', 'tall']

List comprehension#

A list comprehension is a pythonic-way of iterating the elements of a list instead of using a for-loop, for example.

# Suppose we want to built up a list usign for-loop as follows
iterable = range(5)
print(iterable)
# Create an empty list
new_list = []
for i in iterable:
    new_list += [i*4]  # Append each element of the list
print(new_list)

# The same list can be  created using list comprehension
new_list2 = [i*4 for i in iterable]
print(new_list2)
range(0, 5)
[0, 4, 8, 12, 16]
[0, 4, 8, 12, 16]

Tuples#

Python Tuples are ordered iterables very similar to Python Lists. However, unlike lists, tuples are immutable.

We define a tuple by enclosing its elements in parentheses: ().

Note

Even though tuples are defined using parentheses, to index and slice tuples we still need to use square brackets (just like the case of strings and lists).

Why do we want to use a tuple instead of a list?

  • Tuple manipulation is faster to execute.

  • Sometimes you don’t want the data to be modified (immutability is desired).

  • Tuple is required to construct Python Dictionary.

We can work with tuples using indexing too.

# examples of working with tuples

atuple = (5, 6, 12, -1)
print(atuple)

mytuple = ('chicken', 'dog', 'cat', 'bird', 'frog', 'fox')
print(mytuple[-1]) # notice indexing is using square brackets [] like in list

# getting every 2nd element of the tupple
x = mytuple[1::2]
print(x)

# reordering the element backward
print(x[::-1])
(5, 6, 12, -1)
fox
('dog', 'bird', 'fox')
('fox', 'bird', 'dog')

Tuple assignment: packing and unpacking#

Packing is when we assign a tuple to a variable.

t = ('foo', 'bar', 'baz', 'qux’)

packing

Unpacking is when we assign the elemments of a tuple to a tuple of variable(s)

t = ('foo', 'bar', 'baz', 'qux’)
(s1, s2, s3, s4) = t
# tuple assignment (packing and unpacking)

t = ('foo', 'bar', 'baz', 'qux')
for i in range(0,4):
    print("t["+str(i)+"] is", t[i])

(s1, s2, s3, s4) = t

print("s1 =", s1, "; s2= ",s2, "; s3 =", s3, "; s4 =", s4)
t[0] is foo
t[1] is bar
t[2] is baz
t[3] is qux
s1 = foo ; s2=  bar ; s3 = baz ; s4 = qux

Casting: tuple <-> list#

We can cast a list object to a tuple object and vise versa uisng the list() and tuple() casting “functions”.

# From tuple to list and vice versa

mytuple = ('foo', 'bar', 'baz', 'qux')
print(mytuple)
print(f"the object type of mytuple is {type(mytuple)}")
print(f"after a list() casting, the object type of mytuple is: {type(list(mytuple))}")
print(mytuple) # we have not replace mytuple with the list() casted version

mylist = ['foo', 'bar', 'baz', 'qux']
print(type(mylist))
print(type(tuple(mylist)))
print(mylist) # the original mylist has not been replaced with tuple(mylist)

mynewtuple = tuple(mylist)
print(mynewtuple) # now mylist becomes a tuple
('foo', 'bar', 'baz', 'qux')
the object type of mytuple is <class 'tuple'>
after a list() casting, the object type of mytuple is: <class 'list'>
('foo', 'bar', 'baz', 'qux')
<class 'list'>
<class 'tuple'>
['foo', 'bar', 'baz', 'qux']
('foo', 'bar', 'baz', 'qux')

Python Dictionaries#

A dictionary is a comma-separated list of <key>:<value> pairs enclosed in curly braces {}:

d = {<key>:<value>, <key>:<value>, ..., <key>:<value>}

Note: the <key>s must be immutable and unique (duplicate keys are not allowed).

Dictionaries and lists share the following characteristics:

  • Mutable.

  • Dynamic.

  • Nestable.

Dictionaries differ from lists primarily in how elements are accessed:

  • The value of list elements is accessed via indexing.

  • The value of dictionary elements is accessed via the associated <key>.

# Defining a dictionary, an example
capital = {'VIC' : 'Melbourne',
           'NSW' : 'Sydney',
           'QLD' : 'Brisbane',
           'ACT' : 'Canberra',
           'SA'  : 'Adelaide',
           'WA'  : 'Perth',
           'NT'  : 'Darwin'
}
print(capital)
print(type(capital))
{'VIC': 'Melbourne', 'NSW': 'Sydney', 'QLD': 'Brisbane', 'ACT': 'Canberra', 'SA': 'Adelaide', 'WA': 'Perth', 'NT': 'Darwin'}
<class 'dict'>

Dictionary constructor function: dict()#

We can also define a dictionary using the contructor function: dict(). In this case, the input is a list of tuples:

d = dict([(<key>, <value>), (<key>, <value>), ... ])
# Defining a dictionary using the constructor function dict()
capital2 = dict([('VIC','Melbourne'),
                ('NSW','Sydney'),
                ('QLD','Brisbane'),
                ('ACT','Canberra'),
                ('SA','Adelaide'),
                ('WA','Perth'),
                ('NT','Darwin')])
print(capital2)
print(type(capital2))

# if the keys are simple string, then they can be specified as keyword arguments (args)
capital3 = dict(VIC='Melbourne',
                NSW='Sydney',
                QLD='Brisbane',
                ACT='Canberra',
                SA='Adelaide',
                WA='Perth',
                NT='Darwin')
print(capital3)
print(type(capital3))
{'VIC': 'Melbourne', 'NSW': 'Sydney', 'QLD': 'Brisbane', 'ACT': 'Canberra', 'SA': 'Adelaide', 'WA': 'Perth', 'NT': 'Darwin'}
<class 'dict'>
{'VIC': 'Melbourne', 'NSW': 'Sydney', 'QLD': 'Brisbane', 'ACT': 'Canberra', 'SA': 'Adelaide', 'WA': 'Perth', 'NT': 'Darwin'}
<class 'dict'>

Accessing dictionary element#

We can access the value of an element of dictionary by specifying its key in square brackets [].

d = dict([(<key1>, <value>), (<key2>, <value>), ... ])
x = d[<key2>]
#Retriving element value by specifying its key in square brackets ([]):
capital = {'VIC' : 'Melbourne',
           'NSW' : 'Sydney',
           'QLD' : 'Brisbane',
           'ACT' : 'Canberra',
           'SA'  : 'Adelaide',
           'WA'  : 'Perth',
           'NT'  : 'Darwin'
}
print(capital['VIC'])

#Referring to non-existent key raises a KeyError exception
print(capital['TAS'])
Melbourne
---------------------------------------------------------------------------
KeyError                                  Traceback (most recent call last)
Cell In[51], line 13
     10 print(capital['VIC'])
     12 #Referring to non-existent key raises a KeyError exception
---> 13 print(capital['TAS'])

KeyError: 'TAS'

An empty dictionary and how to fill it up#

We can specify an empty dictionary which can then be filled up.

d = {}
# Empty dictionary and filling it incrementally

# an empty dictionary called person
person = {}
print(type(person))

# building up the dictionary
person['firstname'] = 'Joe'
person['lastname'] = 'Fonebone'
person['age'] = 51
person['spouse'] = 'Edna'
person['children'] = ['Ralph', 'Betty', 'Joey']
person['pets'] = {'dog': 'Fido', 'cat': 'Sox'}

# accessing the element values
print(person)
print(person['firstname'])
print(person['age'])
print(person['children'])
<class 'dict'>
{'firstname': 'Joe', 'lastname': 'Fonebone', 'age': 51, 'spouse': 'Edna', 'children': ['Ralph', 'Betty', 'Joey'], 'pets': {'dog': 'Fido', 'cat': 'Sox'}}
Joe
51
['Ralph', 'Betty', 'Joey']

sublist and subdictionary#

When we define the person dictionary as follows, we notice that the value of person['children] is a list and person['pets] is a dictionary.

person = {}
person['firstname'] = 'Joe'
person['lastname'] = 'Fonebone'
person['age'] = 51
person['spouse'] = 'Edna'
person['children'] = ['Ralph', 'Betty', 'Joey']
person['pets'] = {'dog': 'Fido', 'cat': 'Sox'}

To access the elements of these sublist and subdictionary, we use additional index or key appropriately.

#  accessing sublist/subdictionary
person = {}
person['firstname'] = 'Joe'
person['lastname'] = 'Fonebone'
person['age'] = 51
person['spouse'] = 'Edna'
person['children'] = ['Ralph', 'Betty', 'Joey']
person['pets'] = {'dog': 'Fido', 'cat': 'Sox'}

#since 'children' refers to a sublist element
print(person['children'][-1])

#since 'pets' refers to a subdictionary element
print(person['pets']['cat'])
Joey
Sox

Testing, Debugging, and Exception#

Defensive Programming#

In a defensive programming, it is our objective to eliminate any bug from our program. How can we achieve this? By setting up our programming style in such a way that our program facilitates for testing and debugging. That is, for example, we need to:

  • Modularise the program (use functions with specific purposes instead of a single long program with many purposes)

  • Write specifications for the functions

  • Check conditions on inputs and output (that is, use assertions)

Testing and validation refer to the step where we compare input/output pairs to program/function specifications. This is done to find out when the program fails to work and ask how we can break our program.

Debugging is a study/analysis of programming events that lead up to program error with the intention of finding out why the program is not working and how to fix it.

Steps for easy testing and debugging#

From the beginning, we need to design our codes to ease this part. For example, we need to break the program up into modules that can be tested and debugged individually. We need to document all constraints that are applicable on each modules to inform what the expected inputs and outputs are and to provide any assumption behind code design.

When should we test?

  • When we want to ensure that the codes run. Usually this is done by testing to remove syntax errors or static semantic errors. Note that these errors can usually be detected by Python interpreter.

  • When we want to ensure that we have the set of expected results produced by our program. To do this, we need to have a test input set and for each input, the expected output. This test cannot be done by Python interpreter alone.

Depending on the scope, we can define three different types of test:

  • Unit testing

    • Validating each piece of the program

    • Testing each function separately

  • Regression testing

    • Adding tests for bugs as you find them

    • Catching any reintroduced error previously fixed

  • Integration testing -Does overall program work?

We tend to rush in doing the integration testing and we may need to redo each of these test incorporating the findings of the earlier test cycles.

Debugging#

As said before, in defensive programming, our goal is to have a bug-free program. In this case, debugging is often essential. However, debugging is difficult to do (for beginner) because of a steep learning curve. To make it easier, there are some debugging tools and tecniques including:

  • pdb module (built in to IDLE and Anaconda)

  • Python Tutor (https://pythontutor.com/)

  • Use lots of print statement

  • Use your brain (be systematic in your hunt for bugs)

Debugging with print() statements is often a good way to test hypothesis. There are several areas when/where we can issue print statements for debugging purposes:

  • Entering function or Loop

  • Parameters

  • Function results

We can also implement the bisection method in which we put print statement halfway in code and then decide where bug may be located within the program depending on values.

Understanding Python’s error messages such as listed below is crucial for a successul debugging.

  • IndexError: Accessing beyond the limits of a list

  • TypeError: Converting an inappropriate type

  • NameError: Referencing a non-existent variable

  • TypeError: Mixing data types without appropriate coercion

  • SyntaxError: Forgetting to close parenthesis, quotation, etc.

However, logic errors are usually much harder to find and solve. Hence, it is advisable to:

  • Think carefully before writing new code by ensuring our pseudocodes work. Drawing logic flow diagram can be very helpful.

  • Take a break. Don’t introduce new complex codes when you are already tired.

  • Try to explain the logic of your codes to someone else (a co-pilot) or to a non-expert.

The don’t(‘s) and do(s) of debugging include:

DON’T

DO

Write the entire porgram

Write a function

Test te entire program

Test a function

Debug the entire program

Debug a function

Then do integration testing

.

DON’T

DO

Change code

Backup code

Memorise where the bug was

Change code

Test code

Write down potential bug in comment

Forget where bug was or changes made

Test code

Panic

Compare new version with old

Python Exception#

When our code hits an unexpected condition, we may get an exception (… to what was expected) Some common Python built-in exceptions are already listed earlier and relisted below:

  • IndexError: assessing element beyond the list’s boundaries

  • SyntaxError: Python can’t parse program

  • NameError: local or global name not found

  • AttributeError: attribute reference fails

  • TypeError: operand doesn’t have correct type

  • ValueError: operand type okay, but value is illegal

  • IOError: IO system reports malfunction (e.g. file not found)

The SyntaxError exception will be easiest to detect and fixed if we use syntax aware IDE such as Spyder as shown in the screenshot below:

syntaxerror

# SyntaxError from forgotting to enclose a string
mystring = 'This string has no trailing apostrophe
print(mystring)
  Cell In[55], line 2
    mystring = 'This string has no trailing apostrophe
                                                      ^
SyntaxError: EOL while scanning string literal

try ... except block#

If there is any error during run time, our program will stop or crash unless we have prepared instructions to handle the error when it occurs. Python provides ways to handle exceptions such as the one shows in the following diagram using the try ... except block.
TryExcept

This way, exceptions raised by any statement in body of try are handled by the except statement and execution continues with the body of the except statement.

For example, in the following code block, if the user provide an invalid number as input (say 0 in the second number), then it will not print a/b because ZeroDivision exception. Instead, “Bug in user input.” will be printed. We may of course do other things than printing “Bug in user input.” in the except block. For example, if we are in a loop, we can break the loop if there is an exception detected in the try block.

# Handling exceptions using the "try ...except" block
try:
    a = int(input("Tell me one number:"))
    b = int(input("Tell me another number:"))
    print(a/b)
except:
    print("Bug in user input.")

# let's see the exception message if there is any
print(a/b)
Bug in user input.

Given the many possible causes of errors and that Python interpreter can provide us with various exception messages, often we need to handle different exception differently. In the example below, we distinguish ValueError when the user enter non-numerical value and ZeroDivisionError when the user enter numbers which result in zero division. To each case we provide a tailored message. Lastly, we capture any other exception and provide a message about it.

Let’s look at the try-block:

a = int(input("Tell me one number: ")) 
b = int(input("Tell me another number: ")) 
print(a/b)

The int() part in the first statements (a=...) will produce a ValueError exception when the input is not a number. If there is no ValueError exception raised, then the next statement (b=...) is executed. If there is no ValueError exception, then the third statement (print(a/b)) is executed. In particular, if b=0 then a/b will raise a ZeroDivisionError exception.

#%% Handling specific exceptions
try:
    a = int(input("Tell me one number: "))
    b = int(input("Tell me another number: "))
    print("a/b = ", a/b)
    print("a+b = ", a+b)
# if ValueError exception give appropriate message
except ValueError:
    print("Could not convert to a number.")
# if ZeroDivisionError
except ZeroDivisionError:
    print("Can't divide by zero")
#if any other type of exception
except:
   print("Something went very wrong.")
a/b =  1.25
a+b =  9

try ... except ... else block#

Sometimes we want to execute different statements when there are exceptions and when there are no exceptions. This can be accomplished by using the try...except...else block. First, within the try block we put statements which we believe could raise exceptions. Then, these possible exceptions are handled within the except block. However, if no exceptions were raised, we execute different statements.

tryexceptelse

In the example below, we know that input conversion into int and the computation of division could raise exception. Hence, we put their statements within the try block. We then handle three possible types of exceptions that could be raised. If there is no exception, we are sure that division and addition were computed correctly and therefore display the results.

# try ... except ... else
try:
    a = int(input("Tell me one number: "))
    b = int(input("Tell me another number: "))
    division = a/b
    addition = a+b
# if ValueError exception give appropriate message
except ValueError:
    print("Could not convert to a number.")
# if ZeroDivisionError
except ZeroDivisionError:
    print("Can't divide by zero")
#if any other type of exception
except:
   print("Something went very wrong.")
else:
    print("a/b =", division)
    print("a+b =", addition)

try ... except ... else ... finally block#

If, regardless of exception occurance, we want to execute some statements, we can use the finally block in which all statements will always be executed.

tryfinally

# try ... except ... else ... finally
again = True
while again:
    try:
        a = int(input("Tell me one number: "))
        b = int(input("Tell me another number: "))
        division = a/b
        addition = a+b
    # if ValueError exception give appropriate message
    except ValueError:
        print("Could not convert to a number.")
    # if ZeroDivisionError
    except ZeroDivisionError:
        print("Can't divide by zero")
    #if any other type of exception
    except:
        print("Something went very wrong.")
    else:
        print("a/b =", division)
        print("a+b =", addition)
    finally:
        again = ('yes' == input("Do you want to try again (yes or no)? ",))

Custom function (A quick intro)#

In our previous discussions about the DO’s and the DON’Ts in defensive programming, there is an emphasis on the use of functions, each to accomplish a specific task, instead of a long program that can achieve many different tasks.

In Python, we can define a custom function as follows:

def function_name(argument1, argument2, ...):

    <code that performs specific function>

    return <values-to-return>

The keyword def followed by function_name and a set of arguments in parentheses () then a colon (:). The body of the function is indented 4 spaces and closed with a return statement even if there is no value to return.

In the example below we define a custom fucntion to compute sum of squares of two input numbers. Note that the example also shows how a function should be documented using a docstring. Lastly, the example shows how to invoke the function.

#%% Defining and invoking a custom function

# defining my custom function
# Note: Spyder will automatically prefilled your docstring
#       template as soon as you type the triple quotes
def sum_of_squares(x, y):
    """
    
    Parameters
    ----------
    x : TYPE
        DESCRIPTION.
    y : TYPE
        DESCRIPTION.

    Returns
    -------
    result : TYPE
        DESCRIPTION.

    """
    result = x**2 + y**2
    return result

print(f"Given (x,y) = (5,5), then x^2 + y^2 = {sum_of_squares(5,5)}")

print(f"Given (x,y) = (2,3), then x^2 + y^2 = {sum_of_squares(2,3)}")
Given (x,y) = (5,5), then x^2 + y^2 = 50
Given (x,y) = (2,3), then x^2 + y^2 = 13

The next custom function is to check whether a given string consists only of alphanumerical characters (that is, alphabet letters and numbers). The function uses string’s built-in method isalnum() (see W3 School) to do the checking. What our custom function adds is printing the appropriate message.

def myisalnum(s):
    if s.isalnum():
        print(s + ' consists of only alphanumeric character')
    else:
        print(s + ' contains non-alphanumeric character')
    
myisalnum('abc123')
myisalnum('abc$123')
abc123 consists of only alphanumeric character
abc$123 contains non-alphanumeric character