16/03/2026

#EECS#Python#Intro

← Back to MCEA: Mathematical Thinking and Python

Video

Lesson assets

alternate video

Alternate video: freeCodeCamp Python basics

A longer outside walkthrough if you want the same foundations taught with a different pacing and style.

recitation

Recitation 2 PDF

Homework 2 PDF

Thanks Spivak for the Exercise List.

Bases, Branching, Division, and Strings

Number Bases

This morning, we constructed the number systems $\mathbb{N}, \mathbb{Z}, \mathbb{Q}, \mathbb{R}$ and $\mathbb{C}$ , building each from the insufficiency of the last. Throughout that construction, every number was written in its familiar decimal form. But the string " $2812$ " is a representation of a number, not the number itself. Each digit occupies a position whose value is determined by successive powers of ten:

2812 = 2 \cdot 10^3 + 8 \cdot 10^2 + 1 \cdot 10^1 + 2 \cdot 10^0

The rightmost digit sits in the units place ( $10^0 = 1$ ), and moving left, each position is worth ten times the previous. The leftmost " $2$ " in $2812$ therefore represents two thousand, while the rightmost " $2$ " represents two. This is a positional numeral system: the meaning of a digit depends entirely on where it appears.

The choice of ten as the base is a biological convention, not a mathematical one, corresponding with the fact that most humans have ten fingers. For many purposes, it is inconvenient. Ten can only be divided evenly by four numbers ( $1, 2, 5$ and $10$ ), which limits the ease of performing fractional arithmetic. A base built on twelve, which can be divided evenly six ways ( $1, 2, 3, 4, 6$ and $12$ ), would be considerably more convenient. More consequentially for this course, digital circuits operate in two voltage states (high and low), making base two the natural language of computation. In binary, arithmetic can be carried out directly by sequences of logic gates in an electrical circuit.

It is therefore worthwhile to understand positional numeral systems in bases other than ten.

Definition 33 (Base-

b

Expansion)

Let $b$ be a natural number with $b > 1$ . The base- $b$ expansion of a natural number $n$ is the string of digits $d_r d_{r-1} \ldots d_0$ satisfying:

(i) $n = d_r \cdot b^r + d_{r-1} \cdot b^{r-1} + \cdots + d_0 \cdot b^0$ ;

(ii) Each digit satisfies $0 \leqslant d_i < b$ ; and

(iii) If $n > 0$ , then the leading digit $d_r \neq 0$ . The expansion of zero is $0$ in every base.

The definition is compact. Before working through examples, let us examine what each condition demands.

Condition (i) states that the digits encode how many copies of each power of $b$ sum to give $n$ . In decimal, the positions from right to left correspond to units, tens, hundreds, thousands, and so on. In binary, they correspond to $1, 2, 4, 8, 16, \ldots$ .

Condition (ii) restricts the available digits to $0, 1, \ldots, b - 1$ . In base four, the permissible digits are $0, 1, 2, 3$ . Permitting a digit with value $b$ would destroy uniqueness: if " $\text{X}$ " were a base-ten digit representing ten, then both " $\text{X}2$ " and " $102$ " would represent one hundred and two.

Condition (iii) forbids leading zeros among positive numbers. Without it, " $01423$ " and " $1423$ " would be distinct strings for the same number.

Remark

Certain bases carry standard names: the base- $2$ , $3$ , $8$ , $10$ and $16$ expansions are called binary, ternary, octal, decimal and hexadecimal, respectively.

Example 23

Consider the number $1023$ . Its decimal expansion is simply $1023$ :

1023 = 1 \cdot 10^3 + 0 \cdot 10^2 + 2 \cdot 10^1 + 3 \cdot 10^0

Since $1023 = 2^{10} - 1$ , every power of $2$ from $2^0$ to $2^9$ appears exactly once, giving the binary expansion $1111111111$ :

1023 = 1 \cdot 2^9 + 1 \cdot 2^8 + 1 \cdot 2^7 + 1 \cdot 2^6 + 1 \cdot 2^5 + 1 \cdot 2^4 + 1 \cdot 2^3 + 1 \cdot 2^2 + 1 \cdot 2^1 + 1 \cdot 2^0

For bases larger than ten, the ten decimal digits $0$ through $9$ are insufficient. The convention is to extend the digit set with letters: $\text{A} = 10, \text{B} = 11, \ldots, \text{Z} = 35$ . In base $36$ , the expansion of $1023$ is $\text{SF}$ :

1023 = 28 \cdot 36^1 + 15 \cdot 36^0 = \text{S} \cdot 36 + \text{F} \cdot 1

When writing a number by its base- $b$ expansion, we need notation to indicate which base is intended.

Definition 34 (Base-

b

Notation)

Let $b > 1$ and let $d_r, d_{r-1}, \ldots, d_0$ be base- $b$ digits. We write

(d_r d_{r-1} \ldots d_0)_b = d_r \cdot b^r + d_{r-1} \cdot b^{r-1} + \cdots + d_0 \cdot b^0

for the natural number with this base- $b$ expansion. When no subscript is present and a base is not otherwise specified, the expansion is assumed to be decimal.

Example 24

Applying this notation, we can express $1023$ simultaneously in several bases:

1023 = (1111111111)_2 = (1101220)_3 = (1777)_8 = (1023)_{10} = (3\text{FF})_{16} = (\text{SF})_{36}

Python provides built-in functions for converting between the most common bases. The functions bin, oct and hex each return a string prefixed with 0b, 0o or 0x to indicate the base:

n = 1023

bin(n)   # '0b1111111111'
oct(n)   # '0o1777'
hex(n)   # '0x3ff'

These prefixes also serve as integer literals, so base conversions can be verified directly:

a = 0b1111111111   # binary
b = 0o1777         # octal
c = 0x3FF          # hexadecimal

a == b == c == 1023   # True

To convert a string written in an arbitrary base back to an integer, int accepts an optional second argument specifying the base:

int('1101220', 3)   # 1023
int('sf', 36)       # 1023

Problem 28

Find the binary, ternary, octal, hexadecimal and base- $36$ expansions of the number $1729$ by hand, using letters $\text{A}$ through $\text{F}$ as additional hexadecimal digits and $\text{A}$ through $\text{Z}$ for base $36$ . Verify your answers in Python.

Problem 29

The octal expansion of a certain natural number is $(2745)_8$ . Determine its decimal value and its hexadecimal expansion.

Problem 30

A three-digit number in base $b$ has the form $(d_2 d_1 d_0)_b$ where $d_2 \neq 0$ . What are the smallest and largest possible values of such a number, expressed in terms of $b$ ?

Branching Programs and Control Flow

The algorithms and computational sequences formulated thus far possess a strictly sequential topology. They are straight-line programmes, executing statements continuously in the exact order they are defined until the instructions are exhausted. While mathematically predictable, straight-line programmes are fundamentally limited. Because every instruction is executed exactly once, the maximum temporal duration of the computation is bounded entirely by the static length of the source code.

To construct more sophisticated imperative knowledge, we must introduce mechanisms that allow the interpreter to dynamically alter its execution path based on the state of the machine. We achieve this via branching programmes, which utilise conditional statements to dictate control flow.

Conditional Statements and Semantic Indentation

A conditional statement evaluates a proposition (Definition 1) and diverges the execution path based on whether the result is True ( $\top$ ) or False ( $\bot$ ). The most fundamental branching structure is the if-else construct.

This structure consists of:

A logical test evaluating strictly to True or False.
A primary block of instructions executed if and only if the test evaluates to True.
An optional secondary block (the else clause) executed if and only if the test evaluates to False.

Computationally, Python enforces this topological structure through strict semantic indentation. While formal mathematical texts use formatting for human readability, Python interpreters rely on whitespace to define the boundaries of logical blocks.

x = 14

if x % 2 == 0:
    print("The integer is even.")
else:
    print("The integer is odd.")

print("Branching complete.")

The expression x % 2 computes the remainder when x is divided by $2$ . Equivalently, it extracts the last digit of the binary expansion of x (Definition 33): it is $0$ when x is even and $1$ when x is odd.

If the terminal print statement were indented, the interpreter would classify it as part of the else block, fundamentally altering the programme’s static semantics. This architectural decision ensures that the visual hierarchy of the script perfectly mirrors its logical hierarchy.

Because indentation governs logic, the definition of a single line of code becomes paramount. To prevent long mathematical expressions from violating readability or indentation rules, Python permits explicit line continuation using the backslash (\) character, or implicit line continuation within enclosing structural brackets (parentheses, square brackets, or braces).

# Explicit line continuation
alpha = 1.61803398875 + \
        2.71828182845 + \
        3.14159265359

# Implicit line continuation (mathematically preferred)
beta = (1.61803398875 +
        2.71828182845 +
        3.14159265359)

Nested Conditionals and Mutually Exclusive Branching

When the resolution of one conditional block necessitates further logical evaluation, conditional statements may be nested. This constructs a decision tree where each subsequent branch is dependent on the prior evaluating to True or False.

For mutually exclusive propositions, writing deeply nested else and if blocks quickly becomes unreadable. Python condenses this structural nesting via the elif (else if) keyword, which evaluates a new proposition if and only if all preceding tests in the chain evaluated to False.

n = 15

if n % 2 == 0:
    if n % 3 == 0:
        print("n is a multiple of 6.")
    else:
        print("n is even, but not divisible by 3.")
elif n % 3 == 0:
    print("n is odd and divisible by 3.")
else:
    print("n is not divisible by 2 or 3.")

Algorithmic Efficiency and State Updating

Conditionals permit compound Boolean expressions, utilising the Python operators and, or and not (the counterparts of the logical connectives $\land$ , $\lor$ and $\neg$ from Lesson 1), to assess multiple variables simultaneously. However, how we structure these logical tests drastically impacts algorithmic elegance and computational efficiency.

Consider the problem of identifying the maximum positive integer among three variables: $a$ , $b$ and $c$ . If none are positive, the algorithm must return the minimum value among them.

A naive approach might construct a mutually exclusive test for every possible combination of states. Since each of the three variables can either be positive or non-positive, there are $2^3 = 8$ distinct cases. Exhaustively testing each using and statements requires cumbersome code and forces the interpreter to evaluate the same relational propositions repeatedly.

By leveraging the concept of variable binding and state updating, we can construct a significantly more elegant solution. We initialise a provisional binding, sequentially test individual propositions, and update the state strictly when required.

a, b, c = -5, 12, 7

# Assume none are positive initially; find the global minimum.
target_val = min(a, b, c)

# Sequentially update the state if a larger positive integer is discovered.
if a > 0:
    target_val = a
if b > 0 and b > target_val:
    target_val = b
if c > 0 and c > target_val:
    target_val = c

print(target_val)

This paradigm isolates the logical checks. It examines the polarity of each variable exactly once, demonstrating that well-structured imperative knowledge relies not just on valid syntax, but on the efficient manipulation of temporal state.

Conditional Expressions

For discrete, inline evaluations, Python supplies conditional expressions, which resolve an entire if-else branch into a single evaluable expression:

expression_if_true if proposition else expression_if_false

This permits the assignment of variables based on propositional logic without requiring multi-line blocks. A natural first application is the absolute value.

Definition 35 (Absolute Value)

The absolute value of a real number $x$ , denoted $|x|$ , is defined by

|x| = \begin{cases} x & \text{if } x \geqslant 0 \\ -x & \text{if } x < 0 \end{cases}

Equivalently, $|x|$ is the distance from $x$ to $0$ on the number line (Definition 20).

Because the absolute value is defined by cases, there are several ways to compute it. The most direct is a standard if block:

x = -10

if x < 0:
    x = -x

For a single-line body, Python permits writing the block on the same line as the if:

if x < 0: x = -x   # only for very short statements

One might also try to avoid branching entirely by exploiting the fact that True and False behave as $1$ and $0$ in arithmetic:

y = (x < 0) * (-x) + (x >= 0) * x   # this is awful

This works, but it is unreadable, evaluates both branches unconditionally, and obscures the logic. The conditional expression is the cleanest approach:

y = x if x >= 0 else -x

Python also provides the built-in abs for this purpose: abs(-10) returns 10.

With the absolute value defined, we can establish its most important mathematical property. Observe first that $|a| \geqslant 0$ for every real number $a$ , with $|a| = 0$ only when $a = 0$ . Moreover, every real number is bounded by its own absolute value:

-|a| \leqslant a \leqslant |a|

If $a \geqslant 0$ , then $|a| = a$ and $-|a| = -a \leqslant 0 \leqslant a$ . If $a < 0$ , then $|a| = -a > 0 > a$ and $-|a| = -(-a) = a$ . In both cases, the bound holds. This seemingly simple observation yields one of the most widely used inequalities in mathematics.

Theorem 7 (Triangle Inequality)

For all real numbers $a$ and $b$ ,

|a + b| \leqslant |a| + |b|

Proof

Since $-|a| \leqslant a \leqslant |a|$ and $-|b| \leqslant b \leqslant |b|$ , adding these inequalities gives

-(|a| + |b|) \leqslant a + b \leqslant |a| + |b|

Let $M = |a| + |b|$ . Then $-M \leqslant a + b \leqslant M$ . If $a + b \geqslant 0$ , then $|a + b| = a + b \leqslant M$ . If $a + b < 0$ , then $|a + b| = -(a + b) \leqslant M$ , since $a + b \geqslant -M$ . In either case, $|a + b| \leqslant |a| + |b|$ .

■

Problem 31

For which pairs of real numbers $a, b$ does equality hold in the Triangle Inequality? Prove your claim.

Constant Time Complexity

While branching conditionals drastically expand the utility of imperative programmes, they do not overcome the foundational limitation of straight-line execution.

If we define an atomic operation as taking one discrete unit of computational time, a branching programme comprising $N$ lines of code cannot possibly exceed $N$ units of time to run. Because each conditional block is strictly directional and no line is executed more than once, the maximum temporal execution is hard-bounded by a constant $k$ (where $k \leqslant N$ ).

Definition 36 (Constant Time)

An algorithm runs in constant time if its maximal execution time is bounded by a constant, independent of the size or magnitude of its inputs.

Remark (Beyond Constant Time)

Constant-time algorithms are highly restrictive. Consider the mathematical problem of calculating the factorial of a given integer $n$ . A constant-time branching programme cannot solve this for an arbitrary $n$ , as it would require a hard-coded conditional branch for every possible integer. To process computations whose execution length scales with the size of the input, our logical framework must be expanded to include cyclic control flow, abandoning constant-time limitations in favour of indefinite iteration.

Problem 32

Given integer variables a, b and c representing the coefficients of the polynomial $ax^2 + bx + c$ (Definition 31), write a Python programme that computes the discriminant $\Delta = b^2 - 4ac$ and prints whether the polynomial has two distinct real roots ( $\Delta > 0$ ), one repeated real root ( $\Delta = 0$ ), or no real roots ( $\Delta < 0$ ). Handle the case $a = 0$ separately: when $a = 0$ , the expression is linear, not quadratic.

Problem 33

Consider the following rule applied to integers: if $n$ is odd, replace it by $3n + 1$ ; if $n$ is even, replace it by $n/2$ . Write a Python programme that applies this rule three times in succession starting from a given positive integer n, using conditional expressions. Verify that starting from $n = 7$ , three applications produce $34$ .

Division and Divisibility

With the integers (Definition 24) and multiplication (Definition 25) established this morning, we can formalise one of the most fundamental relationships between them.

Definition 37 (Divisibility)

Let $a, b \in \mathbb{Z}$ . We say $b$ divides $a$ , written $b \mid a$ , if there exists an integer $q$ such that $a = qb$ . Equivalently: $a$ is divisible by $b$ , $b$ is a divisor (or factor) of $a$ , and $a$ is a multiple of $b$ .

Perhaps counterintuitively, this definition does not invoke the arithmetic operation of division (Definition 26). It is stated entirely in terms of multiplication.

Example 25

The integer $12$ is divisible by $1, 2, 3, 4, 6$ and $12$ , since

12 = 12 \cdot 1 = 6 \cdot 2 = 4 \cdot 3 = 3 \cdot 4 = 2 \cdot 6 = 1 \cdot 12

It is also divisible by the negatives of each: for instance, $(-3) \mid 12$ since $12 = (-4) \cdot (-3)$ .

Problem 34

Prove that $1$ divides every integer, and that every integer divides $0$ .

Remark (Zero Divides Zero)

A consequence is that $0 \mid 0$ , since $0 = q \cdot 0$ for any integer $q$ . This may seem surprising: we have been told our whole lives that we cannot divide by zero. But divisibility is not the same as performing division. Saying that $0$ divides $0$ simply means that $0$ can be multiplied by an integer to obtain $0$ , which is true. This does not imply that the expression $\frac{0}{0}$ can or should be meaningfully defined.

Using this definition, we can establish general facts about how divisibility propagates.

Theorem 8 (Transitivity of Divisibility)

Let $a, b, c \in \mathbb{Z}$ . If $c \mid b$ and $b \mid a$ , then $c \mid a$ .

Proof

Suppose $c \mid b$ and $b \mid a$ . By the definition of divisibility, there exist integers $q$ and $r$ such that $b = qc$ and $a = rb$ . Substituting the first equation into the second:

a = r(qc) = (rq)c

Since $rq$ is an integer, $c \mid a$ .

■

Problem 35

Let $a, b, d \in \mathbb{Z}$ . Suppose $d \mid a$ and $d \mid b$ . Prove that for any integers $u$ and $v$ , $d$ divides $au + bv$ .

We have used the words “even” and “odd” informally throughout these notes. To prove anything about evenness, we need a precise definition. Intuitively, an even integer is one that is “evenly divisible” by $2$ , but this circular phrasing is not a definition. One might try: $n$ is even if $\frac{n}{2}$ is an integer. This works in practice, but it depends on already knowing what “integer” means in a way that is hard to pin down without further machinery. A cleaner approach avoids division entirely and uses the concept we have just formalised.

Definition 38 (Even and Odd)

An integer $n$ is even if $2 \mid n$ , that is, if $n = 2k$ for some integer $k$ . An integer that is not even is odd.

This definition asks only for a multiplicative witness: to verify that $14$ is even, we exhibit $k = 7$ and check $14 = 2 \cdot 7$ . To verify that $9$ is odd, we must show that no integer $k$ satisfies $9 = 2k$ . With a precise definition in hand, we can now ask and answer questions that would otherwise be vague. Does the sum of two even numbers remain even? Is the product of an odd number and an even number always even? These are provable claims, not intuitions.

The Division Theorem

It is not just interesting to know when one integer divides another. Equally important is understanding what happens when the division is not exact. The following result, stated without proof, tells us that every integer division produces a unique quotient and remainder.

Theorem 9 (Division Theorem)

Let $a, b \in \mathbb{Z}$ with $b \neq 0$ . There exist unique integers $q$ and $r$ such that

a = qb + r \quad \text{and} \quad 0 \leqslant r < |b|

The integer $q$ is called the quotient and $r$ is called the remainder.

Example 26

The integer $12$ leaves a remainder of $2$ when divided by $5$ , since $12 = 2 \cdot 5 + 2$ . The quotient is $q = 2$ and the remainder is $r = 2$ .

What happens when the dividend is negative?

Theorem 10 (Negation of Remainders)

Let $a$ and $b$ be integers with $b > 0$ . If $a$ leaves a remainder of $r > 0$ when divided by $b$ , then $-a$ leaves a remainder of $b - r$ when divided by $b$ .

Proof

By the Division Theorem, $a = qb + r$ for some integer $q$ . Then

-a = -qb - r = -(q + 1)b + b - r = -(q + 1)b + (b - r)

Since $0 < r < b$ , we have $0 < b - r < b$ . The quotient of $-a$ when divided by $b$ is therefore $-(q + 1)$ , and the remainder is $b - r$ .

■

Problem 36

Prove that if an integer $a$ leaves a remainder of $r$ when divided by $b > 0$ , then $a$ leaves a remainder of $r$ when divided by $-b$ .

Python’s divmod returns both the quotient and remainder simultaneously, directly implementing the Division Theorem:

divmod(12, 5)    # (2, 2)   since 12 = 2*5 + 2
divmod(-12, 5)   # (-3, 3)  since -12 = (-3)*5 + 3

The quotient and remainder are also accessible individually via // (floor division) and % (modulo):

12 // 5    # 2
12 % 5     # 2

-12 // 5   # -3
-12 % 5    # 3

The second pair confirms the Negation of Remainders: $12$ leaves remainder $2$ when divided by $5$ , and $-12$ leaves remainder $5 - 2 = 3$ .

Computing Base Expansions

We can now connect the Division Theorem to the material on number bases. In Definition 33, we defined the base- $b$ expansion of a natural number, but did not describe how to compute one. The Division Theorem provides exactly this.

The natural number $n$ whose base- $b$ expansion is $d_r d_{r-1} \cdots d_1 d_0$ satisfies

n = d_0 + b(d_1 + b(d_2 + \cdots + b(d_{r-1} + b \, d_r) \cdots ))

Since $0 \leqslant d_i < b$ , dividing $n$ by $b$ yields remainder $d_0$ (the last digit) and quotient $n_0 = (n - d_0) / b$ , whose base- $b$ expansion is $d_r d_{r-1} \cdots d_1$ . Repeating this process extracts each digit in turn, from right to left, until the quotient reaches zero.

Example 27

We compute the base- $17$ expansion of $12345$ , using the letters $\text{A}$ through $\text{G}$ to represent $10$ through $16$ .

$12345 = 726 \cdot 17 + 3$ , so $d_0 = 3$ and the quotient is $726$ .

$726 = 42 \cdot 17 + 12$ , so $d_1 = 12 = \text{C}$ and the quotient is $42$ .

$42 = 2 \cdot 17 + 8$ , so $d_2 = 8$ and the quotient is $2$ .

$2 = 0 \cdot 17 + 2$ , so $d_3 = 2$ and the quotient is $0$ . We stop.

The base- $17$ expansion of $12345$ is therefore $(28\text{C}3)_{17}$ . Verification:

2 \cdot 17^3 + 8 \cdot 17^2 + 12 \cdot 17 + 3 = 9826 + 2312 + 204 + 3 = 12345

Each step of the algorithm applies the same pair of operations, // and %, to the current quotient:

n = 12345
b = 17

d0 = n % b       # 3
n0 = n // b       # 726

d1 = n0 % b       # 12 (C)
n1 = n0 // b      # 42

d2 = n1 % b       # 8
n2 = n1 // b      # 2

d3 = n2 % b       # 2
n3 = n2 // b      # 0  → stop

The repetitive structure is deliberate: each step is identical, differing only in the value of the quotient. Automating this process requires a mechanism for repeating instructions until a condition is met, which we do not yet possess. We will return to this when we introduce iteration.

Problem 37

Using the division algorithm above, find the base- $7$ and hexadecimal expansions of $9999$ by hand. Verify each answer in Python.

Strings and Input

When we classified Python’s type system in Lesson 1, we distinguished scalar objects (indivisible atoms like int and bool) from non-scalar objects that possess internal structure. We can now examine the simplest non-scalar type: str, which represents finite sequences of characters.

String Literals and Operator Overloading

A string literal is written by enclosing characters in matching quotation marks. Python accepts both single and double quotes, and the two forms are interchangeable:

greeting = 'hello'
message = "world"

It is essential to distinguish a numeric string from its corresponding integer. The literal '123' is a string of three characters; the literal 123 is an integer. These are objects of entirely different types, and confusing them is among the most common sources of error for new programmers.

The arithmetic operators + and *, which we have so far used for numerical computation, take on new meanings when applied to strings. The expression 'ab' + 'cd' evaluates to 'abcd': the + operator concatenates its operands end to end. The expression 3 * 'ha' evaluates to 'hahaha': the * operator, when applied to an integer and a string, repeats the string. There is a natural logic to this parallel: just as $3 \times 2 = 2 + 2 + 2$ in arithmetic, 3 * 'ha' is equivalent to 'ha' + 'ha' + 'ha'.

3 * 4       # 12
3 * 'a'     # 'aaa'
3 + 4       # 7
'a' + 'a'   # 'aa'

An operator whose behaviour depends on the types of its operands is said to be overloaded. This is not unique to programming; mathematical notation is overloaded routinely. The symbol " $+$ " denotes addition of natural numbers, integers, rationals, and reals, each of which was constructed as a distinct operation in the number systems we built this morning. Python merely extends this convention to non-numeric types.

Not every combination of types is permitted. Attempting to multiply two strings produces an error, because repetition requires one operand to be an integer:

'a' * 'a'
# TypeError: can't multiply sequence by non-int of type 'str'

Errors

When an instruction violates the language’s rules, the interpreter halts and reports an error. In Lesson 1, we distinguished the syntax of a language (its grammatical rules) from its static semantics (whether a grammatically valid string has a logical meaning) and its full semantics (what a programme actually does). Errors in Python fall into three corresponding categories.

A syntax error arises when the source code violates the grammatical rules of the language. The interpreter catches these before any code is executed:

print("Uh oh!)
# SyntaxError: unterminated string literal

A runtime error arises when a syntactically valid instruction encounters an illegal state during execution. The TypeError above is one example. Others include dividing by zero and referencing a name that has not been bound to any object:

print(1 / 0)
# ZeroDivisionError: division by zero

new_id
# NameError: name 'new_id' is not defined

A logical error is the most insidious: the programme executes without complaint but produces an incorrect result. Because the code is syntactically valid and encounters no illegal runtime states, the interpreter cannot detect the mistake. Only the programmer, by inspecting the output, can identify it.

print("2 + 2 =", 5)   # prints: 2 + 2 = 5

Indexing, Slicing, and Length

Strings are ordered sequences of characters, and Python provides operations to inspect and extract their contents.

The built-in len returns the number of characters in a string:

len('abc')   # 3
len('')      # 0

Individual characters are extracted by indexing. Python uses zero-based indexing: the first character occupies position $0$ , the second position $1$ , and in general the $k$ -th character (counting from one) occupies position $k - 1$ . For a string of length $n$ , the valid indices are $0, 1, \ldots, n - 1$ . Accessing an index outside this range produces an error:

s = 'abcde'

s[0]   # 'a'
s[4]   # 'e'
s[5]   # IndexError: string index out of range

Negative indices count backwards from the end: s[-1] is the last character, s[-2] the second-to-last, and so on.

s[-1]   # 'e'
s[-5]   # 'a'

A slice extracts a substring. The expression s[i:j] returns the characters from index $i$ up to but not including index $j$ . This half-open convention ensures that the length of the slice is always $j - i$ , and that s[0:len(s)] recovers the entire string:

s[1:4]   # 'bcd'
s[:3]    # 'abc'   (omitting the start defaults to 0)
s[2:]    # 'cde'   (omitting the end defaults to len(s))
s[:]     # 'abcde' (a copy of the entire string)

An optional third argument specifies the step size. The expression s[i:j:k] selects every $k$ -th character starting from index $i$ . A negative step traverses the string in reverse; the expression s[::-1] reverses a string entirely:

'0123456789'[::2]    # '02468'
'0123456789'[1::2]   # '13579'
'abcde'[::-1]        # 'edcba'

Problem 38

The built-in bin returns the binary expansion of an integer as a string prefixed with '0b' (for example, bin(1023) returns '0b1111111111'). Using slicing and len, write an expression that gives the number of binary digits in the expansion of a positive integer n. Verify that $2^{10} - 1 = 1023$ requires exactly $10$ binary digits.

Problem 39

A string is a palindrome if it reads the same forwards and backwards (for instance, 'racecar' and 'abba'). Using the reversing slice s[::-1] and a conditional expression, write a single Python expression that evaluates to 'Palindrome' if a string s is a palindrome and 'Not a palindrome' otherwise.

Type Conversions and Formatted Output

A type conversion (or type cast) converts an object from one type to another. In Python, the name of the target type serves as the conversion:

int('42')      # 42
float('3.14')  # 3.14
str(1023)      # '1023'

We have already encountered int used this way: the expression int('sf', 36) from the section on number bases converts a base- $36$ string to an integer. When converting a float to an int, Python truncates towards zero rather than rounding:

int(3.9)    # 3
int(-3.9)   # -3

Type conversions become essential when constructing formatted output. Recall that the print function inserts a space between each of its arguments:

fraction = 1/2
num = 30000000

print(num * fraction, 'is', fraction * 100, '% of', num)
# 15000000.0 is 50.0 % of 30000000

The unwanted space before the percent sign arises because '%' is a separate argument. The trailing .0 appears because 1/2 is a float, and the product of an int and a float is a float. Constructing the output as a single string via concatenation and explicit type conversions avoids both issues, but the result is verbose and difficult to read. Python provides a far cleaner mechanism: formatted string literals, or f-strings.

An f-string is a string literal prefixed with f that permits expressions enclosed in curly braces to be evaluated and embedded directly:

print(f'{int(num * fraction)} is {fraction * 100}% of {num}')
# 15000000 is 50.0% of 30000000

The expressions inside the braces are evaluated at runtime and automatically converted to strings. To include a literal curly brace, double it:

print(f'{{{3 * 5}}}')   # {15}

F-strings also support format modifiers, separated from the expression by a colon. The modifier .2f truncates a floating-point number to two decimal places, while , inserts thousands separators:

f'{3.14159:.2f}'   # '3.14'

print(f'{num * fraction:,.0f} is {fraction * 100}% of {num:,}')
# 15,000,000 is 50.0% of 30,000,000

Console Input

Python provides the built-in input for reading data from the user. It displays a prompt string, waits for the user to type a response and press Enter, then returns the response as a str:

name = input('Enter your name: ')
print(f'Hello, {name}!')

Critically, input always returns a string, regardless of what the user types:

n = input('Enter an integer: ')
print(type(n))   # <class 'str'>

If the user enters 3, the variable n is bound to the string '3', not the integer $3$ . Consequently, n * 4 evaluates to '3333' (string repetition), not $12$ (multiplication). To perform arithmetic on user input, an explicit type conversion is required:

n = int(input('Enter an integer: '))
print(n * 4)   # 12

Problem 40

Write a programme that asks the user to enter their birthday in the form dd/mm/yyyy and prints the message 'You were born in the year yyyy.', using slicing to extract the year from the input string.

Problem 41

Write a programme that reads two integers from the user and prints their sum, difference, product, quotient (using //), and remainder (using %), each on a separate line formatted with f-strings.

Character Encoding

Strings are sequences of characters, but computers store only numbers. Each character must therefore be assigned a numeric code. The oldest widely used standard, ASCII, assigns codes $0$ through $127$ to the characters common in English text. For instance, 'A' is encoded as $65$ , 'B' as $66$ , and 'a' as $97$ . Python’s built-in ord returns the code of a character, and chr converts a code back:

ord('A')   # 65
chr(65)    # 'A'
chr(97)    # 'a'

Remark (Characters as Numbers)

With $128 = 2^7$ available codes, ASCII requires exactly $7$ binary digits to represent any character. The connection to the earlier material on number bases is direct: the character 'A', stored as the integer $65$ , is held in memory as the binary string $(1000001)_2$ . Every piece of text you type is, at its lowest level, a sequence of numbers, each stored in binary.

ASCII is limited to $128$ characters, sufficient for English but inadequate for the world’s written languages. The Unicode standard, which defines over $120{,}000$ characters across more than $100$ scripts, is its modern replacement. Python 3 uses Unicode natively, most commonly in the UTF-8 encoding, allowing strings to contain characters from virtually any language.

Short-Circuit Evaluation

Earlier in this lesson, we used the Python operators and, or and not as computational counterparts of the logical connectives $\land$ , $\lor$ and $\neg$ . We noted in Lesson 1 that the conjunction $P \land Q$ is false whenever $P$ is false, regardless of $Q$ , and that the disjunction $P \lor Q$ is true whenever $P$ is true, regardless of $Q$ . Python exploits these logical facts. Rather than always evaluating both operands, and and or evaluate left to right and stop as soon as the outcome is determined. This is called short-circuit evaluation.

For and: if the left operand is False, the entire conjunction must be False, so the right operand is never evaluated. For or: if the left operand is True, the entire disjunction must be True, so the right operand is never evaluated.

The expression 1 / 0 produces a ZeroDivisionError if evaluated. We can use this as a diagnostic to test whether an operand is actually reached:

False and (1 / 0)   # False (right operand skipped)
True and (1 / 0)    # ZeroDivisionError (right operand evaluated)

True or (1 / 0)     # True (right operand skipped)
False or (1 / 0)    # ZeroDivisionError (right operand evaluated)

In each case, the left operand is evaluated first. If its value already determines the result, Python returns it immediately without touching the right operand. If not, the right operand is evaluated and its value determines the result.

Consider testing whether an integer n divides another integer m (Definition 37), but only when n is nonzero:

n = 0
m = 10

n != 0 and m % n == 0   # False (m % n is never evaluated)
m % n == 0 and n != 0   # ZeroDivisionError (order matters!)

The first expression is safe: n != 0 evaluates to False, so the and short-circuits and the dangerous m % n is never reached. The second expression crashes because m % n is evaluated first. The logical content of the two expressions is identical: both ask whether n is nonzero and divides m, but the order of the operands determines whether the programme runs or crashes.

Short-circuit evaluation transforms logical operators into compact conditional control flow. The expression A and B behaves identically to B if A else A, and A or B behaves identically to A if A else B.

Problem 42

Without running the code, determine what each of the following expressions evaluates to, or whether it produces an error. Then verify your answers in Python.

(a) False and (1 / 0)

(b) (1 / 0) and False

(c) True or (1 / 0)

(d) (1 / 0) or True

(e) True and False and (1 / 0)

(f) False or True or (1 / 0)

type versus isinstance

We introduced the built-in type in Lesson 1 as a means of inspecting an object’s classification. Python provides a second mechanism, isinstance, which tests whether an object belongs to a given type:

x = 42

type(x) == int      # True
isinstance(x, int)  # True

y = 'hello'

type(y) == str      # True
isinstance(y, str)  # True

For the types we have encountered so far, the two approaches produce identical results. However, isinstance is generally preferred because it handles a broader range of situations that arise once we study object hierarchies later in the course. As a rule of thumb: use isinstance for type checking, and reserve type for inspection and debugging.