Mascot image.
#EECS#Python#Intro

Bases, Branching, Division, and Strings

Number Bases

This morning, we constructed the number systems N,Z,Q,R\mathbb{N}, \mathbb{Z}, \mathbb{Q}, \mathbb{R} and C\mathbb{C}, building each from the insufficiency of the last. Throughout that construction, every number was written in its familiar decimal form. But the string "28122812" is a representation of a number, not the number itself. Each digit occupies a position whose value is determined by successive powers of ten:

2812=2103+8102+1101+21002812 = 2 \cdot 10^3 + 8 \cdot 10^2 + 1 \cdot 10^1 + 2 \cdot 10^0

The rightmost digit sits in the units place (100=110^0 = 1), and moving left, each position is worth ten times the previous. The leftmost "22" in 28122812 therefore represents two thousand, while the rightmost "22" represents two. This is a positional numeral system: the meaning of a digit depends entirely on where it appears.

The choice of ten as the base is a biological convention, not a mathematical one, corresponding with the fact that most humans have ten fingers. For many purposes, it is inconvenient. Ten can only be divided evenly by four numbers (1,2,51, 2, 5 and 1010), which limits the ease of performing fractional arithmetic. A base built on twelve, which can be divided evenly six ways (1,2,3,4,61, 2, 3, 4, 6 and 1212), would be considerably more convenient. More consequentially for this course, digital circuits operate in two voltage states (high and low), making base two the natural language of computation. In binary, arithmetic can be carried out directly by sequences of logic gates in an electrical circuit.

It is therefore worthwhile to understand positional numeral systems in bases other than ten.

Definition 33 (Base-bb Expansion)

Let bb be a natural number with b>1b > 1. The base-bb expansion of a natural number nn is the string of digits drdr1d0d_r d_{r-1} \ldots d_0 satisfying:

(i) n=drbr+dr1br1++d0b0n = d_r \cdot b^r + d_{r-1} \cdot b^{r-1} + \cdots + d_0 \cdot b^0;

(ii) Each digit satisfies 0di<b0 \leqslant d_i < b; and

(iii) If n>0n > 0, then the leading digit dr0d_r \neq 0. The expansion of zero is 00 in every base.

The definition is compact. Before working through examples, let us examine what each condition demands.

Condition (i) states that the digits encode how many copies of each power of bb sum to give nn. In decimal, the positions from right to left correspond to units, tens, hundreds, thousands, and so on. In binary, they correspond to 1,2,4,8,16,1, 2, 4, 8, 16, \ldots.

Condition (ii) restricts the available digits to 0,1,,b10, 1, \ldots, b - 1. In base four, the permissible digits are 0,1,2,30, 1, 2, 3. Permitting a digit with value bb would destroy uniqueness: if "X\text{X}" were a base-ten digit representing ten, then both "X2\text{X}2" and "102102" would represent one hundred and two.

Condition (iii) forbids leading zeros among positive numbers. Without it, "0142301423" and "14231423" would be distinct strings for the same number.

Remark

Certain bases carry standard names: the base-22, 33, 88, 1010 and 1616 expansions are called binary, ternary, octal, decimal and hexadecimal, respectively.

Example 23

Consider the number 10231023. Its decimal expansion is simply 10231023:

1023=1103+0102+2101+31001023 = 1 \cdot 10^3 + 0 \cdot 10^2 + 2 \cdot 10^1 + 3 \cdot 10^0

Since 1023=21011023 = 2^{10} - 1, every power of 22 from 202^0 to 292^9 appears exactly once, giving the binary expansion 11111111111111111111:

1023=129+128+127+126+125+124+123+122+121+1201023 = 1 \cdot 2^9 + 1 \cdot 2^8 + 1 \cdot 2^7 + 1 \cdot 2^6 + 1 \cdot 2^5 + 1 \cdot 2^4 + 1 \cdot 2^3 + 1 \cdot 2^2 + 1 \cdot 2^1 + 1 \cdot 2^0

For bases larger than ten, the ten decimal digits 00 through 99 are insufficient. The convention is to extend the digit set with letters: A=10,B=11,,Z=35\text{A} = 10, \text{B} = 11, \ldots, \text{Z} = 35. In base 3636, the expansion of 10231023 is SF\text{SF}:

1023=28361+15360=S36+F11023 = 28 \cdot 36^1 + 15 \cdot 36^0 = \text{S} \cdot 36 + \text{F} \cdot 1

When writing a number by its base-bb expansion, we need notation to indicate which base is intended.

Definition 34 (Base-bb Notation)

Let b>1b > 1 and let dr,dr1,,d0d_r, d_{r-1}, \ldots, d_0 be base-bb digits. We write

(drdr1d0)b=drbr+dr1br1++d0b0(d_r d_{r-1} \ldots d_0)_b = d_r \cdot b^r + d_{r-1} \cdot b^{r-1} + \cdots + d_0 \cdot b^0

for the natural number with this base-bb expansion. When no subscript is present and a base is not otherwise specified, the expansion is assumed to be decimal.

Example 24

Applying this notation, we can express 10231023 simultaneously in several bases:

1023=(1111111111)2=(1101220)3=(1777)8=(1023)10=(3FF)16=(SF)361023 = (1111111111)_2 = (1101220)_3 = (1777)_8 = (1023)_{10} = (3\text{FF})_{16} = (\text{SF})_{36}

Python provides built-in functions for converting between the most common bases. The functions bin, oct and hex each return a string prefixed with 0b, 0o or 0x to indicate the base:

n = 1023

bin(n)   # '0b1111111111'
oct(n)   # '0o1777'
hex(n)   # '0x3ff'

These prefixes also serve as integer literals, so base conversions can be verified directly:

a = 0b1111111111   # binary
b = 0o1777         # octal
c = 0x3FF          # hexadecimal

a == b == c == 1023   # True

To convert a string written in an arbitrary base back to an integer, int accepts an optional second argument specifying the base:

int('1101220', 3)   # 1023
int('sf', 36)       # 1023
Problem 28

Find the binary, ternary, octal, hexadecimal and base-3636 expansions of the number 17291729 by hand, using letters A\text{A} through F\text{F} as additional hexadecimal digits and A\text{A} through Z\text{Z} for base 3636. Verify your answers in Python.

Problem 29

The octal expansion of a certain natural number is (2745)8(2745)_8. Determine its decimal value and its hexadecimal expansion.

Problem 30

A three-digit number in base bb has the form (d2d1d0)b(d_2 d_1 d_0)_b where d20d_2 \neq 0. What are the smallest and largest possible values of such a number, expressed in terms of bb?

Branching Programs and Control Flow

The algorithms and computational sequences formulated thus far possess a strictly sequential topology. They are straight-line programmes, executing statements continuously in the exact order they are defined until the instructions are exhausted. While mathematically predictable, straight-line programmes are fundamentally limited. Because every instruction is executed exactly once, the maximum temporal duration of the computation is bounded entirely by the static length of the source code.

To construct more sophisticated imperative knowledge, we must introduce mechanisms that allow the interpreter to dynamically alter its execution path based on the state of the machine. We achieve this via branching programmes, which utilise conditional statements to dictate control flow.

Conditional Statements and Semantic Indentation

A conditional statement evaluates a proposition (Definition 1) and diverges the execution path based on whether the result is True (\top) or False (\bot). The most fundamental branching structure is the if-else construct.

This structure consists of:

  1. A logical test evaluating strictly to True or False.
  2. A primary block of instructions executed if and only if the test evaluates to True.
  3. An optional secondary block (the else clause) executed if and only if the test evaluates to False.

Computationally, Python enforces this topological structure through strict semantic indentation. While formal mathematical texts use formatting for human readability, Python interpreters rely on whitespace to define the boundaries of logical blocks.

x = 14

if x % 2 == 0:
    print("The integer is even.")
else:
    print("The integer is odd.")

print("Branching complete.")

The expression x % 2 computes the remainder when x is divided by 22. Equivalently, it extracts the last digit of the binary expansion of x (Definition 33): it is 00 when x is even and 11 when x is odd.

If the terminal print statement were indented, the interpreter would classify it as part of the else block, fundamentally altering the programme’s static semantics. This architectural decision ensures that the visual hierarchy of the script perfectly mirrors its logical hierarchy.

Because indentation governs logic, the definition of a single line of code becomes paramount. To prevent long mathematical expressions from violating readability or indentation rules, Python permits explicit line continuation using the backslash (\) character, or implicit line continuation within enclosing structural brackets (parentheses, square brackets, or braces).

# Explicit line continuation
alpha = 1.61803398875 + \
        2.71828182845 + \
        3.14159265359

# Implicit line continuation (mathematically preferred)
beta = (1.61803398875 +
        2.71828182845 +
        3.14159265359)

Nested Conditionals and Mutually Exclusive Branching

When the resolution of one conditional block necessitates further logical evaluation, conditional statements may be nested. This constructs a decision tree where each subsequent branch is dependent on the prior evaluating to True or False.

For mutually exclusive propositions, writing deeply nested else and if blocks quickly becomes unreadable. Python condenses this structural nesting via the elif (else if) keyword, which evaluates a new proposition if and only if all preceding tests in the chain evaluated to False.

n = 15

if n % 2 == 0:
    if n % 3 == 0:
        print("n is a multiple of 6.")
    else:
        print("n is even, but not divisible by 3.")
elif n % 3 == 0:
    print("n is odd and divisible by 3.")
else:
    print("n is not divisible by 2 or 3.")

Algorithmic Efficiency and State Updating

Conditionals permit compound Boolean expressions, utilising the Python operators and, or and not (the counterparts of the logical connectives \land, \lor and ¬\neg from Lesson 1), to assess multiple variables simultaneously. However, how we structure these logical tests drastically impacts algorithmic elegance and computational efficiency.

Consider the problem of identifying the maximum positive integer among three variables: aa, bb and cc. If none are positive, the algorithm must return the minimum value among them.

A naive approach might construct a mutually exclusive test for every possible combination of states. Since each of the three variables can either be positive or non-positive, there are 23=82^3 = 8 distinct cases. Exhaustively testing each using and statements requires cumbersome code and forces the interpreter to evaluate the same relational propositions repeatedly.

By leveraging the concept of variable binding and state updating, we can construct a significantly more elegant solution. We initialise a provisional binding, sequentially test individual propositions, and update the state strictly when required.

a, b, c = -5, 12, 7

# Assume none are positive initially; find the global minimum.
target_val = min(a, b, c)

# Sequentially update the state if a larger positive integer is discovered.
if a > 0:
    target_val = a
if b > 0 and b > target_val:
    target_val = b
if c > 0 and c > target_val:
    target_val = c

print(target_val)

This paradigm isolates the logical checks. It examines the polarity of each variable exactly once, demonstrating that well-structured imperative knowledge relies not just on valid syntax, but on the efficient manipulation of temporal state.

Conditional Expressions

For discrete, inline evaluations, Python supplies conditional expressions, which resolve an entire if-else branch into a single evaluable expression:

expression_if_true if proposition else expression_if_false

This permits the assignment of variables based on propositional logic without requiring multi-line blocks. A natural first application is the absolute value.

Definition 35 (Absolute Value)

The absolute value of a real number xx, denoted x|x|, is defined by

x={xif x0xif x<0|x| = \begin{cases} x & \text{if } x \geqslant 0 \\ -x & \text{if } x < 0 \end{cases}

Equivalently, x|x| is the distance from xx to 00 on the number line (Definition 20).

Because the absolute value is defined by cases, there are several ways to compute it. The most direct is a standard if block:

x = -10

if x < 0:
    x = -x

For a single-line body, Python permits writing the block on the same line as the if:

if x < 0: x = -x   # only for very short statements

One might also try to avoid branching entirely by exploiting the fact that True and False behave as 11 and 00 in arithmetic:

y = (x < 0) * (-x) + (x >= 0) * x   # this is awful

This works, but it is unreadable, evaluates both branches unconditionally, and obscures the logic. The conditional expression is the cleanest approach:

y = x if x >= 0 else -x

Python also provides the built-in abs for this purpose: abs(-10) returns 10.

With the absolute value defined, we can establish its most important mathematical property. Observe first that a0|a| \geqslant 0 for every real number aa, with a=0|a| = 0 only when a=0a = 0. Moreover, every real number is bounded by its own absolute value:

aaa-|a| \leqslant a \leqslant |a|

If a0a \geqslant 0, then a=a|a| = a and a=a0a-|a| = -a \leqslant 0 \leqslant a. If a<0a < 0, then a=a>0>a|a| = -a > 0 > a and a=(a)=a-|a| = -(-a) = a. In both cases, the bound holds. This seemingly simple observation yields one of the most widely used inequalities in mathematics.

Theorem 7 (Triangle Inequality)

For all real numbers aa and bb,

a+ba+b|a + b| \leqslant |a| + |b|
Proof

Since aaa-|a| \leqslant a \leqslant |a| and bbb-|b| \leqslant b \leqslant |b|, adding these inequalities gives

(a+b)a+ba+b-(|a| + |b|) \leqslant a + b \leqslant |a| + |b|

Let M=a+bM = |a| + |b|. Then Ma+bM-M \leqslant a + b \leqslant M. If a+b0a + b \geqslant 0, then a+b=a+bM|a + b| = a + b \leqslant M. If a+b<0a + b < 0, then a+b=(a+b)M|a + b| = -(a + b) \leqslant M, since a+bMa + b \geqslant -M. In either case, a+ba+b|a + b| \leqslant |a| + |b|.

Problem 31

For which pairs of real numbers a,ba, b does equality hold in the Triangle Inequality? Prove your claim.

Constant Time Complexity

While branching conditionals drastically expand the utility of imperative programmes, they do not overcome the foundational limitation of straight-line execution.

If we define an atomic operation as taking one discrete unit of computational time, a branching programme comprising NN lines of code cannot possibly exceed NN units of time to run. Because each conditional block is strictly directional and no line is executed more than once, the maximum temporal execution is hard-bounded by a constant kk (where kNk \leqslant N).

Definition 36 (Constant Time)

An algorithm runs in constant time if its maximal execution time is bounded by a constant, independent of the size or magnitude of its inputs.

Remark (Beyond Constant Time)

Constant-time algorithms are highly restrictive. Consider the mathematical problem of calculating the factorial of a given integer nn. A constant-time branching programme cannot solve this for an arbitrary nn, as it would require a hard-coded conditional branch for every possible integer. To process computations whose execution length scales with the size of the input, our logical framework must be expanded to include cyclic control flow, abandoning constant-time limitations in favour of indefinite iteration.

Problem 32

Given integer variables a, b and c representing the coefficients of the polynomial ax2+bx+cax^2 + bx + c (Definition 31), write a Python programme that computes the discriminant Δ=b24ac\Delta = b^2 - 4ac and prints whether the polynomial has two distinct real roots (Δ>0\Delta > 0), one repeated real root (Δ=0\Delta = 0), or no real roots (Δ<0\Delta < 0). Handle the case a=0a = 0 separately: when a=0a = 0, the expression is linear, not quadratic.

Problem 33

Consider the following rule applied to integers: if nn is odd, replace it by 3n+13n + 1; if nn is even, replace it by n/2n/2. Write a Python programme that applies this rule three times in succession starting from a given positive integer n, using conditional expressions. Verify that starting from n=7n = 7, three applications produce 3434.

Division and Divisibility

With the integers (Definition 24) and multiplication (Definition 25) established this morning, we can formalise one of the most fundamental relationships between them.

Definition 37 (Divisibility)

Let a,bZa, b \in \mathbb{Z}. We say bb divides aa, written bab \mid a, if there exists an integer qq such that a=qba = qb. Equivalently: aa is divisible by bb, bb is a divisor (or factor) of aa, and aa is a multiple of bb.

Perhaps counterintuitively, this definition does not invoke the arithmetic operation of division (Definition 26). It is stated entirely in terms of multiplication.

Example 25

The integer 1212 is divisible by 1,2,3,4,61, 2, 3, 4, 6 and 1212, since

12=121=62=43=34=26=11212 = 12 \cdot 1 = 6 \cdot 2 = 4 \cdot 3 = 3 \cdot 4 = 2 \cdot 6 = 1 \cdot 12

It is also divisible by the negatives of each: for instance, (3)12(-3) \mid 12 since 12=(4)(3)12 = (-4) \cdot (-3).

Problem 34

Prove that 11 divides every integer, and that every integer divides 00.

Remark (Zero Divides Zero)

A consequence is that 000 \mid 0, since 0=q00 = q \cdot 0 for any integer qq. This may seem surprising: we have been told our whole lives that we cannot divide by zero. But divisibility is not the same as performing division. Saying that 00 divides 00 simply means that 00 can be multiplied by an integer to obtain 00, which is true. This does not imply that the expression 00\frac{0}{0} can or should be meaningfully defined.

Using this definition, we can establish general facts about how divisibility propagates.

Theorem 8 (Transitivity of Divisibility)

Let a,b,cZa, b, c \in \mathbb{Z}. If cbc \mid b and bab \mid a, then cac \mid a.

Proof

Suppose cbc \mid b and bab \mid a. By the definition of divisibility, there exist integers qq and rr such that b=qcb = qc and a=rba = rb. Substituting the first equation into the second:

a=r(qc)=(rq)ca = r(qc) = (rq)c

Since rqrq is an integer, cac \mid a.

Problem 35

Let a,b,dZa, b, d \in \mathbb{Z}. Suppose dad \mid a and dbd \mid b. Prove that for any integers uu and vv, dd divides au+bvau + bv.

We have used the words “even” and “odd” informally throughout these notes. To prove anything about evenness, we need a precise definition. Intuitively, an even integer is one that is “evenly divisible” by 22, but this circular phrasing is not a definition. One might try: nn is even if n2\frac{n}{2} is an integer. This works in practice, but it depends on already knowing what “integer” means in a way that is hard to pin down without further machinery. A cleaner approach avoids division entirely and uses the concept we have just formalised.

Definition 38 (Even and Odd)

An integer nn is even if 2n2 \mid n, that is, if n=2kn = 2k for some integer kk. An integer that is not even is odd.

This definition asks only for a multiplicative witness: to verify that 1414 is even, we exhibit k=7k = 7 and check 14=2714 = 2 \cdot 7. To verify that 99 is odd, we must show that no integer kk satisfies 9=2k9 = 2k. With a precise definition in hand, we can now ask and answer questions that would otherwise be vague. Does the sum of two even numbers remain even? Is the product of an odd number and an even number always even? These are provable claims, not intuitions.

The Division Theorem

It is not just interesting to know when one integer divides another. Equally important is understanding what happens when the division is not exact. The following result, stated without proof, tells us that every integer division produces a unique quotient and remainder.

Theorem 9 (Division Theorem)

Let a,bZa, b \in \mathbb{Z} with b0b \neq 0. There exist unique integers qq and rr such that

a=qb+rand0r<ba = qb + r \quad \text{and} \quad 0 \leqslant r < |b|

The integer qq is called the quotient and rr is called the remainder.

Example 26

The integer 1212 leaves a remainder of 22 when divided by 55, since 12=25+212 = 2 \cdot 5 + 2. The quotient is q=2q = 2 and the remainder is r=2r = 2.

What happens when the dividend is negative?

Theorem 10 (Negation of Remainders)

Let aa and bb be integers with b>0b > 0. If aa leaves a remainder of r>0r > 0 when divided by bb, then a-a leaves a remainder of brb - r when divided by bb.

Proof

By the Division Theorem, a=qb+ra = qb + r for some integer qq. Then

a=qbr=(q+1)b+br=(q+1)b+(br)-a = -qb - r = -(q + 1)b + b - r = -(q + 1)b + (b - r)

Since 0<r<b0 < r < b, we have 0<br<b0 < b - r < b. The quotient of a-a when divided by bb is therefore (q+1)-(q + 1), and the remainder is brb - r.

Problem 36

Prove that if an integer aa leaves a remainder of rr when divided by b>0b > 0, then aa leaves a remainder of rr when divided by b-b.

Python’s divmod returns both the quotient and remainder simultaneously, directly implementing the Division Theorem:

divmod(12, 5)    # (2, 2)   since 12 = 2*5 + 2
divmod(-12, 5)   # (-3, 3)  since -12 = (-3)*5 + 3

The quotient and remainder are also accessible individually via // (floor division) and % (modulo):

12 // 5    # 2
12 % 5     # 2

-12 // 5   # -3
-12 % 5    # 3

The second pair confirms the Negation of Remainders: 1212 leaves remainder 22 when divided by 55, and 12-12 leaves remainder 52=35 - 2 = 3.

Computing Base Expansions

We can now connect the Division Theorem to the material on number bases. In Definition 33, we defined the base-bb expansion of a natural number, but did not describe how to compute one. The Division Theorem provides exactly this.

The natural number nn whose base-bb expansion is drdr1d1d0d_r d_{r-1} \cdots d_1 d_0 satisfies

n=d0+b(d1+b(d2++b(dr1+bdr)))n = d_0 + b(d_1 + b(d_2 + \cdots + b(d_{r-1} + b \, d_r) \cdots ))

Since 0di<b0 \leqslant d_i < b, dividing nn by bb yields remainder d0d_0 (the last digit) and quotient n0=(nd0)/bn_0 = (n - d_0) / b, whose base-bb expansion is drdr1d1d_r d_{r-1} \cdots d_1. Repeating this process extracts each digit in turn, from right to left, until the quotient reaches zero.

Example 27

We compute the base-1717 expansion of 1234512345, using the letters A\text{A} through G\text{G} to represent 1010 through 1616.

12345=72617+312345 = 726 \cdot 17 + 3, so d0=3d_0 = 3 and the quotient is 726726.

726=4217+12726 = 42 \cdot 17 + 12, so d1=12=Cd_1 = 12 = \text{C} and the quotient is 4242.

42=217+842 = 2 \cdot 17 + 8, so d2=8d_2 = 8 and the quotient is 22.

2=017+22 = 0 \cdot 17 + 2, so d3=2d_3 = 2 and the quotient is 00. We stop.

The base-1717 expansion of 1234512345 is therefore (28C3)17(28\text{C}3)_{17}. Verification:

2173+8172+1217+3=9826+2312+204+3=123452 \cdot 17^3 + 8 \cdot 17^2 + 12 \cdot 17 + 3 = 9826 + 2312 + 204 + 3 = 12345

Each step of the algorithm applies the same pair of operations, // and %, to the current quotient:

n = 12345
b = 17

d0 = n % b       # 3
n0 = n // b       # 726

d1 = n0 % b       # 12 (C)
n1 = n0 // b      # 42

d2 = n1 % b       # 8
n2 = n1 // b      # 2

d3 = n2 % b       # 2
n3 = n2 // b      # 0  → stop

The repetitive structure is deliberate: each step is identical, differing only in the value of the quotient. Automating this process requires a mechanism for repeating instructions until a condition is met, which we do not yet possess. We will return to this when we introduce iteration.

Problem 37

Using the division algorithm above, find the base-77 and hexadecimal expansions of 99999999 by hand. Verify each answer in Python.

Strings and Input

When we classified Python’s type system in Lesson 1, we distinguished scalar objects (indivisible atoms like int and bool) from non-scalar objects that possess internal structure. We can now examine the simplest non-scalar type: str, which represents finite sequences of characters.

String Literals and Operator Overloading

A string literal is written by enclosing characters in matching quotation marks. Python accepts both single and double quotes, and the two forms are interchangeable:

greeting = 'hello'
message = "world"

It is essential to distinguish a numeric string from its corresponding integer. The literal '123' is a string of three characters; the literal 123 is an integer. These are objects of entirely different types, and confusing them is among the most common sources of error for new programmers.

The arithmetic operators + and *, which we have so far used for numerical computation, take on new meanings when applied to strings. The expression 'ab' + 'cd' evaluates to 'abcd': the + operator concatenates its operands end to end. The expression 3 * 'ha' evaluates to 'hahaha': the * operator, when applied to an integer and a string, repeats the string. There is a natural logic to this parallel: just as 3×2=2+2+23 \times 2 = 2 + 2 + 2 in arithmetic, 3 * 'ha' is equivalent to 'ha' + 'ha' + 'ha'.

3 * 4       # 12
3 * 'a'     # 'aaa'
3 + 4       # 7
'a' + 'a'   # 'aa'

An operator whose behaviour depends on the types of its operands is said to be overloaded. This is not unique to programming; mathematical notation is overloaded routinely. The symbol "++" denotes addition of natural numbers, integers, rationals, and reals, each of which was constructed as a distinct operation in the number systems we built this morning. Python merely extends this convention to non-numeric types.

Not every combination of types is permitted. Attempting to multiply two strings produces an error, because repetition requires one operand to be an integer:

'a' * 'a'
# TypeError: can't multiply sequence by non-int of type 'str'

Errors

When an instruction violates the language’s rules, the interpreter halts and reports an error. In Lesson 1, we distinguished the syntax of a language (its grammatical rules) from its static semantics (whether a grammatically valid string has a logical meaning) and its full semantics (what a programme actually does). Errors in Python fall into three corresponding categories.

A syntax error arises when the source code violates the grammatical rules of the language. The interpreter catches these before any code is executed:

print("Uh oh!)
# SyntaxError: unterminated string literal

A runtime error arises when a syntactically valid instruction encounters an illegal state during execution. The TypeError above is one example. Others include dividing by zero and referencing a name that has not been bound to any object:

print(1 / 0)
# ZeroDivisionError: division by zero

new_id
# NameError: name 'new_id' is not defined

A logical error is the most insidious: the programme executes without complaint but produces an incorrect result. Because the code is syntactically valid and encounters no illegal runtime states, the interpreter cannot detect the mistake. Only the programmer, by inspecting the output, can identify it.

print("2 + 2 =", 5)   # prints: 2 + 2 = 5

Indexing, Slicing, and Length

Strings are ordered sequences of characters, and Python provides operations to inspect and extract their contents.

The built-in len returns the number of characters in a string:

len('abc')   # 3
len('')      # 0

Individual characters are extracted by indexing. Python uses zero-based indexing: the first character occupies position 00, the second position 11, and in general the kk-th character (counting from one) occupies position k1k - 1. For a string of length nn, the valid indices are 0,1,,n10, 1, \ldots, n - 1. Accessing an index outside this range produces an error:

s = 'abcde'

s[0]   # 'a'
s[4]   # 'e'
s[5]   # IndexError: string index out of range

Negative indices count backwards from the end: s[-1] is the last character, s[-2] the second-to-last, and so on.

s[-1]   # 'e'
s[-5]   # 'a'

A slice extracts a substring. The expression s[i:j] returns the characters from index ii up to but not including index jj. This half-open convention ensures that the length of the slice is always jij - i, and that s[0:len(s)] recovers the entire string:

s[1:4]   # 'bcd'
s[:3]    # 'abc'   (omitting the start defaults to 0)
s[2:]    # 'cde'   (omitting the end defaults to len(s))
s[:]     # 'abcde' (a copy of the entire string)

An optional third argument specifies the step size. The expression s[i:j:k] selects every kk-th character starting from index ii. A negative step traverses the string in reverse; the expression s[::-1] reverses a string entirely:

'0123456789'[::2]    # '02468'
'0123456789'[1::2]   # '13579'
'abcde'[::-1]        # 'edcba'
Problem 38

The built-in bin returns the binary expansion of an integer as a string prefixed with '0b' (for example, bin(1023) returns '0b1111111111'). Using slicing and len, write an expression that gives the number of binary digits in the expansion of a positive integer n. Verify that 2101=10232^{10} - 1 = 1023 requires exactly 1010 binary digits.

Problem 39

A string is a palindrome if it reads the same forwards and backwards (for instance, 'racecar' and 'abba'). Using the reversing slice s[::-1] and a conditional expression, write a single Python expression that evaluates to 'Palindrome' if a string s is a palindrome and 'Not a palindrome' otherwise.

Type Conversions and Formatted Output

A type conversion (or type cast) converts an object from one type to another. In Python, the name of the target type serves as the conversion:

int('42')      # 42
float('3.14')  # 3.14
str(1023)      # '1023'

We have already encountered int used this way: the expression int('sf', 36) from the section on number bases converts a base-3636 string to an integer. When converting a float to an int, Python truncates towards zero rather than rounding:

int(3.9)    # 3
int(-3.9)   # -3

Type conversions become essential when constructing formatted output. Recall that the print function inserts a space between each of its arguments:

fraction = 1/2
num = 30000000

print(num * fraction, 'is', fraction * 100, '% of', num)
# 15000000.0 is 50.0 % of 30000000

The unwanted space before the percent sign arises because '%' is a separate argument. The trailing .0 appears because 1/2 is a float, and the product of an int and a float is a float. Constructing the output as a single string via concatenation and explicit type conversions avoids both issues, but the result is verbose and difficult to read. Python provides a far cleaner mechanism: formatted string literals, or f-strings.

An f-string is a string literal prefixed with f that permits expressions enclosed in curly braces to be evaluated and embedded directly:

print(f'{int(num * fraction)} is {fraction * 100}% of {num}')
# 15000000 is 50.0% of 30000000

The expressions inside the braces are evaluated at runtime and automatically converted to strings. To include a literal curly brace, double it:

print(f'{{{3 * 5}}}')   # {15}

F-strings also support format modifiers, separated from the expression by a colon. The modifier .2f truncates a floating-point number to two decimal places, while , inserts thousands separators:

f'{3.14159:.2f}'   # '3.14'

print(f'{num * fraction:,.0f} is {fraction * 100}% of {num:,}')
# 15,000,000 is 50.0% of 30,000,000

Console Input

Python provides the built-in input for reading data from the user. It displays a prompt string, waits for the user to type a response and press Enter, then returns the response as a str:

name = input('Enter your name: ')
print(f'Hello, {name}!')

Critically, input always returns a string, regardless of what the user types:

n = input('Enter an integer: ')
print(type(n))   # <class 'str'>

If the user enters 3, the variable n is bound to the string '3', not the integer 33. Consequently, n * 4 evaluates to '3333' (string repetition), not 1212 (multiplication). To perform arithmetic on user input, an explicit type conversion is required:

n = int(input('Enter an integer: '))
print(n * 4)   # 12
Problem 40

Write a programme that asks the user to enter their birthday in the form dd/mm/yyyy and prints the message 'You were born in the year yyyy.', using slicing to extract the year from the input string.

Problem 41

Write a programme that reads two integers from the user and prints their sum, difference, product, quotient (using //), and remainder (using %), each on a separate line formatted with f-strings.

Character Encoding

Strings are sequences of characters, but computers store only numbers. Each character must therefore be assigned a numeric code. The oldest widely used standard, ASCII, assigns codes 00 through 127127 to the characters common in English text. For instance, 'A' is encoded as 6565, 'B' as 6666, and 'a' as 9797. Python’s built-in ord returns the code of a character, and chr converts a code back:

ord('A')   # 65
chr(65)    # 'A'
chr(97)    # 'a'
Remark (Characters as Numbers)

With 128=27128 = 2^7 available codes, ASCII requires exactly 77 binary digits to represent any character. The connection to the earlier material on number bases is direct: the character 'A', stored as the integer 6565, is held in memory as the binary string (1000001)2(1000001)_2. Every piece of text you type is, at its lowest level, a sequence of numbers, each stored in binary.

ASCII is limited to 128128 characters, sufficient for English but inadequate for the world’s written languages. The Unicode standard, which defines over 120,000120{,}000 characters across more than 100100 scripts, is its modern replacement. Python 3 uses Unicode natively, most commonly in the UTF-8 encoding, allowing strings to contain characters from virtually any language.

Short-Circuit Evaluation

Earlier in this lesson, we used the Python operators and, or and not as computational counterparts of the logical connectives \land, \lor and ¬\neg. We noted in Lesson 1 that the conjunction PQP \land Q is false whenever PP is false, regardless of QQ, and that the disjunction PQP \lor Q is true whenever PP is true, regardless of QQ. Python exploits these logical facts. Rather than always evaluating both operands, and and or evaluate left to right and stop as soon as the outcome is determined. This is called short-circuit evaluation.

For and: if the left operand is False, the entire conjunction must be False, so the right operand is never evaluated. For or: if the left operand is True, the entire disjunction must be True, so the right operand is never evaluated.

The expression 1 / 0 produces a ZeroDivisionError if evaluated. We can use this as a diagnostic to test whether an operand is actually reached:

False and (1 / 0)   # False (right operand skipped)
True and (1 / 0)    # ZeroDivisionError (right operand evaluated)

True or (1 / 0)     # True (right operand skipped)
False or (1 / 0)    # ZeroDivisionError (right operand evaluated)

In each case, the left operand is evaluated first. If its value already determines the result, Python returns it immediately without touching the right operand. If not, the right operand is evaluated and its value determines the result.

Consider testing whether an integer n divides another integer m (Definition 37), but only when n is nonzero:

n = 0
m = 10

n != 0 and m % n == 0   # False (m % n is never evaluated)
m % n == 0 and n != 0   # ZeroDivisionError (order matters!)

The first expression is safe: n != 0 evaluates to False, so the and short-circuits and the dangerous m % n is never reached. The second expression crashes because m % n is evaluated first. The logical content of the two expressions is identical: both ask whether n is nonzero and divides m, but the order of the operands determines whether the programme runs or crashes.

Short-circuit evaluation transforms logical operators into compact conditional control flow. The expression A and B behaves identically to B if A else A, and A or B behaves identically to A if A else B.

Problem 42

Without running the code, determine what each of the following expressions evaluates to, or whether it produces an error. Then verify your answers in Python.

(a) False and (1 / 0)

(b) (1 / 0) and False

(c) True or (1 / 0)

(d) (1 / 0) or True

(e) True and False and (1 / 0)

(f) False or True or (1 / 0)

type versus isinstance

We introduced the built-in type in Lesson 1 as a means of inspecting an object’s classification. Python provides a second mechanism, isinstance, which tests whether an object belongs to a given type:

x = 42

type(x) == int      # True
isinstance(x, int)  # True

y = 'hello'

type(y) == str      # True
isinstance(y, str)  # True

For the types we have encountered so far, the two approaches produce identical results. However, isinstance is generally preferred because it handles a broader range of situations that arise once we study object hierarchies later in the course. As a rule of thumb: use isinstance for type checking, and reserve type for inspection and debugging.