10 Representations

Jed Rembold & Fred Agbo

February 3, 2025

Announcements

You have Problem Set 2 due tonight at 10 pm
- If you completed the assignment on Penseive, do not bother to submit on GitHub
Problem Set 3 will be posted today and due on Monday next week at 10 pm
- I’ll publish it also via the GitHub classroom and also in Pensieve Tutor
- You are free to make your submission through either systems
Remember your section meetings and attend them.
- Get help from your section leader and also go to the QUAD Center
We are jumpting to chapter 7 today! Read the chapter along for this week
Invitation to Willamette University Civic Commons event- tomorrow. Check the Schedule here for more details
Link to poll here

Review Question

The Kilburn’s algorithm for finding the factor of a number using brute-force strategy is shown on the right. What would be the printed output of the code?

def largest_factor(n):
    factor = n - 1
    while n % factor != 0:
        factor -= 1
    return factor

if __name__ == '__main__':
    fac = largest_factor(20)
    print(fac)

Representations

Focusing today on how a computer can internally store more complex and abstract information
Initially will look at numbers
Then we’ll transition to strings

Bit Power

The fundamental unit of memory inside a computer is called a bit
- Coined from a contraction of the words binary and digit
An individual bit exists in one of two states, usually denoted as 0 or 1.
More complex data can be represented by combining larger numbers of bits:
- Two bits can represent 4 (\(2\times 2\)) values
- Three bits can represent 8 (\(2\times2\times2\)) values
- Four bits can represent 16 (\(2\times2\times2\times2\) or \(2^4\)) values, etc
Many laptops have 16GB of system memory, and can therefore keep track of approximately \(2^{16,000,000,000}\) states!

An old code, but it checks out

Binary notation is an old idea
- Described by German mathematician Leibniz back in 1703
Leibniz describes his use of binary notation in an easy-to-follow style
Leibniz’s paper notes that the Chinese had discovered binary arithmetic 2000 years earlier, as illustrated by the patterns of lines in the I Ching!

Back to Grade school

Now in Binary

Representing Integers

The number of symbols available to count with determines the base of a number system
- Decimal is base 10, as we have 10 symbols (0-9) to count with
  - Each new number counts for 10 times as much as the previous
- Binary is base 2, as we only have 2 symbols (0 and 1) to count with
  - Each new number counts for twice as much as the previous
Can always determine what number a representation corresponds to by adding up the individual contributions

Specifying Bases

So the binary number 00101010 is equivalent to the decimal number 42
We distinguish by subsetting the bases: \[ 00101010_2 = 42_{10}\]
The number itself still is the same! All that changes is how we represent that number.
- Numbers do not have bases – representations of numbers have bases

Binary Clocks

Other Bases

Binary is not a particularly compact representation to write out, so computer scientists will often use more compact representations as well
- Octal (base 8) uses the digits 0 to 7
- Hexadecimal (base 16) uses the digits 0 to 9 and then the letters A through F

Why octal or hexadecimal over our trusty old decimal system?
- Both are powers of 2, so it makes it easy to convert back to decimal
  - 1 octal digit = 3 binary digit, 1 hex digit = 4 binary digit

Base(ic) Practice

The Java compiler has a fun quirk where every binary file it produces begins with

What is this in decimal? octal? hexadecimal?

Representation

Sequences of bits have no intrinsic meaning!
- Just the representations we assign to them by convention or by building certain operations into hardware
- A 32-bit sequence represents an integer only because we have designed hardware to manipulate those sequences arithmetically: applying operations like addition, subtraction, etc
By choosing an appropriate representation, you can use bits to represent any value you could imagine!
- Characters represented by numeric character codes
- Floating-point representations to support real numbers
- Two-dimensional arrays of bits representing images
To be useful though, everyone needs to agree on a representation!

Representation Pitfalls

How we choose to represent values has consequences!
Python represents floating point (fractional) numbers using two integers
- One to represent the significant digits
- One to represent the exponent (where the decimal place is)
\(1\frac{1}{4}\) Example
- In decimal: \(\quad\displaystyle 1\frac{1}{4} = \frac{1}{1} + \frac{2}{10} + \frac{5}{100} = 1.25 = (125, -2)\)
- In binary: \(\quad\displaystyle 1\frac{1}{4} = \frac{1}{1} + \frac{0}{2} + \frac{1}{4} = 1.01 = (101, -10)\)

Consequences

The best we can do within the range of normal integers \[\frac{3602879701896397}{2^{55}} = 0.10000000000000000555111512312578270\]
When doing operations on these numbers, extra decimals will sometimes get rounded off, suddenly making the number look precise, but you might always have a tiny bit of this rounding error showing up in floating point values.
So be careful using == for numeric comparisons! Rounding might result in unexpected falsehoods
- 0.1 + 0.1 + 0.1 != 0.3
- Far better to check if two numbers are within a small margin of one another, or greater or less than the other

Representing Characters

Use numeric encodings to represent character data inside the machine, where each character is assigned an integer value.
Character codes are not very useful unless standardized though!
- Competing encodings in the early years made it difficult to share data across machines
First widely adopted character encoding was ASCII (American Standard Code for Information Interchange)
With only 256 possible characters, ASCII proved inadequate in the international world, and has therefore been superseded by Unicode.

ASCII

The ASCII subset of Unicode