10 Representations

Jed Rembold & Fred Agbo

February 5, 2024

Announcements

  • Problem set 1 grading is posted: check to confirm
    • Majority did not test their codes on all scenarios as described in the instruction
    • Follow the submission guidelines and do not create a separate file
    • Make sure your names are fully written at the top of the starter file
  • You have Problem Set 2 due tonight at 10 pm
  • Remember your section meetings and attend them.
    • Get help from your section leader
    • Also go to the QUAD Center for help: opened six days a week for in-person (drop-in) tutoring. They also offer Zoom appointments
  • We are jumpting to chapter 7 today! Read the chapter along for this week
  • Link to Polling https://www.polleverywhere.com/agbofred203

Review Question

The Kilburn’s algorithm for finding the factor of a number using brute-force strategy is shown on the right. What would be the printed output of the code?

  1. 12
  2. 10
  3. 5
  4. 15
def largest_factor(n):
    factor = n - 1
    while n % factor != 0:
        factor -= 1
    return factor

if __name__ == '__main__':
    fac = largest_factor(20)
    print(fac)

Representations

  • Focusing today on how a computer can internally store more complex and abstract information
  • Initially will look at numbers
  • Then we’ll transition to strings

Bit Power

  • The fundamental unit of memory inside a computer is called a bit
    • Coined from a contraction of the words binary and digit
  • An individual bit exists in one of two states, usually denoted as 0 or 1.
  • More complex data can be represented by combining larger numbers of bits:
    • Two bits can represent 4 (\(2\times 2\)) values
    • Three bits can represent 8 (\(2\times2\times2\)) values
    • Four bits can represent 16 (\(2\times2\times2\times2\) or \(2^4\)) values, etc
  • Many laptops have 16GB of system memory, and can therefore keep track of approximately \(2^{16,000,000,000}\) states!

An old code, but it checks out

  • Binary notation is an old idea
    • Described by German mathematician Leibniz back in 1703
  • Leibniz describes his use of binary notation in an easy-to-follow style
  • Leibniz’s paper notes that the Chinese had discovered binary arithmetic 2000 years earlier, as illustrated by the patterns of lines in the I Ching!
Gottfried Wilhelm von Leibniz

Back to Grade school

image/svg+xml 0 1 10 100 100 101 102 10 digits: 0 - 9

Now in Binary

image/svg+xml 20 22 21 23 2 digits: 0 & 1 1 2 4 8 1 8 x 1 4 x 0 2 x 0 1 x + + + 0 0

Representing Integers

  • The number of symbols available to count with determines the base of a number system
    • Decimal is base 10, as we have 10 symbols (0-9) to count with
      • Each new number counts for 10 times as much as the previous
    • Binary is base 2, as we only have 2 symbols (0 and 1) to count with
      • Each new number counts for twice as much as the previous
  • Can always determine what number a representation corresponds to by adding up the individual contributions

Specifying Bases

  • So the binary number 00101010 is equivalent to the decimal number 42
  • We distinguish by subsetting the bases: \[ 00101010_2 = 42_{10}\]
  • The number itself still is the same! All that changes is how we represent that number.
    • Numbers do not have bases – representations of numbers have bases

Binary Clocks

Other Bases

  • Binary is not a particularly compact representation to write out, so computer scientists will often use more compact representations as well
    • Octal (base 8) uses the digits 0 to 7
    • Hexadecimal (base 16) uses the digits 0 to 9 and then the letters A through F


  • Why octal or hexadecimal over our trusty old decimal system?
    • Both are powers of 2, so it makes it easy to convert back to decimal
      • 1 octal digit = 3 binary digit, 1 hex digit = 4 binary digit

Base(ic) Practice

  • The Java compiler has a fun quirk where every binary file it produces begins with


  • What is this in decimal? octal? hexadecimal?

Representation

  • Sequences of bits have no intrinsic meaning!
    • Just the representations we assign to them by convention or by building certain operations into hardware
    • A 32-bit sequence represents an integer only because we have designed hardware to manipulate those sequences arithmetically: applying operations like addition, subtraction, etc
  • By choosing an appropriate representation, you can use bits to represent any value you could imagine!
    • Characters represented by numeric character codes
    • Floating-point representations to support real numbers
    • Two-dimensional arrays of bits representing images
  • To be useful though, everyone needs to agree on a representation!

Representation Pitfalls

  • How we choose to represent values has consequences!
  • Python represents floating point (fractional) numbers using two integers
    • One to represent the significant digits
    • One to represent the exponent (where the decimal place is)
  • \(1\frac{1}{4}\) Example
    • In decimal: \(\quad\displaystyle 1\frac{1}{4} = \frac{1}{1} + \frac{2}{10} + \frac{5}{100} = 1.25 = (125, -2)\)
    • In binary: \(\quad\displaystyle 1\frac{1}{4} = \frac{1}{1} + \frac{0}{2} + \frac{1}{4} = 1.01 = (101, -10)\)

Consequences

  • The best we can do within the range of normal integers \[\frac{3602879701896397}{2^{55}} = 0.10000000000000000555111512312578270\]
  • When doing operations on these numbers, extra decimals will sometimes get rounded off, suddenly making the number look precise, but you might always have a tiny bit of this rounding error showing up in floating point values.
  • So be careful using == for numeric comparisons! Rounding might result in unexpected falsehoods
    • 0.1 + 0.1 + 0.1 != 0.3
    • Far better to check if two numbers are within a small margin of one another, or greater or less than the other

Representing Characters

  • Use numeric encodings to represent character data inside the machine, where each character is assigned an integer value.
  • Character codes are not very useful unless standardized though!
    • Competing encodings in the early years made it difficult to share data across machines
  • First widely adopted character encoding was ASCII (American Standard Code for Information Interchange)
  • With only 256 possible characters, ASCII proved inadequate in the international world, and has therefore been superseded by Unicode.

ASCII

image/svg+xml 0 1 2 3 4 5 6 7 8 9 A B C D E F 0x 1x 2x 3x 4x 5x 6x 7x \0 \b \t \n \v \f \r ! " # $ % & ' ( ) * + , - . / 0 1 2 3 4 5 6 7 8 9 : ; < = > ? @ A B C D E F G H I J K L M N O P Q R S T U V W X Y Z [ \ ] ^ _ ` a b c d e f g h i j k l m n o p q r s t u v w x y z { | } ~
The ASCII subset of Unicode
// reveal.js plugins