The union of the sets \(A\) and \(B\) (\(A \cup
B\)) consists of all elements in either \(A\) or \(B\) or both.
The intersection of the sets \(A\) and \(B\) (\(A \cap
B\)) consists of all elements in both \(A\) and \(B\).
The set difference of \(A\) and \(B\) (\(A -
B\)) consists of all elements in \(A\) but not in \(B\).
The symmetric set difference of \(A\) and \(B\) (\(A\triangle
B\)) consists of all elements in \(A\) or \(B\) but not in both.
Python Implementations
Python’s built-in implementation of sets supports all these same
operations
Can either use appropriately named methods on sets or operators
between sets
Membership 3 in primes
Union: A.union(B)A | B
Intersection A.intersection(B)A & B
Difference A.difference(B)A - B
Symmetric difference
A.symmetric_difference(B)A ^ B
Venn Diagrams
A Venn Diagram is a graphical representation of a set which
indicates common elements as overlapping areas
The following Venn diagrams illustrate the effect of the 4 primary
set operations
Practice
If we have the following sets from earlier:
digits = { 0, 1, 2, 3, 4, 5, 6, 7, 8, 9 }
evens = { 0, 2, 4, 6, 8 }
odds = { 1, 3, 5, 7, 9 }
primes = { 2, 3, 5, 7 }
squares = { 0, 1, 4, 9 }
What is the value of each of the following:
evens ∪ squares
odds ∩ primes
primes - evens
odds ∆ squares
Understanding Check
Looking at the same sets:
digits = { 0, 1, 2, 3, 4, 5, 6, 7, 8, 9 }
evens = { 0, 2, 4, 6, 8 }
odds = { 1, 3, 5, 7, 9 }
primes = { 2, 3, 5, 7 }
squares = { 0, 1, 4, 9 }
What is the set resulting from: \[
(\text{primes} \cap \text{evens}) \cup
(\text{odds}\cap\text{squares})\]
∅
{ 1, 2, 9 }
{ 1, 3, 4, 5}
{ 0, 3, 4, 5, 7}
Set Relationships
Sets \(A\) and \(B\) are equal (\(A = B\)) if they have the same elements.
This would make them the same circles in a Venn diagram
Set \(A\) is a subset of
\(B\) (\(A\subseteq B\)) if all the elements in
\(A\) are also in \(B\).
This would mean that the circle for \(A\) would be entirely inside (or equal) to
the circle of \(B\)
Set \(A\) is a proper
subset of \(B\) (\(A\subset B\)) if \(A\) is a subset of \(B\) and the two sets are not equal
Informal Proofs
You can use Venn diagrams to justify different set identities
Example: Say you wanted to show that: \[
A - (B \cap C) = (A-B) \cup (A-C)\]
Python Set Methods
Can also use “set comprehension” to generate a set
{ x for x in range(0,100,2) }
Function
Description
len(set)
Returns the number of elements in a set
elem in set
Returns True if
elem is in the set
set.copy()
Creates and returns a shallow copy of the set
set.add(elem)
Adds the specified elem to the set
set.remove(elem)
Removes the element from the set, raising a
ValueError if it is missing
set.discard(elem)
Removes the element from the set, doing nothing if it is
missing
Why use sets?
Sets come up naturally in many situations
Many real-world applications involve unordered collections of unique
elements, for which sets are the natural model.
Sets have a well-established mathematical foundation
If you can frame your application in terms of sets, you can rely on the
various mathematical properties that apply to sets.
Sets can make it easier to reason about your program
One of the advantages of mathematical abstraction is that using it often
makes it easy to think clearly and rigorously about what your program
does.
Many important algorithms are described in terms of sets
If you look at websites that describe some of the most important
algorithms in computer science, many of them base those descriptions in
terms of set operations.
Representing Data
To use computation effectively, we frequently need to be able to
represent real world data in a way that computers can easily work with
Real world data is often more complicated or nuanced than just “a
list of numbers”
Python’s existing data structures are tools, which
you can use to help represent certain ideas
Lists when you have sequential type data, wherein there is
a logical ordering to the data in question (where position matters)
Example: GPA over the course of 4 years
Tuples or classes when you have elements that
should be grouped together but which have no inherent ordering.
Generally use tuples for simple records and write custom classes for
more complex. Could potentially also use a dictionary.
Example: Student names in a class
Maps or dictionaries when you have specific keys
corresponding to other values.
Example: Student grades
Tricky Data
Human readable data is not always the best machine readable
data!
Name
Class
Q1
Mid
Q3
Final
Sally
Python
A
B
B
A
Jake
Python
B
B
B
C
James
Astro
B
B
A
Lily
Astro
A
A
B
Ben
Python
C
B
B
A
Storing the above in a 2D array would work, but would be frustrating
to work with
A Computer Friendly Approach
Student grades are time ordered, so we could use a list for the
grades
Each student has a corresponding sequence of grades (and students
are unordered), so we could use a dictionary where student names are the
keys and the list of grades the values
Each class corresponds to an unordered set of students. Could have
another dictionary where the keys were the class names and the values
were the dictionary of students/grades
Structures representing complicated data can often be large enough
that you don’t want to store them within your program itself
We can put them in their own file, but reading them in with our
current tools would be complicated
Current methods read in text, so we would need to then
parse the text to identify what data structures we needed to
create and what elements we needed to add
This is certainly possible, but is sometimes more overhead than what
we would like
It can be useful then to store the data structure in a file in such
a format that can be easily read into Python