CMU 15-110: Principles of Computing Sets

1. Creating Sets
• Create an empty set
s = set() print(s) # prints set()

• Create a set from a list
# Right way: L = ['cat', 'cow', 'dog'] s = set(L) print(s) # prints {'cow', 'dog', 'cat'} # Wrong way: L = ['cat', 'cow', 'dog'] s = set() s.add(L) # This crashes: TypeError: unhashable type: 'list'

• Create a set from a string
print('wrong:', set('abc')) # {'c', 'a', 'b'} print('right:', { 'abc' }) # {'abc'} print('right:', set(['abc'])) # {'abc'}

2. Properties of Sets
1. Sets are Unordered
s = set([2, 4, 8]) print(s) # prints {8, 2, 4} in standard Python (though not in brython) for value in s: # prints 8, 2, 4 print(value)

2. Elements are Unique
s = set([2,2,2]) print(s) # prints {2} print(len(s)) # prints 1

3. Elements Must Be Immutable
s = set() s.add(12345) # fine, since numbers are immutable s.add('wow') # fine, since strings are immutable s.add([1,2]) # fails: unhashable type: 'list' (since lists are mutable)

4. Sets are Very Efficient
• For a list L of size n, `(value in L)` is O(n).
• For a set S of size n, `(value in S)` is O(1). Wow.

3. Set Operations

1. Operations on a set

 Operation Result len(s) the number of elements in s x in s True if x is in s, and False otherwise x not in s opposite of `in`: True if x is not in s, and False otherwise s.add(x) add element x to set s, where x is immutable s.discard(x) removes x from set s if present (no error if not present) for value in s: loop over each value in set s

2. Operations on two sets

 Operation Equivalent Result s.issubset(t) s <= t test whether every element in s is in t s.issuperset(t) s >= t test whether every element in t is in s s.union(t) s | t new set with elements from both s and t s.intersection(t) s & t new set with elements common to s and t s.difference(t) s - t new set with elements in s but not in t

4. Example: Contains Duplicates
def containsDuplicates(L): return (len(L) > len(set(L))) print(containsDuplicates([2,3,5,7])) print(containsDuplicates([2,3,5,2]))

5. Example: Comparing Two Large Lists
# This is a perfect case for using set differences (s - t) # We start with two large lists: list1 = ['Lemon', 'Mauve', 'Lilac', 'Taupe', 'Slate', 'Pear', 'Copper', 'Amaranth', 'Lavender', 'Amethyst', 'Champagne', 'Harlequin', 'Periwinkle', 'Teal', 'Carmine', 'Red', 'Baby', 'Magenta', 'White', 'Silver', 'Byzantium', 'Brown', 'Cerulean', 'Blush', 'Ruby', 'Raspberry', 'Gold', 'Tan', 'Cyan', 'Scarlet', 'Desert', 'Viridian', 'Jungle', 'Spring', 'Orchid', 'Beige', 'Emerald', 'Prussian', 'Olive', 'Chartreuse', 'Indigo', 'Pink', 'Salmon', 'Puce', 'Blue', 'Electric', 'Persian', 'Violet', 'Bronze', 'Ivory', 'Cobalt', 'Navy', 'Maroon', 'Azure', 'Turquoise', 'Jade', 'Orange-red', 'Spring', 'Coral', 'Plum', 'Green', 'Aquamarine', 'Peach', 'Apricot', 'Blue-violet', 'Black', 'Ocher', 'Coffee', 'Sangria', 'Magenta', 'Crimson', 'Cerise', 'Sapphire', 'Purple', 'Orange', 'Amber', 'Gray', 'Erin', 'Chocolate', 'Red-violet', 'Rose', 'Lime', 'Yellow'] list2 = ['Coffee', 'Navy', 'Lilac', 'Mauve', 'Plum', 'Ocher', 'Bronze', 'Baby', 'Slate', 'Pink', 'Copper', 'Silver', 'Olive', 'White', 'Sapphire', 'Spring', 'Jungle', 'Apricot', 'Raspberry', 'Lime', 'Magenta', 'Periwinkle', 'Maroon', 'Carmine', 'Gold', 'Pear', 'Cyan', 'Red', 'Amethyst', 'Turquoise', 'Harlequin', 'Chartreuse', 'Emerald', 'Blue', 'Blue-green', 'Chocolate', 'Crimson', 'Indigo', 'Orchid', 'Ivory', 'Azure', 'Blue-violet', 'Burgundy', 'Violet', 'Salmon', 'Spring', 'Lavender', 'Tan', 'Taupe', 'Erin', 'Beige', 'Cobalt', 'Red-violet', 'Rose', 'Amber', 'Amaranth', 'Teal', 'Yellow', 'Purple', 'Byzantium', 'Blush', 'Desert', 'Lemon', 'Electric', 'Sangria', 'Prussian', 'Aquamarine', 'Orange-red', 'Cerulean', 'Brown', 'Puce', 'Coral', 'Gray', 'Viridian', 'Jade', 'Scarlet', 'Champagne', 'Peach', 'Green', 'Magenta', 'Orange', 'Black'] # And we ask: which colors are only in one of the two lists? colors1 = set(list1) colors2 = set(list2) print('Colors in list1 but not in list2:', colors1 - colors2) print('Colors in list2 but not in list1:', colors2 - colors1)