Computer Science 15-112, Fall 2011
Optional Lecture Notes: Combinatorial Iterators

These notes were prepared by students attending the optional lecture on matrices. As such, they may contain a small error here or there (standard disclaimer). They are provided here as a service to those who may have missed the lecture.

First, here are two versions of the code we wrote:
combinatorics-prep.py (written in preparation for the lecture)
combinatorics-from-class.py (the actual code we wrote, which differs in some small but interesting ways)

Combinatorial Iterators

Problems to solve today:

· Subset Sum

o Partition

· Satisfyability (SAT)

· Odds of 3 Card Poker

· Krypto

· Cryptarithm

An Introduction to the Itertools Package

The package that allows us to solve combinatorial problems is called the itertools module, which you can import with import itertools. (The itertools functions are the real stars of the lecture!) For our purposes we will use from itertools import * in order to call any of the functions in the module without having to use itertools. in front of the function name. When calling any of these functions, it’s important to call them in a list() otherwise, they just give an alias in memory where the result is stored.

product(iterables) takes the Cartesian product of any arguments, returning a list of these products. For example:
list(product([1,2], [3,4])) returns [(1, 3), (1, 4), (2, 3), (2, 4)].

permutations(iterable) gives you all arrangements of the input, for example, but is the equivalent of using the product function and then excluding all non-unique arrangements. Order matters in permutations. (A,B) is not the same as (B,A).

combination(iterable, r) returns combinations of the iterable with length of r - in other words, returns all subsets of the input with size of r. Order does not matter in combinations. (A,B) is the same as (B,A)

combination_with_replacement(iterable, r) returns combinations that permit one element to be used more than once. Order does not matter, so this can be done as the product of an iterable with itself, and then excluding results that are not in sorted order. (A,B,C) becomes (A,A), (A,B), (A,C), (B,C), (B,B), and (C,C).

Everything else we can do with combinatorics is just a mashup of these functions! These are called recipes by the Python Docs. Recipes are not included in the module, so we’ll have to list them ourselves as helper functions. The recipes that will be important for us include:

unique(iterable) which returns all the unique elements of the iterable, remembering all the elements its seen so far. It’s called unique_everseen(iterable) in the Python Docs.

powerset(iterable) gives the powerset of the iterable, which are the all of the subsets of the iterable of all possible lengths.

counterInNBase(base,digits) uses product to give permutations of digits in a given base over digits number of digits. For example, counterInNBase(2,2) gives 00, 01, 10, and 11.

count(iterable) counts the iterables. How is the different from len(iterable)?

countIf(iterable, test) counts the iterables that fulfill a certain condition made in test.

We will use these techniques in our functions to solve combinatorial problems!

NP (Nondeterministic Polynomial) Complete

linear -> polynomial (quadratic, cubic) -> super polynomial (exponential)

go in order of good -> bad -> really bad in terms of efficiency

Types of problems:

NP: nondeterministic polynomial time

- can be verified in polynomial time (efficiently check to see if a solution is correct)

P: polynomial time

- can be solved in polynomial time (efficiently)

- P is a subset of NP

NP-C: NP-complete

- No known efficient (polynomial time) solution to these problems

- A subset of NP problems such that a solution for one can be used to solve any NP problem.

- If the NP-C problems are solvable in polynomial time, then since that solution can be used to solve all other NP problems, including NP-C problems, then all NP problems would have an efficient (polynomial) solution, meaning that the set of NP problems would equal the set of P problems, or P = NP

- If the NP-C problems cannot be solved in polynomial time, then there are problems in NP that are not in the set P, or P =/= NP

Is P != NP (cannot be solved in sub-exponential time) or is P = NP (can be solved in sub-exponential time)

This is THE outstanding problem of computer science!

Examples of NPC problems:

· Traveling salesman

o A salesman has to visit a certain amount of cities while traveling the least amount of distance

· Airline seating

o Given all passengers, seat them in such a way that burns the least amount of airline fuel

· UPS bin packing

o How can UPS most efficiently transport all of their packages using the least amount of miles traveled for all of the trucks together

· Other optimization problems (some that seem similar, and many others that on the surface don't)

It’s easy to verify that the answer is correct, though much harder to come up with the answer!

Why is this interesting? If you can solve one, you can solve them all in polynomial time!

If we have a problem that is exponential in nature, we can only solve a portion of it

As an efficient algorithm to solve these problems has yet to be discovered (if it even exists), companies currently use estimation algorithms that will result in roughly 10% waste. In a 10 Trillion dollar world economy, this means that one TRILLION dollars are wasted every year.

Unfortunately, we’re not going to get a polynomial solution to subset sum. But we can get an exponential solution!

Subset Sum Overview

Problem we’re trying to solve:

If I have a set of numbers, is there some subset such that the sum of all its elements is 0?

We know that powersets give every possible subset. So for each set in the powerset, we want to take its sum and see if it equals 0.

def subsetSum(iterable):

for subset in powerset(iterable):

if (len(subset) > 0) and sum(subset) == 0: #subset must not be the empty set!

return subset

return None

Partition Overview

Problem we’re trying to solve:

If I have a set of numbers, is there a partition I can make such that each partition equals the other?

This is the same as asking if I can take any subset of a set where the sum is equal to the sum of the set’s complement. Using powerset(iterable), we take all possible sets and verify if they equal their complements, which is the difference between the input set and the subset.

Satisfyability Overview

Problem we’re trying to solve:

If I have an expression using variables x and y, which solution(s) of x and y will return True? For example, the expression (x&y) or (not x & not y) will have certain values of x and y to yield True.

The function SAT will find all assignments of the variables that make the given expression true. To solve this in a combinatorial manner, you would have to assign T/F to every variable in turn, revealing a factor of 2 possibilities for every variable, so 2ⁿpossibilities of assignments. We can do this using counterInBase() to assign variables with any value in a base of digits (base 2 in this case) and then evaluate the expression using eval(). We’ll note all the assignments that produce True.

def SAT(vars, expr):

#Is there a True/False assignment to vars that makes expr True?

for variableTruthValues in counterInBaseN(2, len(vars)): #Here, 2 is the base and len(vars) is the number of variables

#Assign to each variable the corresponding truth value

locals = dict() #We need a dictionary of assignments for the truth values of the variables

for (var,val) in zip(vars, variableTruthValues): #Go through and bind variables to values by zipping them together

locals[var] = val

Now we want to evaluate the expression! We can either define as globals OR create a map of locals (See below.*)

In this loop, we’re trying every possible set of truth values. If we’ve satisfied the expression with this, we return the locals. If it returns True, we’re found a winning assignment, so return the locals!

if (eval(expr, globals(), locals) == True):

return locals

return None

*Explanation of eval(expr, globals(), locals)’s arguments:

expr: a string or code object to be evaluated

globals(): dictionary containing ‘global’ variables and other mappings

locals(): mapping containing local variables and other mappings

(the above allow you to save variables to use in future eval calls)

For the first three problems, we used some variation of powerset.

Now, let’s use combinations and permutations!

Three Card Poker Overview

How it works – You’re given 3 cards in a hand, so there are 22,100 unique hands. We want to use combinations here because the order of the cards doesn’t matter.

In order to get the deck of all the cards, we want to take the product of the suits and faces…

def getDeck():

suits = ["clubs", "diamonds", "hearts", "spades"]

faces = range(1,14)

return list(product(suits, faces))

Now in order to find all possible hands, we want to find all possible combinations.

Combinations() gives an iterator. This is clumsy because lists are expensive! Instead, we will use count from the recipes. A variation of count is countIF, which counts lists of values only if the value = True.

def totalHands(cardsPerHand):

return count(combinations(getDeck(), cardsPerHand))

This counting approach will give an exact answer as opposed to the Monte Carlo techniques, which give an approximate. However, as a tradeoff, the Monte Carlo techniques tend to be more applicable.

Now we want to count all the pairs. In order to do this, we’re going to define hasPairs INSIDE of totalPairs! We need to see if there are duplicate face cards in the hand, so we’re going to use sets. Originally, each card is represented as a tuple of its suit and face, though we can ignore the suit when looking for pairs. We can unzip the set to separate the suit from the face values.

def totalPairs(cardsPerHand): #Count the single pairs, but not trips or two-pairs!

def hasPair(hand):

#Hand == ((suit0, face0), (suit1, face1), ..., (suitK, faceK))

(suits, faces) = zip(*hand) #Unzip to extract the face values

faceSet = set(faces) #Create set of all faces

return (len(faceSet) == len(hand)-1) # Why -1? Put 3 in, put 5 in, put 3: just 3 & 5 in the set so len is 2!

#This also discounts 3 of a kind (ie 2 clubs, 2 hearts, and 2 diamonds, which is not a pair!)

return countIf(combinations(getDeck(), cardsPerHand), hasPair) #We use countIf here to see if all the hands fulfill hasPair

Krypto Overview

Given an arbitrarily long list of numbers and a target, the aim is to reach the target by inserting operators in between the numbers and solving without using order of operations but instead going left to right.

For example: List (3, 7, 5, 4) Target 46

Solution: 3 + 7 * 5 – 4

Our strategy will be to reason over every permutation and slip in all possible operators.

:Screen shot 2011-10-20 at 9.52.55 PM.png

Q. How do we iterate over all operations?

Take the product!

def krypto(numbers, target, ops="+-*/"):

def makeKryptoExpr(operands, operators):

#Operands = [2,3,56,-38]

#Operators = ["+", "*", "+"]

#Result --> "(((2+3)*56)+-38)"

result = "(" * len(operators) + str(operands[0]) #Every operation needs parentheses to force L->R

for i in xrange(len(operators)):

result += operators[i] + str(operands[i+1]) + ")" #One more operand than operator

return result

for operands in permutations(numbers):

for operators in product(ops, repeat=len(numbers)-1): #Try every possible product of my operators!

#Why -1? If there are 5 numbers, there are 4 gaps!

expr = makeKryptoExpr(operands, operators)

if (eval(expr) == target): #Can’t use eval because it uses order of operations UNLESS we use parentheses

# We win!!!!

return expr.replace("(","").replace(")","")

return None

Cryptarithm Overview

The general idea of a cryptarithm is to assign integers to letters.

For example, SEND + MORE = MONEY is “9567 + 1085 = 10652”

Q. How to attack this problem?

-Need list of unique letters

-Try all possible assignments of single digits

-Different arrangement would produce a different answer

So, we want a permutation of the number of unique letters with unique numbers. First we need to get an expression…

def cryptarithm(expr):

We need to have a function that takes in a word and replaces the letters with digits, so it gives 3 5 6 4 instead of SEND. We can define this now inside cryptarithm and call it later.

def makeCryptValue(word, map):

# MakeCryptValue("SEND", {"S":5, "E":2, "N":1, "D":8 }) returns 5218

result = 0

for letter in word: #Walk through the result, get the digit and shift

result = 10*result + map[letter] #10*result is the shift, map[letter] is the digit value

return result

# 1) Make list of words, so word[0] + word[1] = word[2]. Assume that the expression is well-formed.

# Words = ["SEND", "MORE", "MONEY"]

words = expr.replace("+"," ").replace("="," ").split(" ") #use str.split()

# 2) Want a list of unique letters. Use sets for uniqueness

letters = sorted(set("".join(words))) #Returns unique letters

for digitAssignment in permutations(range(10), len(letters)): #Try every permutation of digits against letters

#If 4 letters, groups of 4. If more than 10 letters, change the base

map = dict(zip(letters, digitAssignment)) #Create tuples of letter/digit and put into a dictionary

values = [makeCryptValue(word, map) for word in words] #Now call the function that we created earlier

if (0 in [map[word[0]] for word in words]):

#The first number of any word cannot be a leading 0! We can create a map to check this

continue

if (values[0] + values[1] == values[2]): #We win!

return "%d+%d=%d" % (values[0], values[1], values[2])

return None