Recursion 3 - Divide and conquer
When confronted with a problem, we can try to solve it by splitting its dimension in half (or more), look for solutions in each of the halves with two recursive calls, and then decide what to do with the found solutions, if any.
Why should we do this? So far we wrote sequential algorithms that basically exploit only one CPU core on one computer, but modern CPUs have multiple cores and data centers multiple servers. If we could keep busy all the CPU cores on all the servers our algorithms would perform way faster.
By partitioning a problem into completely separate parts with a divide and conquer strategy, we might hope it becomes amenable to parallelization. To understand if a strategy really separates the problem into clear-cut parts, without possibility of i.e. recursive calls overwriting data which should only be read by other calls, we might ask ourselves the following question: if we swapped the recursive calls, would we get a different result? If the answer is no, the program is probably going to be a good candidate for parallelization.
Note in these worksheets we only care about the logic and will not actually see how to execute programs in parallel - if interested, search for multithreading programming or big data frameworks such as Spark)
Managing solutions
After we split a problem in halves and do recursive calls on each halve, several cases may occur:
No solution is found
One solution is found
Two solutions are found
case 1): we can only give up
case 2): we have only one solution, so we can just return that one
case 3): we have two solutions, so we need to decide what is the purpose of the algorithm
Is the purpose to …
find all possible solutions? Then we return both of them.
find the best solution, according to some measure of ‘goodness’? Then we measure each of the solutions and give back the highest scoring one.
provide a combination of existing solutions, according to some mixing method? Then we combine the found solutions and give them back
Divide and conquer - SimpleFP style
Example - dsdouble
Suppose we are tasked with reimplementing the double
like this:
Take a lst
of numbers and RETURN a NEW list with the numbers doubled as in SimpleFP style but this time following a divide and conquer strategy.
Let’s first focus on the logic, without caring about performance: we will divide the input in half, compute the doubled version of the halved lists (that is, find two solutions), and then join them at the end (combine the found solutions).
Notice the two recursive calls here are done sequentially one after each other, but if we wanted nothing would prevent us from executing them in parallel (in other words, if you swapped the two recursive calls, you would obtain exactly the same result)
[2]:
def dsdouble(lst):
if len(lst) == 0: # base case
return []
elif len(lst) == 1: # base case
return [lst[0] * 2]
else:
left = dsdouble(lst[:len(lst)//2])
right = dsdouble(lst[len(lst)//2:])
return left + right
result = dsdouble([4,6,2,7,3])
jupman.pytut()
[2]:
Exercise - debug dsdouble
Add a level
parameter and some debugging indented print to the function
[4]:
result = dsdouble([4,6,2,7,3])
print('final result=', result)
lst= [4, 6, 2, 7, 3]
left call
lst= [4, 6]
left call
lst= [4]
left= [8]
right call
lst= [6]
right= [12]
left= [8, 12]
right call
lst= [2, 7, 3]
left call
lst= [2]
left= [4]
right call
lst= [7, 3]
left call
lst= [7]
left= [14]
right call
lst= [3]
right= [6]
right= [14, 6]
right= [4, 14, 6]
final result= [8, 12, 4, 14, 6]
Exercise - dssum
Takes a lst
of numbers and RETURN the sum of all numbers.
USE SimpleFP style with a divide and conquer strategy.
DO NOT use
sum
function ;-)
[6]:
def dssum(lst):
raise Exception('TODO IMPLEMENT ME !')
result = dssum([4,2,6,3])
print(result) # 15
assert dssum([4,2,6,3]) == 15
assert dssum([]) == 0
assert dssum([3]) == 3
assert dssum([5,1]) == 6
Exercise - dsmin
Given a lst
of numbers, RETURN the minimum
assume the list has at least one element
USE SimpleFP style with a divide and conquer strategy
DON’T use
min
function..
[7]:
def dsmin(lst):
raise Exception('TODO IMPLEMENT ME !')
result = dsmin([3,6,5,8,-3,8,-2,-10,4])
print(result) # -10
assert dsmin([4]) == 4
assert dsmin([4,1]) == 1
assert dsmin([0,9]) == 0
assert dsmin([6,8,5]) == 5
assert dsmin([3,6,5,8,3,8,2,10,4]) == 2
Exercise - dszip
Given two lists, RETURN a NEW list having two-element tuples with elements taken pairwise from lst1
and lst2
USE SimpleFP style with a divide and conquer strategy
DO NOT use
zip
function
[8]:
def dszip(lst1, lst2):
raise Exception('TODO IMPLEMENT ME !')
result = dszip([3,5,1,6], ['a','d','e','w'])
print(result) # [(3,'a'), (5,'d'), (1,'e'), (6,'w')]
assert dszip([], []) == []
assert dszip(['a'], [0]) == [('a',0)]
assert dszip(['z','a'], [2,4]) == [('z',2), ('a',4)]
assert dszip([3,5,1,6], ['a','d','e','w']) == [(3,'a'), (5,'d'), (1,'e'), (6,'w')]
Exercise - dsunnest
Given a lst
of lists, RETURN a NEW list without sublists (also called shallow unnest)
assume sublists cannot have further sublists inside
USE SimpleFP style with a divide and conquer strategy
[9]:
def dsunnest(lst):
raise Exception('TODO IMPLEMENT ME !')
result = dsunnest([[3,2,5], [2,4], [0,9,5,1]])
print(result) # [3,2,5,2,4,0,9,5,1]
assert dsunnest([]) == []
assert dsunnest([[]]) == []
assert dsunnest([[],[]]) == []
assert dsunnest([[],[4]]) == [4]
assert dsunnest([[3],[]]) == [3]
assert dsunnest([[3,2]]) == [3,2]
assert dsunnest([[3,2]]) == [3,2]
assert dsunnest([[3,2,5], [2,4], [0,9,5,1]]) == [3,2,5,2,4,0,9,5,1]
Exercise - dsall
Given a lst
of booleans, RETURN True
if all elements are True
, otherwise RETURN False
DO NOT use
all
functionassume
dsall
of empty list isTrue
USE SimpleFP style with a divide and conquer strategy
HINT: consider exploiting Boolean evaluation order to shorten your code
[10]:
def dsall(lst):
raise Exception('TODO IMPLEMENT ME !')
assert dsall([]) == True
assert dsall([True]) == True
assert dsall([False]) == False
assert dsall([False, False]) == False
assert dsall([False, True]) == False
assert dsall([True, False]) == False
assert dsall([True, True]) == True
assert dsall([False, False, False]) == False
assert dsall([False, False, True]) == False
assert dsall([False, True, False]) == False
assert dsall([False, True, True]) == False
assert dsall([True, False, False]) == False
assert dsall([True, False, True]) == False
assert dsall([True, True, False]) == False
assert dsall([True, True, True]) == True
Exercise - dsrev
Given a list, RETURN a NEW list reversed.
USE SimpleFP style with a divide and conquer strategy
DO NOT use
reversed
function
[11]:
def dsrev(lst):
raise Exception('TODO IMPLEMENT ME !')
result = dsrev([4,9,7,5,6])
print(result)
assert dsrev([]) == []
assert dsrev([3]) == [3]
assert dsrev([5,3]) == [3,5]
assert dsrev([6,6,2]) == [2,6,6]
assert dsrev([4,9,7,5,6]) == [6,5,7,9,4]
Exercise - dsrep
Given a list la
, and a list lb
of integer numbers \(\geq 0\), RETURN a NEW list where all elements of la
are repeated the amount of times indicated in the corresponding cell of lb
. A bit hard, but gives satisfaction.
assume
la
andlb
have the same sizeUSE SimpleFP style with a divide and conquer strategy
DO NOT use
*
replication operatorHINT: since both lists have equal length, you can apply splitting logic to both
[12]:
def dsrep(la, lb):
raise Exception('TODO IMPLEMENT ME !')
result = dsrep(['a','b','c','d'], [2,5,0,3])
print(result) # ['a','a','b','b','b','b','b','d','d','d']
assert dsrep([], []) == []
assert dsrep(['a'], [1]) == ['a']
assert dsrep(['a'], [0]) == []
assert dsrep(['b'], [2]) == ['b','b']
assert dsrep(['a','b'], [1,1]) == ['a','b']
assert dsrep(['a','b'], [1,2]) == ['a','b','b']
assert dsrep(['a','b'], [2,1]) == ['a','a','b']
assert dsrep(['a','b'], [2,0]) == ['a','a']
assert dsrep(['a','b'], [0,2]) == ['b','b']
assert dsrep(['a','b','c'], [1,1,1]) == ['a','b','c']
assert dsrep(['a','b','c'], [0,1,1]) == ['b','c']
assert dsrep(['a','b','c'], [0,1,0]) == ['b']
assert dsrep(['a','b','c'], [3,1,2]) == ['a','a','a','b','c','c']
assert dsrep(['a','b','c','d'], [2,5,0,3]) == ['a','a','b','b','b','b','b','d','d','d']
Divide and conquer - Accumulator style
So far we’ve been creating lists at each call, thus wasting lots of memory and computation time.
Let’s first try using indeces with a simple example:
Example - dacount
Suppose given a number n
, you want to build a list with first numbers from 1
to n
without using range
and using ModAcc style with a divide and conquer strategy
Notie a couple of things:
to promote parallelism, ideally it should be possibile to execute left and right calls independently of their order: to allow for this you should first allocate all the needed space in the accumulator. Since for us it’s a special variable, you can preallocate space using the
*
operator before calling the helpers.since
acc
is created in the main function and keeps getting modified, we dropped thereturn
statements in the helper
[13]:
def dacount_helper(i, j, acc):
if i > j:
pass
elif i == j:
acc[i - 1] = i
else:
k = (i + j) // 2
dacount_helper(i, k, acc) # no need to return
dacount_helper(k+1, j, acc) # no need to return
def dacount(n):
acc = [0]*n # let's create space to be filled in later
# (acc is special so let's allow us the freedom of using * operator)
dacount_helper(1,n, acc)
return acc
result = dacount(5)
print(result)
jupman.pytut()
[1, 2, 3, 4, 5]
[13]:
Example - dadouble
Let’s see now how we could use indeces and accumulators with our double
function:
Take a lst
of numbers and RETURN a NEW list with the numbers doubled, following the ModAcc style with a Divide and Conquer strategy
Again, notice we preallocated the accumulator in the main function and dropped the return instructions in the helper
[15]:
def dadouble_helper(lst, i,j,acc):
if i >= j: # note the >= in this case
return acc
else:
k = (i + j) // 2
acc[k] = lst[k] * 2
dadouble_helper(lst, i, k, acc)
dadouble_helper(lst, k+1, j, acc)
return acc
def dadouble(lst):
acc = [None]*len(lst) # let's create space to be filled in later
# (acc is special so let's allow us the freedom of using * operator)
dadouble_helper(lst, 0, len(lst), acc)
return acc
result = dadouble([4,6,2])
print(result)
jupman.pytut()
[8, 12, 4]
[15]:
Exercise - debug dadouble
Add a level
parameter and some debugging indented print to the dadouble
function.
[17]:
result = dadouble([4,6,2,7,3])
print("Final result=", result)
i= 0 j= 5 acc= [None, None, None, None, None]
left call
i= 0 j= 2 acc= [None, None, 4, None, None]
left call
i= 0 j= 1 acc= [None, 12, 4, None, None]
left call
i= 0 j= 0 acc= [8, 12, 4, None, None]
right call
i= 1 j= 1 acc= [8, 12, 4, None, None]
right call
i= 2 j= 2 acc= [8, 12, 4, None, None]
right call
i= 3 j= 5 acc= [8, 12, 4, None, None]
left call
i= 3 j= 4 acc= [8, 12, 4, None, 6]
left call
i= 3 j= 3 acc= [8, 12, 4, 14, 6]
right call
i= 4 j= 4 acc= [8, 12, 4, 14, 6]
right call
i= 5 j= 5 acc= [8, 12, 4, 14, 6]
Final result= [8, 12, 4, 14, 6]
Exercise - dahist
Given a lst
, RETURN a NEW dictionary with the counts of elements of lst
DO NOT use search methods (
.index
,.count
,.remove
…)NOTE in this case we can start with an empty dictionary
[19]:
def dahist(lst):
raise Exception('TODO IMPLEMENT ME !')
result = dahist(['a','b','a','c','b','b','c','a','b','c','d'])
print(result) # {'a':3,
# 'b':4,
# 'c':3,
# 'd':1}
assert dahist([]) == {}
assert dahist(['a']) == {'a':1}
assert dahist(['a','a']) == {'a':2}
assert dahist(['a','b']) == {'a':1,
'b':1}
assert dahist(['b','c']) == {'c':1,
'b':1}
assert dahist(['a','c','a']) == {'a':2,
'c':1}
assert dahist(['c','c','a']) == {'a':1,
'c':2}
assert dahist(['a','b','a','c','b','b','c','a','b','c','d']) == {'a':3,
'b':4,
'c':3,
'd':1}
TODO will add more
More exercises
Computer Science Circles: nice chapter about recursion with some exercises
W3 Resources: some recursion exercises
Edabit: Search by setting ‘Recursion’ as tag - notice many provided solutions are not actually in recursive format.
Geeks for geeks:
Continue
Go on with the challenges