Recursion 3 - Divide and conquer

When confronted with a problem, we can try to solve it by splitting its dimension in half (or more), look for solutions in each of the halves with two recursive calls, and then decide what to do with the found solutions, if any.

Why should we do this? So far we wrote sequential algorithms that basically exploit only one CPU core on one computer, but modern CPUs have multiple cores and data centers multiple servers. If we could keep busy all the CPU cores on all the servers our algorithms would perform way faster.

By partitioning a problem into completely separate parts with a divide and conquer strategy, we might hope it becomes amenable to parallelization. To understand if a strategy really separates the problem into clear-cut parts, without possibility of i.e. recursive calls overwriting data which should only be read by other calls, we might ask ourselves the following question: if we swapped the recursive calls, would we get a different result? If the answer is no, the program is probably going to be a good candidate for parallelization.

Note in these worksheets we only care about the logic and will not actually see how to execute programs in parallel - if interested, search for multithreading programming or big data frameworks such as Spark)

Managing solutions

After we split a problem in halves and do recursive calls on each halve, several cases may occur:

  1. No solution is found

  2. One solution is found

  3. Two solutions are found

case 1): we can only give up

case 2): we have only one solution, so we can just return that one

case 3): we have two solutions, so we need to decide what is the purpose of the algorithm

Is the purpose to …

  • find all possible solutions? Then we return both of them.

  • find the best solution, according to some measure of ‘goodness’? Then we measure each of the solutions and give back the highest scoring one.

  • provide a combination of existing solutions, according to some mixing method? Then we combine the found solutions and give them back

Divide and conquer - SimpleFP style

Example - dsdouble

Suppose we are tasked with reimplementing the double like this:

Take a lst of numbers and RETURN a NEW list with the numbers doubled as in SimpleFP style but this time following a divide and conquer strategy.

Let’s first focus on the logic, without caring about performance: we will divide the input in half, compute the doubled version of the halved lists (that is, find two solutions), and then join them at the end (combine the found solutions).

Notice the two recursive calls here are done sequentially one after each other, but if we wanted nothing would prevent us from executing them in parallel (in other words, if you swapped the two recursive calls, you would obtain exactly the same result)

[2]:
def dsdouble(lst):
    if len(lst) == 0:            # base case
        return []
    elif len(lst) == 1:          # base case
        return [lst[0] * 2]
    else:
        left = dsdouble(lst[:len(lst)//2])
        right = dsdouble(lst[len(lst)//2:])
        return left + right

result = dsdouble([4,6,2,7,3])
jupman.pytut()
[2]:

Exercise - debug dsdouble

Add a level parameter and some debugging indented print to the function

Show solution
[4]:


result = dsdouble([4,6,2,7,3])
print('final result=', result)
 lst= [4, 6, 2, 7, 3]
 left call
   lst= [4, 6]
   left call
     lst= [4]
   left= [8]
   right call
     lst= [6]
   right= [12]
 left= [8, 12]
 right call
   lst= [2, 7, 3]
   left call
     lst= [2]
   left= [4]
   right call
     lst= [7, 3]
     left call
       lst= [7]
     left= [14]
     right call
       lst= [3]
     right= [6]
   right= [14, 6]
 right= [4, 14, 6]
final result= [8, 12, 4, 14, 6]

Exercise - dssum

Takes a lst of numbers and RETURN the sum of all numbers.

  • USE SimpleFP style with a divide and conquer strategy.

  • DO NOT use sum function ;-)

Show solution
[6]:

def dssum(lst):
    raise Exception('TODO IMPLEMENT ME !')

result = dssum([4,2,6,3])
print(result)  # 15

assert dssum([4,2,6,3]) == 15
assert dssum([]) == 0
assert dssum([3]) == 3
assert dssum([5,1]) == 6

Exercise - dsmin

Given a lst of numbers, RETURN the minimum

  • assume the list has at least one element

  • USE SimpleFP style with a divide and conquer strategy

  • DON’T use min function..

Show solution
[7]:


def dsmin(lst):
    raise Exception('TODO IMPLEMENT ME !')

result = dsmin([3,6,5,8,-3,8,-2,-10,4])
print(result)  # -10

assert dsmin([4]) == 4
assert dsmin([4,1]) == 1
assert dsmin([0,9]) == 0
assert dsmin([6,8,5]) == 5
assert dsmin([3,6,5,8,3,8,2,10,4]) == 2

Exercise - dszip

Given two lists, RETURN a NEW list having two-element tuples with elements taken pairwise from lst1 and lst2

  • USE SimpleFP style with a divide and conquer strategy

  • DO NOT use zip function

Show solution
[8]:

def dszip(lst1, lst2):
    raise Exception('TODO IMPLEMENT ME !')

result = dszip([3,5,1,6], ['a','d','e','w'])
print(result)  # [(3,'a'), (5,'d'), (1,'e'), (6,'w')]

assert dszip([], []) == []
assert dszip(['a'], [0]) == [('a',0)]
assert dszip(['z','a'], [2,4]) == [('z',2), ('a',4)]
assert dszip([3,5,1,6], ['a','d','e','w']) == [(3,'a'), (5,'d'), (1,'e'), (6,'w')]

Exercise - dsunnest

Given a lst of lists, RETURN a NEW list without sublists (also called shallow unnest)

  • assume sublists cannot have further sublists inside

  • USE SimpleFP style with a divide and conquer strategy

Show solution
[9]:

def dsunnest(lst):
    raise Exception('TODO IMPLEMENT ME !')
result = dsunnest([[3,2,5], [2,4], [0,9,5,1]])
print(result)  # [3,2,5,2,4,0,9,5,1]

assert dsunnest([]) == []
assert dsunnest([[]]) == []
assert dsunnest([[],[]]) == []
assert dsunnest([[],[4]]) == [4]
assert dsunnest([[3],[]]) == [3]
assert dsunnest([[3,2]]) == [3,2]
assert dsunnest([[3,2]]) == [3,2]
assert dsunnest([[3,2,5], [2,4], [0,9,5,1]]) == [3,2,5,2,4,0,9,5,1]

Exercise - dsall

Given a lst of booleans, RETURN True if all elements are True, otherwise RETURN False

  • DO NOT use all function

  • assume dsall of empty list is True

  • USE SimpleFP style with a divide and conquer strategy

  • HINT: consider exploiting Boolean evaluation order to shorten your code

Show solution
[10]:

def dsall(lst):
    raise Exception('TODO IMPLEMENT ME !')

assert dsall([]) == True
assert dsall([True]) == True
assert dsall([False]) == False
assert dsall([False, False]) == False
assert dsall([False, True]) == False
assert dsall([True, False]) == False
assert dsall([True, True]) == True
assert dsall([False, False, False]) == False
assert dsall([False, False, True]) == False
assert dsall([False, True, False]) == False
assert dsall([False, True, True]) == False
assert dsall([True, False, False]) == False
assert dsall([True, False, True]) == False
assert dsall([True, True, False]) == False
assert dsall([True, True, True]) == True

Exercise - dsrev

Given a list, RETURN a NEW list reversed.

  • USE SimpleFP style with a divide and conquer strategy

  • DO NOT use reversed function

Show solution
[11]:

def dsrev(lst):
    raise Exception('TODO IMPLEMENT ME !')

result = dsrev([4,9,7,5,6])
print(result)

assert dsrev([]) == []
assert dsrev([3]) == [3]
assert dsrev([5,3]) == [3,5]
assert dsrev([6,6,2]) == [2,6,6]
assert dsrev([4,9,7,5,6]) == [6,5,7,9,4]

Exercise - dsrep

Given a list la, and a list lb of integer numbers \(\geq 0\), RETURN a NEW list where all elements of la are repeated the amount of times indicated in the corresponding cell of lb. A bit hard, but gives satisfaction.

  • assume la and lb have the same size

  • USE SimpleFP style with a divide and conquer strategy

  • DO NOT use * replication operator

  • HINT: since both lists have equal length, you can apply splitting logic to both

Show solution
[12]:


def dsrep(la, lb):
    raise Exception('TODO IMPLEMENT ME !')

result = dsrep(['a','b','c','d'], [2,5,0,3])
print(result) # ['a','a','b','b','b','b','b','d','d','d']

assert dsrep([], []) == []
assert dsrep(['a'], [1]) == ['a']
assert dsrep(['a'], [0]) == []
assert dsrep(['b'], [2]) == ['b','b']
assert dsrep(['a','b'], [1,1]) == ['a','b']
assert dsrep(['a','b'], [1,2]) == ['a','b','b']
assert dsrep(['a','b'], [2,1]) == ['a','a','b']
assert dsrep(['a','b'], [2,0]) == ['a','a']
assert dsrep(['a','b'], [0,2]) == ['b','b']
assert dsrep(['a','b','c'], [1,1,1]) == ['a','b','c']
assert dsrep(['a','b','c'], [0,1,1]) == ['b','c']
assert dsrep(['a','b','c'], [0,1,0]) == ['b']
assert dsrep(['a','b','c'], [3,1,2]) == ['a','a','a','b','c','c']
assert dsrep(['a','b','c','d'], [2,5,0,3]) == ['a','a','b','b','b','b','b','d','d','d']

Divide and conquer - Accumulator style

So far we’ve been creating lists at each call, thus wasting lots of memory and computation time.

Let’s first try using indeces with a simple example:

Example - dacount

Suppose given a number n, you want to build a list with first numbers from 1 to n without using range and using ModAcc style with a divide and conquer strategy

Notie a couple of things:

  1. to promote parallelism, ideally it should be possibile to execute left and right calls independently of their order: to allow for this you should first allocate all the needed space in the accumulator. Since for us it’s a special variable, you can preallocate space using the * operator before calling the helpers.

  2. since acc is created in the main function and keeps getting modified, we dropped the return statements in the helper

[13]:
def dacount_helper(i, j, acc):
    if i > j:
        pass
    elif i == j:
        acc[i - 1] = i
    else:
        k = (i + j) // 2
        dacount_helper(i, k, acc)    # no need to return
        dacount_helper(k+1, j, acc)  # no need to return

def dacount(n):
    acc = [0]*n  # let's create space to be filled in later
                 # (acc is special so let's allow us the freedom of using * operator)
    dacount_helper(1,n, acc)
    return acc

result = dacount(5)
print(result)
jupman.pytut()
[1, 2, 3, 4, 5]
[13]:

Example - dadouble

Let’s see now how we could use indeces and accumulators with our double function:

Take a lst of numbers and RETURN a NEW list with the numbers doubled, following the ModAcc style with a Divide and Conquer strategy

Again, notice we preallocated the accumulator in the main function and dropped the return instructions in the helper

[15]:
def dadouble_helper(lst, i,j,acc):
    if i >= j:    # note the >= in this case
        return acc
    else:
        k = (i + j) // 2
        acc[k] = lst[k] * 2
        dadouble_helper(lst, i, k, acc)
        dadouble_helper(lst, k+1, j, acc)
        return acc

def dadouble(lst):
    acc = [None]*len(lst)  # let's create space to be filled in later
                           # (acc is special so let's allow us the freedom of using * operator)
    dadouble_helper(lst, 0, len(lst), acc)
    return acc

result = dadouble([4,6,2])
print(result)

jupman.pytut()
[8, 12, 4]
[15]:

Exercise - debug dadouble

Add a level parameter and some debugging indented print to the dadouble function.

Show solution
[17]:


result = dadouble([4,6,2,7,3])
print("Final result=", result)
 i= 0 j= 5 acc= [None, None, None, None, None]
 left call
   i= 0 j= 2 acc= [None, None, 4, None, None]
   left call
     i= 0 j= 1 acc= [None, 12, 4, None, None]
     left call
       i= 0 j= 0 acc= [8, 12, 4, None, None]
     right call
       i= 1 j= 1 acc= [8, 12, 4, None, None]
   right call
     i= 2 j= 2 acc= [8, 12, 4, None, None]
 right call
   i= 3 j= 5 acc= [8, 12, 4, None, None]
   left call
     i= 3 j= 4 acc= [8, 12, 4, None, 6]
     left call
       i= 3 j= 3 acc= [8, 12, 4, 14, 6]
     right call
       i= 4 j= 4 acc= [8, 12, 4, 14, 6]
   right call
     i= 5 j= 5 acc= [8, 12, 4, 14, 6]
Final result= [8, 12, 4, 14, 6]

Exercise - dahist

Given a lst, RETURN a NEW dictionary with the counts of elements of lst

  • DO NOT use search methods (.index, .count, .remove…)

  • NOTE in this case we can start with an empty dictionary

Show solution
[19]:



def dahist(lst):
    raise Exception('TODO IMPLEMENT ME !')

result = dahist(['a','b','a','c','b','b','c','a','b','c','d'])
print(result)  # {'a':3,
               #  'b':4,
               #  'c':3,
               #  'd':1}

assert dahist([]) == {}
assert dahist(['a']) == {'a':1}
assert dahist(['a','a']) == {'a':2}
assert dahist(['a','b']) == {'a':1,
                            'b':1}
assert dahist(['b','c']) == {'c':1,
                            'b':1}
assert dahist(['a','c','a']) == {'a':2,
                                'c':1}
assert dahist(['c','c','a']) == {'a':1,
                                'c':2}
assert dahist(['a','b','a','c','b','b','c','a','b','c','d']) == {'a':3,
                                                                 'b':4,
                                                                 'c':3,
                                                                 'd':1}

TODO will add more

More exercises

Computer Science Circles: nice chapter about recursion with some exercises

W3 Resources: some recursion exercises

Edabit: Search by setting ‘Recursion’ as tag - notice many provided solutions are not actually in recursive format.

Geeks for geeks:

Continue

Go on with the challenges