The effect of direction on recursion and understanding code - Longest Common Subsequence

Today I’m going to discuss another fun and common challenge. It’s the Longest Common Subsequence a.k.a. LCS.

I’ll first focus on discussing a pain point I went through when I was trying to compare the algorithm I deduced on my own vs a few other algorithms I saw online. Our algorithms seemed very similar. Yet different. It made debugging my code based on other code very difficult. This is a very common problem I face when I doing leetcode. For whatever reason my way of coding is more than often to a good extent different from what I see from others. I find it easier to compare the high level of my code with a video from Youtube since most other submissions on Leetcode itself don’t contain a good breakdown pseudocode of what’s happening.

Let’s first explain the problem. Suppose we have: “dabc” and “aafb”. The longest common subsequence is “ab”. Other examples:

Text1	Text2	LCS
“abc”	“abc”	“abc”
“dab”	“db	“db”
“eaf”	“abeaf”	“eaf”
“kpm”	“qspzm”	“pm”

The way to do this is to have two pointers grid. One pointer for the current index you’re trying to match for text1, another for text2. Each location in the grid below corresponds to the two pointers. Example:

text1: kpm
text2: qspzm

Approach One

       0   1   2    
       k   p   m    
0  q   🏁  x   x  🔚 
1  s   x   x   x  🔚
2  p   x   x   x  🔚
3  z   x   x   x  🔚
4  m   x   x   🎯 🔚
       🔚 🔚  🔚

_{🏁: start

🎯: end

🔚: out of bounds. return 0}

Start from (0,0) and finish at bottom right corner.
return 0 for any value that’s beyond the right or top edges of the grid.

Approach Two

           0   1   2    
           k   p   m
           🔚  🔚  🔚    
0  q  🔚   🎯  x   x
1  s  🔚   x   x   x
2  p  🔚   x   x   x
3  z  🔚   x   x   x
4  m  🔚   x   x   🏁

Start from (2,4) and finish at top left corner.
return 0 for any value that’s beyond the left or bottom edges of the grid.

My confusion originated from the fact that I was coding from one approach, but trying to fix my code based on another approach — without realizing the approaches are different! 🙃🫤🫤😐🤔

Approach One Code

/// Starts from 0,0. Ends at m * n
/// Out of bounds would be the BOTTOM and RIGHT edges.
func longestCommonSubsequence(_ text1: String, _ text2: String) -> Int {
    var t1: [Character] = Array(text1)
    var t2: [Character] = Array(text2)

    func helper(s: Stage) -> Int {
        // 💡 doing an index check vs subscripting which can crash is a good middle ground. Index checking is a lot better than having to create a safe-subscript that returns an optional where you ultimately have to default its value to something.  
        guard s.x < t1.count && s.y < t2.count else {
            return 0
        }
        // 💡 Skip certain neighbors in the graph if items at the node match. 
        if t1[s.x] == t2[s.y] {
            return helper(s: Stage(s.x + 1, s.y + 1)) + 1
        }
        return max(helper(s: Stage(s.x + 1, s.y)), helper(s: Stage(s.x, s.y + 1)))
    }

    return helper(s: Stage(0, 0))
}

Helper

💡 Using this helper to build an easier mental model was crucial for me.

struct Stage: Hashable {
    var x: Int
    var y: Int
    
    init(_ x: Int, _ y: Int) {
        self.x = x
        self.y = y
    }
}

extension Array<Array<Int>> {
  subscript (stage: Stage) -> Int {
    get {
      self[stage.y][stage.x] // a good way to conceptualize is: `x` is as progress, `y` is as level. 
    }
    set {
      self[stage.y][stage.x] = newValue
    }
  }
}

Approach Two Code

/// Starts from m * n. Ends at 0 * 0
/// Out of bounds would be the TOP and LEFT edges. 
func longestCommonSubsequence(_ text1: String, _ text2: String) -> Int {
    var t1: [Character] = Array(text1)
    var t2: [Character] = Array(text2)
    
    func helper(_ s: Stage) -> Int {
        
        guard s.x >= 0 && s.y >= 0 else {
            return 0
        }

        if t1[s.x] == t2[s.y] {
            return helper(Stage(s.x - 1, s.y - 1)) + 1 
        } else {
            return max(helper(Stage(s.x - 1, s.y)), helper(Stage(s.x, s.y - 1)))
        }
    }
    return helper(Stage(t1.count - 1, t2.count - 1))
}

💡 When is a 2D Graph a good model for your challenge? What does the 2D grid model?

Any time you have two items with two sets of indexes and you have to compare them as you go, then a 2D graph becomes a very good candidate. The grid models the following:

It models a graph. Not a tree.
Every node in the graph, marks index combinations of the two strings.
You can start from any node.
- If you start from an edge node then you have to expand into only two direction.
- If you start from a side-node then you have to expand in three directions.
- If you start from a node that’s not on the edges, then if you have to expand in all four directions.
- Staring from a corner is just much more natural.

Your traversal can be to your adjacent neighbors or diagonal. In this challenge you will skip adjacent neighbors if certain conditions are met. That is if the values of the indexes in the strings match then move diagonally otherwise bifurcate into the two adjacent indexes of the grid.

Summary and some other confusions I had along the way

To figure out the issue with my code I thought I had look up ‘top down vs ‘bottom up’ & ‘memoization vs tabulation’. But while those are unrelated to figuring out this issue and might confuse you even more, unless fully understood, understanding their differences might make it easier for you to compare different approaches. Try not to spend too much time on the differences. Instead focus on learning how to break down your problem into a simple subproblem.

About Top down approach, Bard said:

Recursive algorithms are called top-down because they break down a problem into smaller and smaller sub problems until they reach a base case, which is a simple problem that can be solved directly. Once the base case is solved, the results are used to solve the larger sub problems, and so on, until the original problem is solved. This process is called recursion, and it is a powerful way to solve many different types of problems.

The term “top-down” is used to describe this approach because the algorithm starts at the top of the problem hierarchy and works its way down to the base cases. This is in contrast to a bottom-up approach, which starts at the base cases and works its way up to the top of the problem hierarchy.

Hence both approaches are top down. Only that their direction is different. The first time I heard the term ‘top-down’, I thought it had something to do with the fact that in tree traversals you start at the root and the root is always (in algorithm world, not real life) at the top of tree. But it has nothing to do with direction. It’s more about approach and if big sub problems depend on small sub problems vs if you’re building incrementally.

Optimizations

For both of these solutions you can do memoization.

memoization for when you start from (m,n). See this gist for its code.

Essentially where you start from will dictate the direction. The direction often happens to also dictate the most performant algorithm you can choose.

If you wanted to go for a tabulation approach, then it’s better if start from 0,0 and build from there.

tabulation for when you start from (0,0). See this gist for its code

With tabulation you compute sub problems in advance, with memoization you compute sub problems on demand.

Note on readability

The returning value for the two approaches are:

return helper(Stage(0, 0))
return helper(Stage(t1.count - 1, t2.count - 1))

Which one makes more sense?
The second approach can be slightly more readable in this case. Readers might be confused as to why you’re returning the value for Stage(0,0) and not the last. That said for me personally it was easier to write code that starts from (0,0) and expand.

Are you always able to write your from both directions?

Trees No. Graphs Yes.

To better understand this, what is the node root node below?

Click to see answer 👇

There isn’t any root. This isn’t a tree, it’s a graph and that twisted my perception for the longest time. Any moment that you have a left and right movement then it’s no longer a tree. With Trees you can only go up and down.

Cycles / multiple paths, break the tree property. A tree can’t have multiple ways to get to the same node — a graph can.

Any of the A,B,C,D nodes can be deemed as a good starting node of the graph. To be clear you could start from any node of the graph, but starting from either A or C seems the most natural / easiest.

If you pick A as your starting node, then C becomes the last visiting node of the graph.
If you pick C as your starting node, then A becomes the last visiting node of the graph.
If you pick B as your starting node, then D becomes the last visiting node of the graph.
If you pick D as your starting node, then B becomes the last visiting node of the graph.

If you add the edges it becomes more clear that it’s a graph.

“2D Graph with edges”

As humans we usually start from the origin node (0,0) but also less frequently from the destination node (m,n).

Acknowledgements

Special thanks to Tim Vermeulen for answering dozens of my questions so I was able to put this post together and Maks Verner for writing this post

Approach One#

Approach Two#

Approach One Code#

Helper#

Approach Two Code#

💡 When is a 2D Graph a good model for your challenge? What does the 2D grid model?#

Summary and some other confusions I had along the way#

Optimizations#

Note on readability#

Are you always able to write your from both directions?#

Acknowledgements#