[Dynamic Programming] Longest Common Subsequence of Three Strings - Hard

The longest common subsequence (LCS) problem for three strings is a well-known problem in computer science. It is especially important in dynamic programming. This problem asks us to find the longest sequence that appears in the same order in all three strings. The sequence does not need to be next to itself.

To solve this problem, we use dynamic programming. We create a three-dimensional table. This table keeps track of the lengths of the longest common subsequences for each pair of positions in the three strings. This method helps us find the best solution.

In this article, we will explore the LCS problem in detail. We will look at the problem statement and the dynamic programming method to find the longest common subsequence of three strings. We will share code examples in Java, Python, and C++. We will also discuss ways to reduce space use.

We will point out common mistakes when we implement the algorithm. We will also compare different methods for solving the LCS problem. Lastly, we will answer frequently asked questions to help clear up any confusion.

  • Dynamically Finding Longest Common Subsequence of Three Strings - Hard
  • Understanding the Problem Statement of Longest Common Subsequence
  • Dynamic Programming Approach to Longest Common Subsequence of Three Strings
  • Java Implementation of Longest Common Subsequence of Three Strings
  • Python Implementation of Longest Common Subsequence of Three Strings
  • C++ Implementation of Longest Common Subsequence of Three Strings
  • Optimizing Space Complexity for Longest Common Subsequence
  • Common Pitfalls in Implementing Longest Common Subsequence
  • Comparative Analysis of Different Approaches to Longest Common Subsequence
  • Frequently Asked Questions

Understanding the Problem Statement of Longest Common Subsequence

The Longest Common Subsequence (LCS) problem is about finding the longest subsequence that is common in three input strings. Substrings are different from subsequences. We can get subsequences by removing some characters from the original string. We do not change the order of the remaining characters.

Problem Definition

We have three strings, X, Y, and Z. Our job is to find the length of the longest subsequence that is in all three strings. The subsequence does not have to be next to each other.

Example

Let’s look at these three strings: - X = "AGGTAB" - Y = "GXTXAYB" - Z = "AGXGTAB"

The longest common subsequence here is "GTAB", and it has a length of 4.

Input/Output Format

  • Input: Three strings X, Y, Z.
  • Output: A number that shows the length of the longest common subsequence.

Constraints

  • Each string can be up to 100 characters long.
  • Strings can have both uppercase and lowercase letters.

This problem is a good example of dynamic programming. We can build our solution step by step. We use results we already calculated to find the final answer more easily.

When we understand this problem, we can create a smart dynamic programming solution to find the longest common subsequence of three strings.

Dynamic Programming Approach to Longest Common Subsequence of Three Strings

The Longest Common Subsequence (LCS) problem can also work with three strings. Our goal is to find the longest subsequence that is present in all three strings. A subsequence means a sequence that appears in the same order but does not need to be next to each other.

Problem Definition

We have three strings: X, Y, and Z. Our task is to find the length of the longest subsequence that is common to all three.

Dynamic Programming Solution

To solve this problem, we will use a 3D array dp. Here, dp[i][j][k] shows the length of the longest common subsequence of substrings X[0...i-1], Y[0...j-1], and Z[0...k-1].

Recurrence Relation

The rule for finding the longest common subsequence is like this:

  • If X[i-1], Y[j-1], and Z[k-1] are the same: [ dp[i][j][k] = dp[i-1][j-1][k-1] + 1 ]

  • If they are not the same: [ dp[i][j][k] = (dp[i-1][j][k], dp[i][j-1][k], dp[i][j][k-1]) ]

Base Case

For the base case, we set dp[i][0][0], dp[0][j][0], and dp[0][0][k] to 0 for all i, j, and k.

Time and Space Complexity

The time it takes for this method is (O(n m p)). Here, (n), (m), and (p) are the lengths of strings X, Y, and Z. The space needed is also (O(n m p)).

Implementation

Now we show how to write the LCS of three strings in Java, Python, and C++.

Java Implementation

public class LCS3 {
    public int longestCommonSubsequence(String X, String Y, String Z) {
        int n = X.length(), m = Y.length(), p = Z.length();
        int[][][] dp = new int[n + 1][m + 1][p + 1];

        for (int i = 1; i <= n; i++) {
            for (int j = 1; j <= m; j++) {
                for (int k = 1; k <= p; k++) {
                    if (X.charAt(i - 1) == Y.charAt(j - 1) && Y.charAt(j - 1) == Z.charAt(k - 1)) {
                        dp[i][j][k] = dp[i - 1][j - 1][k - 1] + 1;
                    } else {
                        dp[i][j][k] = Math.max(dp[i - 1][j][k], Math.max(dp[i][j - 1][k], dp[i][j][k - 1]));
                    }
                }
            }
        }
        return dp[n][m][p];
    }
}

Python Implementation

def longest_common_subsequence(X, Y, Z):
    n, m, p = len(X), len(Y), len(Z)
    dp = [[[0] * (p + 1) for _ in range(m + 1)] for _ in range(n + 1)]

    for i in range(1, n + 1):
        for j in range(1, m + 1):
            for k in range(1, p + 1):
                if X[i - 1] == Y[j - 1] == Z[k - 1]:
                    dp[i][j][k] = dp[i - 1][j - 1][k - 1] + 1
                else:
                    dp[i][j][k] = max(dp[i - 1][j][k], dp[i][j - 1][k], dp[i][j][k - 1])

    return dp[n][m][p]

C++ Implementation

#include <vector>
#include <string>
using namespace std;

int longestCommonSubsequence(string X, string Y, string Z) {
    int n = X.size(), m = Y.size(), p = Z.size();
    vector<vector<vector<int>>> dp(n + 1, vector<vector<int>>(m + 1, vector<int>(p + 1, 0)));

    for (int i = 1; i <= n; ++i) {
        for (int j = 1; j <= m; ++j) {
            for (int k = 1; k <= p; ++k) {
                if (X[i - 1] == Y[j - 1] && Y[j - 1] == Z[k - 1]) {
                    dp[i][j][k] = dp[i - 1][j - 1][k - 1] + 1;
                } else {
                    dp[i][j][k] = max(dp[i - 1][j][k], max(dp[i][j - 1][k], dp[i][j][k - 1]));
                }
            }
        }
    }
    return dp[n][m][p];
}

This dynamic programming way helps us find the length of the longest common subsequence for three strings. It gives us a strong solution to this problem. For more about dynamic programming, we can read Dynamic Programming - Longest Common Subsequence.

Java Implementation of Longest Common Subsequence of Three Strings

We will show how to find the Longest Common Subsequence (LCS) of three strings using Java. We use dynamic programming for this. We create a 3D array to keep the lengths of the longest common subsequences for parts of the three input strings.

Steps:

  1. First, we find the lengths of the three strings.
  2. Then, we create a 3D array called dp. Here, dp[i][j][k] holds the length of the LCS of parts str1[0...i-1], str2[0...j-1], and str3[0...k-1].
  3. Next, we use loops to fill the dp array based on some rules:
    • If the characters match (str1[i-1] == str2[j-1] == str3[k-1]), we do:

      dp[i][j][k] = dp[i-1][j-1][k-1] + 1;
    • If they do not match, we take the biggest value from the options:

      dp[i][j][k] = max(dp[i-1][j][k], dp[i][j-1][k], dp[i][j][k-1]);

Java Code:

public class LongestCommonSubsequenceThreeStrings {
    public static int lcsOfThree(String str1, String str2, String str3) {
        int l1 = str1.length();
        int l2 = str2.length();
        int l3 = str3.length();
        
        int[][][] dp = new int[l1 + 1][l2 + 1][l3 + 1];

        for (int i = 1; i <= l1; i++) {
            for (int j = 1; j <= l2; j++) {
                for (int k = 1; k <= l3; k++) {
                    if (str1.charAt(i - 1) == str2.charAt(j - 1) && str1.charAt(i - 1) == str3.charAt(k - 1)) {
                        dp[i][j][k] = dp[i - 1][j - 1][k - 1] + 1;
                    } else {
                        dp[i][j][k] = Math.max(dp[i - 1][j][k], Math.max(dp[i][j - 1][k], dp[i][j][k - 1]));
                    }
                }
            }
        }
        return dp[l1][l2][l3];
    }

    public static void main(String[] args) {
        String str1 = "abc";
        String str2 = "ac";
        String str3 = "abc";
        System.out.println("Length of LCS is " + lcsOfThree(str1, str2, str3));
    }
}

Explanation of the Code:

We start with the lcsOfThree method. It makes a 3D array called dp to store the lengths of the common subsequences.

The nested loops go through the lengths of the input strings. They fill the dp array based on the matching rules we talked about.

At the end, the method gives back the value at dp[l1][l2][l3]. This value shows the length of the longest common subsequence of the three strings.

This way, we compute the LCS of three strings. This method works well with a time complexity of O(n * m * p). Here, n, m, and p are the lengths of the three strings.

Python Implementation of Longest Common Subsequence of Three Strings

We can implement the Longest Common Subsequence (LCS) of three strings using dynamic programming in Python. We will use a 3D list to keep track of the lengths of the longest common subsequence for each index of the three strings.

Here is a simple implementation:

def longest_common_subsequence(str1, str2, str3):
    len1, len2, len3 = len(str1), len(str2), len(str3)
    
    # Create a 3D DP array initialized to 0
    dp = [[[0] * (len3 + 1) for _ in range(len2 + 1)] for __ in range(len1 + 1)]

    for i in range(1, len1 + 1):
        for j in range(1, len2 + 1):
            for k in range(1, len3 + 1):
                if str1[i - 1] == str2[j - 1] == str3[k - 1]:
                    dp[i][j][k] = dp[i - 1][j - 1][k - 1] + 1
                else:
                    dp[i][j][k] = max(dp[i - 1][j][k], 
                                      dp[i][j - 1][k], 
                                      dp[i][j][k - 1])

    return dp[len1][len2][len3]

# Example usage
str1 = "abcde"
str2 = "ace"
str3 = "abc"
print("Length of LCS is:", longest_common_subsequence(str1, str2, str3))

Explanation:

  • 3D DP Array: We use dp[i][j][k] to show the length of the LCS for the first i characters of str1, the first j characters of str2, and the first k characters of str3.
  • Base Case: If any string is empty (when i, j, or k is 0), the LCS length is 0.
  • Recurrence Relation:
    • If the current characters from all three strings are the same, we add 1 to the count from the previous indices.
    • If not, we find the maximum value from the previous states by leaving out one of the strings.

This implementation can find the length of the longest common subsequence of three strings using dynamic programming. For more details on advanced dynamic programming methods, you can check this related article.

C++ Implementation of Longest Common Subsequence of Three Strings

To find the longest common subsequence (LCS) of three strings in C++, we use a dynamic programming method. We create a 3D table called dp. Here, dp[i][j][k] shows the length of the LCS of the substrings str1[0:i-1], str2[0:j-1], and str3[0:k-1].

C++ Code Implementation

#include <iostream>
#include <vector>
#include <string>
using namespace std;

int longestCommonSubsequence(string str1, string str2, string str3) {
    int len1 = str1.size();
    int len2 = str2.size();
    int len3 = str3.size();
    
    vector<vector<vector<int>>> dp(len1 + 1, vector<vector<int>>(len2 + 1, vector<int>(len3 + 1, 0)));
    
    for (int i = 1; i <= len1; i++) {
        for (int j = 1; j <= len2; j++) {
            for (int k = 1; k <= len3; k++) {
                if (str1[i - 1] == str2[j - 1] && str2[j - 1] == str3[k - 1]) {
                    dp[i][j][k] = dp[i - 1][j - 1][k - 1] + 1;
                } else {
                    dp[i][j][k] = max({dp[i - 1][j][k], dp[i][j - 1][k], dp[i][j][k - 1]});
                }
            }
        }
    }
    
    return dp[len1][len2][len3];
}

int main() {
    string str1 = "abcde";
    string str2 = "ace";
    string str3 = "bce";
    
    int result = longestCommonSubsequence(str1, str2, str3);
    cout << "Length of Longest Common Subsequence: " << result << endl;
    
    return 0;
}

Explanation of the Code

We start by making a 3D vector called dp. It has size (len1 + 1) x (len2 + 1) x (len3 + 1). This will hold our results.

Then we loop through each character of the three strings. If the characters at the current position are the same, we add one to the LCS length from the previous positions. If they are not the same, we choose the biggest value from the three cases where we leave out one string at a time.

At the end, the value in dp[len1][len2][len3] gives us the length of the longest common subsequence.

Complexity

  • Time Complexity: O(n * m * p). Here, n, m, and p are the lengths of the three strings.
  • Space Complexity: O(n * m * p) for the DP table.

This C++ code shows how to find the longest common subsequence of three strings using dynamic programming. For more details about dynamic programming, you can check articles like Dynamic Programming Fibonacci Number or Dynamic Programming Edit Distance.

Optimizing Space Complexity for Longest Common Subsequence

When we implement the Longest Common Subsequence (LCS) algorithm for three strings, space complexity can be a big issue. This is true, especially when input sizes are large. The simple dynamic programming method uses a 3D array to store results for each substring combination. This can use a lot of memory. To make space usage better, we can remember that we only need to keep track of the current and previous rows in our DP table.

Space Optimization Technique

  1. Reduced DP Table: Instead of keeping a full 3D array, we can use two 2D arrays or even better, one 2D array and one 1D array to store the results. This really helps to lower memory use.

  2. Iterative Calculation: By going through the strings and updating our tables as we go, we can make our algorithm better while still getting the right results.

Implementation Example

Here is how we can write the optimized LCS algorithm for three strings in Python:

def lcs_three_strings(s1, s2, s3):
    l1, l2, l3 = len(s1), len(s2), len(s3)
    dp = [[[0] * (l3 + 1) for _ in range(l2 + 1)] for __ in range(2)]

    for i in range(1, l1 + 1):
        for j in range(1, l2 + 1):
            for k in range(1, l3 + 1):
                if s1[i - 1] == s2[j - 1] == s3[k - 1]:
                    dp[i % 2][j][k] = dp[(i - 1) % 2][j - 1][k - 1] + 1
                else:
                    dp[i % 2][j][k] = max(dp[(i - 1) % 2][j][k], 
                                          dp[i % 2][j - 1][k], 
                                          dp[i % 2][j][k - 1])
    
    return dp[l1 % 2][l2][l3]

# Example usage
s1 = "abcde"
s2 = "ace"
s3 = "abc"
print(lcs_three_strings(s1, s2, s3))  # Output: 2

Key Points

  • Memory Efficiency: We reduce space complexity from O(n * m * p) to O(min(n, m, p) * m * p) by only keeping the rows we need.
  • Time Complexity: The time complexity stays O(n * m * p). Here, n, m, and p are the lengths of the three strings.
  • Practical Usage: This way is very helpful in competitive programming and cases where memory use is very important.

For more reading on similar dynamic programming methods, we can check out Dynamic Programming: Edit Distance and Dynamic Programming: Longest Common Subsequence.

Common Pitfalls in Implementing Longest Common Subsequence

When we implement the Longest Common Subsequence (LCS) algorithm for three strings, we can face some common mistakes. These mistakes can cause wrong results or slow performance. Here are some important things to be careful about:

  1. Incorrect Base Case Handling:
    • We need to handle the base cases correctly. For example, if one of the strings is empty, the length of the LCS should be zero.
    if (i == 0 || j == 0 || k == 0) {
        return 0; // LCS length is 0
    }
  2. State Definition Errors:
    • We must define the states in our dynamic programming table properly. For three strings X, Y, and Z, we should use dp[i][j][k]. This shows the LCS length for the first i characters of X, j characters of Y, and k characters of Z.
  3. Overlooking Character Comparison:
    • We should compare characters correctly. It’s easy to forget to compare the characters at indices i-1, j-1, and k-1 right.
    if (X.charAt(i - 1) == Y.charAt(j - 1) && Y.charAt(j - 1) == Z.charAt(k - 1)) {
        dp[i][j][k] = 1 + dp[i - 1][j - 1][k - 1];
    } else {
        dp[i][j][k] = Math.max(dp[i - 1][j][k], Math.max(dp[i][j - 1][k], dp[i][j][k - 1]));
    }
  4. Memory Management Issues:
    • When we work with big strings, memory use can grow a lot. We should be careful about how much space our solution uses. We can use better methods to save memory, like keeping only the last two rows of the DP table if the strings are not changed.
  5. Not Accounting for All Characters:
    • We need to check that we consider all characters from all strings in our comparison. It is easy to miss some characters or think we know what the strings contain.
  6. Edge Case Oversights:
    • We should pay attention to edge cases. This includes strings with special characters, numbers, or spaces. We must make sure our code handles these cases well.
  7. Inefficient Backtracking:
    • When we rebuild the LCS, we should make the backtracking process fast. We should avoid unnecessary checks that can slow down our code.
  8. Testing with Diverse Inputs:
    • We should test our code with different inputs. This includes small, large, and edge cases. This helps make sure our code works well. If we do not test enough, we can have hidden bugs that show up only in certain cases.

By knowing these common mistakes in implementing the Longest Common Subsequence algorithm for three strings, we can improve the accuracy and speed of our solution. If we want to read more about dynamic programming, we can check out Dynamic Programming: Longest Common Subsequence.

Comparative Analysis of Different Approaches to Longest Common Subsequence

The Longest Common Subsequence (LCS) problem has different ways to solve it. Each way has its own features for time complexity, space complexity, and how easy it is to use. Here, we look at three main methods: Recursive, Dynamic Programming (DP), and Optimized Space Dynamic Programming.

1. Recursive Approach

  • Time Complexity: O(2^n) where n is the length of the shortest string.
  • Space Complexity: O(n) because of the call stack.
  • Description: We define the problem using recursion. If the last characters of the three strings match, we find the LCS of the remaining parts. If they do not match, we find the maximum LCS by skipping one string at a time.
public int LCSRecursive(String a, String b, String c, int i, int j, int k) {
    if (i == 0 || j == 0 || k == 0) return 0;
    if (a.charAt(i - 1) == b.charAt(j - 1) && b.charAt(j - 1) == c.charAt(k - 1)) {
        return 1 + LCSRecursive(a, b, c, i - 1, j - 1, k - 1);
    } else {
        return Math.max(Math.max(LCSRecursive(a, b, c, i - 1, j, k), 
                                  LCSRecursive(a, b, c, i, j - 1, k)), 
                         LCSRecursive(a, b, c, i, j, k - 1));
    }
}

2. Dynamic Programming Approach

  • Time Complexity: O(n * m * p), where n, m, and p are the lengths of the three strings.
  • Space Complexity: O(n * m * p) for the DP table.
  • Description: We use a 3D array to save the length of LCS for smaller problems. We build the solution step by step, checking for matching characters and updating the table.
public int LCSDynamicProgramming(String a, String b, String c) {
    int n = a.length(), m = b.length(), p = c.length();
    int[][][] dp = new int[n + 1][m + 1][p + 1];

    for (int i = 1; i <= n; i++) {
        for (int j = 1; j <= m; j++) {
            for (int k = 1; k <= p; k++) {
                if (a.charAt(i - 1) == b.charAt(j - 1) && b.charAt(j - 1) == c.charAt(k - 1)) {
                    dp[i][j][k] = 1 + dp[i - 1][j - 1][k - 1];
                } else {
                    dp[i][j][k] = Math.max(Math.max(dp[i - 1][j][k], dp[i][j - 1][k]), dp[i][j][k - 1]);
                }
            }
        }
    }
    return dp[n][m][p];
}

3. Optimized Space Dynamic Programming Approach

  • Time Complexity: O(n * m * p).
  • Space Complexity: O(m * p) using two 2D arrays for current and previous results.
  • Description: We use two 2D arrays instead of a 3D array. This way, we keep track of the last row we computed. This helps us save space a lot.
public int LCSOptimized(String a, String b, String c) {
    int n = a.length(), m = b.length(), p = c.length();
    int[][] dp = new int[m + 1][p + 1];
    int[][] current = new int[m + 1][p + 1];

    for (int i = 1; i <= n; i++) {
        for (int j = 1; j <= m; j++) {
            for (int k = 1; k <= p; k++) {
                if (a.charAt(i - 1) == b.charAt(j - 1) && b.charAt(j - 1) == c.charAt(k - 1)) {
                    current[j][k] = 1 + dp[j - 1][k - 1];
                } else {
                    current[j][k] = Math.max(Math.max(dp[j][k], current[j - 1][k]), current[j][k - 1]);
                }
            }
        }
        // Swap arrays
        int[][] temp = dp; 
        dp = current; 
        current = temp;
    }
    return dp[m][p];
}

Summary of Comparisons

  • Recursive Approach: It is easy to understand. But it is not good for big strings because of the high time complexity.
  • Dynamic Programming: It is better than recursion and works well for medium-sized strings.
  • Optimized Space DP: It keeps the same speed as DP but uses less memory. This makes it good for bigger input sizes.

For more information on dynamic programming methods, we can check out Dynamic Programming - Fibonacci Number and Dynamic Programming - Longest Common Subsequence.

Frequently Asked Questions

1. What is the Longest Common Subsequence (LCS) of three strings?

The Longest Common Subsequence (LCS) of three strings is a well-known problem in dynamic programming. We try to find the longest sequence that shows up in all three strings in the same order. The sequence does not have to be next to each other. This problem is harder than finding the LCS of two strings. It needs a three-dimensional table to solve it well.

2. How can I optimize the space complexity for the LCS of three strings?

To make the space usage better for the LCS of three strings, we can use a rolling array method. Instead of using a big 3D array, we can use two 2D arrays. We switch between these arrays while we calculate the LCS. This way, we save memory and still find the LCS in a good way.

3. What are common pitfalls when implementing LCS for three strings?

Some common mistakes when we implement the LCS for three strings are wrong index handling and not setting up the dynamic programming table right. We also may forget edge cases like empty strings. It is very important to take care of the transitions in the DP table and check that we cover all parts to avoid mistakes.

4. Can I use memoization for solving the LCS problem of three strings?

Yes, we can use memoization to solve the LCS problem for three strings. By saving the results of smaller problems, we can avoid doing the same work again. This makes our solution faster. This way works well when the strings are long and have some similar parts.

5. How does the LCS of three strings compare to two strings in terms of complexity?

Finding the LCS of three strings is more complex than doing it for two strings. The time it takes usually goes up from O(mn) for two strings to O(mn*p) for three strings. Here, m, n, and p are the lengths of the three strings. This increase in complexity means we need better methods to handle larger inputs.

For more information on dynamic programming techniques, we can read articles about Fibonacci numbers, climbing stairs problem, and more. You can find them in Best Online Tutorial’s dynamic programming section.