[Dynamic Programming] Longest Repeating Subsequence - Medium

The Longest Repeating Subsequence (LRS) problem is a classic dynamic programming challenge. The goal is to find the longest subsequence that appears at least twice in a string. Unlike substrings, we get subsequences by removing some characters from the original string. We do this without changing the order of the remaining characters.

To solve this problem well, we can use a dynamic programming approach. This method helps us build a solution step by step. We will store results we find along the way. This way, we avoid doing the same calculations again. In the end, we will know the length of the longest repeating subsequence.

In this article, we will look closely at the Longest Repeating Subsequence problem. We will start with a clear explanation of the problem. Then, we will go through a complete dynamic programming approach. We will share Java, Python, and C++ code examples for real use.

We will also look at ways to save space, check the time it takes to run, point out common mistakes we can make, and answer some frequently asked questions. The sections we will cover include:

Dynamic Programming Longest Repeating Subsequence Problem Explained
Understanding the Problem Statement
Dynamic Programming Approach for Longest Repeating Subsequence
Java Implementation of Longest Repeating Subsequence
Python Code for Longest Repeating Subsequence
C++ Solution for Longest Repeating Subsequence
Optimizing Space Complexity in Longest Repeating Subsequence
Analyzing Time Complexity for Longest Repeating Subsequence
Common Mistakes in Longest Repeating Subsequence Implementation
Frequently Asked Questions

Understanding the Problem Statement

The Longest Repeating Subsequence (LRS) problem is about finding the longest subsequence in a string that shows up at least two times but does not overlap. This is different from the Longest Common Subsequence (LCS). In LCS, we compare two different sequences. But in LRS, we use the same sequence for both times. This means the characters must come from different places.

Problem Definition

We have a string X. Our job is to find the length of the longest subsequence Y such that:

Y is a subsequence of X
The occurrences of Y in X do not overlap

Example

For the string “AABEBCDD”, the longest repeating subsequence is “ABD”. Its length is 3.

The subsequence “ABD” shows up in the original string like this:
- First occurrence: A A B B C D D
- Second occurrence: A A B B C D D

Input and Output

Input: A string X
Output: An integer which tells the length of the longest repeating subsequence

Constraints

The input string can be up to 1000 characters long.
The characters can be any printable ASCII character.

We often solve this problem using dynamic programming. We make a 2D table to keep track of our results and build up to the final answer. The dynamic programming method looks at all pairs of indices in the string. It checks if characters match while making sure they are not from the same instance.

This LRS problem is similar to other dynamic programming problems, like the Longest Common Subsequence (LCS). For more details on dynamic programming techniques, you can visit the Dynamic Programming - Longest Common Subsequence.

Dynamic Programming Approach for Longest Repeating Subsequence

The Longest Repeating Subsequence (LRS) problem is about finding the longest part of a string that appears at least two times. We can have repeated characters but we cannot use the same character from the same position. For example, if we take the string “AABBC”, the longest repeating subsequence is “AB”.

To solve this problem, we can use a dynamic programming method. We will create a 2D table called dp. In this table, dp[i][j] will hold the length of the longest repeating subsequence for the first i characters of the string X and the first j characters of the same string X. We need to make sure that i and j do not point to the same position in the string.

Steps:

Initialization:
- First, we create a 2D array dp with size (n+1) x (n+1). Here, n is the length of the string.
- We set all the elements in this array to 0.
Filling the DP Table:
- Next, we loop through the string with two loops. For each pair of indices i and j:
  - If X[i-1] is equal to X[j-1] and i is not equal to j, we add 1 to dp[i][j] from dp[i-1][j-1].
  - If not, we take the bigger value between dp[i-1][j] and dp[i][j-1].
Result:
- Finally, the value at dp[n][n] tells us the length of the longest repeating subsequence.

Dynamic Programming Table Update:

int longestRepeatingSubsequence(String X) {
    int n = X.length();
    int[][] dp = new int[n + 1][n + 1];

    for (int i = 1; i <= n; i++) {
        for (int j = 1; j <= n; j++) {
            if (X.charAt(i - 1) == X.charAt(j - 1) && i != j) {
                dp[i][j] = dp[i - 1][j - 1] + 1;
            } else {
                dp[i][j] = Math.max(dp[i - 1][j], dp[i][j - 1]);
            }
        }
    }
    return dp[n][n];
}

This algorithm works in O(n^2) time and uses O(n^2) space. For better performance, check the next section about space complexity.

Java Implementation of Longest Repeating Subsequence

We can solve the Longest Repeating Subsequence (LRS) problem in Java using a simple dynamic programming method. We will create a 2D array that helps us store the lengths of the longest repeating subsequences at different points.

Here is the Java code for the Longest Repeating Subsequence:

public class LongestRepeatingSubsequence {
    public static int longestRepeatingSubseq(String str) {
        int n = str.length();
        int[][] dp = new int[n + 1][n + 1];

        for (int i = 1; i <= n; i++) {
            for (int j = 1; j <= n; j++) {
                if (str.charAt(i - 1) == str.charAt(j - 1) && i != j) {
                    dp[i][j] = 1 + dp[i - 1][j - 1];
                } else {
                    dp[i][j] = Math.max(dp[i - 1][j], dp[i][j - 1]);
                }
            }
        }
        return dp[n][n];
    }

    public static void main(String[] args) {
        String str = "AABEBCDD";
        System.out.println("The length of the longest repeating subsequence is: " + longestRepeatingSubseq(str));
    }
}

Explanation of the Code:

Input: We pass the input string to the longestRepeatingSubseq method.
DP Table Initialization: We create a 2D array dp of size (n+1) x (n+1). Here n is the length of the input string.
Filling the DP Table:
- If the characters at positions i-1 and j-1 are the same and the indices are not the same, we add one to the count from the diagonal (dp[i-1][j-1]).
- If not, we take the maximum from either the left (dp[i][j-1]) or from above (dp[i-1][j]).
Output: The value in dp[n][n] gives us the length of the longest repeating subsequence.

This way, the program runs with O(n^2) time and uses O(n^2) space. It is good for moderate input sizes. If you want to see more dynamic programming problems, you can check Dynamic Programming - Longest Common Subsequence.

Python Code for Longest Repeating Subsequence

We can solve the Longest Repeating Subsequence problem using dynamic programming in Python. We will write a function that uses a 2D array. This array will help us store the lengths of the longest repeating subsequences for different parts of the input string.

Here is a simple code to do it:

def longest_repeating_subsequence(seq):
    n = len(seq)
    # Create a 2D array to store lengths of longest repeating subsequence
    dp = [[0] * (n + 1) for _ in range(n + 1)]

    # Build the dp array
    for i in range(1, n + 1):
        for j in range(1, n + 1):
            # If characters match and indexes are not the same
            if seq[i - 1] == seq[j - 1] and i != j:
                dp[i][j] = dp[i - 1][j - 1] + 1
            else:
                dp[i][j] = max(dp[i - 1][j], dp[i][j - 1])

    # The length of the longest repeating subsequence
    return dp[n][n]

# Example usage
sequence = "AABEBCDD"
print("Length of Longest Repeating Subsequence is", longest_repeating_subsequence(sequence))

Explanation of the Code:

Input: We take a string seq to find the longest repeating subsequence.
DP Table: We make a 2D list dp. This list has size (n+1)x(n+1) to hold lengths based on subsequences.
Nested Loop: We go through each character of the string. We check if characters match and update the dp table when needed.
Result: The value at dp[n][n] tells us the length of the longest repeating subsequence.

This method has a time complexity of O(n^2) and space complexity of O(n^2).

If we want to learn more about dynamic programming, we can read other articles like Dynamic Programming: Longest Common Subsequence and Dynamic Programming: Longest Increasing Subsequence.

C++ Solution for Longest Repeating Subsequence

The Longest Repeating Subsequence (LRS) problem is to find the longest part of a string that shows up at least two times. But we have to make sure that the two parts do not come from the same place in the original string.

C++ Implementation

Here is a simple C++ solution using dynamic programming:

#include <iostream>
#include <vector>
#include <string>

using namespace std;

int longestRepeatingSubsequence(string str) {
    int n = str.length();
    vector<vector<int>> dp(n + 1, vector<int>(n + 1, 0));

    for (int i = 1; i <= n; i++) {
        for (int j = 1; j <= n; j++) {
            if (str[i - 1] == str[j - 1] && i != j) {
                dp[i][j] = 1 + dp[i - 1][j - 1];
            } else {
                dp[i][j] = max(dp[i - 1][j], dp[i][j - 1]);
            }
        }
    }
    return dp[n][n];
}

int main() {
    string str = "aabb";
    cout << "Length of Longest Repeating Subsequence: " << longestRepeatingSubsequence(str) << endl;
    return 0;
}

Explanation of the Code

Dynamic Programming Table Initialization: We create a 2D vector dp that has size (n+1) x (n+1). We set all values to zero. Here n is the length of the input string.
Filling the DP Table:
- If the characters match and their positions are not the same, we add one to the value from the diagonal cell dp[i-1][j-1].
- If they do not match, we take the bigger value from either the left cell or the cell above.
Result: The length of the longest repeating subsequence will be in dp[n][n].

Complexity Analysis

Time Complexity: O(n^2). This is because we have nested loops, where n is the length of the input string.
Space Complexity: O(n^2). This is for keeping the DP table.

This C++ solution works well to find the length of the longest repeating subsequence. It works best for strings that are not too long. To save more space, we can use a rolling array technique to reduce the space needed to O(n).

For more problems on dynamic programming, check this Dynamic Programming - Longest Common Subsequence.

Optimizing Space Complexity in Longest Repeating Subsequence

When we work on the Longest Repeating Subsequence (LRS) problem using Dynamic Programming, space complexity can be a problem. This is true especially for bigger strings. The usual way uses a 2D array to keep track of the lengths of subsequences. This leads to a space complexity of O(n^2), where n is the length of the input string. But we can make this better to O(n) space by using a 1D array.

Space Optimization Technique

Observation: The current state of the DP array only needs the previous row. So, instead of keeping a full 2D table, we can just store the current and previous rows.
Implementation Steps:
- First, we create two 1D arrays called current and previous, each of size n+1 for the lengths of subsequences.
- Next, we go through the string and update the current array using the values from the previous array.
- Once we finish processing, we swap the references of current and previous.

Here is the optimized Java code:

public class LongestRepeatingSubsequence {
    public static int longestRepeatingSubsequence(String str) {
        int n = str.length();
        int[] previous = new int[n + 1];
        int[] current = new int[n + 1];
        
        for (int i = 1; i <= n; i++) {
            for (int j = 1; j <= n; j++) {
                if (str.charAt(i - 1) == str.charAt(j - 1) && i != j) {
                    current[j] = previous[j - 1] + 1;
                } else {
                    current[j] = Math.max(previous[j], current[j - 1]);
                }
            }
            // Swap references for the next iteration
            int[] temp = previous;
            previous = current;
            current = temp;
        }
        return previous[n];
    }

    public static void main(String[] args) {
        String str = "AABEBCDD";
        System.out.println("Length of Longest Repeating Subsequence is " + longestRepeatingSubsequence(str));
    }
}

Key Benefits

Reduced Space Complexity: We lower the space complexity from O(n^2) to O(n).
Maintains Time Complexity: The time complexity stays O(n^2) since we still check each character pair.

By using this space optimization method, we can solve the Longest Repeating Subsequence problem without using too much memory. If you want to learn more about dynamic programming, you can check out related problems like Dynamic Programming - Longest Common Subsequence.

Analyzing Time Complexity for Longest Repeating Subsequence

We can find the time complexity of the Longest Repeating Subsequence (LRS) problem using dynamic programming. We solve the LRS problem with a 2D table. The size of the table depends on the length of the input string.

Time Complexity Analysis

Dynamic Programming Table Construction:
- We make a 2D array called dp. Its size is (n+1) x (n+1), where n is the length of the input string.
- We need to look at each character of the input string with itself. This makes a loop inside a loop.
Iterations:
- The first loop goes for n times (for each character in the string).
- The second loop also goes for n times.
- So, the total number of loops is O(n^2).
Final Complexity:
- Each time we do something in the loops (like checking characters and changing the dp table) takes a constant time of O(1).
- So, the total time complexity for the Longest Repeating Subsequence is: [ O(n^2) ]

Example

For a string s = "AABBC", we fill the dynamic programming table by comparing the characters of the string. This gives us a time complexity of O(n^2) as we said before.

If we want to learn more about dynamic programming, we can check out articles like Dynamic Programming: Longest Common Subsequence and Dynamic Programming: Fibonacci Number.

Common Mistakes in Longest Repeating Subsequence Implementation

When we implement the Longest Repeating Subsequence (LRS) with dynamic programming, we can make some common mistakes. It is important to see and avoid these mistakes for a good solution.

Incorrect Initialization of DP Array:
We need to make sure the DP table starts right. The first row and first column should be zero. This shows subsequences of length zero.
```
int[][] dp = new int[m + 1][n + 1]; // m and n are lengths of the input strings
```
Misunderstanding the Problem Definition:
The LRS is not the same as the Longest Common Subsequence (LCS). The LRS counts the same character only once in each subsequence. We must check that indices are not the same when we compare.
```
if (str1.charAt(i - 1) == str2.charAt(j - 1) && i != j) {
    dp[i][j] = dp[i - 1][j - 1] + 1;
}
```
Improper Transition Logic:
If we do not write the transition logic correctly, it can give wrong values in the DP table. The transition must handle both cases: when characters match and when they do not.
```
else {
    dp[i][j] = Math.max(dp[i - 1][j], dp[i][j - 1]);
}
```
Boundary Conditions:
We must pay attention to the array boundaries during looping. Small mistakes can lead to errors or bad results.
Not Returning the Correct Result:
We find the final result for the LRS in dp[m][n]. We must return this value correctly after we fill the DP table.
Space Optimization Neglect:
If space usage is important, we can use a rolling array. This reduces space from O(m*n) to O(n). We can do this by only saving the current and previous rows of the DP table.
```
int[] prev = new int[n + 1];
int[] curr = new int[n + 1];
```
Failure to Handle Edge Cases:
We should always test our code with edge cases. This means checking empty strings or strings with no repeating characters. This helps us make sure our code is strong.
Ignoring Time Complexity:
The usual time complexity for LRS is O(m*n). We must make sure our code follows this rule. We should not add extra loops or unneeded calculations.

By knowing these common mistakes, we can implement the Longest Repeating Subsequence better. This helps both our accuracy and performance in dynamic programming. For more knowledge on dynamic programming, check out this Dynamic Programming article.

Frequently Asked Questions

What is the Longest Repeating Subsequence?

The Longest Repeating Subsequence (LRS) problem is about finding the longest part in a string that appears at least two times and does not overlap. We can solve this problem well using dynamic programming. We make a table to keep track of the lengths of the longest repeating subsequences. It is important to know how we can solve this problem to get better at dynamic programming.

How is Dynamic Programming Used in Longest Repeating Subsequence?

Dynamic Programming (DP) plays a key role in solving the Longest Repeating Subsequence problem. We create a 2D array. Each cell (i, j) shows the length of the longest repeating subsequence we have found. We compare characters and update the DP table based on what we found before. This way, we can find the final answer in a good way. This method helps us avoid doing the same calculations again, so it is faster.

What are Common Mistakes Made in LRS Implementation?

When we implement the Longest Repeating Subsequence, we often make some mistakes. One common mistake is not setting up the DP table correctly. We might forget to think about non-overlapping subsequences. Another mistake is not understanding when to update the DP table. This can give us wrong results. We need to pay close attention to these points to make sure our implementation works well.

Can the Longest Repeating Subsequence be Optimized for Space Complexity?

Yes, we can make the space used by the Longest Repeating Subsequence algorithm better. Instead of using a 2D DP array, we can use two 1D arrays. These will hold the current and previous states of our calculations. This way, we lower the space complexity from O(n^2) to O(n). It makes our solution faster while keeping the same time complexity.

What is the Time Complexity of the Longest Repeating Subsequence Algorithm?

The time complexity for the Longest Repeating Subsequence algorithm with dynamic programming is O(n^2). Here, n is the length of the input string. This complexity comes from filling an n x n DP table. We compare each character in the string to every other character. It is important to understand this complexity to see how efficient our implementation is.

For more insight into dynamic programming concepts, check out related articles like Dynamic Programming: Longest Common Subsequence and Dynamic Programming: Longest Increasing Subsequence.

Understanding the Problem Statement

Problem Definition

Example

Input and Output

Constraints

Dynamic Programming Approach for Longest Repeating Subsequence

Steps:

Dynamic Programming Table Update:

Java Implementation of Longest Repeating Subsequence

Explanation of the Code:

Python Code for Longest Repeating Subsequence

Explanation of the Code:

C++ Solution for Longest Repeating Subsequence

C++ Implementation

Explanation of the Code

Complexity Analysis

Optimizing Space Complexity in Longest Repeating Subsequence

Space Optimization Technique

Key Benefits

Analyzing Time Complexity for Longest Repeating Subsequence

Time Complexity Analysis

Example

Related Articles

Common Mistakes in Longest Repeating Subsequence Implementation

Frequently Asked Questions

What is the Longest Repeating Subsequence?

How is Dynamic Programming Used in Longest Repeating Subsequence?

What are Common Mistakes Made in LRS Implementation?

Can the Longest Repeating Subsequence be Optimized for Space Complexity?

What is the Time Complexity of the Longest Repeating Subsequence Algorithm?