Open In App

Longest Common Subsequence (LCS)

Last Updated : 02 Dec, 2024
Summarize
Comments
Improve
Suggest changes
Like Article
Like
Save
Share
Report
News Follow

Given two strings, s1 and s2, the task is to find the length of the Longest Common Subsequence. If there is no common subsequence, return 0.
subsequence is a string generated from the original string by deleting 0 or more characters and without changing the relative order of the remaining characters. For example , subsequences of “ABC” are “”, “A”, “B”, “C”, “AB”, “AC”, “BC” and “ABC”.
In general a string of length n has 2n subsequences.

Examples:

Input: s1 = “ABC”, s2 = “ACD”
Output: 2
Explanation: The longest subsequence which is present in both strings is “AC”.

Input: s1 = “AGGTAB”, s2 = “GXTXAYB”
Output: 4
Explanation: The longest common subsequence is “GTAB”.

Input: s1 = “ABC”, s2 = “CBA”
Output: 1
Explanation: There are three longest common subsequences of length 1, “A”, “B” and “C”.

[Naive Approach] Using Recursion – O(2 ^ min(m, n)) Time and O(min(m, n)) Space

The idea is to compare the last characters of s1 and s2. While comparing the strings s1 and s2 two cases arise:

  1. Match : Make the recursion call for the remaining strings (strings of lengths m-1 and n-1) and add 1 to result.
  2. Do not Match : Make two recursive calls. First for lengths m-1 and n, and second for m and n-1. Take the maximum of two results.

Base case : If any of the strings become empty, we return 0.

For example, consider the input strings s1 = “ABX” and s2 = “ACX”.

LCS(“ABX”, “ACX”) = 1 + LCS(“AB”, “AC”) [Last Characters Match]

LCS(“AB”, “AC”) = max( LCS(“A”, “AC”) , LCS(“AB”, “A”) ) [Last Characters Do Not Match]

LCS(“A”, “AC”) = max( LCS(“”, “AC”) , LCS(“A”, “A”) ) = max(0, 1 + LCS(“”, “”)) = 1

LCS(“AB”, “A”) = max( LCS(“A”, “A”) , LCS(“AB”, “”) ) = max( 1 + LCS(“”, “”, 0)) = 1

So overall result is 1 + 1 = 2

Below is the implementation of the recursive approach:

C++
// A Naive recursive implementation of LCS problem
#include <iostream>
using namespace std;

// Returns length of LCS for s1[0..m-1], s2[0..n-1]
int lcs(string &s1, string &s2, int m, int n) {
  
    // Base case: If either string is empty, the length of LCS is 0
    if (m == 0 || n == 0)
        return 0;

    // If the last characters of both substrings match
    if (s1[m - 1] == s2[n - 1])
      
        // Include this character in LCS and recur for remaining substrings
        return 1 + lcs(s1, s2, m - 1, n - 1);

    else
        // If the last characters do not match
        // Recur for two cases:
        // 1. Exclude the last character of s1 
        // 2. Exclude the last character of s2 
        // Take the maximum of these two recursive calls
        return max(lcs(s1, s2, m, n - 1), lcs(s1, s2, m - 1, n));
}

int main() {
    string s1 = "AGGTAB";
    string s2 = "GXTXAYB";
    int m = s1.size();
    int n = s2.size();

    cout << lcs(s1, s2, m, n) << endl;

    return 0;
}
C
// A Naive recursive implementation of LCS problem
#include <stdio.h>
#include <string.h>

int max(int x, int y) {
      return x > y ? x : y; 
}

// Returns length of LCS for s1[0..m-1], s2[0..n-1]
int lcs(char *s1, char *s2, int m, int n) {

    // Base case: If either string is empty, the length of LCS is 0
    if (m == 0 || n == 0)
        return 0;

    // If the last characters of both substrings match
    if (s1[m - 1] == s2[n - 1])

        // Include this character in LCS and recur for remaining substrings
        return 1 + lcs(s1, s2, m - 1, n - 1);

    else
        // If the last characters do not match
        // Recur for two cases:
        // 1. Exclude the last character of S1 
        // 2. Exclude the last character of S2 
        // Take the maximum of these two recursive calls
        return max(lcs(s1, s2, m, n - 1), lcs(s1, s2, m - 1, n));
}

int main() {
    char s1[] = "AGGTAB";
    char s2[] = "GXTXAYB";
    int m = strlen(s1);
    int n = strlen(s2);

    printf("%d\n", lcs(s1, s2, m, n));
    return 0;
}
Java
// A Naive recursive implementation of LCS problem
class GfG {

    // Returns length of LCS for s1[0..m-1], s2[0..n-1]
    static int lcs(String s1, String s2, int m, int n) {

        // Base case: If either string is empty, the length of LCS is 0
        if (m == 0 || n == 0)
            return 0;

        // If the last characters of both substrings match
        if (s1.charAt(m - 1) == s2.charAt(n - 1))

            // Include this character in LCS and recur for remaining substrings
            return 1 + lcs(s1, s2, m - 1, n - 1);

        else
            // If the last characters do not match
            // Recur for two cases:
            // 1. Exclude the last character of S1 
            // 2. Exclude the last character of S2 
            // Take the maximum of these two recursive calls
            return Math.max(lcs(s1, s2, m, n - 1), lcs(s1, s2, m - 1, n));
    }

    public static void main(String[] args) {
        String s1 = "AGGTAB";
        String s2 = "GXTXAYB";
        int m = s1.length();
        int n = s2.length();

        System.out.println(lcs(s1, s2, m, n));
    }
}
Python
# A Naive recursive implementation of LCS problem

# Returns length of LCS for s1[0..m-1], s2[0..n-1]
def lcs(s1, s2, m, n):
  
    # Base case: If either string is empty, the length of LCS is 0
    if m == 0 or n == 0:
        return 0

    # If the last characters of both substrings match
    if s1[m - 1] == s2[n - 1]:

        # Include this character in LCS and recur for remaining substrings
        return 1 + lcs(s1, s2, m - 1, n - 1)

    else:
        # If the last characters do not match
        # Recur for two cases:
        # 1. Exclude the last character of S1 
        # 2. Exclude the last character of S2 
        # Take the maximum of these two recursive calls
        return max(lcs(s1, s2, m, n - 1), lcs(s1, s2, m - 1, n))

if __name__ == "__main__":
    s1 = "AGGTAB"
    s2 = "GXTXAYB"
    m = len(s1)
    n = len(s2)

    print(lcs(s1, s2, m, n))
C#
// A Naive recursive implementation of LCS problem
using System;

class GfG {

    // Returns length of LCS for s1[0..m-1], s2[0..n-1]
    static int lcs(string s1, string s2, int m, int n) {

        // Base case: If either string is empty, the length of LCS is 0
        if (m == 0 || n == 0)
            return 0;

        // If the last characters of both substrings match
        if (s1[m - 1] == s2[n - 1])

            // Include this character in LCS and recur for remaining substrings
            return 1 + lcs(s1, s2, m - 1, n - 1);

        else
            // If the last characters do not match
            // Recur for two cases:
            // 1. Exclude the last character of S1 
            // 2. Exclude the last character of S2 
            // Take the maximum of these two recursive calls
            return Math.Max(lcs(s1, s2, m, n - 1), lcs(s1, s2, m - 1, n));
    }

    static void Main() {
        string s1 = "AGGTAB";
        string s2 = "GXTXAYB";
        int m = s1.Length;
        int n = s2.Length;

        Console.WriteLine(lcs(s1, s2, m, n));
    }
}
JavaScript
// A Naive recursive implementation of LCS problem

// Returns length of LCS for s1[0..m-1], s2[0..n-1]
function lcs(s1, s2, m, n) {
  
    // Base case: If either string is empty, the length of LCS is 0
    if (m === 0 || n === 0)
        return 0;

    // If the last characters of both substrings match
    if (s1[m - 1] === s2[n - 1])

        // Include this character in LCS and recur for remaining substrings
        return 1 + lcs(s1, s2, m - 1, n - 1);

    else
        // If the last characters do not match
        // Recur for two cases:
        // 1. Exclude the last character of S1 
        // 2. Exclude the last character of S2 
        // Take the maximum of these two recursive calls
        return Math.max(lcs(s1, s2, m, n - 1), lcs(s1, s2, m - 1, n));
}

// driver code
let s1 = "AGGTAB";
let s2 = "GXTXAYB";
let m = s1.length;
let n = s2.length;

console.log(lcs(s1, s2, m, n));

Output
4

Time Complexity: O(2min(m, n)) , where m and n are lengths of strings s1 and s2.
Auxiliary Space: O(min(m, n)) , recursion stack space

[Better Approach] Using Memoization – O(m * n) Time and O(m * n) Space

If we use the above recursive approach for strings “AXYT” and “AYZX“, we will get a partial recursion tree as shown below. Here we can see that the subproblem L(“AXY”, “AYZ”) is being calculated more than once. If the total tree is considered there will be several such overlapping subproblems. Hence we can optimize it either using memoization or tabulation.

Longest-Common-Subsequence

Overlapping Subproblems in Longest Common Subsequence

  • There are two parameters that change in the recursive solution and these parameters go from 0 to m and 0 to n. So we create a 2D array of size (m+1) x (n+1).
  • We initialize this array as -1 to indicate nothing is computed initially.
  • Now we modify our recursive solution to first do a lookup in this table and if the value is -1, then only make recursive calls. This way we avoid re-computations of the same subproblems.

Below is the implementation of the above approach:

C++
// C++ implementation of Top-Down DP
// of LCS problem
#include <iostream>
#include <vector>
using namespace std;

// Returns length of LCS for s1[0..m-1], s2[0..n-1]
int lcs(string &s1, string &s2, int m, int n, vector<vector<int>> &memo) {

    // Base Case
    if (m == 0 || n == 0)
        return 0;

    // Already exists in the memo table
    if (memo[m][n] != -1)
        return memo[m][n];

    // Match
    if (s1[m - 1] == s2[n - 1])
        return memo[m][n] = 1 + lcs(s1, s2, m - 1, n - 1, memo);

    // Do not match
    return memo[m][n] = max(lcs(s1, s2, m, n - 1, memo), lcs(s1, s2, m - 1, n, memo));
}

int main() {
    string s1 = "AGGTAB";
    string s2 = "GXTXAYB";

    int m = s1.length();
    int n = s2.length();
    vector<vector<int>> memo(m + 1, vector<int>(n + 1, -1));
    cout << lcs(s1, s2, m, n, memo) << endl;
    return 0;
}
C
// C implementation of Top-Down DP
// of LCS problem
#include <stdio.h>
#include <string.h>

// Define a maximum size for the strings
#define MAX 1000

// Function to find the maximum of two integers
int max(int a, int b) {
    return (a > b) ? a : b;
}

// Returns length of LCS for s1[0..m-1], s2[0..n-1]
int lcs(const char *s1, const char *s2, int m, int n, int memo[MAX][MAX]) {
  
    // Base Case
    if (m == 0 || n == 0) {
        return 0;
    }

    // Already exists in the memo table
    if (memo[m][n] != -1) {
        return memo[m][n];
    }

    // Match
    if (s1[m - 1] == s2[n - 1]) {
        return memo[m][n] = 1 + lcs(s1, s2, m - 1, n - 1, memo);
    }

    // Do not match
    return memo[m][n] = max(lcs(s1, s2, m, n - 1, memo), lcs(s1, s2, m - 1, n, memo));
}

int main() {
    const char *s1 = "AGGTAB";
    const char *s2 = "GXTXAYB";

    int m = strlen(s1);
    int n = strlen(s2);

    // Create memo table with fixed size
    int memo[MAX][MAX];
    for (int i = 0; i <= m; i++) {
        for (int j = 0; j <= n; j++) {
            // Initialize memo table with -1
            memo[i][j] = -1;
        }
    }

    printf("%d\n", lcs(s1, s2, m, n, memo));

    return 0;
}
Java
// Java implementation of Top-Down DP of LCS problem
import java.util.Arrays;

class GfG {
  
    // Returns length of LCS for s1[0..m-1], s2[0..n-1]
    static int lcs(String s1, String s2, int m, int n,
                   int[][] memo) {
        // Base Case
        if (m == 0 || n == 0)
            return 0;

        // Already exists in the memo table
        if (memo[m][n] != -1)
            return memo[m][n];

        // Match
        if (s1.charAt(m - 1) == s2.charAt(n - 1)) {
            return memo[m][n]
                = 1 + lcs(s1, s2, m - 1, n - 1, memo);
        }

        // Do not match
        return memo[m][n]
            = Math.max(lcs(s1, s2, m, n - 1, memo),
                       lcs(s1, s2, m - 1, n, memo));
    }

    public static void main(String[] args) {
        String s1 = "AGGTAB";
        String s2 = "GXTXAYB";

        int m = s1.length();
        int n = s2.length();
        int[][] memo = new int[m + 1][n + 1];

        // Initialize the memo table with -1
        for (int i = 0; i <= m; i++) {
            Arrays.fill(memo[i], -1);
        }

        System.out.println(lcs(s1, s2, m, n, memo));
    }
}
Python
# Python implementation of Top-Down DP of LCS problem

# Returns length of LCS for s1[0..m-1], s2[0..n-1]
def lcs(s1, s2, m, n, memo):
    # Base Case
    if m == 0 or n == 0:
        return 0

    # Already exists in the memo table
    if memo[m][n] != -1:
        return memo[m][n]

    # Match
    if s1[m - 1] == s2[n - 1]:
        memo[m][n] = 1 + lcs(s1, s2, m - 1, n - 1, memo)
        return memo[m][n]

    # Do not match
    memo[m][n] = max(lcs(s1, s2, m, n - 1, memo),
                     lcs(s1, s2, m - 1, n, memo))
    return memo[m][n]

if __name__ == "__main__":
    s1 = "AGGTAB"
    s2 = "GXTXAYB"

    m = len(s1)
    n = len(s2)
    memo = [[-1 for _ in range(n + 1)] for _ in range(m + 1)]
    print(lcs(s1, s2, m, n, memo))
C#
// C# implementation of Top-Down DP of LCS problem
using System;
class GfG {

    // Returns length of LCS for s1[0..m-1], s2[0..n-1]
    static int lcs(string s1, string s2, int m,
                          int n, int[, ] memo) {
        // Base Case
        if (m == 0 || n == 0)
            return 0;

        // Already exists in the memo table
        if (memo[m, n] != -1)
            return memo[m, n];

        // Match
        if (s1[m - 1] == s2[n - 1]) {
            return memo[m, n]
                = 1 + lcs(s1, s2, m - 1, n - 1, memo);
        }

        // Do not match
        return memo[m, n]
            = Math.Max(lcs(s1, s2, m, n - 1, memo),
                       lcs(s1, s2, m - 1, n, memo));
    }

    public static void Main() {
        string s1 = "AGGTAB";
        string s2 = "GXTXAYB";

        int m = s1.Length;
        int n = s2.Length;
        int[, ] memo = new int[m + 1, n + 1];

        // Initialize memo array with -1
        for (int i = 0; i <= m; i++) {
            for (int j = 0; j <= n; j++) {
                memo[i, j] = -1;
            }
        }

        Console.WriteLine(lcs(s1, s2, m, n, memo));
    }
}
JavaScript
// A Top-Down DP implementation of LCS problem

// Returns length of LCS for s1[0..m-1], s2[0..n-1]
function lcs(s1, s2, m, n, memo) {
    // Base Case
    if (m === 0 || n === 0)
        return 0;

    // Already exists in the memo table
    if (memo[m][n] !== -1)
        return memo[m][n];

    // Match
    if (s1[m - 1] === s2[n - 1]) {
        memo[m][n] = 1 + lcs(s1, s2, m - 1, n - 1, memo);
        return memo[m][n];
    }

    // Do not match
    memo[m][n] = Math.max(lcs(s1, s2, m, n - 1, memo),
                          lcs(s1, s2, m - 1, n, memo));
    return memo[m][n];
}

// driver code
const s1 = "AGGTAB";
const s2 = "GS1TS1AS2B";

const m = s1.length;
const n = s2.length;
const memo = Array.from({length : m + 1},
                        () => Array(n + 1).fill(-1));

console.log(lcs(s1, s2, m, n, memo));

Output
4

Time Complexity: O(m * n) ,where m and n are lengths of strings s1 and s2.
Auxiliary Space: O(m * n)

[Expected Approach 1] Using Bottom-Up DP (Tabulation) – O(m * n) Time and O(m * n) Space

There are two parameters that change in the recursive solution and these parameters go from 0 to m and 0 to n. So we create a 2D dp array of size (m+1) x (n+1).

  • We first fill the known entries when m is 0 or n is 0.
  • Then we fill the remaining entries using the recursive formula.

Say the strings are S1 = “AXTY” and S2 = “AYZX”, Follow below :


Below is the implementation of the above approach:

C++
#include <iostream>
#include <vector>
using namespace std;

// Returns length of LCS for s1[0..m-1], s2[0..n-1]
int lcs(string &s1, string &s2) {
    int m = s1.size();
    int n = s2.size();

    // Initializing a matrix of size (m+1)*(n+1)
    vector<vector<int>> dp(m + 1, vector<int>(n + 1, 0));

    // Building dp[m+1][n+1] in bottom-up fashion
    for (int i = 1; i <= m; ++i) {
        for (int j = 1; j <= n; ++j) {
            if (s1[i - 1] == s2[j - 1])
                dp[i][j] = dp[i - 1][j - 1] + 1;
            else
                dp[i][j] = max(dp[i - 1][j], dp[i][j - 1]);
        }
    }

    // dp[m][n] contains length of LCS for s1[0..m-1]
    // and s2[0..n-1]
    return dp[m][n];
}

int main() {
    string s1 = "AGGTAB";
    string s2 = "GXTXAYB";
    cout << lcs(s1, s2) << endl;

    return 0;
}
C
#include <stdio.h>
#include <stdlib.h>
#include <string.h>

int max(int x, int y);

// Function to find length of LCS for s1[0..m-1], s2[0..n-1]
int lcs(const char *S1, const char *S2) {
    int m = strlen(S1);
    int n = strlen(S2);

    // Initializing a matrix of size (m+1)*(n+1)
    int dp[m + 1][n + 1];

    // Building dp[m+1][n+1] in bottom-up fashion
    for (int i = 0; i <= m; i++) {
        for (int j = 0; j <= n; j++) {
          
            if (i == 0 || j == 0)
                dp[i][j] = 0;
            
            else if (S1[i - 1] == S2[j - 1])
                dp[i][j] = dp[i - 1][j - 1] + 1;

            else
                dp[i][j] = max(dp[i - 1][j], dp[i][j - 1]);
        }
    }
   
    return dp[m][n];
}

int max(int x, int y) {
    return (x > y) ? x : y;
}

int main() {
    const char *S1 = "AGGTAB";
    const char *S2 = "GXTXAYB";
    printf("Length of LCS is %d\n", lcs(S1, S2));

    return 0;
}
Java
import java.util.Arrays;

class GfG {
  
    // Returns length of LCS for s1[0..m-1], s2[0..n-1]
    static int lcs(String S1, String S2) {
        int m = S1.length();
        int n = S2.length();

        // Initializing a matrix of size (m+1)*(n+1)
        int[][] dp = new int[m + 1][n + 1];

        // Building dp[m+1][n+1] in bottom-up fashion
        for (int i = 1; i <= m; i++) {
            for (int j = 1; j <= n; j++) {
                if (S1.charAt(i - 1) == S2.charAt(j - 1)) {
                    dp[i][j] = dp[i - 1][j - 1] + 1;
                }
                else {
                    dp[i][j] = Math.max(dp[i - 1][j],
                                        dp[i][j - 1]);
                }
            }
        }

        // dp[m][n] contains length of LCS for S1[0..m-1]
        // and S2[0..n-1]
        return dp[m][n];
    }

  
    public static void main(String[] args)
    {
        String S1 = "AGGTAB";
        String S2 = "GXTXAYB";
        System.out.println("Length of LCS is "
                           + lcs(S1, S2));
    }
}
Python
def get_lcs_length(S1, S2):
    m = len(S1)
    n = len(S2)

    # Initializing a matrix of size (m+1)*(n+1)
    dp = [[0] * (n + 1) for x in range(m + 1)]

    # Building dp[m+1][n+1] in bottom-up fashion
    for i in range(1, m + 1):
        for j in range(1, n + 1):
            if S1[i - 1] == S2[j - 1]:
                dp[i][j] = dp[i - 1][j - 1] + 1
            else:
                dp[i][j] = max(dp[i - 1][j],
                               dp[i][j - 1])

    # dp[m][n] contains length of LCS for S1[0..m-1]
    # and S2[0..n-1]
    return dp[m][n]


if __name__ == "__main__":
    S1 = "AGGTAB"
    S2 = "GXTXAYB"
    print("Length of LCS is", get_lcs_length(S1, S2))
C#
using System;

class Gfg {
    // Returns length of LCS for S1[0..m-1], S2[0..n-1]
    static int GetLCSLength(string S1, string S2) {
        int m = S1.Length;
        int n = S2.Length;

        // Initializing a matrix of size (m+1)*(n+1)
        int[, ] dp = new int[m + 1, n + 1];

        // Building dp[m+1][n+1] in bottom-up fashion
        for (int i = 1; i <= m; i++) {
            for (int j = 1; j <= n; j++) {
                if (S1[i - 1] == S2[j - 1]) {
                    dp[i, j] = dp[i - 1, j - 1] + 1;
                }
                else {
                    dp[i, j] = Math.Max(dp[i - 1, j],
                                        dp[i, j - 1]);
                }
            }
        }

        // dp[m, n] contains length of LCS for S1[0..m-1]
        // and S2[0..n-1]
        return dp[m, n];
    }

    static void Main() {
        string S1 = "AGGTAB";
        string S2 = "GXTXAYB";
        Console.WriteLine("Length of LCS is "
                          + GetLCSLength(S1, S2));
    }
}
JavaScript
function getLcsLength(S1, S2) {
    const m = S1.length;
    const n = S2.length;

    // Initializing a matrix of size (m+1)*(n+1)
    const dp = Array.from({length : m + 1},
                          () => Array(n + 1).fill(0));

    // Building dp[m+1][n+1] in bottom-up fashion
    for (let i = 1; i <= m; i++) {
        for (let j = 1; j <= n; j++) {
            if (S1[i - 1] === S2[j - 1]) {
                dp[i][j] = dp[i - 1][j - 1] + 1;
            }
            else {
                dp[i][j]
                    = Math.max(dp[i - 1][j], dp[i][j - 1]);
            }
        }
    }

    // dp[m][n] contains length of LCS for
    // S1[0..m-1] and S2[0..n-1]
    return dp[m][n];
}

const S1 = "AGGTAB";
const S2 = "GXTXAYB";
console.log("Length of LCS is", getLcsLength(S1, S2));

Output
Length of LCS is 4

Time Complexity: O(m * n) which is much better than the worst-case time complexity of Naive Recursive implementation. 
Auxiliary Space: O(m * n) because the algorithm uses an array of size (m+1)*(n+1) to store the length of the common subsequence.

[Expected Approach 2] Using Bottom-Up DP (Space-Optimization):

One important observation in the above simple implementation is, in each iteration of the outer loop we only need values from all columns of the previous row. So there is no need to store all rows in our DP matrix, we can just store two rows at a time and use them. We can further optimize to use only one array.

Please refer this post: A Space Optimized Solution of LCS

Applications of LCS

LCS is used to implement diff utility (find the difference between two data sources). It is also widely used by revision control systems such as Git for multiple changes made to a revision-controlled collection of files.

Problems based on LCS




Next Article

Similar Reads

three90RightbarBannerImg
  翻译: