Edit Distance - Codeprg

Breaking

programing News Travel Computer Engineering Science Blogging Earning

Saturday 29 August 2020

Edit Distance

Edit Distance

 

Edit Distance

Given two words word1 and word2, find the minimum number of operations required to convert word1 to word2.

You have the following 3 operations permitted on a word:

  1. Insert a character
  2. Delete a character
  3. Replace a character

Example 1:

Input: word1 = "horse", word2 = "ros"
Output: 3
Explanation: 
horse -> rorse (replace 'h' with 'r')
rorse -> rose (remove 'r')
rose -> ros (remove 'e')

Example 2:

Input: word1 = "intention", word2 = "execution"
Output: 5
Explanation: 
intention -> inention (remove 't')
inention -> enention (replace 'i' with 'e')
enention -> exention (replace 'n' with 'x')
exention -> exection (replace 'n' with 'c')
exection -> execution (insert 'u')
C++ Solution :
To apply DP, we define the state dp[i][j] to be the minimum number of operations to convert word1[0..i) to word2[0..j).

For the base case, that is, to convert a string to an empty string, the mininum number of operations (deletions) is just the length of the string. So we have dp[i][0] = i and dp[0][j] = j.

For the general case to convert word1[0..i) to word2[0..j), we break this problem down into sub-problems. Suppose we have already known how to convert word1[0..i - 1) to word2[0..j - 1) (dp[i - 1][j - 1]), if word1[i - 1] == word2[j - 1], then no more operation is needed and dp[i][j] = dp[i - 1][j - 1].

If word1[i - 1] != word2[j - 1], we need to consider three cases.

Replace word1[i - 1] by word2[j - 1] (dp[i][j] = dp[i - 1][j - 1] + 1);
If word1[0..i - 1) = word2[0..j) then delete word1[i - 1] (dp[i][j] = dp[i - 1][j] + 1);
If word1[0..i) + word2[j - 1] = word2[0..j) then insert word2[j - 1] to word1[0..i) (dp[i][j] = dp[i][j - 1] + 1).
So when word1[i - 1] != word2[j - 1], dp[i][j] will just be the minimum of the above three cases.

class Solution {
public:
    int minDistance(string word1, string word2) {
        int m = word1.size(), n = word2.size();
        vector<vector<int>> dp(m + 1, vector<int>(n + 1, 0));
        for (int i = 1; i <= m; i++) {
            dp[i][0] = i;
        }
        for (int j = 1; j <= n; j++) {
            dp[0][j] = j;
        }
        for (int i = 1; i <= m; i++) {
            for (int j = 1; j <= n; j++) {
                if (word1[i - 1] == word2[j - 1]) {
                    dp[i][j] = dp[i - 1][j - 1];
                } else {
                    dp[i][j] = min(dp[i - 1][j - 1], min(dp[i][j - 1], dp[i - 1][j])) + 1;
                }
            }
        }
        return dp[m][n];
    }
};
Note that each time when we update dp[i][j], we only need dp[i - 1][j - 1], dp[i][j - 1] and dp[i - 1][j]. We may optimize the space of the code to use only two vectors.

class Solution {
public:
    int minDistance(string word1, string word2) {
        int m = word1.size(), n = word2.size();
        vector<int> pre(n + 1, 0), cur(n + 1, 0);
        for (int j = 1; j <= n; j++) {
            pre[j] = j;
        }
        for (int i = 1; i <= m; i++) {
            cur[0] = i;
            for (int j = 1; j <= n; j++) {
                if (word1[i - 1] == word2[j - 1]) {
                    cur[j] = pre[j - 1];
                } else {
                    cur[j] = min(pre[j - 1], min(cur[j - 1], pre[j])) + 1;
                }
            }
            fill(pre.begin(), pre.end(), 0);
            swap(pre, cur);
        }
        return pre[n];
    }
};
Or even just one vector.

class Solution {
public:
    int minDistance(string word1, string word2) {
        int m = word1.size(), n = word2.size(), pre;
        vector<int> cur(n + 1, 0);
        for (int j = 1; j <= n; j++) {
            cur[j] = j;
        }
        for (int i = 1; i <= m; i++) {
            pre = cur[0];
            cur[0] = i;
            for (int j = 1; j <= n; j++) {
                int temp = cur[j];
                if (word1[i - 1] == word2[j - 1]) {
                    cur[j] = pre;
                } else {
                    cur[j] = min(pre, min(cur[j - 1], cur[j])) + 1;
                }
                pre = temp;
            }
        }
        return cur[n];
    }
};
For those having difficulty cracking dynamic programming solutions, I find it easiest to 
solve by first starting with a naive, but working recursive implementation. It's essential
to do so, because dynamic programming is basically recursion with caching. With this workflow, 
deciphering dynamic programming problems becomes just a little more manageable for us normal people. :)

Thought process:
Given two strings, we're tasked with finding the minimum number of transformations we 
need to make to arrive with equivalent strings. From the get-go, there doesn't seem to
 be any way around trying all possibilities, and in this, possibilities refers to inserting, 
 deleting, or replacing a character. Recursion is usually a good choice for trying all possilbilities.

Whenever we write recursive functions, we'll need some way to terminate, or else we'll end up 
overflowing the stack via infinite recursion. With strings, the natural state to keep track of
 is the index. We'll need two indexes, one for word1 and one for word2. Now we just need to handle
  our base cases, and recursive cases.
What happens when we're done with either word? Some thought will tell you that the minimum number 
of transformations is simply to insert the rest of the other word. This is our base case. What 
about when we're not done with either string? We'll either match the currently indexed characters
 in both strings, or mismatch. In the first case, we don't incur any penalty, and we can continue
  to compare the rest of the strings by recursing on the rest of both strings. In the case of a 
  mismatch, we either insert, delete, or replace. To recap:

base case: word1 = "" or word2 = "" => return length of other string
recursive case: word1[0] == word2[0] => recurse on word1[1:] and word2[1:]
recursive case: word1[0] != word2[0] => recurse by inserting, deleting, or replacing
And in Python:

class Solution:
    def minDistance(self, word1, word2):
        """Naive recursive solution"""
        if not word1 and not word2:
            return 0
        if not word1:
            return len(word2)
        if not word2:
            return len(word1)
        if word1[0] == word2[0]:
            return self.minDistance(word1[1:], word2[1:])
        insert = 1 + self.minDistance(word1, word2[1:])
        delete = 1 + self.minDistance(word1[1:], word2)
        replace = 1 + self.minDistance(word1[1:], word2[1:])
        return min(insert, replace, delete)
With a solution in hand, we're ecstatic and we go to submit our code. All is well until 
we see the dreaded red text... TIME LIMIT EXCEEDED. What did we do wrong? Let's look at 
a simple example, and for sake of brevity I'll annotate the minDistance function as md.

word1 = "horse"
word2 = "hello"

The tree of recursive calls, 3 levels deep, looks like the following. I've highlighted 
recursive calls with multiple invocations. So now we see that we're repeating work. I'm 
not going to try and analyze the runtime of this solution, but it's exponential.

md("horse", "hello")
    md("orse", "ello")
        md("orse", "llo")
            md("orse", "lo")
            md("rse", "llo") <- 
            md("rse", "lo")
        md("rse", "ello")
            md("rse", "llo") <-
            md("se", "ello")
            md("se", "llo") <<-
        md("rse", "llo")
            md("rse", "llo") <-
            md("se", "llo") <<-
            md("se", "lo")
The way we fix this is by caching. We save intermediate computations in a dictionary and if 
we recur on the same subproblem, instead of doing the same work again, we return the saved 
value. Here is the memoized solution, where we build from bigger subproblems to smaller subproblems (top-down).

class Solution:
    def minDistance(self, word1, word2, i, j, memo):
        """Memoized solution"""
        if i == len(word1) and j == len(word2):
            return 0
        if i == len(word1):
            return len(word2) - j
        if j == len(word2):
            return len(word1) - i

        if (i, j) not in memo:
            if word1[i] == word2[j]:
                ans = self.minDistance2(word1, word2, i + 1, j + 1, memo)
            else: 
                insert = 1 + self.minDistance2(word1, word2, i, j + 1, memo)
                delete = 1 + self.minDistance2(word1, word2, i + 1, j, memo)
                replace = 1 + self.minDistance2(word1, word2, i + 1, j + 1, memo)
                ans = min(insert, delete, replace)
            memo[(i, j)] = ans
        return memo[(i, j)]
Of course, an interative implementation is usually better than its recursive counterpart because 
we don't risk blowing up our stack in case the number of recursive calls is very deep. We can also
 use a 2D array to do essentially the same thing as the dictionary of cached values. When we do this,
  we build up solutions from smaller subproblems to bigger subproblems (bottom-up). In this case, 
  since we are no longer "recurring" in the traditional sense, we initialize our 2D table with base 
  constraints. The first row and column of the table has known values since if one string is empty,
   we simply add the length of the non-empty string since that is the minimum number of edits necessary 
   to arrive at equivalent strings. For both the memoized and dynamic programming solutions, the runtime
    is O(mn) and the space complexity is O(mn) where m and n are the lengths of word1 and word2, respectively.

class Solution:
    def minDistance(self, word1, word2):
        """Dynamic programming solution"""
        m = len(word1)
        n = len(word2)
        table = [[0] * (n + 1) for _ in range(m + 1)]

        for i in range(m + 1):
            table[i][0] = i
        for j in range(n + 1):
            table[0][j] = j

        for i in range(1, m + 1):
            for j in range(1, n + 1):
                if word1[i - 1] == word2[j - 1]:
                    table[i][j] = table[i - 1][j - 1]
                else:
                    table[i][j] = 1 + min(table[i - 1][j], table[i][j - 1], table[i - 1][j - 1])
        return table[-1][-1]
JAVA SOLUTION
Let following be the function definition :-

f(i, j) := minimum cost (or steps) required to convert first i characters
 of word1 to first j characters of word2

Case 1: word1[i] == word2[j], i.e. the ith the jth character matches.

f(i, j) = f(i - 1, j - 1)

Case 2: word1[i] != word2[j], then we must either insert, delete or
 replace, whichever is cheaper

f(i, j) = 1 + min { f(i, j - 1), f(i - 1, j), f(i - 1, j - 1) }

f(i, j - 1) represents insert operation
f(i - 1, j) represents delete operation
f(i - 1, j - 1) represents replace operation
Here, we consider any operation from word1 to word2. It means, when we say
 insert operation, we insert a new character after word1 that matches the 
 jth character of word2. So, now have to match i characters of word1 to j - 1 
 characters of word2. Same goes for other 2 operations as well.

Note that the problem is symmetric. The insert operation in one direction
 (i.e. from word1 to word2) is same as delete operation in other. So, we 
 could choose any direction.

Above equations become the recursive definitions for DP.

Base Case:

f(0, k) = f(k, 0) = k

Below is the direct bottom-up translation of this recurrent relation. 
It is only important to take care of 0-based index with actual code :-

public class Solution {
    public int minDistance(String word1, String word2) {
        int m = word1.length();
        int n = word2.length();
        
        int[][] cost = new int[m + 1][n + 1];
        for(int i = 0; i <= m; i++)
            cost[i][0] = i;
        for(int i = 1; i <= n; i++)
            cost[0][i] = i;
        
        for(int i = 0; i < m; i++) {
            for(int j = 0; j < n; j++) {
                if(word1.charAt(i) == word2.charAt(j))
                    cost[i + 1][j + 1] = cost[i][j];
                else {
                    int a = cost[i][j];
                    int b = cost[i][j + 1];
                    int c = cost[i + 1][j];
                    cost[i + 1][j + 1] = a < b ? (a < c ? a : c) : (b < c ? b : c);
                    cost[i + 1][j + 1]++;
                }
            }
        }
        return cost[m][n];
    }
}
Time complexity : If n is the length of word1, m of word2, because of the
 two indented loops, it is O(nm)Let following be the function definition :-

f(i, j) := minimum cost (or steps) required to convert first i characters
 of word1 to first j characters of word2

Case 1: word1[i] == word2[j], i.e. the ith the jth character matches.

f(i, j) = f(i - 1, j - 1)

Case 2: word1[i] != word2[j], then we must either insert, delete or
 replace, whichever is cheaper

f(i, j) = 1 + min { f(i, j - 1), f(i - 1, j), f(i - 1, j - 1) }

f(i, j - 1) represents insert operation
f(i - 1, j) represents delete operation
f(i - 1, j - 1) represents replace operation
Here, we consider any operation from word1 to word2. It means, when we
 say insert operation, we insert a new character after word1 that matches 
 the jth character of word2. So, now have to match i characters of word1
  to j - 1 characters of word2. Same goes for other 2 operations as well.

Note that the problem is symmetric. The insert operation in one direction
 (i.e. from word1 to word2) is same as delete operation in other. So, 
 we could choose any direction.

Above equations become the recursive definitions for DP.

Base Case:

f(0, k) = f(k, 0) = k

Below is the direct bottom-up translation of this recurrent relation.
 It is only important to take care of 0-based index with actual code :-

public class Solution {
    public int minDistance(String word1, String word2) {
        int m = word1.length();
        int n = word2.length();
        
        int[][] cost = new int[m + 1][n + 1];
        for(int i = 0; i <= m; i++)
            cost[i][0] = i;
        for(int i = 1; i <= n; i++)
            cost[0][i] = i;
        
        for(int i = 0; i < m; i++) {
            for(int j = 0; j < n; j++) {
                if(word1.charAt(i) == word2.charAt(j))
                    cost[i + 1][j + 1] = cost[i][j];
                else {
                    int a = cost[i][j];
                    int b = cost[i][j + 1];
                    int c = cost[i + 1][j];
                    cost[i + 1][j + 1] = a < b ? (a < c ? a : c) : (b < c ? b : c);
                    cost[i + 1][j + 1]++;
                }
            }
        }
        return cost[m][n];
    }
}
Time complexity : If n is the length of word1, m of word2, because of the
 two indented loops, it is O(nm)