The dynamic planning process is: each decision depends on the current state, and then the state is transferred. A decision sequence is generated in a changing state, so this multi-stage optimization decision-making process is called dynamic programming.
Dynamic programming is actually a general term for a type of topic, not a fixed algorithm. The meaning of dynamic programming is to solve the overall approach by adopting recursive (or division and conquer) strategies and solving the sub-problems of big problems. The core idea of dynamic programming is to cleverly split the problem into multiple sub-problems, and to obtain the overall solution to the problem by calculating the sub-problems. The sub-problems can be split into more sub-problems, thereby solving the required problems using methods similar to recursive iteration. Question description:
For the sequences S and T, the distance between them is defined as: performing the following operations several times on one of the two: 1, delete a character; 2, insert a character; 3, change a character. Each time the operation is performed, the count is increased by 1. The minimum count of turning S and T into an equal sequence is the edit distance (editdistance) or similarity between the two. Please give the corresponding algorithm and its implementation.
analyze:
Assume that the lengths of the sequences S and T are m and n respectively, and the edit distance between them is expressed as edit[m][n]. Then there are the following situations when operating the sequence:
a. When the end characters of S and T are equal, no one of the above definition operations (i.e. "Edit") is required for the end characters, that is, no count is required. The condition is satisfied: edit[m][n]=edit[m-1][n-1].
b. When the end characters of S and T are not equal, the end of either of them needs to be edited, and the corresponding count will be increased by 1.
b1, modify the end of S or T to make it equal to T or S, then edit[m][n]=edit[m-1][n-1]+1;
b2, delete the element at the end of S to make S and T equal, then edit[m][n]=edit[m-1][n]+1;
b3, delete the element at the end of T to make T and S equal, then edit[m][n]=edit[m][n-1]+1;
b4, add the tail element of T at the end of S to make S and T equal, then the length of S becomes m+1, but at this time, the end elements of S and T are already equal. You only need to compare the first m elements of S with the first n-1 elements of T, so edit[m][n]=edit[m][n-1]+1;
b5, add the tail element of S at the end of T to make T and S equal. The situation at this time is the same as b4, satisfying edit[m][n]=edit[m-1][n]+1;
c. A special case is that when S is empty, edit[0][n]=n; and when T is empty, edit[m][0]=m; This is easy to understand. For example, for sequences "" and "abc", the minimum operation of both is 3, that is, sequence "" performs 3 insertion operations, or sequence "abc" performs 3 delete operations.
Therefore, it is not difficult for us to deduce the dynamic programming equation of editing distance above:
Therefore, the recursive implementation of the dynamic programming algorithm for string editing distance can be expressed in the following Java code:
public static int editDistance(String a, String b) { if (a == null || b == null) { return -1; } return editDistance(a, a.length() - 1, b, b.length() - 1); } public static int editDistance(String a, int m, String b, int n) { if (m < 0 || n < 0) { return 1; } else if (a.charAt(m) == b.charAt(n)) { return editDistance(a, m - 1, b, n - 1); } else { return Math.min(Math.min(editDistance(a, m - 1, b, n) + 1, editDistance(a, m, b, n - 1) + 1), editDistance(a, m - 1, b, n - 1) + 1); } }UPDATE:
At the same time, from the dynamic programming equation of edit distance, we can see that edit[m][n] can be derived from edit[m - 1][n - 1], edit[m - 1][n], edit[m][n - 1], and if edit is a two-dimensional array, edit[m][n] can be determined by conditions by the elements at its upper, left and upper left positions. That is, we can traverse the two-dimensional array and then calculate the current value through backtracking.
For example, for the strings S = "sailn" and T = "failing", the two-dimensional array is initialized as:
| m/n | f | a | i | l | i | n | g | |
| 0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | |
| s | 1 | 1 | ||||||
| a | 2 | |||||||
| i | 3 | |||||||
| l | 4 | |||||||
| n | 5 |
Because S[0] = s, T[0] = f, then S[0] != T[0], then corresponds to the above two-dimensional matrix, edit[1][1] = min(edit[0][0], edit[0][1], edit[1][0]) + 1 i.e. edit[1][1] = min(0, 1, 1) + 1 i.e. edit[1][1] = 0 + 1 = 1.
| m/n | f | a | i | l | i | n | g | |
| 0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | |
| s | 1 | 1 | 2 | 3 | 4 | 5 | 6 | 7 |
| a | 2 | 2 | 1 | |||||
| i | 3 | |||||||
| l | 4 | |||||||
| n | 5 |
For S[1] = a, T[1] = a, S[1] = T[1], it corresponds to a two-dimensional matrix, edit[2][2] = edit[1][1], so edit[2][2] = 1. So according to this rule, fill the above two-dimensional matrix as follows:
| m/n | f | a | i | l | i | n | g | |
| 0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | |
| s | 1 | 1 | 2 | 3 | 4 | 5 | 6 | 7 |
| a | 2 | 2 | 1 | 2 | 3 | 4 | 5 | 6 |
| i | 3 | 3 | 2 | 1 | 2 | 3 | 4 | 5 |
| l | 4 | 4 | 3 | 2 | 1 | 2 | 3 | 4 |
| n | 5 | 5 | 4 | 3 | 2 | 2 | 2 | 3 |
Therefore, the edit distance between the two is edit[m][n] = edit[5][7] = 3.
Therefore, according to the above idea, the Java version of the backtracking solution of dynamic programming can be performed as follows:
public static int editDistance(String a, String b) { if (a == null || b == null) { return -1; } int[][] matrix = new int[a.length() + 1][b.length() + 1]; for (int i = 0; i < a.length() + 1; i++) { for (int j = 0; j < b.length() + 1; j++) { if (i == 0) { matrix[i][j] = j; } else if (j == 0) { matrix[i][j] = i; } else { if (a.charAt(i - 1) == b.charAt(j - 1)) { matrix[i][j] = matrix[i - 1][j - 1]; } else { matrix[i][j] = 1 + Math.min(Math.min(matrix[i - 1][j], matrix[i][j - 1]), matrix[i - 1]); } } } return matrix[a.length()][b.length()]; }Summarize
The above is the entire content of this article about the edit distance problem sample code for Java dynamic programming. I hope it will be helpful to everyone. Interested friends can continue to refer to other related topics on this site. If there are any shortcomings, please leave a message to point it out.