This article describes the method of java to find the largest common substring of two strings. Share it for your reference, as follows:
Recently, I have a requirement for text comparison in my project work. After studying this period of time, I have summarized the content of this blog: Find the largest common substring of two strings.
Algorithm idea: calculate the common substring of two strings based on graphs. For specific algorithm ideas, please refer to the following figure:
Input string S1: achmacmh Input string S2: macham
The specific implementation code is as follows:
package cn.lulei.compare; import java.util.ArrayList; import java.util.Collections; import java.util.Comparator; import java.util.List; public class StringCompare { private int a; private int b; public String getMaxLengthCommonString(String s1, String s2) { if (s1 == null || s2 == null) { return null; } a = s1.length();//s1 length makes line b = s2.length();//Set the column of s2 length if(a== 0 || b == 0) { return ""; } //Set the matching matrix boolean [][] array = new boolean[a][b]; for (int i = 0; i < a; i++) { char c1 = s1.charAt(i); for (int j = 0; j < b; j++) { char c2 = s2.charAt(j); if (c1 == c2) { array[i][j] = true; } else { array[i][j] = false; } } } } //Find all common factor strings, save the information as the starting position and length relative to the second string List<ChildString> childStrings = new ArrayList<ChildString>(); for (int i = 0; i < a; i++) { getMaxSort(i, 0, array, childStrings); } for (int i = 1; i < b; i++) { getMaxSort(0, i, array, childStrings); } //Sort sort(childStrings); if (childStrings.size() < 1) { return ""; } //Return the maximum common factor string int max = childStrings.get(0).maxLength; StringBuffer sb = new StringBuffer(); for (ChildString s: childStrings) { if (max != s.maxLength) { break; } sb.append(s2.substring(s.maxStart, s.maxStart + s.maxLength)); sb.append("/n"); } return sb.toString(); } //Sorting, flashback private void sort(List<ChildString> list) { Collections.sort(list, new Comparator<ChildString>(){ public int compare(ChildString o1, ChildString o2) { return o2.maxLength - o1.maxLength; } }); } //Find the common factor string on a slash private void getMaxSort(int i, int j, boolean [][] array, List<ChildString> sortBean) { int length = 0; int start = j; for (; i < a && j < b; i++,j++) { if (array[i][j]) { length++; } else { sortBean.add(new ChildString(length, start)); length = 0; start = j + 1; } if (i == a-1 || j == b-1) { sortBean.add(new ChildString(length, start)); } } } //Public factor class ChildString { int maxLength; int maxStart; ChildString(int maxLength, int maxStart){ this.maxLength = maxLength; this.maxStart = maxStart; } } /** * @param args */ public static void main(String[] args) { // TODO Auto-generated method stub System.out.println(new StringCompare().getMaxLengthCommonString("achmacmh", "macham")); } } The final execution result of the program is:
For comparison of the two files, I personally think that this algorithmic idea can be referred to (and now and for implementation), which will be written in a future blog.
In the above implementation process, all common substring information is saved with an array, and then the largest substring is sorted. If this method is just to find the largest substring, the algorithm is not very reasonable. Therefore, the following modifications are made. List only saves the largest substring in the current calculation. The specific implementation is as follows:
/** *@Description: String comparison*/ package com.lulei.test; import java.util.ArrayList; import java.util.List; public class StringCompare { private int a; private int b; private int maxLength = -1; public String getMaxLengthCommonString(String s1, String s2) { if (s1 == null || s2 == null) { return null; } a = s1.length();//s1 length is used to make row b = s2.length();//s2 length is used to make column if(a== 0 || b == 0) { return ""; } //Set the matching matrix boolean [][] array = new boolean[a][b]; for (int i = 0; i < a; i++) { char c1 = s1.charAt(i); for (int j = 0; j < b; j++) { char c2 = s2.charAt(j); if (c1 == c2) { array[i][j] = true; } else { array[i][j] = false; } } } //Finish all common factor strings and save the information as the starting position and length relative to the second string List<ChildString> childStrings = new ArrayList<ChildString>(); for (int i = 0; i < a; i++) { getMaxSort(i, 0, array, childStrings); } for (int i = 1; i < b; i++) { getMaxSort(0, i, array, childStrings); } StringBuffer sb = new StringBuffer(); for (ChildString s: childStrings) { sb.append(s2.substring(s.maxStart, s.maxStart + s.maxLength)); sb.append("/n"); } return sb.toString(); } //Find the common factor string on a slash private void getMaxSort(int i, int j, boolean [][] array, List<ChildString> sortBean) { int length = 0; int start = j; for (; i < a && j < b; i++,j++) { if (array[i][j]) { length++; } else { //Directly add, save all substrings, the following judgment will only save the current largest substring //sortBean.add(new ChildString(length, start)); if (length == maxLength) { sortBean.add(new ChildString(length, start)); } else if (length > maxLength) { sortBean.clear(); maxLength = length; sortBean.add(new ChildString(length, start)); } length = 0; start = j + 1; } if (i == a-1 || j == b-1) { //Directly add, save all substrings, the following judgment only saves the current largest substring //sortBean.add(new ChildString(length, start)); if (length == maxLength) { sortBean.add(new ChildString(length, start)); } else if (length > maxLength) { sortBean.clear(); maxLength = length; sortBean.add(new ChildString(length, start)); } } } } } //Public factor class ChildString { int maxLength; int maxStart; ChildString(int maxLength, int maxStart){ this.maxLength = maxLength; this.maxStart = maxStart; } } /** * @param args */ public static void main(String[] args) { // TODO Auto-generated method stub System.out.println(new StringCompare().getMaxLengthCommonString("abcdef", "defabc")); } } Thank you for reading, I hope it can help you. Thank you for your support for this site!