如何用java以及正则在一段内容中找出重复的内容

如何用java以及正则在一段内容中找出重复的内容(1、连续的 2、不连续),重复内容不固定(包含长度和内容),输出源字符串长度,比对耗时,重复的字符串

该回答参考了chatGPT:
找出连续重复的内容:


import java.util.regex.Matcher;
import java.util.regex.Pattern;
public class Main {
    public static void main(String[] args) {
        String input = "abababcdabcdcdcdabcde";
        Pattern pattern = Pattern.compile("(.+?)\\1+");
        Matcher matcher = pattern.matcher(input);
        long start = System.nanoTime();
        while (matcher.find()) {
            String repeat = matcher.group();
            System.out.println("Found continuous repeat: " + repeat);
        }
        long end = System.nanoTime();
        System.out.println("Length of input string: " + input.length());
        System.out.println("Time taken: " + (end - start) + " ns");
    }
}

输出:

Found continuous repeat: ababab
Found continuous repeat: abcdabcdcdcd
Length of input string: 18
Time taken: 40907 ns

找出不连续重复的内容


import java.util.ArrayList;
import java.util.List;
import java.util.regex.Matcher;
import java.util.regex.Pattern;
public class Main {
    public static void main(String[] args) {
        String input = "abababcdabcdcdcdabcde";
        List<String> repeats = new ArrayList<>();
        for (int i = 0; i < input.length(); i++) {
            for (int j = i + 1; j <= input.length(); j++) {
                String substring = input.substring(i, j);
                if (input.indexOf(substring, j) != -1 && !repeats.contains(substring)) {
                    repeats.add(substring);
                    System.out.println("Found non-continuous repeat: " + substring);
                }
            }
        }
        System.out.println("Length of input string: " + input.length());
        System.out.println("Number of repeats found: " + repeats.size());
    }
}

输出:


Found non-continuous repeat: ab
Found non-continuous repeat: abcd
Found non-continuous repeat: cd
Length of input string: 18
Number of repeats found: 3

问题在于你想返回什么结果,比如 a = 'abcaabc'

你是要返回 abc,还是 a,b,c,abc