leetcode467. Unique Substrings in Wraparound String

Subject requirements

Consider the string s to be the infinite wraparound string of "abcdefghijklmnopqrstuvwxyz", so s will look like this: "...zabcdefghijklmnopqrstuvwxyzabcdefghijklmnopqrstuvwxyzabcd....".

Now we have another string p. Your job is to find out how many unique non-empty substrings of p are present in s. In particular, your input is the string p and you need to output the number of different non-empty substrings of p in the string s.

Note: p consists of only lowercase English letters and the size of p might be over 10000.

Example 1:
Input: "a"
Output: 1

Explanation: Only the substring "a" of string "a" is in the string s.
Example 2:
Input: "cac"
Output: 2
Explanation: There are two substrings "a", "c" of string "cac" in the string s.
Example 3:
Input: "zab"
Output: 6
Explanation: There are six substrings "z", "a", "b", "za", "ab", "zab" of string "zab" in the string s.

Suppose there is a string s that loops infinitely from a-z26 letters. Now enter a string p and ask how many substrings appear in S.

Ideas and Codes

It is known that s is a series of orderly strings consisting of alphabetical loops from A-Z. Therefore, any string that appears in s must follow a sequence rule such as a-z-a. Therefore, if two adjacent characters in p are not continuous, then these two characters will not appear in a loop. For example, in the case of cac, because ca and ac are not continuous, the two letters must not constitute a circular substring.

Then there is how to reduce the scenario of repeated computing. Suppose you start traversing the number of substrings that end with p[i]. If p[i] and p[i-1] are not continuous, there will be at most one cyclic substring. If p[i] and p[i-1] are continuous, and it is known that there are count[i-1] substrings in the loop ending with p[i-1], count[i] = count[i-1] + 1

But the problem is not over at this time, there is still a duplicate scene, such as ABC abc, if calculated according to the above method, it will be calculated 12 substrings, but in fact, because ABC repeats, so there are only 6 circular substrings. The method of de-duplication here is to always keep the longest substring length ending with each character. This value is preserved using an array of int[26] maxSubstring. After calculating count[i] with the above method, it needs to be compared with maxSubstring[p[i]-'a']. If the length does not exceed the longest substring, then all the current situations ending with that character have been taken into account. Otherwise, you need to add the total number of substrings to the difference between them.

    public int findSubstringInWraproundString(String p) {
        int[] preMaxSubstring = new int[26];
        int prevLength = 0;
        int count = 0;
        for(int i = 0 ; i<p.length() ; i++) {
            char c = p.charAt(i);
            if(i == 0) {
                count++;
                preMaxSubstring[c-'a']++;
                prevLength++;
            }else {
                char prev = p.charAt(i-1);
                prevLength = (prev-'a'+1) % 26 == (c-'a') ? prevLength+1 : 1;
                if(prevLength > preMaxSubstring[c-'a']) {
                    count += prevLength - preMaxSubstring[c-'a'];
                    preMaxSubstring[c-'a'] = prevLength;
                }
            }
        }
        return count;
    }

Tags: Java

Posted on Tue, 08 Oct 2019 07:54:23 -0700 by om.bitsian