Link to the problem on UVa Online Judge
Presumably you have read the problem. You have to count the number of unique 3-letter subsequences of input string. For example if AABCD is given then AAB, AAC, AAD, ABC, ABD, ACD and BCD are unique subsequences of it. How many 3-letter strings do we have at all? 26 choices for the first letter, 26 choices for the second letter and 26 choices for the third letter, overall we have roughly 18,000 3-letter strings. We can generate them with three nested for loops. Next we have to find out whether a generated 3-letter string is a subsequence of the input string or not?
How shall we do this? The naive solution is to use nested loops to find out whether these 3 letters are a subsequence of input or not, which is very expensive for us, so we can't use this method.
Let's say we want to check whether BHK is a subsequence of the input or not. Do you agree we have to match 'B' with the first occurrence of 'B' in the input? It is the best choice. Let's say we have matched 'B' from BHK with a 'B' at position 1500 from input. Now we have to find a 'H' which is to the right of 1500th position, what if there are several 'H's to the right of this position? Again, isn't it the best solution to match 'H' with the nearest 'H' after index 1500? It is the best choice, because if we match this 'H' with a farther 'H' we might lose the chance to match 'K' with any 'K's at all.
So it seems for each index we need the nearest index of each character that occurs after it, for example nextIndex[position][character]. nextIndex[position][character] is -1 if there are no character's in range [position...length] or index of the nearest character to position.
Presumably you have read the problem. You have to count the number of unique 3-letter subsequences of input string. For example if AABCD is given then AAB, AAC, AAD, ABC, ABD, ACD and BCD are unique subsequences of it. How many 3-letter strings do we have at all? 26 choices for the first letter, 26 choices for the second letter and 26 choices for the third letter, overall we have roughly 18,000 3-letter strings. We can generate them with three nested for loops. Next we have to find out whether a generated 3-letter string is a subsequence of the input string or not?
How shall we do this? The naive solution is to use nested loops to find out whether these 3 letters are a subsequence of input or not, which is very expensive for us, so we can't use this method.
Let's say we want to check whether BHK is a subsequence of the input or not. Do you agree we have to match 'B' with the first occurrence of 'B' in the input? It is the best choice. Let's say we have matched 'B' from BHK with a 'B' at position 1500 from input. Now we have to find a 'H' which is to the right of 1500th position, what if there are several 'H's to the right of this position? Again, isn't it the best solution to match 'H' with the nearest 'H' after index 1500? It is the best choice, because if we match this 'H' with a farther 'H' we might lose the chance to match 'K' with any 'K's at all.
So it seems for each index we need the nearest index of each character that occurs after it, for example nextIndex[position][character]. nextIndex[position][character] is -1 if there are no character's in range [position...length] or index of the nearest character to position.
We calculate nextIndex like this picture. First position has no idea about the nearest character after it, so it asks position+1 about them, the only character position knows about is s[position], so it updates it to position.
And this is how we can find out whether a string of size M is a subsequence of another string of size N with O(M) for each query.