MO640 - Exercises - Informatics Concepts, Setubal and Meidanis 1997 - chapter 2

Exercises marked with (*) require further reading/search beyond the suggested texts.

1. Design and analyze an algorithm that checks whether a string p is a substring of another string t. Can you design an algorithm that runs in time O(|p| + |t|)?

Answer:

The Knuth-Morris-Pratt (KMP) algorithm is a procedure that can find whether p is a substring of t in O(|t| + |p|) time. Reference: Knuth–Morris–Pratt algorithm (Wikipedia). It is interesting to notice that a naïve algorithm (such as the one below) would have a significantly larger worst-case running time.

      NaiveSubstring(string p, string t): boolean
      Input: p: string, t: string
      Output: boolean
      
      boolean NaiveSubstring(string t, string p) {
        if (t.length < p.length) {
          return FALSE;
        }
        for (int i = 0; i < p.length; i++) {
          if t[i] != p[i]) {
	    return NaiveSubstring(++t, p);
	  }
	}
	return TRUE;
      }
			    

Complexity Analysis: The FOR loop can iterate at most |p| times. Furthermore, every time the comparison between characters fails, t is shortened by one character. This shortening can happen at most |t|-|p| times (since more than that would return FALSE). So, the approximate running time of the algorithm is |p|*(|t|-|p|), which leads to an O(|t|*|p|) upper bound.


MO640 Home

© 2015 Joao Meidanis