Given two sequences of letters, such as A = HELLO and B = ALOHA,
find the longest contiguous sequence appearing in both.
One solution: (assume strings have lengths m and n)
For each of the m starting points of A, check for the longest common string starting at each of the n starting points of B.
The checks could average Θ(m) time → a total of Θ(m^2*n) time.
Dynamic programming solution:
Let Li, j = maximum length of common strings that end at A[i] & B[j]. Then,
Li, j = 0 if i = 0 or j = 0
= Li-1, j-1 + 1 if A[i] = B[j]
= 0 if A[i] != B[j]
int LONGEST_COMMON_SUBSTRING(A,m,B,n)
{
if (m==0 || n ==0) return 0;
for i = 0 to m
L[i,0] = 0
for j = 0 to n
L[0,j] = 0
len = 0 ;
answer = L[0,0];
for i = 1 to m do
for j = 1 to n do
if A[i] != B[j]
L[i,j ] = 0;
else
Li,j = 1 + L[i-1,j-1];
if L[i,j] > len then
len := Li,j
}
Example:
A L O H A
H 0 0 0 1 0
E 0 0 0 0 0
L 0 1 0 0 0
L 0 1 0 0 0
O 0 0 2 0 0
Monday, July 5, 2010
Longest common substring problem
Subscribe to:
Post Comments (Atom)
0 comments:
Post a Comment