1

Python beginner here and I am trying to make a program that outputs the longest common substring of two strings from the input.

Example:

word1 = input('Give 1. word: xlaqseabcitt')
word2 = input('Give 2. word: peoritabcpeor')

Wanted output:

abc

I have code like this so far:

word1 = input("Give 1. word: ") 
word2 = input("Give 2. word: ")

longestSegment = "" 
tempSegment = ""

for i in range(len(word1)): 
    if word1[i] == word2[i]:
        tempSegment += word1[i] 
    else: 
        tempSegment = ""

if len(tempSegment) > len(longestSegment):
    longestSegment = tempSegment

print(longestSegment)

I end up with IndexError when word2 is shorter than word1 and it does not give me the common substring.

Any advice?

Thanks for the help!

EDIT: I found this solution:

string1 = input('Give 1. word: ')
string2 = input('Give 2. word: ')
answer = ""
len1, len2 = len(string1), len(string2)
for i in range(len1):
    for j in range(len2):
        lcs_temp=0
        match=''
        while ((i+lcs_temp < len1) and (j+lcs_temp<len2) and   string1[i+lcs_temp] == string2[j+lcs_temp]):
            match += string2[j+lcs_temp]
            lcs_temp+=1
        if (len(match) > len(answer)):
            answer = match

print(answer)

Is there any way to make this code more clear and suitable for beginners?

Emiiri93
  • 27
  • 5
  • 2
    handle the case when the strings are of different length – sahasrara62 Feb 11 '21 at 21:01
  • 1
    Does this answer your question? [Find common substring between two strings](https://stackoverflow.com/questions/18715688/find-common-substring-between-two-strings) – sahasrara62 Feb 11 '21 at 21:01
  • Thank for the link! it does answer my question. I was too eager to make a post about it then search for it first. – Emiiri93 Feb 11 '21 at 21:20
  • 1
    @sahasrara62 I find that the link you provided is a lot confusing. The accepted answer there is point out as wrong by many people. Other answer suggesting use of existing library like difflib, supposedly isn't good for use in real practice. – Harsh Feb 11 '21 at 21:20
  • Yes, it was very confusing but I managed to get code from there that works fine I think and I edited it a little bit. If you guys want to check it out and give any advice to improve it? You can see it in the post EDIT. – Emiiri93 Feb 11 '21 at 21:27
  • Yes, your right but the answer what I was looking for the most consecutive, matching characters in two strings. If this makes any sense. – Emiiri93 Feb 11 '21 at 21:51
  • 1
    @Harsh there are multiple solution, and OP problem is same as them. if you think the accepted solution in link is wrong, you can point out there why it is wrong. – sahasrara62 Feb 11 '21 at 22:05
  • 1
    @Emiiri93 you can post your working code as an answer – sahasrara62 Feb 11 '21 at 22:05

1 Answers1

2

You can build a dictionary from the first string containing the positions of each character, keyed on the characters. Then go through the second string and compare the substring of each character with the rest of the second string at that position:

# extract common prefix
def common(A,B) :
    firstDiff = (i for i,(a,b) in enumerate(zip(A,B)) if a!=b) # 1st difference
    commonLen = next(firstDiff,min(len(A),len(B)))             # common length
    return A[:commonLen]

word1 = "xlaqseabcitt"
word2 = "peoritabcpeor"

# position(s) of each character in word1
sub1 = dict()  
for i,c in enumerate(word1): sub1.setdefault(c,[]).append(i)

# maximum (by length) of common prefixes from matching first characters
maxSub = max((common(word2[i:],word1[j:]) 
                       for i,c in enumerate(word2) 
                                 for j in sub1.get(c,[])),key=len)


print(maxSub) # abc
Alain T.
  • 34,859
  • 4
  • 30
  • 47