My tackle. I tested it with a list of 729 lists and it still works fast.
I'm not sure on how faster it is, if at all honestly. But it doesn't use sets.
Here (It has a test in it, just use the function):
a = [1,2,3,4,5,6,7,8,9]
b = [9,10,11,12,13,14,15]
c = [11,3,99]
d = [9,10,11,12,50,1]
ls = [a,b,c,d]
for i in range(100,3000,2):
ls.append([i,i+1,i+2,i+3])
def f(ls):
wordDic = {}
countDic = {}
ind = 0
for t in range(len(ls)):
countDic[t]={}
for l in ls:
for word in l:
try:
for sharedIndex in wordDic[word]: #For every list that we already know has this word
try:
countDic[sharedIndex][ind] += 1
try:
countDic[ind][sharedIndex] += 1
except KeyError:
countDic[ind][sharedIndex] = 1
except KeyError:
countDic[sharedIndex][ind] = 1
wordDic[word].append(ind)
except KeyError:
wordDic[word] = [ind]
ind += 1
mx = 0
ans = -1
ind = 0
for sLs in countDic.values():
for rLr in sLs:
if mx < sLs[rLr]:
ans = (ls[ind],ls[rLr])
mx = max(mx,sLs[rLr])
ind += 1
return ans
print(f(ls))
What it does:
It's based on these two dictionaries: wordDic and countDic.
wordDic keys are each "word" that is used, with its value being a list of the indexes where said word was found.
countDic keys are the indexes of each list, and its values are dictionaries that hold how many
countDic = { listInd: {otherLists:sharedAmount , ...} , ...}
First it creates the dictionaries. Then it goes through each list once and saves the words that it has. It adds its own index to the list of indexes that each word has, after adding one to the amount of "shared words" count in the second dictionary.
When it finishes you'd have something like this:
{0: {1: 1, 2: 1, 3: 2}, 1: {2: 1, 3: 4}, 2: {3: 1}, 3: {1: 3, 0: 1}}
([9, 10, 11, 12, 13, 14, 15], [9, 10, 11, 12, 50, 1])
which reads as:
{(List zero: elements shared with list 1 = 1, elements shared with 2=1, elements shared with list 3=2.),(List One: elements shared with list 2=1, elements shared with list 3=4),....}
In this case List 1 shares the most elements with list 3 as any other. The rest of the function simply goes through the dictionary and finds this maximum.
I probably messed up my explanation. I think it would be best that you check if the function works better than your own first, and then try to understand it.
I also just noticed that you probably only need to add 1 to previously found lists, and don't need to add that other list to the one you're currently testing. I'll see if that works
EDIT1: It seems so. The lines:
try:
countDic[ind][sharedIndex] += 1
except KeyError:
countDic[ind][sharedIndex] = 1
Can be commented out.
I hope this helps.