I'm struggling to find some clear math behind ensemble learning.
I can simulate it very easily, eg:
import numpy as np, scipy.stats
r = np.random.random(1000)
d = np.array([0]*1000)
cors = []
for i in range(100):
v = np.random.random(1000)
c = scipy.stats.pearsonr(v,r).statistic
cors.append(c)
d = d + v * c # <- ensembling
print(np.max(cors))
print(scipy.stats.pearsonr(d,r).statistic)
0.07027996608008028
0.2646662315626593
It seems like a very simple concept, and yet I can't find any clear mathematical description as to why it works.