numpy - Fast incremental update of the mean and covariance in Python -
i have python script need update mean , co-variance matrix. doing each time new data point $x$ (a vector), recompute mean , covariance follows:
data.append(x) # `data` list of lists of floats (i.e., x list of floats) self.mean = np.mean( data, axis=0) # self.mean list representing center of data self.cov = np.cov( data, rowvar=0)
the problem is not fast enough me. there anyway more efficient incrementally updating mean
, cov
without re-computing them based on data
?
computing mean incrementally should easy , can figure out. main problem how update covariance matrix self.cov
.
i'd keeping track of sum , sum of squares.
in __init__
:
self.sumx = 0 self.sumx2 = 0
and in append:
data.append(x) self.sumx += x self.sumx2 += x * x[:,np,newaxis] self.mean = sumx / len(data) self.cov = (self.sumx2 - self.mean * self.mean[:,np,newaxis]) / len(data)
noting [:,np.newaxis]
broadcasting find produce of every pair of elements
Comments
Post a Comment