numpy - Fast incremental update of the mean and covariance in Python -
i have python script need update mean , co-variance matrix. doing each time new data point $x$ (a vector), recompute mean , covariance follows:
data.append(x) # `data` list of lists of floats (i.e., x list of floats) self.mean = np.mean( data, axis=0) # self.mean list representing center of data self.cov = np.cov( data, rowvar=0) the problem is not fast enough me. there anyway more efficient incrementally updating mean , cov without re-computing them based on data ?
computing mean incrementally should easy , can figure out. main problem how update covariance matrix self.cov.
i'd keeping track of sum , sum of squares.
in __init__:
self.sumx = 0 self.sumx2 = 0 and in append:
data.append(x) self.sumx += x self.sumx2 += x * x[:,np,newaxis] self.mean = sumx / len(data) self.cov = (self.sumx2 - self.mean * self.mean[:,np,newaxis]) / len(data) noting [:,np.newaxis] broadcasting find produce of every pair of elements
Comments
Post a Comment