numpy - Fast incremental update of the mean and covariance in Python -


i have python script need update mean , co-variance matrix. doing each time new data point $x$ (a vector), recompute mean , covariance follows:

data.append(x) # `data` list of lists of floats (i.e., x list of floats) self.mean = np.mean( data, axis=0) # self.mean list representing center of data self.cov = np.cov( data, rowvar=0) 

the problem is not fast enough me. there anyway more efficient incrementally updating mean , cov without re-computing them based on data ?

computing mean incrementally should easy , can figure out. main problem how update covariance matrix self.cov.

i'd keeping track of sum , sum of squares.

in __init__:

self.sumx = 0 self.sumx2 = 0 

and in append:

data.append(x) self.sumx += x self.sumx2 += x * x[:,np,newaxis]  self.mean = sumx / len(data) self.cov = (self.sumx2 - self.mean * self.mean[:,np,newaxis])  / len(data) 

noting [:,np.newaxis] broadcasting find produce of every pair of elements


Comments

Popular posts from this blog

ios - RestKit 0.20 — CoreData: error: Failed to call designated initializer on NSManagedObject class (again) -

laravel - PDOException in Connector.php line 55: SQLSTATE[HY000] [1045] Access denied for user 'root'@'localhost' (using password: YES) -

java - Digest auth with Spring Security using javaconfig -