OpenMP optimization with c -


i'm supposed optimize code below make run atleast 16x faster using openmp , memory blocking. far can think of collapsing loops simple statement below. makes run 3 times faster. ideas on making closer 16?

int i,j; #pragma omp parallel collapse(2)    //my inserted code (i = 0; < msize; i++)     (j = 0; j < msize; j++)         d[i][j] = c[j][i]; 

when declare inner loop index in outer scope, must use private clause give each thread own copy. collapse may interfere simd vectorization.


Comments

Popular posts from this blog

ios - RestKit 0.20 — CoreData: error: Failed to call designated initializer on NSManagedObject class (again) -

java - Digest auth with Spring Security using javaconfig -

laravel - PDOException in Connector.php line 55: SQLSTATE[HY000] [1045] Access denied for user 'root'@'localhost' (using password: YES) -