OpenMP optimization with c -
i'm supposed optimize code below make run atleast 16x faster using openmp , memory blocking. far can think of collapsing loops simple statement below. makes run 3 times faster. ideas on making closer 16?
int i,j; #pragma omp parallel collapse(2) //my inserted code (i = 0; < msize; i++) (j = 0; j < msize; j++) d[i][j] = c[j][i];
when declare inner loop index in outer scope, must use private clause give each thread own copy. collapse may interfere simd vectorization.
Comments
Post a Comment