java - Performance Impact of RDD to JavaRDD conversion -


i have code , wan work against javardd instead of rdd i'm doing conversion here. know performance impact of transformation specialy when i'm dealing gbs of data.

rdd<string> textfile = sc.textfile(filepath, 2); javardd<string> javardd = textfile.tojavardd();  

this wide or narrow transformation ? difference between javardd , rdd ?

there's no significant performance penalty - javardd simple wrapper around rdd make calls java code more convenient. holds original rdd ad member, , calls member's method on method invocation, example (from javardd.scala):

def cache(): javardd[t] = wraprdd(rdd.cache())  

wraprdd boils down new javardd[t](rdd), performance penalty creating thin java object every method invocation, that's entirely negligible it's not done per element in rdd, once entire object.


Comments

Popular posts from this blog

ios - RestKit 0.20 — CoreData: error: Failed to call designated initializer on NSManagedObject class (again) -

java - Digest auth with Spring Security using javaconfig -

laravel - PDOException in Connector.php line 55: SQLSTATE[HY000] [1045] Access denied for user 'root'@'localhost' (using password: YES) -