Hi All, Just started to experiment with "sparklyr" and already loving it.
I'm trying to build an extension by constructing an R wrapper to Spark's Gaussian Mixtures. My attempt is below, and so is the error message. Not sure if this is possible to do, and if so, what is wrong with my code. Any hints would be much appreciated. Best, Axel. ----- library(sparklyr) library(dplyr) sc <- spark_connect(master = "local") x <- copy_to(sc, iris) x <- x %>% select(Petal_Width, Petal_Length) # set params k <- 3 iter.max <- 100 features <- dplyr::tbl_vars(x) compute.cost <- TRUE tolerance <- 1e-4 ml.options <- ml_options() df <- spark_dataframe(x) sc <- spark_connection(df) df <- ml_prepare_features( x = df, features = features, envir = environment() # ml.options = ml.options ) envir <- new.env(parent = emptyenv()) envir$id <- ml.options$id.column df <- df %>% sdf_with_unique_id(envir$id) %>% spark_dataframe() tdf <- ml_prepare_dataframe(df, features, ml.options = ml.options, envir = envir) envir$model <- "org.apache.spark.ml.clustering.GaussianMixture" gmm <- invoke_new(sc, envir$model) >Error: failed to invoke spark command >16/10/09 16:35:35 ERROR <init> on org.apache.spark.ml.clustering.GaussianMixture failed [[alternative HTML version deleted]] ______________________________________________ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.