Dear all,

I have been trying to perform machine learning/feature selection tasks in R using various packages (e.g. mlr and FSelector). However, when giving larger data frames as input for the functions, I get a segmentation fault (memory not mapped).

This happened first when using the mlr benchmark function with dataframes in the order of 200 rows x 10,000 columns (all integer values).

I prepared a minimal working example where I get a segmentation fault trying to calculate the information gain with FSelector package.

require("FSelector")
# Random dataframe 200 rows * 25,000 cols
large.df <- data.frame(replicate(25000,sample(0:1,200,rep=TRUE)))
weights <- information.gain(X24978~., large.df)
print(weights)


I am using R version 3.3.0 64-bit on Ubuntu 14.04.4 LTS with FSelector v0.20 and rJava v0.9.8 on a machine with 32 core Intel i7 and 250 GB Ram. Java is OpenJDK 1.7 74bit.

I would highly appreciate if you could give me any hint on how to solve the problem.

Best
ssalentin

--
Sebastian Salentin, PhD student
Bioinformatics Group

Technische Universität Dresden
Biotechnology Center (BIOTEC)
Tatzberg 47/49
01307 Dresden, Germany

______________________________________________
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Reply via email to