Hi folks,
Any ideas on this? This does sound like a fairly common situation - reading
in large data file into R?
Thanks.
Andy
--
View this message in context:
http://r.789695.n4.nabble.com/Reading-large-sparse-arff-files-into-R-tp4249409p4252393.html
Sent from the R help mailing list archive a
Hi,
I am trying to read in a large and highly sparse ARFF file into R which was
produced by WEKA. However the package 'RWeka' just chokes on this file. The
data set has about 40k observations and about 20k dimensions. Even after 1hr
read.arff method of RWeka is still trying to read in the file, w
Hi all,
I am new to R, and am trying to do feature selection on my text data that
has about 30k observations and about 15k features. I am interested in using
Chi-Sqaured and Mutual Information based feature selection. I tried using
FSelector package but found it too slow for my purposes.
Are t
Daniel Malter wrote:
>
> Take a look here: http://www.jstatsoft.org/v25/i05/paper
>
> HTH,
> Da.
>
>
> andy1234 wrote:
>>
>> Dear everyone,
>>
>> I am new to R, and I am looking at doing text classification on a huge
>> collection o
Dear everyone,
I am new to R, and I am looking at doing text classification on a huge
collection of documents (>500,000) which are distributed among 300 classes
(so basically, this is my training data). Would someone please be kind
enough to let me know about the R packages to use and their scala
Hello all,
I am looking at doing text classification on very high dimensional data
(about 300,000 or more features) and upto 2000 documents. I am quite new to
R though, and was just wondering if R and it's libraries would scale to such
high dimensions.
Any thoughts will be much appreciated.
Th
Hello everyone,
Any thoughts in this one please?
The only thing I found was the FSelector package
(http://en.wikibooks.org/wiki/Data_Mining_Algorithms_In_R/Dimensionality_Reduction/Feature_Selection#Aviable_Feature_Ranking_Techniques_in_FSelector_Package).
Unfortunately though it seems to be far
I need to use entropy based feature selection to reduce term space while
doing text classification. Are there any R packages available that would
help me do this?
I can also make do with chi squared based algorithm, if there are packages
for that.
Thanks in advance.
Andy
--
View this message in
8 matches
Mail list logo