Hello,

I'm trying to use the "lsa" (latent semantic analysis) package, and running
into a problem that seems to be related to the number of documents being
processed.  Here's the code I'm running (after loading the lsa and rstem
packages), and the error message:

> SnippetsPath <- "c:\\OED\\AuditExplain\\"  # path where to find text
snippets
> data(stopwords_en)
> tdm <- textmatrix(SnippetsPath, stopwords=stopwords_en)

I get this error message with ~ 280 documents:  "Error in sort(
unique.default(x), na.last = TRUE) : 'x' must be atomic"

The error won't occur if I reduce the number of documents (say to 220, for
instance).  I'm not clear if this is memory/capacity issue or something
else.
A traceback returns the following, but interpreting this result is outside
of my league ;-)  Any idea of what could be the problem?  I greatly
appreciate your advice.

> traceback()
10: stop("'x' must be atomic")
9: sort(unique.default(x), na.last = TRUE)
8: factor(a, exclude = exclude)
7: table(txt)
6: inherits(x, "factor")
5: is.factor(x)
4: sort(table(txt), decreasing = TRUE)
3: FUN(X[[238]], ...)
2: lapply(dir(mydir, full.names = TRUE), textvector, stemming, language,
       minWordLength, minDocFreq, stopwords, vocabulary)
1: textmatrix(SnippetsPath, stopwords = stopwords_en)

Alex

        [[alternative HTML version deleted]]

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Reply via email to