Nevermind, I found the problem. I was using a variant of the function that was wrong.

Quoting "Sarah Goslee" <sarah.gos...@gmail.com>:

That's not enough information for us to be able to help you, since
there's no reproducible code.

Here's the crucial bit:
   stem_freqs_list <- mget(uniq_stems,stem_dict)
   stem_freqs <- do.call(rbind,stem_freqs_list)

What does stem_freqs_list look like?
What does stem_freqs look like?

dim() and str() would both be helpful here.

Sarah


On Thu, Jun 9, 2011 at 3:48 PM,  <nty...@clovermail.net> wrote:
Hello R-philes:

I have the following function that gets the output of mget() and converts it
to a data frame to return.  What I am finding is that the dimensions are
wrong.  Basically, I get:

 bridesmaid wed  u see  m gt lt like love X.0 dress pagetrack one go X3 get
1         56  35 27  30 24 20 20   23   28  17    25        16  16 28 15  26

Instead, I want something like:

[1] bridesmaid 56

In other words, I want the word in the first column and the frequency in the
second column.

Any help would be very much appreciated.

Regards,

Na'im

library(Rstem)

# make a data frame of stems and their frequencies
stem_freq_list <- function(freqFile) {
   stem_dict <- new.env(parent=emptyenv(), hash=TRUE)
   freq_dist <- read.csv(freqFile,header=TRUE)
   words <- as.character(freq_dist[,1])
   freqs <- as.numeric(freq_dist[,2])
   stems <- wordStem(words, language="english")
   uniq_stems <- c()

   # make a hash table of stems and their frequencies
   for (i in 1:length(words)) {
       word <- words[i]; stem <- stems[i]; freq <- freqs[i]
       if (exists(stem, envir=stem_dict)) {
           cnt <- get(stem, envir=stem_dict)
           cnt <- cnt + freqs[i]
           assign(stem,cnt,envir=stem_dict)
       } else {
           assign(stem, freq, envir=stem_dict)
           uniq_stems <- append(uniq_stems, stem)
       }
   }

   # return data frame of stems and their frequencies
   stem_freqs_list <- mget(uniq_stems,stem_dict)
   stem_freqs <- do.call(rbind,stem_freqs_list)
   return(stem_freqs_list)
}



--
Sarah Goslee
http://www.functionaldiversity.org



______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Reply via email to