[R] "subscript out of bounds" error when using koRpus+Tree Tagger

Jiayue Wang Sat, 08 Dec 2018 22:24:13 -0800

Hi,

I'm trying to do text corpus processing on some novels, with koRpuspackage and Tree Tagger. The script lists all txt files (11 in all) in adir, and processes it one by one.


##########
rm(list=ls())
library(koRpus)
library(koRpus.lang.en)
set.kRp.env(TT.cmd = "/pathto/tree-tagger-english", lang = "en")
outdir <- "/pathto/corpora"
corpdir <- paste0(outdir,"/","morrison11")

files <- list.files(path=corpdir, pattern = "*.txt", full.names = F)
n <- length(files)

output <- file(paste0(outdir,"/calc_results_morrison11.txt"), open="at")
for (i in 1:n) {
  cat(i," - ",files[i],"\n", file = output)
  tagged.results <- treetag(paste0(corpdir,'/',files[i]),
     treetagger="kRp.env")
  capture.output(flesch(tagged.results), file = output)
  cat("\n", file=output)
  capture.output(TTR(tagged.results), file = output)
  cat("\n", file=output)
  capture.output(textFeatures(tagged.results), file=output)
  cat("\n===========================\n", file = output)
}
close(output)
#########

The problem is, the script always throws the following error when itworks on the last txt file and prematurely exits:


　　Error in all.patterns[[word.length]] : subscript out of bounds

I can't figure out what this message means. the dir's are correct;there's no problem with Tree Tagger installation; n and files have thecorrect values.


Please help, many thanks!

Jiayue

______________________________________________
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] "subscript out of bounds" error when using koRpus+Tree Tagger

Reply via email to