> On Nov 6, 2015, at 3:28 AM, Karl <josip.2...@gmail.com> wrote: > > Hi All, > > Using R for text processing is quite new to me, while I have found a lot of > useful functions and I'm beginning to learn regex, I need help with the > following task. How do I calculate the distance between words? > > That is, given a specific keyword, I need to assign labels to the other > words based on the distance (number of words) to this keyword. > > For example, if the keyword is "amet" and the string of words is
strng <- "Lorem ipsum dolor sit amet, consectetur adipiscing elit.” > -> "dolor" would get a value of -2 > -> "elit" would get a value of 3 words <- unlist(strsplit(strng, "\\W")) words[words != ""] #[1] "Lorem" "ipsum" "dolor" "sit" #[5] "amet" "consectetur" "adipiscing" "elit" real <- words[words != “"] which(real == "amet") #[1] 5 length(real) #[1] 8 vec <- 1:length(real) - which(real == "amet") names(vec) <- real vec["dolor"] #dolor # -2 > # > If the sentence contains more than one instance of the keyword, I need > values for each instance. Moreover, one can assume that I can split my data > into sentences, so there is no need to search and recognize sentences (this > is a separate problem). > > Thank you! > > Best regards, > Jay > > [[alternative HTML version deleted]] > > ______________________________________________ > R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. David Winsemius Alameda, CA, USA ______________________________________________ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.