how bout using read.table(... , sep=" ").
That would give you a vector of single words. then
grepl("\\[[9-z]+\\]",x)
will return a boolean vector
> x<-c('test','[bracket]','hi]','[blah','foo','[bar]')
> grepl('\\[[9-z]+\\]',x)
[1] FALSE TRUE FALSE FALSE FALSE TRUE
> x[grepl('\\[[9-z]+\\]',x)]
[1] "[bracket]" "[bar]"
You might need a more complex reg-ex to catch them all incase of
([citation]) instances for example.
Justin
On Tue, Jan 24, 2012 at 6:52 AM, mdvaan <[email protected]> wrote:
> Hi,
>
> I have a series of MS word files and each file contains plain text. From
> these texts I would like to extract only those elements (read: words) that
> are between square brackets. Example of a text:
>
> Most fundamentally, it has led to an effort to clarify the organizational
> form concept. According to them [see also Smith, Jones and Carroll 2002],
> categories emerge as audience members recognize dissimilarities among
> groups
> of consumers and label them as members of a common set [Nicol 2000].
>
> Now I would like to get the following selection:
>
> see also Smith, Jones and Carroll 2002
> Nicol 2000
>
> Any ideas on how to do this? What would be the best way to import the text
> in R? The entire text as an element in a dataframe? Thank you very much!
>
> Best,
>
> Mathijs
>
>
> --
> View this message in context:
> http://r.789695.n4.nabble.com/Select-elements-from-text-tp4323947p4323947.html
> Sent from the R help mailing list archive at Nabble.com.
>
> ______________________________________________
> [email protected] mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>
[[alternative HTML version deleted]]
______________________________________________
[email protected] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.