Well, I guess I'm not quite there yet. What I gave earlier was a simplified
example, and did not accurately reflect the complexity of the task.
This is my real world example. As you can see, what I need to do is delete
an arbitrary number of characters, including brackets and parens enclosing
them, multiple times within the same string. Help?
myCharVec <- "medicare [link 220.30.05] ssa (1-800-772-1213). 2008 [link
145.30.05] amounts (2d) gross income (magi) here. (2e)"
myCharVec
myCharVec <- gsub('\\[.*\\]', '', myCharVec)
myCharVec
myCharVec <- gsub('\\(.*\\)', '', myCharVec)
myCharVec
#what I want
# "medicare ssa . 2008 amounts gross income here."
myCharVec <- "medicare [link 220.30.05] ssa (1-800-772-1213). 2008 [link
145.30.05] amounts (2d) gross income (magi) here. (2e)"
> myCharVec
[1] "medicare [link 220.30.05] ssa (1-800-772-1213). 2008 [link
145.30.05] amounts (2d) gross income (magi) here. (2e)"
> myCharVec <- gsub('\\[.*\\]', '', myCharVec)
> myCharVec
[1] "medicare amounts (2d) gross income (magi) here. (2e)"
> myCharVec <- gsub('\\(.*\\)', '', myCharVec)
> myCharVec
[1] "medicare amounts "
>
> #what I want
> # "medicare ssa . 2008 amounts gross income here."
------------------------------------------------------------
Mark W. Kimpel MD ** Neuroinformatics ** Dept. of Psychiatry
Indiana University School of Medicine
15032 Hunter Court, Westfield, IN 46074
(317) 490-5129 Work, & Mobile & VoiceMail
"The real problem is not whether machines think but whether men do." -- B.
F. Skinner
******************************************************************
On Thu, Aug 20, 2009 at 11:39 AM, William Dunlap <[email protected]> wrote:
>
> > -----Original Message-----
> > From: [email protected]
> > [mailto:[email protected]] On Behalf Of Mark Kimpel
> > Sent: Thursday, August 20, 2009 8:31 AM
> > To: [email protected]
> > Subject: [R] help with regular expressions in R
> > ...
> > myCharVec <- c("[the rain in spain]", "(the rain in spain)")
> > gsub('\\[*.\\]', '', myCharVec)
>
> Change the '*.' to '.*'.
>
> Your expression matches 0 or more left square brackets,
> followed by 1 character, followed by a right squared bracket.
>
> "\\[.*\]]" matches a left square bracket, followed by 0 or more
> characters, followed by a right square bracket.
>
> Bill Dunlap
> TIBCO Software Inc - Spotfire Division
> wdunlap tibco.com
>
> >
> > #what I get
> > # [1] "[the rain in spai" "(the rain in spain)"
> >
> > #what I want
> > [1] "" "(the rain in spain)"
> >
> > > sessionInfo()
> > R version 2.10.0 Under development (unstable) (2009-08-12 r49193)
> > x86_64-unknown-linux-gnu
> >
> > locale:
> > [1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C
> > [3] LC_TIME=en_US.UTF-8 LC_COLLATE=en_US.UTF-8
> > [5] LC_MONETARY=C LC_MESSAGES=en_US.UTF-8
> > [7] LC_PAPER=en_US.UTF-8 LC_NAME=C
> > [9] LC_ADDRESS=C LC_TELEPHONE=C
> > [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C
> >
> > attached base packages:
> > [1] stats graphics grDevices datasets utils methods base
> >
> > other attached packages:
> > [1] RWeka_0.3-20 tm_0.4
> >
> > loaded via a namespace (and not attached):
> > [1] grid_2.10.0 rJava_0.6-3 slam_0.1-3
> >
> >
> > ------------------------------------------------------------
> > Mark W. Kimpel MD ** Neuroinformatics ** Dept. of Psychiatry
> > Indiana University School of Medicine
> >
> > 15032 Hunter Court, Westfield, IN 46074
> >
> > (317) 490-5129 Work, & Mobile & VoiceMail
> >
> > "The real problem is not whether machines think but whether
> > men do." -- B.
> > F. Skinner
> > ******************************************************************
> >
> > [[alternative HTML version deleted]]
> >
> > ______________________________________________
> > [email protected] mailing list
> > https://stat.ethz.ch/mailman/listinfo/r-help
> > PLEASE do read the posting guide
> > http://www.R-project.org/posting-guide.html
> > and provide commented, minimal, self-contained, reproducible code.
> >
>
[[alternative HTML version deleted]]
______________________________________________
[email protected] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.