Thank you.
If I use
gsub(" \xad", "-", x)
[1] "NEW YORK-NEW ENGLAND"
I get what I want.
Adrian
sessionInfo()
R version 2.9.2 (2009-08-24)
i386-pc-mingw32
locale:
LC_COLLATE=English_United States.1252;LC_CTYPE=English_United
States.1252;LC_MONETARY=English_United
States.1252;LC_NUMERIC=C;LC_TIME=English_United States.1252
attached base packages:
[1] stats graphics grDevices utils datasets methods base
On Wed, 14 Oct 2009, Prof Brian Ripley wrote:
On Wed, 14 Oct 2009, Adrian Dragulescu wrote:
charToRaw(x)
[1] 4e 45 57 20 59 4f 52 4b 20 ad 4e 45 57 20 45 4e 47 4c 41 4e 44
charToRaw(y)
[1] 4e 45 57 20 59 4f 52 4b 20 2d 4e 45 57 20 45 4e 47 4c 41 4e 44
So they are different.
We really do need the 'at a minimum' information we asked you for in the
posting guide. But in cp1252 (a guess as to what you might be using) \xad is
a 'soft hyphen', and that is not the same thing as a hyphen -- you will get
the same issues with 'non-breaking space'.
BDR
Adrian
I use R 2.8.1 on WinXP
On Wed, 14 Oct 2009, Duncan Murdoch wrote:
On 10/14/2009 1:30 PM, Adrian Dragulescu wrote:
Hello,
Below is some output that shows my issue.
I have a variable x that I read from a file (more on this below)
x
[1] "NEW YORK NEW ENGLAND"
gsub(" -", "-", x) # this does not work!
[1] "NEW YORK NEW ENGLAND"
Well, I see no hyphen at all here, but then I am not on Windows.
It looks as though it worked, presumably because something got lost in
your email.
Could you post charToRaw(x) so we can see what's in x?
Duncan Murdoch
Encoding(x) # is x in a special encoding? no
[1] "unknown"
y = "NEW YORK -NEW ENGLAND" # I type in variable y
gsub(" -", "-", y) # and gsub works as expected
[1] "NEW YORK-NEW ENGLAND"
I'm sure the problem has to do with the way I read the variable x. But
even if I change the encoding for x to ASCII, I still cannot do the sub.
I get x by reading a pdf file with pdftotext so you will not be able to
replicate my issue.
Thanks for any suggestions,
Adrian
--
Brian D. Ripley, rip...@stats.ox.ac.uk
Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/
University of Oxford, Tel: +44 1865 272861 (self)
1 South Parks Road, +44 1865 272866 (PA)
Oxford OX1 3TG, UK Fax: +44 1865 272595
______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.