Dear Emily,
I have written a more robust version of the function:
extract.nonLetters = function(x, rm.space = TRUE, normalize=TRUE,
sort=TRUE) {
if(normalize) str = stringi::stri_trans_nfc(str);
ch = strsplit(str, "", fixed = TRUE);
ch = unique(unlist(ch));
if(sort) ch = sort(ch
Dear Emily,
Using a look-behind solves the split problem in this case. (Note: Using
Regex is in most/many cases the simplest solution.)
str = c("leucocyten + gramnegatieve staven +++ grampositieve staven ++",
"leucocyten – grampositieve coccen +")
tokens = strsplit(str, "(?<=[-+])\\s++", perl
Since any space that follows 2 or 3 + signs (or - signs) also follows
a single + (or -), this can be done with positive look behind, which
may be a little simpler:
x <- c(
'leucocyten + gramnegatieve staven +++ grampositieve staven ++',
'leucocyten - grampositieve coccen +'
)
strsplit(x, "(?<=
I always find regex puzzles amusing, so after changing the unicode
typo quotes and dashes to ascii, the following simple prescription,
similar to those proffered by others, seems to produce what you
requested with your example:
x <- c("leucocyten + gramnegatieve staven +++ grampositieve staven ++"
rg
Subject: Re: [R] Split String in regex while Keeping Delimiter
I thought replacing the spaces following instances of +++,++,+,- with "\n" and
then reading with scan should succeed. Like Ivan Krylov I was fairly sure that
you meant the minus sign to be "-" rather than "
I thought replacing the spaces following instances of +++,++,+,- with "\n" and
then reading with scan should succeed. Like Ivan Krylov I was fairly sure that
you meant the minus sign to be "-" rather than "–", but perhaps your were using
MS Word as an editor which is inconsistent with effective
On Wed, 12 Apr 2023 08:29:50 +
Emily Bakker wrote:
> Some example data:
> “leucocyten + gramnegatieve staven +++ grampositieve staven ++”
> “leucocyten – grampositieve coccen +”
>
> I want to split the strings such that I get the following result:
> c(“leucocyten +”, “gramnegatieve staven
This seems to do the job but there are probably more elegant solutions:
f <- function(s) { sub("^ ","",unlist(strsplit(gsub("\\+ ","+@ ",s),"@"))) }
g <- function(s) { sub("^ ","",unlist(strsplit(gsub("- ","-@ ",s),"@"))) }
h <- function(s) { g(f(s)) }
To try it out:
s <- “leucocyten + gramnegati
Hello List,
I have a dataset consisting of strings that I want to split while saving the
delimiter.
Some example data:
“leucocyten + gramnegatieve staven +++ grampositieve staven ++”
“leucocyten – grampositieve coccen +”
I want to split the strings such that I get the following result:
c(“le
9 matches
Mail list logo