Hi,
Here is the Biostrings solution in case you need to chop a long
string into hundreds or thousands of fragments (a situation where
base::substring() is very inefficient):
library(Biostrings)
## Call as.character() on the result if you want it back as
## a character vector.
fast_chop_string <- function(x, ends)
{
if (!is(x, "XString"))
x <- as(x, "XString")
extractAt(x, at=PartitioningByEnd(ends))
}
Will be much faster than substring (e.g. 100x or 1000x) when
chopping a string like a Human chromosome into hundreds or
thousands of fragments.
Biostrings is a Bioconductor package:
https://bioconductor.org/packages/Biostrings
Cheers,
H.
On 05/12/2016 01:18 AM, Jan Kacaba wrote:
Nice solution Jim, thank you.
2016-05-12 2:45 GMT+02:00 Jim Lemon <drjimle...@gmail.com>:
Hi again,
Sorry, that should be:
chop_string<-function(x,ends) {
starts<-c(1,ends[-length(ends)]+1)
return(substring(x,starts,ends))
}
Jim
On Thu, May 12, 2016 at 10:05 AM, Jim Lemon <drjimle...@gmail.com> wrote:
Hi Jan,
This might be helpful:
chop_string<-function(x,ends) {
starts<-c(1,ends[-length(ends)]-1)
return(substring(x,starts,ends))
}
Jim
On Thu, May 12, 2016 at 7:23 AM, Jan Kacaba <jan.kac...@gmail.com> wrote:
Here is my attempt at function which computes margins from positions.
require("stringr")
require("dplyr")
ends<-seq(10,100,8) # end margins
test_string<-"Lorem ipsum dolor sit amet, consectetuer adipiscing
elit. Aliquam in lorem sit amet leo accumsan lacinia."
sekoj=function(ends){
l_ends<-length(ends)
begs=vector(mode="integer",l_ends)
begs[1]=1
for (i in 2:(l_ends)){
begs[i]<-ends[i-1]+1
}
margs<-rbind(begs,ends)
margs<-cbind(margs,c(ends[l_ends]+1,-1))
#rownames(margs)<-c("beg","end")
return(margs)
}
margins<-sekoj(ends)
str_sub(test_string,margins[1,],margins[2,]) %>% print
Code to run in browser:
http://www.r-fiddle.org/#/fiddle?id=rVmNVxDV
2016-05-11 23:12 GMT+02:00 Bert Gunter <bgunter.4...@gmail.com>:
Dunno -- but you might have a look at Hadley Wickham's 'stringr' package:
https://cran.r-project.org/web/packages/stringr/stringr.pdf
Cheers,
Bert
Bert Gunter
"The trouble with having an open mind is that people keep coming along
and sticking things into it."
-- Opus (aka Berkeley Breathed in his "Bloom County" comic strip )
On Wed, May 11, 2016 at 1:12 PM, Jan Kacaba <jan.kac...@gmail.com> wrote:
Dear R-help
I would like to split long string at specified precomputed positions.
'substring' needs beginings and ends. Is there a native function which
accepts positions so I don't have to count second argument?
For example I have vector of possitions pos<-c(5,10,19). Substring
needs input first=c(1,6,11) and last=c(5,10,19). There is no problem
to write my own function. Just asking.
Derek
______________________________________________
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
______________________________________________
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
______________________________________________
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
--
Hervé Pagès
Program in Computational Biology
Division of Public Health Sciences
Fred Hutchinson Cancer Research Center
1100 Fairview Ave. N, M1-B514
P.O. Box 19024
Seattle, WA 98109-1024
E-mail: hpa...@fredhutch.org
Phone: (206) 667-5791
Fax: (206) 667-1319
______________________________________________
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.