Hi,

Here is the Biostrings solution in case you need to chop a long
string into hundreds or thousands of fragments (a situation where
base::substring() is very inefficient):

  library(Biostrings)

  ## Call as.character() on the result if you want it back as
  ## a character vector.
  fast_chop_string <- function(x, ends)
  {
    if (!is(x, "XString"))
        x <- as(x, "XString")
    extractAt(x, at=PartitioningByEnd(ends))
  }

Will be much faster than substring (e.g. 100x or 1000x) when
chopping a string like a Human chromosome into hundreds or
thousands of fragments.

Biostrings is a Bioconductor package:

  https://bioconductor.org/packages/Biostrings

Cheers,
H.


On 05/12/2016 01:18 AM, Jan Kacaba wrote:
Nice solution Jim, thank you.



2016-05-12 2:45 GMT+02:00 Jim Lemon <drjimle...@gmail.com>:
Hi again,
Sorry, that should be:

chop_string<-function(x,ends) {
  starts<-c(1,ends[-length(ends)]+1)
  return(substring(x,starts,ends))
}

Jim

On Thu, May 12, 2016 at 10:05 AM, Jim Lemon <drjimle...@gmail.com> wrote:
Hi Jan,
This might be helpful:

chop_string<-function(x,ends) {
  starts<-c(1,ends[-length(ends)]-1)
  return(substring(x,starts,ends))
}

Jim


On Thu, May 12, 2016 at 7:23 AM, Jan Kacaba <jan.kac...@gmail.com> wrote:
Here is my attempt at function which computes margins from positions.

require("stringr")
require("dplyr")

ends<-seq(10,100,8)  # end margins
test_string<-"Lorem ipsum dolor sit amet, consectetuer adipiscing
elit. Aliquam in lorem sit amet leo accumsan lacinia."

sekoj=function(ends){
   l_ends<-length(ends)
   begs=vector(mode="integer",l_ends)
   begs[1]=1
   for (i in 2:(l_ends)){
     begs[i]<-ends[i-1]+1
   }
   margs<-rbind(begs,ends)
   margs<-cbind(margs,c(ends[l_ends]+1,-1))
   #rownames(margs)<-c("beg","end")
   return(margs)
}
margins<-sekoj(ends)
str_sub(test_string,margins[1,],margins[2,]) %>% print

Code to run in browser:
http://www.r-fiddle.org/#/fiddle?id=rVmNVxDV

2016-05-11 23:12 GMT+02:00 Bert Gunter <bgunter.4...@gmail.com>:
Dunno -- but you might have a look at Hadley Wickham's 'stringr' package:
https://cran.r-project.org/web/packages/stringr/stringr.pdf

Cheers,

Bert


Bert Gunter

"The trouble with having an open mind is that people keep coming along
and sticking things into it."
-- Opus (aka Berkeley Breathed in his "Bloom County" comic strip )


On Wed, May 11, 2016 at 1:12 PM, Jan Kacaba <jan.kac...@gmail.com> wrote:
Dear R-help

I would like to split long string at specified precomputed positions.
'substring' needs beginings and ends. Is there a native function which
accepts positions so I don't have to count second argument?

For example I have vector of possitions pos<-c(5,10,19). Substring
needs input first=c(1,6,11) and last=c(5,10,19). There is no problem
to write my own function. Just asking.

Derek

______________________________________________
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

______________________________________________
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

______________________________________________
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


--
Hervé Pagès

Program in Computational Biology
Division of Public Health Sciences
Fred Hutchinson Cancer Research Center
1100 Fairview Ave. N, M1-B514
P.O. Box 19024
Seattle, WA 98109-1024

E-mail: hpa...@fredhutch.org
Phone:  (206) 667-5791
Fax:    (206) 667-1319

______________________________________________
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Reply via email to