Hi,
On Sep 29, 2009, at 12:03 PM, <mau...@alice.it> wrote:
Through converting a miRNAs file from FASTA to character format I
get a vector which looks like the following:
nml
[1] "hsa-let-7a MIMAT0000062 Homo sapiens let-7a"
[2] "hsa-let-7b MIMAT0000063 Homo sapiens let-7b"
[3] "hsa-let-7c MIMAT0000064 Homo sapiens let-7c"
[4] "hsa-let-7d MIMAT0000065 Homo sapiens let-7d"
[5] "hsa-let-7e MIMAT0000066 Homo sapiens let-7e"
[6] "hsa-let-7f MIMAT0000067 Homo sapiens let-7f"
[7] "hsa-miR-15a MIMAT0000068 Homo sapiens miR-15a"
[8] "hsa-miR-16 MIMAT0000069 Homo sapiens miR-16"
[9] "hsa-miR-17 MIMAT0000070 Homo sapiens miR-17"
[10] "hsa-miR-18a MIMAT0000072 Homo sapiens miR-18a"
.......................................................................................................
[888] "hsa-miR-675* MIMAT0006790 Homo sapiens miR-675*"
[889] "hsa-miR-888* MIMAT0004917 Homo sapiens miR-888*"
[890] "hsa-miR-541* MIMAT0004919 Homo sapiens miR-541*"
My goal is to separate into a vector only the first string preceding
the 1st space starting from the left.
With reference to the records above listed I would obtain:
[1] "hsa-let-7a"
[2] "hsa-let-7b"
[3] "hsa-let-7c"
[4] "hsa-let-7d"
[5] "hsa-let-7e"
[6] "hsa-let-7f f"
[7] "hsa-miR-15a"
[8] "hsa-miR-16"
[9] "hsa-miR-17"
[10] "hsa-miR-18a"
.......................................................................................................
[888] "hsa-miR-675*"
[889] "hsa-miR-888*"
[890] "hsa-miR-541*"
pieces <- strsplit(nml, " ")
sapply(pieces, '[', 1)
Or, the same as a 1 liner:
sapply(strsplit(nml, " "), '[', 1)
Hope that helps,
-steve
--
Steve Lianoglou
Graduate Student: Computational Systems Biology
| Memorial Sloan-Kettering Cancer Center
| Weill Medical College of Cornell University
Contact Info: http://cbio.mskcc.org/~lianos/contact
______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.