HI Eliza,
Suppose you have 147 data files in the same working directory. Here, I am
using "Eliza1.txt" and a modified "Eliza2.txt" (attached).
list.files()
#[1] "Eliza1.txt" "Eliza2.txt"
lapply(list.files(),function(i) str_count(gsub(" $","",readLines(i))," "))
#count the spaces. Used gsub as there were spaces at the end (possibly due to
formatting error) #which was removed. If there are no spaces at the end, you
don't need ?gsub()
#[[1]]
#[1] 7 7 7 7 6 7 7 7 7 7 6 6 7 7 7 7 6 7 7 7 7 7 6 6
#
#[[2]]
# [1] 7 7 7 7 6 7 7 7 7 7 6 6 7 7 7 7 6 7 7 7 7 7 6 6
res<- lapply(list.files(),function(i) {Lines2<-gsub("
$","",readLines(i));Lines2[str_count(Lines2," ")==7]<-
str_replace(Lines2[str_count(Lines2," ")==7],"\\s+","
");Lines2[str_count(Lines2," ")==6]<- str_replace(Lines2[str_count(Lines2,"
")==6],"\\s+","
");substr(Lines2[substr(Lines2,6,6)==0|substr(Lines2,9,9)==0],6,6)<-"
";substr(Lines2[substr(Lines2,6,6)==0|substr(Lines2,9,9)==0],9,9)<-" ";Lines2})
names(res)<-gsub("\\..*","",list.files())
res
#$Eliza1
# [1] "1911. 1. 1 7.87" "1911. 1. 2 9.26" "1911. 1. 3 8.06"
# [4] "1911. 1. 4 8.13" "1911. 1. 5 12.90" "1911. 2. 6 5.45"
# [7] "1911. 2. 7 3.26" "1911. 3. 8 5.70" "1911. 3. 9 9.24"
#[10] "1911. 4.10 7.60" "1911. 5.11 14.82" "1911. 5.12 14.10"
#[13] "1911. 6.13 7.87" "1911. 6.14 9.26" "1911. 7.15 8.06"
#[16] "1911. 7.16 8.13" "1911. 8.17 12.90" "1911. 8.18 5.45"
#[19] "1911. 9.19 3.26" "1911. 9.20 5.70" "1911.10.21 9.24"
#[22] "1911.10.22 7.60" "1911.11.23 14.82" "1911.12.24 14.10"
#$Eliza2
# [1] "1911. 1. 1 4.87" "1911. 1. 2 11.26" "1911. 1. 3 6.06"
# [4] "1911. 1. 4 8.13" "1911. 1. 5 11.90" "1911. 2. 6 5.55"
# [7] "1911. 2. 7 3.16" "1911. 3. 8 5.10" "1911. 3. 9 9.34"
#[10] "1911. 4.10 7.10" "1911. 5.11 14.92" "1911. 5.12 14.20"
#[13] "1911. 6.13 7.77" "1911. 6.14 9.36" "1911. 7.15 8.66"
#[16] "1911. 7.16 8.23" "1911. 8.17 11.90" "1911. 8.18 15.45"
#[19] "1911. 9.19 13.26" "1911. 9.20 15.77" "1911.10.21 19.34"
#[22] "1911.10.22 7.66" "1911.11.23 14.84" "1911.12.24 14.11"
lapply(res,function(x) str_count(x," "))
#$Eliza1
# [1] 7 7 7 7 6 7 7 7 7 6 5 5 6 6 6 6 5 6 6 6 5 5 4 4
#$Eliza2
# [1] 7 7 7 7 6 7 7 7 7 6 5 5 6 6 6 6 5 6 6 6 5 5 4 4
Hope this helps.
A.K.
________________________________
From: eliza botto <eliza_bo...@hotmail.com>
To: "smartpink...@yahoo.com" <smartpink...@yahoo.com>
Sent: Friday, February 15, 2013 4:47 PM
Subject: RE: data formatting
Thankyou very much for replying arun. i just need to know, what change will i
have to make if i am importing 147 data files into a list. what difference
will it make on the first command which is,
Lines1<-readLines(textConnection("1911.01.01 7.87
1911.01.02 9.26
1911.01.03 8.06
1911.01.04 8.13
1911.01.05 12.90
1911.02.06 5.45
1911.02.07 3.26
1911.03.08 5.70
1911.03.09 9.24
1911.04.10 7.60
1911.05.11 14.82
1911.05.12 14.10
1911.06.13 7.87
1911.06.14 9.26
1911.07.15 8.06
1911.07.16 8.13
1911.08.17 12.90
1911.08.18 5.45
1911.09.19 3.26
1911.09.20 5.70
1911.10.21 9.24
1911.10.22 7.60
1911.11.23 14.82
1911.12.24 14.10"))
thankyou so very much...
elisa
> Date: Fri, 15 Feb 2013 11:11:36 -0800
> From: smartpink...@yahoo.com
> Subject: Re: data formatting
> To: eliza_bo...@hotmail.com
> CC: r-help@r-project.org
>
>
>
> Dear Eliza,
>
> Try this:
>
> Lines1<-readLines(textConnection("1911.01.01 7.87
> 1911.01.02 9.26
> 1911.01.03 8.06
> 1911.01.04 8.13
> 1911.01.05 12.90
> 1911.02.06 5.45
> 1911.02.07 3.26
> 1911.03.08 5.70
> 1911.03.09 9.24
> 1911.04.10 7.60
> 1911.05.11 14.82
> 1911.05.12 14.10
> 1911.06.13 7.87
> 1911.06.14 9.26
> 1911.07.15 8.06
> 1911.07.16 8.13
> 1911.08.17 12.90
> 1911.08.18 5.45
> 1911.09.19 3.26
> 1911.09.20 5.70
> 1911.10.21 9.24
> 1911.10.22 7.60
> 1911.11.23 14.82
> 1911.12.24 14.10"))
>
> Lines2<-Lines1[Lines1!=""]
> library(stringr)
> str_count(Lines2, " ")
> # [1] 7 7 7 7 6 7 7 7 7 7 6 6 7 7 7 7 6 7 7 7 7 7 6 6
>
>
> Lines2[str_count(Lines2," ")==7]<- str_replace(Lines2[str_count(Lines2,"
> ")==7],"\\s+"," ") #reduced 2 spaces
>
> Lines2[str_count(Lines2," ")==6]<- str_replace(Lines2[str_count(Lines2,"
> ")==6],"\\s+"," ") #reduced 2 spaces
> str_count(Lines2," ")
> # [1] 5 5 5 5 4 5 5 5 5 5 4 4 5 5 5 5 4 5 5 5 5 5 4 4
> substr(Lines2[substr(Lines2,6,6)==0|substr(Lines2,9,9)==0],6,6)<-" "
> substr(Lines2[substr(Lines2,6,6)==0|substr(Lines2,9,9)==0],9,9)<-" "
> str_count(Lines2," ") #see the difference in space. This counts all the
> space. Here 2 white space are added to replace 0
> # [1] 7 7 7 7 6 7 7 7 7 6 5 5 6 6 6 6 5 6 6 6 5 5 4 4
> Lines2
> # [1] "1911. 1. 1 7.87" "1911. 1. 2 9.26" "1911. 1. 3 8.06"
> # [4] "1911. 1. 4 8.13" "1911. 1. 5 12.90" "1911. 2. 6 5.45"
> # [7] "1911. 2. 7 3.26" "1911. 3. 8 5.70" "1911. 3. 9 9.24"
> #[10] "1911. 4.10 7.60" "1911. 5.11 14.82" "1911. 5.12 14.10"
> #[13] "1911. 6.13 7.87" "1911. 6.14 9.26" "1911. 7.15 8.06"
> #[16] "1911. 7.16 8.13" "1911. 8.17 12.90" "1911. 8.18 5.45"
> #[19] "1911. 9.19 3.26" "1911. 9.20 5.70" "1911.10.21 9.24"
> #[22] "1911.10.22 7.60" "1911.11.23 14.82" "1911.12.24 14.10"
>
> A.K.
> ________________________________
> From: eliza botto <eliza_bo...@hotmail.com>
> To: "smartpink...@yahoo.com" <smartpink...@yahoo.com>
> Sent: Friday, February 15, 2013 12:38 PM
> Subject: data formatting
>
>
>
> Dear Arun,
> [text file is also attached if format is changed]
> i need to data managing genius expertise on the following issue.
> i have data like the following table
>
> 1911.01.01 7.87 ##(7 spaces between the columns)
> 1911.01.02 9.26 ##(7 spaces between the columns)
> 1911.01.03 8.06 ##(7 spaces between the columns)
> 1911.01.04 8.13 ##(7 spaces between the columns)
> 1911.01.05 12.90 ##(6 spaces between the columns)
> 1911.02.06 5.45 ##(7 spaces between the columns)
> 1911.02.07 3.26 ##(7 spaces between the columns)
> 1911.03.08 5.70 ##(7 spaces between the columns)
> 1911.03.09 9.24 ##(7 spaces between the columns)
> 1911.04.10 7.60 ##(7 spaces between the columns)
> 1911.05.11 14.82 ##(6 spaces between the columns)
> 1911.05.12 14.10 ##(6 spaces between the columns)
> 1911.06.13 7.87 ##(7 spaces between the columns)
> 1911.06.14 9.26 ##(7 spaces between the columns)
> 1911.07.15 8.06 ##(7 spaces between the columns)
> 1911.07.16 8.13 ##(7 spaces between the columns)
> 1911.08.17 12.90 ##(6 spaces between the columns)
> 1911.08.18 5.45 ##(7 spaces between the columns)
> 1911.09.19 3.26 ##(7 spaces between the columns)
> 1911.09.20 5.70 ##(7 spaces between the columns)
> 1911.10.21 9.24 ##(7 spaces between the columns)
> 1911.10.22 7.60 ##(7 spaces between the columns)
> 1911.11.23 14.82 ##(6 spaces between the columns)
> 1911.12.24 14.10 ##(6 spaces between the columns)
> and i want it to be in the following manner and afterwards i want to save
> that file in ".txt" format.
> 1911. 1. 1 7.87 ##(5 spaces between the columns)
> 1911. 1. 2 9.26 ##(5 spaces between the columns)
> 1911. 1. 3 8.06 ##(5 spaces between the columns)
> 1911. 1. 4 8.13 ##(5 spaces between the columns)
> 1911. 1. 5 12.90 ##(4 spaces between the columns)
> 1911. 2. 6 5.45 ##(5 spaces between the columns)
> 1911. 2. 7 3.26 ##(5 spaces between the columns)
> 1911. 3. 8 5.70 ##(5 spaces between the columns)
> 1911. 3. 9 9.24 ##(5 spaces between the columns)
> 1911. 4.10 7.60 ##(5 spaces between the columns)
> 1911. 5.11 14.82 ##(4 spaces between the columns)
> 1911. 5.12 14.10 ##(4 spaces between the columns)
> 1911. 6.13 7.87 ##(5 spaces between the columns)
> 1911. 6.14 9.26 ##(5 spaces between the columns)
> 1911. 7.15 8.06 ##(5 spaces between the columns)
> 1911. 7.16 8.13 ##(5 spaces between the columns)
> 1911. 8.17 12.90 ##(4 spaces between the columns)
> 1911. 8.18 5.45 ##(5 spaces between the columns)
> 1911. 9.19 3.26 ##(5 spaces between the columns)
> 1911. 9.20 5.70 ##(5 spaces between the columns)
> 1911.10.21 9.24 ##(5 spaces between the columns)
> 1911.10.22 7.60 ##(5 spaces between the columns)
> 1911.11.23 14.82 ##(4 spaces between the columns)
> 1911.12.24 14.10 ##(4 spaces between the columns)
> you could see that spaces between the columns needed to be reduced in
> executed file and also the zeros in date columns with months and days are
> needed to be replaced with space.
> thankyou very very much in advance
> elisa
1911.01.01 7.87
1911.01.02 9.26
1911.01.03 8.06
1911.01.04 8.13
1911.01.05 12.90
1911.02.06 5.45
1911.02.07 3.26
1911.03.08 5.70
1911.03.09 9.24
1911.04.10 7.60
1911.05.11 14.82
1911.05.12 14.10
1911.06.13 7.87
1911.06.14 9.26
1911.07.15 8.06
1911.07.16 8.13
1911.08.17 12.90
1911.08.18 5.45
1911.09.19 3.26
1911.09.20 5.70
1911.10.21 9.24
1911.10.22 7.60
1911.11.23 14.82
1911.12.24 14.10
1911.01.01 4.87
1911.01.02 11.26
1911.01.03 6.06
1911.01.04 8.13
1911.01.05 11.90
1911.02.06 5.55
1911.02.07 3.16
1911.03.08 5.10
1911.03.09 9.34
1911.04.10 7.10
1911.05.11 14.92
1911.05.12 14.20
1911.06.13 7.77
1911.06.14 9.36
1911.07.15 8.66
1911.07.16 8.23
1911.08.17 11.90
1911.08.18 15.45
1911.09.19 13.26
1911.09.20 15.77
1911.10.21 19.34
1911.10.22 7.66
1911.11.23 14.84
1911.12.24 14.11
______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.