On 15/08/2011 11:47 AM, R Saba wrote:
Reading data with variable column widths.
Here are several lines of a txt data set I would like to read.
The number of variables is fixed at 13 . The problem is how to read the
first variable when it can contain blank space-- for example " Alabama
(Seasonally Adjusted)" , "St. Clair", etc.

I assume those commas are thousands separators.

I'd do it in several steps:
1. Use readLines to read the data into a character variable, without parsing the lines.
2.  Remove all the commas.
3. Replace the last 12 spaces with some unique separator (e.g. a comma, now they're all gone).

This step is the hardest. There's likely a regular expression that does that; another way to do it would be to replace the last space, 12 times. The regular expression for the last space, with the rest of the line matched as well, is " ([^ ]*)$". So this should work:

for (i in 1:12)
  sub(" ([^ ]*)$", ",\\1", lines)

4.  Now read the text strings using read.csv() or whatever.

Duncan Murdoch

Alabama (Seasonally Adjusted) 2,168,870 2,162,604 2,122,787 1,954,895
1,956,026 1,925,007 213,975 206,578 197,780 9.9% 9.6% 9.3%
Alabama (Not Seasonally Adjusted) 2,185,690 2,155,322 2,135,467 1,955,512
1,951,696 1,930,257 230,178 203,626 205,210 10.5% 9.4% 9.6%
Autauga 24,743 24,472 24,234 22,355 22,373 22,394 2,388 2,099 1,840 9.7%
8.6% 7.6%
Baldwin 86,185 84,039 83,698 78,160 76,934 76,736 8,025 7,105 6,962 9.3%
8.5% 8.3%
Barbour 9,954 9,706 9,737 8,611 8,546 8,588 1,343 1,160 1,149 13.5% 12.0%
11.8%
......
St. Clair 36,821 36,139 35,964 33,233 33,021 32,540 3,588 3,118 3,424 9.7%
8.6% 9.5%
.......
Winston 9,150 8,986 9,295 7,779 7,717 7,933 1,371 1,269 1,362 15.0% 14.1%
14.7%
United States (Seasonally Adjusted) 153,421,000 153,693,000 153,684,000
139,334,000 139,779,000 139,092,000 14,087,000 13,914,000 14,593,000 9.2%
9.1% 9.5%
United States (Not Seasonally Adj.) 154,538,000 153,449,000 154,767,000
140,129,000 140,028,000 139,882,000 14,409,000 13,421,000 14,885,000 9.3%
8.7% 9.6%

Thanks,
Richard Saba

--
View this message in context: 
http://r.789695.n4.nabble.com/Read-variable-column-width-data-tp3744922p3744922.html
Sent from the R help mailing list archive at Nabble.com.

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Reply via email to