On Feb 2, 2010, at 1:33 PM, trece por ciento wrote:

Thanks again, David
I think that this could work.
Final questions:
1. I have read that read.fwt could be slow for big tables (my tables have aprox. 160000 records, with 176 characters of recordlenght, almost 28MBytes). Could that be a problem?

I don't know. Too many details are missing.

2. If using read.fwt is not a problem, wouldn't be better to read all the records by read.fwt into a dataframe with the Type 1 structure, and then process the Type 2 records in the dataframe adding new fields for these records (NULL valued for Type 1)?

If they had the same dividing points that might work, but I am guessing that it would not. What about reading it in for one record type, outputting the good records, then redoing the input with the other record type?

--
David.

Hug

--- On Mon, 2/1/10, David Winsemius <dwinsem...@comcast.net> wrote:

From: David Winsemius <dwinsem...@comcast.net>
Subject: Re: [R] Import fixed-format ascii file with mixed record types
To: "trece por ciento" <el13porcie...@yahoo.com>
Cc: r-help@r-project.org
Date: Monday, February 1, 2010, 2:23 PM

On Feb 1, 2010, at 2:33 PM, trece por ciento wrote:

Thanks David, but can read.fwf cope with different
record types?
For example, if recordtype is the 4th character, I
could have:

011125678 ---> This is record Type 1
011136779 ---> This is record Type 1
011124943 ---> This is record Type 1
011286711 ---> This is record Type 2
011234872 ---> This is record Type 2
011135628 ---> This is record Type 1

So, how can I tell read.fwf to take the correct type
into account?

You may need to separate the line-types first. If the
numbers of lines are not too large then this would exemplify
a strategy:

txt <- "011125678
+ 011136779
+ 011124943
+ 011286711
+ 011234872
+ 011135628"

substr(readLines(textConnection(txt)), 4,4)
[1] "1" "1" "1" "2" "2" "1"
file1 <-
readLines(textConnection(txt))[substr(readLines(textConnection(txt)),
4,4) == "1"]
file2 <-
readLines(textConnection(txt))[substr(readLines(textConnection(txt)),
4,4) == "2"]
file1
[1] "011125678" "011136779" "011124943" "011135628"
file2
[1] "011286711" "011234872"

Then these text objects could be processed with
read.fwf(textConnection(file1) and the same for file2.

--David.

Thanks again,
Hug

--- On Mon, 2/1/10, David Winsemius <dwinsem...@comcast.net>
wrote:

From: David Winsemius <dwinsem...@comcast.net>
Subject: Re: [R] Import fixed-format ascii file with
mixed record types
To: "trece por ciento" <el13porcie...@yahoo.com>
Cc: r-help@r-project.org
Date: Monday, February 1, 2010, 12:01 PM


On Feb 1, 2010, at 11:40 AM, trece por ciento wrote:

I need to import several ascii files in fixed
format with two different record types. The data comes from
European Labor Force Surveys, wich is a household survey.
The first record type is for people over 16 years, and the
second much sorter is for people aged 15 or less (this
record has a filler with several blanks to get the same
record length).
The files tipically have 160000 records, with 176
characters per record, the data is numeric, corresponding to
102 variables, mostly integers (seven variables have two
decimals). My opertating system is Windows XP.
My questions:
1. Wich do you think is the best way to import the
files into R?


?read.fwf

2. Could you give me any references or examples?

There are examples in the help page.

Thanking you in advance,
Hug




     [[alternative HTML version
deleted]]

______________________________________________
R-help@r-project.org
mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained,
reproducible code.

David Winsemius, MD
Heritage Laboratories
West Hartford, CT






David Winsemius, MD
Heritage Laboratories
West Hartford, CT






David Winsemius, MD
Heritage Laboratories
West Hartford, CT

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Reply via email to