[R] Fixed Width EBCDIC Files in R

Brian Trautman Thu, 05 Feb 2015 13:20:07 -0800

I'm trying to read some mainframe data encoded as EBCDIC into R, and am at
a loss. I'd like to avoid using an external program to convert the files,
since I'm operating in a corporate environment.


You can find the example files at at the link below, with both ASCII and
EBCDIC versions. Note that there are no linebreaks in the EBCDIC versions
of the file -- instead, I'd be specifying the width of each line manually.
R has the IBM500 encoding available in my environment, which should be the
correct one for these files.

However, when I run the following commands, R seems to fail entirely.  It
loads a single record with garbage characters, regardless of the encoding I
specified.


layout <- read.fwf("EBCDIC_LAYOUT", widths = c(80), fileEncoding='ibm500')

data   <- read.fwf("EBCDIC_ZIPCODE", widths = c(32), fileEncoding='ibm500')


Where might I go from here?

Related -- some of the files I expect to use will be fairly large (1 GB or
so). Preferably, I'd like a solution that scales reasonably well. (I tried
packages like LaF, but they don't have the option to select encoding.)

Thank you very much!


Example files --
https://drive.google.com/open?id=0ByvX1v-WqaaASTdwV2ZYS0pBV00&authuser=0

        [[alternative HTML version deleted]]

______________________________________________
[email protected] mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Fixed Width EBCDIC Files in R

Reply via email to