Re: [R] OT: batch processing XLS files to CSV

Prof Brian Ripley Thu, 29 May 2008 02:51:23 -0700

I think the issue here is that the Excel spreadsheets as described do notcontain simple tables of the form that the various .xls readers describedin 'R Data Import/Export' handle. Your example is not applicable (and ifit were there are several alternatives in that manual).


I would look into the Perl solutions for parsing .xls files.


On Thu, 29 May 2008, Pascal Pascal wrote:

Hello,

I'm not a guru but we have to do reporting with data in Excel files. We use
"xlsReadWrite" package.

Below an exemple, with an "Excel.R" file which import "MySheet" sheet from a
"MyFile.xls" workbook. First column are numeric, Second are integer. Third
character and so on.

library(xlsReadWrite)
tmp <- read.xls( "MyFile.xls", sheet="MySheet", colClasses = c("numeric",
"integer", "character", "integer"), colNames=T )

To extract all columns behind a definied column you can use a command like
this:
tmp <- tmp[,seq( which(colnames(tmp)=="StartColumn"),ncol(tmp))])      #

From column "StartColumn" to the end

tmp <- tmp[,seq(
which(colnames(tmp)=="StartColumn"),which(colnames(tmp)=="EndColumn")])
# From column "StartColumn" to "EndColumn"

You can call routinely this procedure by a simple batch file (".bat") under
windows by this command.
start /B /WAIT "Rcmd" "%Path to R%\bin\Rcmd.exe" BATCH "C:\R\excel.R"

You'll find more information on xlsReadWrite on CRAN.

You can also search in CPAN  and on the web. There are more than one
solution to import directly Excel files with Perl.


Indeed, and CRAN package gdata has an interface to one of them.

Be more precise and give example if you want more help.

Hope that helps,

Pascal

2008/5/28 Chris Evans <[EMAIL PROTECTED]>:

Dear R gurus, particularly those of generous M$ tolerance and diverse gifts
and knowledge!

I have an interesting challenge that I will end up crunching in R involving
service usage by patients.  Maybe I can do all of it in R but I can't see
how yet.

My situation is that our IT Department can give me loads of XLS files about
patients one of our services have seen.  The are one per patient per time
period.  All the data are in the first sheet of the XLS files and that sheet
contains four variable length but fixed format matrices of data:
1) demographics (actually, this is fixed length, one row!);
2) community contacts with services, variable length, rarely zero rows but
could be;
3) inpatient admissions, variable length, often zero rows;
and CPA information (don't ask what that is!), two rows, fixed format, just
to make things tricky, they're spearated by a fixed few junk rows in the xls
files. The column format of each block is different.

Each block starts with standard label rows so it will be easy to identify
these start points and know the format on the rows that follow each one.  I
could use perl to scan for these and then read the zero to many lines of the
data in the matrix and end on finding the next header.

I would be fairly happy to do this with perl but would need to convert the
xls (xls 2002) files to CSV to get at them in Perl (I think).

Anyone out there done anything like this and can give me any advice? I'm
sorry, I'm sure there are more specific lists or web resources but I think
the skills are here too and if someone can tell me how to do this all in R,
I'd be fascinated.

Many thanks,

Chris

--
Chris Evans <[EMAIL PROTECTED]> Skype: chris-psyctc
Professor of Psychotherapy, Nottingham University;
Consultant Psychiatrist in Psychotherapy, Notts PDD network;
Research Programmes Director, Nottinghamshire NHS Trust;
*If I am writing from one of those roles, it will be clear. Otherwise*
*my views are my own and not representative of those institutions    *

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


        [[alternative HTML version deleted]]

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


--
Brian D. Ripley,                  [EMAIL PROTECTED]
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford,             Tel:  +44 1865 272861 (self)
1 South Parks Road,                     +44 1865 272866 (PA)
Oxford OX1 3TG, UK                Fax:  +44 1865 272595

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] OT: batch processing XLS files to CSV

Reply via email to