Hello all: I'm hoping you can help me determine the source of this problem.

I've just used read.csv to bring a small (581 rows, 9 vars) dataset into R
(2.7.0., Mac OS 10.5.5). The dataset was created in Excel 2008 from a
datadump from an Oracle database. I've done this many times before and had
no problems.

The dataset ("a") appears to have extra rows filled with NAs. For example,

> a[a$mmt.dose == 10, ]
       ID COHORT    F st.y st.m st.d days   md mmt.dose
NA     NA   <NA> <NA>   NA   NA   NA   NA <NA>       NA
NA.1   NA   <NA> <NA>   NA   NA   NA   NA <NA>       NA
NA.2   NA   <NA> <NA>   NA   NA   NA   NA <NA>       NA
NA.3   NA   <NA> <NA>   NA   NA   NA   NA <NA>       NA
NA.4   NA   <NA> <NA>   NA   NA   NA   NA <NA>       NA
NA.5   NA   <NA> <NA>   NA   NA   NA   NA <NA>       NA
222    88      V   PC   NA   NA   NA   NA MOSE       10
NA.6   NA   <NA> <NA>   NA   NA   NA   NA <NA>       NA
NA.7   NA   <NA> <NA>   NA   NA   NA   NA <NA>       NA
NA.8   NA   <NA> <NA>   NA   NA   NA   NA <NA>       NA
NA.9   NA   <NA> <NA>   NA   NA   NA   NA <NA>       NA
NA.10  NA   <NA> <NA>   NA   NA   NA   NA <NA>       NA
474   756      V    C 2004   10    1 1553 UNKN       10

I've examined the original CSV file and also exported the "a" dataset to a
CSV and found no source for these entries.

Any help would be much appreciated!


M-J


--

PhD student,
School of Population and Public Health,
University of British Columbia
Musqueam Territory, British Columbia

Research Assistant,
Urban Health Research Institute,
BC Centre for Excellence in HIV/AIDS
St. Paul's Hospital,
Vancouver, Canada

        [[alternative HTML version deleted]]

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Reply via email to