On Feb 29, 2012, at 7:00 PM, Francesco Sarracino wrote:

Dear R listers,

I have a silly problem. I am trying to load a dta (Stata) file in R.
The dta is about 650 MB and contains the integrated World Values
Survey/ European Value Study data-set.
My problem is that I don't manage to load the file. After almost 1
hour I issued the following command:
data <- read.dta("http://www.stata-press.com/data/kkd/data1.dta";,
 convert.dates = TRUE, convert.factors = TRUE,
     missing.type = FALSE,
     convert.underscore = FALSE, warn.missing.labels = TRUE)

I get MUCH smaller data.frame;

require(foreign)
...then your code:

(Almost instantaneous return to console prompt.)

> str(data)
'data.frame':   3340 obs. of  47 variables:
$ persnr : int 2229 3994 6326 8660 10622 13277 15241 17852 19635 21501 ... $ intnr : int 145700 256862 166979 120826 154849 138118 13277 160539 194697 150495 ... $ state : Factor w/ 16 levels "Berlin","Schl.Hst",..: 6 15 2 6 6 10 6 6 10 9 ... $ gender : Factor w/ 2 levels "Maenner","Frauen": 1 1 2 1 1 1 1 1 2 2 ...
Snipped a few pages...


The column names don't really look like what you describe:

>  names(data)
 [1] "persnr"   "intnr"    "state"    "gender"   "ybirth"   "ymove"
 [7] "ybuild"   "hcond"    "sqm"      "rooms"    "fseval"   "kitchen"
[13] "shower"   "wc"       "heating"  "cellar"   "balcony"  "garden"
[19] "phone"    "renttype" "rent"     "renteval" "hhtype"   "htype"
[25] "area"     "np11701"  "np0105"   "np9401"   "np9402"   "np9403"
[31] "np9501"   "np9502"   "np9503"   "np9504"   "np9506"   "np9507"
[37] "hhpos"    "hhsize"   "marital"  "edu"      "voc"      "yedu"
[43] "emp"      "occ"      "hhinc"    "income"   "egph"

>  sessionInfo()
R version 2.14.0 Patched (2011-11-13 r57650)
Platform: x86_64-apple-darwin9.8.0/x86_64 (64-bit)

locale:
[1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8

attached base packages:
[1] stats     grDevices utils     datasets  graphics  methods
[7] base

other attached packages:
[1] foreign_0.8-47 sos_1.3-1      brew_1.0-6     lattice_0.20-0

loaded via a namespace (and not attached):
[1] grid_2.14.0  tools_2.14.0



I still don't have my data loaded. Moreover, my system becomes very
slow and not responsive.
I can't figure out what is going on.
Here you are my specs:
Ubuntu Linux 11.10 x86_64-pc-linux-gnu (64-bit)
Intel Core i7, 4 GB RAM, 367 GB Free HD, 8 GB swap memory
R:
R version 2.14.1 (2011-12-22)

Can you please help me figuring out what's wrong? I think it's
impossible that R can't handle files of similar sizes.
Thanks a lot,
f.

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

David Winsemius, MD
West Hartford, CT

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Reply via email to