Peter Dalgaard wrote: > g...@ucalgary.ca wrote: >> We would like to load data from Statistics Canada >> (http://www.statcan.gc.ca/) using R, >> for example, Employment and unemployment rates. >> It seems to me that the tables are displayed in HTML. >> I was wondering if you know how to load these tables. Thanks, > > I suspect the answer is "with some difficulty". You can do stuff like > this, based on using the clipboard. Go to
or maybe library(XML) document = htmlParse('http://www.statcan.gc.ca/daily-quotidien/090520/t090520b1-eng.htm') rows = xpathSApply(document, '//table/tbody/tr') and then use further xpaths to extract the content of interest. vQ > > http://www.statcan.gc.ca/daily-quotidien/090520/t090520b1-eng.htm > > mark the contents of the table, then > > > dd <- t(read.delim("clipboard", colClasses="character")) > > dd1 <- dd[-1,] # 1st row are labels > > dd2 <- as.numeric(gsub(",","",dd1)) # strip thousands separators > Warning message: > NAs introduced by coercion > > dim(dd2) <- dim(dd1) > > dd2 > [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10] > [,11] > [1,] NA 226.8 123.1 2948.0 11630.0 178768.0 122.5 NA 37.6 27822.0 > 1.760 > [2,] NA 224.6 117.7 2945.0 10709.0 181862.0 121.7 NA 37.1 28822.0 > 1.750 > [3,] NA 222.0 109.5 2932.0 9694.0 185068.0 121.1 NA 36.9 27801.0 > 1.730 > [4,] NA 218.8 101.2 2924.0 8968.0 187636.0 120.6 NA 36.7 26560.0 > 1.690 > [5,] NA 215.6 97.2 2920.0 8759.0 189702.0 120.1 NA 36.4 23762.0 > 1.640 > [6,] NA 213.3 96.0 2918.0 8770.0 191343.0 119.7 NA 36.2 22029.0 > 1.600 > [7,] NA -1.1 -1.2 -0.1 0.1 0.9 -0.3 NA -0.5 -7.3 > -0.045 > [,12] [,13] [,14] [,15] > [1,] NA 2959.0 9637.0 221.8 > [2,] NA 2963.0 9635.0 218.4 > [3,] NA 2966.0 9587.0 217.1 > [4,] NA 2939.0 9368.0 211.2 > [5,] NA 2915.0 9325.0 209.4 > [6,] NA 2879.0 9199.0 210.5 > [7,] NA -1.2 -1.4 0.5 ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.