I have been tinkering around with this for a bit, and I am proud to share navel gazer 1.0.
If no arguments are passed, it will look up the top 50 authors on the r-help list, for the given month in the given year. You can also specify one or more months as a character vector (e.g., "August" or c("August", "September") ). The same goes for years. Thanks to some help from Henrique (although I promise this was not the only reason I wanted it), it will not try to pull a month beyond the current one. You can also choose a different list (such as r-devel). If you have the time, you can set argument entire = TRUE, which will look up every month from whatever year(s) you specified (or the current year if you did not). It will return a named list with each element corresponding to one month. Also, by default it will create a dotplot in lattice (though this may be turned off via plot = FALSE). Finally, you can specify how many authors you want. It defaults to 50. It is also available at here: http://gist.github.com/584910 ####################################### navel.gazer <- function(month = NULL, year = NULL, entire = FALSE, list = "r-help", n = 50, plot = TRUE) { # Ben Bolker came up with most of the code # Henrique Dallazuanna provided an edit to the z <- line of code # Brian Diggs provided capwords() to properly count Peter Dalgaard # Joshua Wiley adapted all of it to one function if(is.null(month)) { month <- format(Sys.Date(), format = "%B") } if(isTRUE(entire)) { month <- unique(months(as.Date(1:365, "2000-01-01"))) } if(is.null(year)) { year <- format(Sys.Date(), format = "%Y") } if(length(year) > 1) { tmp <- vector(mode = "list", length = length(year)) for(i in seq_along(year)) { tmp[[i]] <- paste(year[i], month, sep = "-") } times <- unlist(tmp) } else { times <- paste(year, month, sep = "-") } require(zoo) times <- sort(as.yearmon(times, "%Y-%B")) current <- as.yearmon(Sys.Date(), "%Y-%m") times <- format(times[times <= current], "%Y-%B") # Function to extract the names # Originally by Ben Bolker namefun <- function(x) { gsub("\\n","",gsub("^.+<I>","",gsub("</I>.+$","",x))) } # Based on a suggestion by Brian Diggs # Capitalizes the first letter of each word capwords <- function(s, strict = FALSE) { cap <- function(s) paste(toupper(substring(s,1,1)), {s <- substring(s,2); if(strict) tolower(s) else s}, sep = "", collapse = " " ) sapply(strsplit(s, split = " "), cap, USE.NAMES = !is.null(names(s))) } # Collects the author names for the relevant month and list # from the R archives # Originally by Ben Bolker grabber <- function(month, list, n) { baseurl <- "https://stat.ethz.ch/pipermail/" require(RCurl) # z <- getURL(paste(baseurl,list,"/",month,"/author.html",sep="")) z <- getURL(paste(baseurl,list,"/", month,"/author.html",sep=""), ssl.verifypeer = FALSE) zz <- strsplit(z,"<LI>")[[1]] cnames <- capwords(sapply(zz[3:(length(zz)-1)],namefun)) rr <- rev(sort(table(cnames))) output <- rr[1:n] return(output) } # Create dot plots of the number of posts # lattice dotplot() code primarily by Ben Bolker plotter <- function(dat) { require(lattice) if(length(dat) > 1) { old.par <- par(no.readonly = TRUE) on.exit(par(old.par)) par("ask" = TRUE) } for(i in seq_along(dat)) { print(dotplot(~rev(dat[[i]]), xlab = "Number of posts", main = names(dat)[i])) } invisible() } numbers <- lapply(times, function(x) {grabber(month = x, list = list, n = n)}) names(numbers) <- times if(plot) { plotter(dat = numbers) } return(numbers) } ####################################### Two examples: navel.gazer(year = 2009, entire = TRUE) navel.gazer(month = "September", year = 2007) This basically works out to be a tribute to David Winsemius: navel.gazer(n = 1, entire = TRUE) Hope you get a bit of fun out of this. I certainly enjoyed writing it! Josh On Tue, Aug 17, 2010 at 1:10 PM, Henrique Dallazuanna <www...@gmail.com> wrote: > I think that gsub example on help page is more clear: > > library(XML) > > # could be used the XML package to get the names > cnames <- gsub('\n', '', head(tail(sapply(getNodeSet(htmlParse(z, asText = > TRUE), "//i"), xmlValue), -3), -3)) > gsub("(\\w)(\\w*)", "\\U\\1\\L\\2", cnames, perl=TRUE) > > > > On Tue, Aug 17, 2010 at 4:44 PM, Brian Diggs <dig...@ohsu.edu> wrote: > >> Since Peter Dalgaard is splitting his considerable contributions between >> "Peter Dalgaard" and "peter dalgaard", I made the following changes (which >> shouldn't be a problem unless e e cummings becomes a regular poster): >> >> # from base::chartr documentation >> capwords <- function(s, strict = FALSE) { >> cap <- function(s) paste(toupper(substring(s,1,1)), >> {s <- substring(s,2); if(strict) tolower(s) else s}, >> sep = "", collapse = " " ) >> sapply(strsplit(s, split = " "), cap, USE.NAMES = !is.null(names(s))) >> } >> >> cnames <- capwords(sapply(zz[3:(length(zz)-1)],namefun)) >> >> >> >> >> On 8/17/2010 10:00 AM, Henrique Dallazuanna wrote: >> >>> Ben, >>> >>> I change the line: >>> >>> z<- getURL(paste(baseurl,list,"/", month,"/author.html",sep="")) >>> >>> to >>> >>> z<- getURL(paste(baseurl,list,"/", month,"/author.html",sep=""), >>> ssl.verifypeer = FALSE) >>> >>> because don't work for me. >>> >>> Nice! >>> >>> On Tue, Aug 17, 2010 at 1:47 PM, Ben Bolker<bbol...@gmail.com> wrote: >>> >>> month<- "2010-August" >>>> list<- "r-help" >>>> ##list<- "r-sig-ecology" >>>> ##list<- "r-sig-mixed-models" >>>> ## month<- "2010q3" >>>> n<- 50 >>>> baseurl<- "https://stat.ethz.ch/pipermail/" >>>> library(RCurl) >>>> z<- getURL(paste(baseurl,list,"/",month,"/author.html",sep="")) >>>> zz<- strsplit(z,"<LI>")[[1]] >>>> namefun<- function(x) { >>>> gsub("\\n","",gsub("^.+<I>","",gsub("</I>.+$","",x))) >>>> } >>>> >>>> cnames<- sapply(zz[3:(length(zz)-1)],namefun) >>>> rr<- rev(sort(table(cnames))) >>>> >>>> >>>> library(lattice) >>>> dotplot(~rev(rr[1:n]),xlab="Number of posts") >>>> >>>> dotplot(~rev(rr[1:n]),xlab="Number of posts", >>>> scales=list(x=list(log=10))) >>>> >>>> ______________________________________________ >>>> R-help@r-project.org mailing list >>>> https://stat.ethz.ch/mailman/listinfo/r-help >>>> PLEASE do read the posting guide >>>> http://www.R-project.org/posting-guide.html >>>> and provide commented, minimal, self-contained, reproducible code. >>>> >>>> >>> >>> >>> >> -- >> Brian Diggs >> Senior Research Associate, Department of Surgery, Oregon Health & Science >> University >> > > > > -- > Henrique Dallazuanna > Curitiba-Paraná-Brasil > 25° 25' 40" S 49° 16' 22" O > > [[alternative HTML version deleted]] > > > ______________________________________________ > R-help@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > > -- Joshua Wiley Ph.D. Student, Health Psychology University of California, Los Angeles http://www.joshuawiley.com/ ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.