Re: [R] Help request: Parsing docx files for key words and appending to a spreadsheet

2024-01-06 Thread Andy
f the major sticking points I kept bumping up against. Thank you so much for this. All the best Andy On 05/01/2024 13:59, Howard, Tim G (DEC) wrote: Here's a simplified version of how I would do it, using `textreadr` but otherwise base functions. I haven't done it all, but have a

Re: [R] Help request: Parsing docx files for key words and appending to a spreadsheet

2024-01-04 Thread Andy
ction and append part. If I can get it to work for one of these fields, I suspect that I can repeat the basic syntax to extract and append the remaining fields. Therefore, if someone can either suggest a syntax or point me to a useful tutorial, that would be splendid. Thank you in anticipation.

Re: [R] Help request: Parsing docx files for key words and appending to a spreadsheet

2023-12-30 Thread Andy
t line   # which summarises it.   # the result is saved in a data frame object   # called content which we shall show some   # heading into from   head(content) } Results in this error now:Error in x$doc_obj : $ operator is invalid for atomic vectors Thank you. On 30/12/2023 12:12, Andy

Re: [R] Help request: Parsing docx files for key words and appending to a spreadsheet

2023-12-30 Thread Andy
Hi Eric Thanks for that. That seems to fix one problem (the lack of a separator), but introduces a new one when I complete the function Calum proposed:Error in docx_summary() : argument "x" is missing, with no default The whole code so far looks like this: # Load libraries library(tcltk) libr

Re: [R] Help request: Parsing docx files for key words and appending to a spreadsheet

2023-12-30 Thread Andy
ion and page number Length Byline Subject (only if the threshold of coverage for a specific subject is >=50% is reached (e.g. Greenwashing (51%)) - if not, enter 'nil' and move onto the next article in the folder This is the ambition. I am clearly a long way short of that though. Man

Re: [R] Help request: Parsing docx files for key words and appending to a spreadsheet

2023-12-30 Thread Andy
Thanks Ivan and Calum I continue to appreciate your support. Calum, I entered the code snippet you provided, and it returns 'file missing'. Looking at this, while the object 'full_filename' exists, what is happening is that the path from getwd() is being appended to the title of the article, b

Re: [R] Help request: Parsing docx files for key words and appending to a spreadsheet

2023-12-29 Thread Andy
en manipulate it. To be more specific, we might need an example of the DF [...] On Fri, Dec 29, 2023 at 10:14 AM Andy wrote: [...] I'd like to be able to accomplish the following: (1) Append the title, the month, the author, the number of words, and page number(s) to a spreadsheet

Re: [R] Help request: Parsing docx files for key words and appending to a spreadsheet

2023-12-29 Thread Andy
n filetype %in% c("docx") && grepl("^([fh]ttp)", file) :'length = 38' in coercion to 'logical(1)' ## And so I am going around in circles and not at all clear on how I can make progress. I am sure that there must be a way, but the sugg

[R] Help request: Parsing docx files for key words and appending to a spreadsheet

2023-12-29 Thread Andy
ted to UTF-8 plain text, would that make the task easier? I am not a confident coder, and am really only just getting my head around R so appreciate a steep learning curve ahead, but of course, I don't know what I don't know, so any pointers in the right d

Re: [R] checkpointing

2021-12-14 Thread Andy Jacobson via R-help
cluster. While I have an answer for my particular task, it would still be useful to checkpoint using the scheme Henrik suggests. Thanks all for the interesting conversation! -Andy On 12/14/21 5:39 PM, Henrik Bengtsson wrote: On Tue, Dec 14, 2021 at 1:17 AM Andy Jacobson wrote: Those are good

Re: [R] checkpointing

2021-12-14 Thread Andy Jacobson
long-running processes like optim(). -Andy On 12/13/21 11:51 AM, Duncan Murdoch wrote: On 13/12/2021 12:58 p.m., Greg Minshall wrote: Jeff, This sounds like an OS feature, not an R feature... certainly not a portable R feature. i'm not arguing for it, but this seems to me like something

[R] checkpointing

2021-12-13 Thread Andy Jacobson via R-help
Has anyone ever considered what it would take to implement checkpointing in R, so that long-running processes could be interrupted and resumed later, from a different process or even a different machine? Thanks, Andy -- Andy Jacobson andy.jacob...@noaa.gov NOAA Global Monitoring Lab 325

Re: [R] levels

2020-07-17 Thread andy elprama
gt; It's about 27 minutes in. > > Chris Gordon-Smith > On 15/07/2020 17:16, Marc Schwartz via R-help wrote: > > On Jul 15, 2020, at 4:31 AM, andy elprama > wrote: > > Dear R-users, > > Something strange happened within the command "levels" > > R versio

[R] levels

2020-07-15 Thread andy elprama
Dear R-users, Something strange happened within the command "levels" R version 3.6.1 name <- c("a","b","c") values <- c(1,2,3) data <- data.frame(name,values) levels(data$name) [1] "a" "b" "c" R version 4.0 name <- c("a","b","c") values <- c(1,2,3) data <- data.frame(name,values) levels(data$nam

Re: [R] regular expression, stringr::str_view, grep

2020-04-29 Thread Andy Spada
PCRE. Perhaps the regular expression should have been rewritten: desired_brackets <- "af+g[^m$][^A-Z]" grep(desired_brackets, aff, value = TRUE) ### correct result str_view(aff, desired_brackets) ### correct result Regards, Andy On 28.04.2020 18:41:50, David Winsemius wrote: On 4/28

[R] nlme::gls potential bug

2019-01-31 Thread Andy Beet via R-help
is specified on page 204 eq (5.5). I have also calculated sigma based on (5.7) -after the transformation documented (5.2) -and i do not get the same value as either the package or my implementation. Any advice would be most welcomed. Is there a bug in the estimation of sigma in this package?

Re: [R] word stemming for corpus linguistics

2016-07-26 Thread Andy Wolfe
ne until I come across a better (read, more elegant) solution. Best Andy On 26/07/16 14:05, Paul Johnston wrote: Hi I use the tm_map() with stemDocument used as an argument Looking at a particular file before stemming writeLines(as.character(data_mined_volatile[[1]])) ## The European

Re: [R] word stemming for corpus linguistics

2016-07-26 Thread Andy Wolfe
on on that process, and whether that is applied before or after the text is transformed into a DTM because searching on-line hasn't (yet) thrown anything back. Thanks. Andy On 26/07/16 08:50, Paul Johnston wrote: Suggest look at http://www.inside-r.org/packages/cran/tm/docs/stemDocumen

[R] word stemming for corpus linguistics

2016-07-26 Thread Andy Wolfe
sis using the tm package as part of the whole text mining process? I appreciate any help. Thanks. Andy [[alternative HTML version deleted]] __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mail

Re: [R] Random Forest classification

2016-04-18 Thread Liaw, Andy
This is explained in the "Details" section of the help page for partialPlot. Best Andy > -Original Message- > From: R-help [mailto:r-help-boun...@r-project.org] On Behalf Of Jesús Para > Fernández > Sent: Tuesday, April 12, 2016 1:17 AM > To: r-help@r-project.

[R] HELP - as.numeric changing column data

2016-01-06 Thread Andy Schneider
Hi - I'm trying to plot some data and having a lot of trouble! I have a simple dataset consisting of two columns - income_per_capita and mass_beauty_value. When I read the data in and plot it, I get the attached plot Mass Beauty Non-Numeric:

[R] Most appropriate function for the following optimisation issue?

2015-10-20 Thread Andy Yuan
Hello Please could you help me to select the most appropriate/fastest function to use for the following constraint optimisation issue? Objective function: Min: Sum( (X[i] - S[i] )^2) Subject to constraint : Sum (B[i] x X[i]) =0 where i=1��n and S[i] and B[i] are real numbers Need to

[R] Login

2014-05-27 Thread Andy Siddaway
Dear R help, I cannot login to my account. I am keen to remove the posting I made to R help from google web searches - see http://r.789695.n4.nabble.com/R-software-installation-problem-td4659556.html Thanks, Andy Dr Andy Siddaway Registered Clinical Psychologist/ MRC Clinical Research

Re: [R] rpart and randomforest results

2014-04-07 Thread Liaw, Andy
, you really want to make sure the settings in the two are as close as possible. Also, how did you compute the pseudo R2, on test set, or some other way? Best, Andy -Original Message- From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On Behalf Of Schillo, Sonja

Re: [R] randomForest warning: The response has five or fewer unique values. Are you sure you want to do regression?

2014-03-24 Thread Liaw, Andy
he response has fewer than five distinct values. It may be legitimate regression data, and if so you can safely ignore the warning (that's why it's not an error). It's there to catch the cases when people try to do classification with class labels 1, 2, ..., k and forgot to make

Re: [R] Variable importance - ANN

2013-12-04 Thread Liaw, Andy
You can try something like this: http://pubs.acs.org/doi/abs/10.1021/ci050022a Basically similar idea to what is done in random forests: permute predictor variable one at a time and see how much that degrades prediction performance. Cheers, Andy -Original Message- From: r-help-boun

Re: [R] interpretation of MDS plot in random forest

2013-12-02 Thread Liaw, Andy
Yes, that's part of the intention anyway. One can also use them to do clustering. Best, Andy -Original Message- From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On Behalf Of Massimo Bressan Sent: Monday, December 02, 2013 6:34 AM To: r-help@r-projec

Re: [R] How do I extract Random Forest Terms and Probabilities?

2013-12-02 Thread Liaw, Andy
#2 can be done simply with predict(fmi, type="prob"). See the help page for predict.randomForest(). Best, Andy -Original Message- From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On Behalf Of arun Sent: Tuesday, November 26, 2013 6:57 PM To: R help S

Re: [R] Split type in the RandomForest package

2013-11-20 Thread Liaw, Andy
Classification trees use the Gini index, whereas the regression trees use sum of squared errors. They are "hard-wired" into the C/Fortran code, so not easily changeable. Best, Andy -Original Message- From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org]

Re: [R] What is the difference between Mean Decrease Accuracy produced by importance(foo) vs foo$importance in a Random Forest Model?

2013-11-19 Thread Liaw, Andy
The difference is importance(..., scale=TRUE). See the help page for detail. If you extract the $importance component from a randomForest object, you do not get the scaling. Best, Andy -Original Message- From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On

Re: [R] FW: Nadaraya-Watson kernel

2013-11-07 Thread Liaw, Andy
="epan", bandwidth=.1) > plot(x, y) > lines(f, lwd=2) Andy -Original Message- From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On Behalf Of Ms khulood aljehani Sent: Tuesday, November 05, 2013 9:49 AM To: r-h...@stat.math.ethz.ch Subject: [R] FW:

[R] Override setClass and setMethod in a package R 3.0.1

2013-07-11 Thread Andy Pranata
l="numeric",* * type="character",* * desc="character"* * )* *)* * * *setMethod("*", * * signature(e1 = "numeric", e2 = "AAA"), * * definition=function (e1, e2) {* * if (e2@type == "double"){* * e2@val = e2@val * (

[R] Lattice different colours for bars

2013-06-12 Thread Andrew McFadden (Andy)
Hi all Perhaps this is torturous methodology. I was trying to use lattice to produce a barchart showing the number positive and negative over time. I wasn't quite sure how create a different colour for values of arbo$Ikeda in the example below ie red for ikeda and green for neg. library(resha

Re: [R] SVD on very large data matrix

2013-04-08 Thread Andy Cooper
als with rather smaller data matrices than the ones I have, so it is unclear how it would perform and scale. So, no one has direct experience running irlba on a data matrix as large as 500,000 x 1,000 or larger? kind regards Andy From: Berend Hasselman

[R] SVD on very large data matrix

2013-04-08 Thread Andy Cooper
advice on what R-packages are available to perform such a task, what the RAM requirement is, and indeed what would be the state-of-the-art in terms of numerical algorithms and programming language to use to accomplish this task. with many thanks in advance, Andy Cooper [[alternative HTML

[R] Maximum likelihood estimation of ARMA(1,1)-GARCH(1,1)

2013-04-08 Thread Andy Yeh
Hello Following some standard textbooks on ARMA(1,1)-GARCH(1,1) (e.g. Ruey Tsay's Analysis of Financial Time Series), I try to write an R program to estimate the key parameters of an ARMA(1,1)-GARCH(1,1) model for Intel's stock returns. For some random reason, I cannot decipher what is wrong with

Re: [R] Creating 3d partial dependence plots

2013-03-20 Thread Liaw, Andy
It needs to be done "by hand", in that partialPlot() does not handle more than one variable at a time. You need to modify its code to do that (and be ready to wait even longer, as it can be slow). Andy -Original Message- From: r-help-boun...@r-project.org [mailto:r-help

Re: [R] R software installation problem

2013-02-25 Thread Andy Siddaway
rface to help. Type 'q()' to quit R. This ‘Trick or Treat’ message also appears (in the Console box) when I downloaded RStudio. Any tips or guidance on resolving this problem would be really appreciated! Many thanks, Andy Siddaway On 25 February 2013 00:10, Sarah Gosl

[R] R software installation problem

2013-02-24 Thread Andy Siddaway
what I've done clearer. Basically, R doesn't seem to be installing correctly and I can't figure out why. It's probably a simple error which a non-(complete)-novice would notice. Thanks very much, Andy Siddaway Trainee Clinical Psycholo

[R] Getting WinBUGS Leuk example to work from R using R2winBUGS

2013-02-17 Thread Andy Cox
I am trying to learn to use winBUGS from R, I have experience with R. I have managed to successfully run a simple example from R with no problems. I have been trying to run the Leuk: Survival from winBUGS examples Volume 1. I have managed to run this from winBUGS GUI with no problems. My problem is

Re: [R] Different results from random.Forest with test option and using predict function

2012-12-04 Thread Liaw, Andy
he randomForest objects to see if they are the same. At least the first tree in both should be identical. Andy -Original Message- From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On Behalf Of tdbuskirk Sent: Monday, December 03, 2012 6:31 PM To: r-help@r-project.

Re: [R] How do I make R randomForest model size smaller?

2012-12-04 Thread Liaw, Andy
uld be avoided with large datasets. Andy -Original Message- From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On Behalf Of John Foreman Sent: Monday, December 03, 2012 3:43 PM To: r-help@r-project.org Subject: [R] How do I make R randomForest model size smaller?

Re: [R] Partial dependence plot in randomForest package (all flat responses)

2012-11-26 Thread Liaw, Andy
Not unless we have more information. Please read the Posting Guide to see how to make it easier for people to answer your question. Best, Andy -Original Message- From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On Behalf Of Oritteropus Sent: Thursday, November

Re: [R] Random Forest for multiple categorical variables

2012-10-17 Thread Liaw, Andy
How about taking the combination of the two? E.g., gamma = factor(paste(alpha, beta1, sep=":")) and use gamma as the response. Best, Andy -Original Message- From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On Behalf Of Gyanendra Pokharel Sen

Re: [R] Random Forest - Extract

2012-10-03 Thread Liaw, Andy
set type="votes" and norm.votes=FALSE, you will get the counts instead of proportions. Best, Andy -Original Message- From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On Behalf Of Lopez, Dan Sent: Wednesday, September 26, 2012 9:05 PM To: R help (r-help@r-pr

Re: [R] interpret the importance output?

2012-08-29 Thread Liaw, Andy
then divide by the SD of these differences. With that, I hope it's clear that only v2 and v4 in your example are potentially "important". Best, Andy -Original Message- From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On Behalf Of Johnathan Mercer S

[R] Significance of interaction depends on factor reference level - lmer/AIC model averaging

2012-06-30 Thread Andy Robertson
Dear R users, I am using lmer combined with AIC model selection and averaging (in the MuMIn package) to try and assess how isotope values (which indicate diet) vary within a population of animals. I have multiple measures from individuals (variable 'Tattoo') and multiple individuals within

Re: [R] DCC-GARCH model

2012-06-28 Thread andy
Hello Marcin, did you get the answer to your questions. I have the same questions and would appreciate your help if you found the answers. Thanks, Ankur -- View this message in context: http://r.789695.n4.nabble.com/DCC-GARCH-model-tp3524387p4634776.html Sent from the R help mailing list archi

[R] Multivariate P-GARCH Model

2012-06-27 Thread andy
Hi, I am trying to estimate a multivariate P-GARCH model for two factors x&y. I have selected p-garch to study the leverage effects. Is there any toolkit in R that can help me do this? Thanks, Andy -- View this message in context: http://r.789695.n4.nabble.com/Multivariate-P-GARCH-M

Re: [R] Stratified Sampling with randomForest Regression

2012-06-01 Thread Liaw, Andy
Yes, you need to modify both the R and the underlying C code. It's the the source package on CRAN (the .tar.gz file). Andy -Original Message- From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On Behalf Of Josh Browning Sent: Friday, June 01, 2012 10:48

Re: [R] Random Forest Classification_ForestCombination

2012-05-29 Thread Liaw, Andy
As long as you can remember that the summaries such as variable importance, OOB predictions, and OOB error rates are not applicable, I think that should be fine. Andy -Original Message- From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On Behalf Of Nikita Desai

Re: [R] Question about random Forest function in R

2012-05-29 Thread Liaw, Andy
Hi Kelly, The function has a limitation that it cannot handle any column in your "x" that is a categorical variable with more than 32 categories. One possibility is to see if you can "bin" some of the categories into one to get below 32 categories. Andy -Original

[R] lattice: add a marginal histogram on top of the colorkey of a levelplot?

2012-05-29 Thread Andy Bunn
Lattice experts: Can you think of a way to produce a levelplot as below and then add a histogram of the z variable to the top margin of the plot that would sit on top of the color key? x <- seq(pi/4, 5 * pi, length.out = 100) y <- seq(pi/4, 5 * pi, length.out = 100) r <- as.vector(sqrt(outer(

Re: [R] Random forests prediction

2012-05-14 Thread Liaw, Andy
That's not how RF works at all. The setting of mtry is irrelevant to this. Andy -Original Message- From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On Behalf Of matt Sent: Monday, May 14, 2012 10:22 AM To: r-help@r-project.org Subject: Re: [R] Random fo

Re: [R] No Data in randomForest predict

2012-05-14 Thread Liaw, Andy
.Length = 1, : missing values in newdata Andy -Original Message- From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On Behalf Of Jennifer Corcoran Sent: Saturday, May 05, 2012 5:17 PM To: r-help@r-project.org Subject: [R] No Data in randomForest predict I would li

Re: [R] Random forests prediction

2012-05-14 Thread Liaw, Andy
it seems to be worth repeating: Don't use the training set for evaluating models: that almost never make sense. Andy -Original Message- From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On Behalf Of matt Sent: Friday, May 11, 2012 3:43 PM To: r-help@r-proj

[R] Hmisc::xYplot - text on xaxis

2012-04-26 Thread Andy Bunn
Hello, I'm making a simple plot using xYplot in the Hmisc library and having problems with labeling the values on the x-axis. Using the reproducible example below, how can I have the text (jan, feb,mar, etc.) in place of 1:12. Thanks, AB x <- c(seq(0,0.5,by=0.1),seq(0.5,0,by=-0.1)) ci <- rnor

Re: [R] Partial Dependence and RandomForest

2012-04-17 Thread Liaw, Andy
. Andy -Original Message- From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On Behalf Of jmc Sent: Friday, April 13, 2012 11:20 AM To: r-help@r-project.org Subject: Re: [R] Partial Dependence and RandomForest Thank you Andy. I obviously neglected to read into the

Re: [R] Execution speed in randomForest

2012-04-13 Thread Liaw, Andy
Without seeing your code, it's hard to say much more, but do avoid using formula when you have large data. Andy -Original Message- From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On Behalf Of Jason & Caroline Shaw Sent: Friday, April 06, 2012 1:20 P

Re: [R] Partial Dependence and RandomForest

2012-04-13 Thread Liaw, Andy
Please read the help page for the partialPlot() function and make sure you learn about all its arguments (in particular, "which.class"). Andy -Original Message- From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On Behalf Of jmc Sent: Wednesday, April

Re: [R] loess function take

2012-04-13 Thread Liaw, Andy
Alternatively, use only a subset to run loess(), either a random sample or something like every other k-th (sorted) data value, or the quantiles. It's hard for me to imagine that that many data points are going to improve your model much at all (unless you use tiny span). Andy From: r

Re: [R] Imputing missing values using "LSmeans" (i.e., population marginal means) - advice in R?

2012-04-05 Thread Liaw, Andy
Don't know how you searched, but perhaps this might help: https://stat.ethz.ch/pipermail/r-help/2007-March/128064.html > -Original Message- > From: r-help-boun...@r-project.org > [mailto:r-help-boun...@r-project.org] On Behalf Of Jenn Barrett > Sent: Tuesday, April 03, 2012 1:23 AM > To

Re: [R] Question about randomForest

2012-04-04 Thread Liaw, Andy
o make predictions on the training set. > Opens another issue, which is if newdata is close but not > exactly oldata, > then you get overfitted results? Possibly, depending on how "close" the new data are to the training set. This applies to nearly _ALL_ methods, not just RF.

Re: [R] Memory limits for MDSplot in randomForest package

2012-03-30 Thread Liaw, Andy
s taking up the time. My suggestion is to see if you can find some efficient ways of doing eigen decomposition on such large matrices. You might be able to make the proximity matrix sparse (e.g., by thresholding), and see if there are packages that can do the decomposition on the spar

Re: [R] fitted values with locfit

2012-03-28 Thread Liaw, Andy
th function, not the sum of two univariate smooths. If the latter is what you want, use packages that fits additive models. Best, Andy > -Original Message- > From: r-help-boun...@r-project.org > [mailto:r-help-boun...@r-project.org] On Behalf Of Soberon > Velez, Alexandr

[R] job opening at Merck Research Labs, NJ USA

2012-03-20 Thread Liaw, Andy
The Biometrics Research department at the Merck Research Laboratories has an open position to be located in Rahway, New Jersey, USA: This position will be responsible for imaging and bio-signal biomarkers projects including analysis of preclinical, early clinical, and experimental medicine imag

Re: [R] Using caegorical variables in package randomForest.

2012-03-13 Thread Liaw, Andy
The way to represent categorical variables is with factors. See ?factor. randomForest() will handle factors appropriately, as most modeling functions in R. Andy > -Original Message- > From: r-help-boun...@r-project.org > [mailto:r-help-boun...@r-project.org] On Behalf Of

Re: [R] Help on reshape function

2012-03-06 Thread Liaw, Andy
Just using the reshape() function in base R: df.long = reshape(df, varying=list(names(df)[4:7]), direction="long") This also gives two extra columns ("time" and "id") can can be dropped. Andy > -Original Message- > From: r-help-boun...@r-pro

Re: [R] Good and modern Kernel Regression package in R with auto-bandwidth?

2012-02-29 Thread Liaw, Andy
That's why I said you need the book. The details are all in the book. From: Michael [mailto:comtech@gmail.com] Sent: Thursday, February 23, 2012 1:49 PM To: Liaw, Andy Cc: r-help Subject: Re: [R] Good and modern Kernel Regression package in R with

Re: [R] Good and modern Kernel Regression package in R with auto-bandwidth?

2012-02-23 Thread Liaw, Andy
el object, followed by predicting new data using that fitted model object" very well because of it's local nature. Think of k-nn classification, which has similar problem: The "model" needs to be computed for every data point you want to predict. Andy __

Re: [R] Good and modern Kernel Regression package in R with auto-bandwidth?

2012-02-23 Thread Liaw, Andy
ok to get most mileage out of it though. Andy From: Michael [mailto:comtech@gmail.com] Sent: Thursday, February 23, 2012 12:25 AM To: Liaw, Andy Cc: Bert Gunter; r-help Subject: Re: [R] Good and modern Kernel Regression package in R with auto-bandwidth?

Re: [R] Good and modern Kernel Regression package in R with auto-bandwidth?

2012-02-22 Thread Liaw, Andy
ing bandwidth, using plug-in methods or CV-type. The last I check, the jury is still out. Andy > -Original Message- > From: r-help-boun...@r-project.org > [mailto:r-help-boun...@r-project.org] On Behalf Of Bert Gunter > Sent: Wednesday, February 22, 2012 6:03 PM > To: Mic

Re: [R] indexing by empty string (was RE: Error in predict.randomForest ... subscript out of bounds with NULL name in X)

2012-02-01 Thread Liaw, Andy
don't want a function to do that, either. That's why I need to look for a workaround. Using which() seems rather clumsy for the purpose, as I need to combine those with the non-empty ones, and preserving ordering would be a mess. Andy > -Original Message- > From: r-

Re: [R] randomForest: proximity for new objects using an existing rf

2012-02-01 Thread Liaw, Andy
with nodes=TRUE, then compute the proximity "by hand" by counting how often any given pair landed in the same terminal node of each tree. Andy > -Original Message- > From: r-help-boun...@r-project.org > [mailto:r-help-boun...@r-project.org] On Behalf Of Kilian > S

Re: [R] Random Forest Package

2012-02-01 Thread Liaw, Andy
You should be able to use the Rgui menu to install packages. Andy > -Original Message- > From: r-help-boun...@r-project.org > [mailto:r-help-boun...@r-project.org] On Behalf Of Niratha > Sent: Wednesday, February 01, 2012 5:16 AM > To: r-help@r-project.org > Subject:

[R] indexing by empty string (was RE: Error in predict.randomForest ... subscript out of bounds with NULL name in X)

2012-01-31 Thread Liaw, Andy
I'm not exactly sure if this is a problem with indexing by name; i.e., is the following behavior by design? The problem is that names or dimnames that are empty seem to be treated differently, and one can't index by them: R> junk = 1:3 R> names(junk) = c("a", "b", "") R> junk a b 1 2 3 R> j

Re: [R] Bivariate Partial Dependence Plots in Random Forests

2012-01-31 Thread Liaw, Andy
were used. Best, Andy > -Original Message- > From: r-help-boun...@r-project.org > [mailto:r-help-boun...@r-project.org] On Behalf Of Lucie Bland > Sent: Friday, January 27, 2012 5:01 AM > To: r-help@r-project.org > Subject: [R] Bivariate Partial Dependence Plots

Re: [R] Variable selection based on both training and testing data

2012-01-30 Thread Liaw, Andy
Variable section is part of the training process-- it chooses the model. By definition, test data is used only for testing (evaluating chosen model). If you find a package or function that does variable selection on test data, run from it! Best, Andy > -Original Message- > F

[R] contour(): Thickness contour labels

2012-01-21 Thread Andy Richling
, but the size is to big, so i can't see anything ;) Is there any command to rise the thickness of contour labels like the "lwd=??" command for the width of lines? Thanks for help :) Andy __ R-help@r-project.org mailing list https://s

Re: [R] tm package, custom reader

2012-01-14 Thread Andy Adamiec
On Sat, Jan 14, 2012 at 12:41 PM, Milan Bouchet-Valat wrote: > Le samedi 14 janvier 2012 à 12:24 -0600, Andy Adamiec a écrit : > > Hi Milan, > > > > > > The xml solr files are not in a typical format, here is an example > > http://www.omegahat.org/RSXML/solr.xml

Re: [R] tm package, custom reader

2012-01-14 Thread Andy Adamiec
On Sat, Jan 14, 2012 at 12:41 PM, Milan Bouchet-Valat wrote: > Le samedi 14 janvier 2012 à 12:24 -0600, Andy Adamiec a écrit : > > Hi Milan, > > > > > > The xml solr files are not in a typical format, here is an example > > http://www.omegahat.org/RSXML/solr.xml

Re: [R] What is the function for "smoothing splines with the smoothing parameter selected by generalized maximum likelihood?

2012-01-09 Thread Liaw, Andy
See the gss package on CRAN. Andy > -Original Message- > From: r-help-boun...@r-project.org > [mailto:r-help-boun...@r-project.org] On Behalf Of ali_protocol > Sent: Monday, January 09, 2012 7:13 AM > To: r-help@r-project.org > Subject: [R] What is the function for

Re: [R] explanation why RandomForest don't require a transformations (e.g. logarithmic) of variables

2011-12-05 Thread Liaw, Andy
You should see no differences beyond what you'd get by running RF a second time with a different random number seed. Best, Andy From: gianni lavaredo [mailto:gianni.lavar...@gmail.com] Sent: Monday, December 05, 2011 2:19 PM To: Liaw, Andy Cc: r-h

Re: [R] explanation why RandomForest don't require a transformations (e.g. logarithmic) of variables

2011-12-05 Thread Liaw, Andy
data (although difference should be slight). Transformation of the response variable is quite another thing. RF needs it just as much as others if the situation calls for it. Cheers, Andy > -Original Message- > From: r-help-boun...@r-project.org > [mailto:r-help-boun...@r-project.

Re: [R] Random Forests in R

2011-12-01 Thread Liaw, Andy
. Currently the only Fortran part is the node splitting in classification trees. Andy > -Original Message- > From: r-help-boun...@r-project.org > [mailto:r-help-boun...@r-project.org] On Behalf Of Peter Langfelder > Sent: Thursday, December 01, 2011 12:33 AM > To: Axel Urbiz

Re: [R] Question about randomForest

2011-11-28 Thread Liaw, Andy
ld drop to (near) 0 rather quickly, as each tree is intentially overfitting its training set. Andy > -Original Message- > From: r-help-boun...@r-project.org > [mailto:r-help-boun...@r-project.org] On Behalf Of Weidong Gu > Sent: Sunday, November 27, 2011 10:56 AM > To

Re: [R] tuning random forest. An unexpected result

2011-11-23 Thread Liaw, Andy
the MSE estimates are within a few percent of each other, you're likely just chasing noise in the evaluation process. Just my $0.02... Best, Andy > -Original Message- > From: r-help-boun...@r-project.org > [mailto:r-help-boun...@r-project.org] On Behalf Of gian

Re: [R] arima.sim: innov querry

2011-11-22 Thread Andy Bunn
> On 22/11/11 13:04, Andy Bunn wrote: > > Apologies for thickness - I'm sure that this operates as documented > and with good reason. However... > > > > My understanding of arima.sim() is obviously imperfect. In the > example below I assume that x1 and x2 are simila

[R] arima.sim: innov querry

2011-11-21 Thread Andy Bunn
Apologies for thickness - I'm sure that this operates as documented and with good reason. However... My understanding of arima.sim() is obviously imperfect. In the example below I assume that x1 and x2 are similar white noise processes with a mean of 5 and a standard deviation of 1. I thought x

Re: [R] equal spacing of the polygons in levelplot key (lattice)

2011-11-16 Thread Andy Bunn
> -Original Message- > From: Dennis Murphy [mailto:djmu...@gmail.com] > Sent: Wednesday, November 16, 2011 11:22 AM > To: Andy Bunn > Cc: r-help@r-project.org > Subject: Re: [R] equal spacing of the polygons in levelplot key > (lattice) > > OK, how about t

Re: [R] equal spacing of the polygons in levelplot key (lattice)

2011-11-16 Thread Andy Bunn
> -Original Message- > From: Dennis Murphy [mailto:djmu...@gmail.com] > Sent: Tuesday, November 15, 2011 8:54 PM > To: Andy Bunn > Cc: r-help@r-project.org > Subject: Re: [R] equal spacing of the polygons in levelplot key > (lattice) > > Hi: > > Does

[R] equal spacing of the polygons in levelplot key (lattice)

2011-11-15 Thread Andy Bunn
Given the example: R> (levs <- quantile(volcano,c(0,0.1,0.5,0.9,0.99,1))) 0% 10% 50% 90% 99% 100% 94 100 124 170 189 195 R> levelplot(volcano,at=levs) How can I make the key categorical with the size of the divisions equally spaced in the key? E.g., five equal size rectangles wit

Re: [R] gsDesign

2011-11-15 Thread Liaw, Andy
cular case, I'm quite sure the package maintainer for gsDesign doesn't keep up with R-help.) Best, Andy > -Original Message- > From: r-help-boun...@r-project.org > [mailto:r-help-boun...@r-project.org] On Behalf Of Dongli Zhou > Sent: Monday, November 14, 2011

Re: [R] randomForest - NaN in %IncMSE

2011-09-23 Thread Liaw, Andy
You are not giving anyone much to go on. Please read the posting guide and see how to ask your question in a way that's easier for others to answer. At the _very_ least, show what commands you used, what your data looks like, etc. Andy > -Original Message- > From: r-hel

Re: [R] class weights with Random Forest

2011-09-13 Thread Liaw, Andy
ced data (say 1:100 or worse). If using weighted Gini helps in your situation, by all means do it. I can only say that in the past it didn't give us the result we were expecting. Best, Andy > -Original Message- > From: r-help-boun...@r-project.org > [mailto:r-help-bou

Re: [R] randomForest memory footprint

2011-09-08 Thread Liaw, Andy
an 10 terminal nodes per tree). Best, Andy > -Original Message- > From: r-help-boun...@r-project.org > [mailto:r-help-boun...@r-project.org] On Behalf Of John Foreman > Sent: Wednesday, September 07, 2011 2:46 PM > To: r-help@r-project.org > Subject: [R] randomForest memory foo

Re: [R] convert a splus randomforest object to R

2011-08-09 Thread Liaw, Andy
()/source() as the R Data Import/Export manual suggests? Andy > -Original Message- > From: r-help-boun...@r-project.org > [mailto:r-help-boun...@r-project.org] On Behalf Of Zhiming Ni > Sent: Tuesday, August 02, 2011 8:11 PM > To: r-help@r-project.org > Subject: [

Re: [R] randomForest partial dependence plot variable names

2011-08-09 Thread Liaw, Andy
[i]), ylim=c(30, 70)) } par(op) Andy > -Original Message- > From: r-help-boun...@r-project.org > [mailto:r-help-boun...@r-project.org] On Behalf Of Katharine Miller > Sent: Thursday, August 04, 2011 4:38 PM > To: r-help@r-project.org > Subject: [R] randomForest p

Re: [R] squared "pie chart" - is there such a thing?

2011-07-25 Thread Liaw, Andy
Has anyone suggested mosaic displays? That's the closest I can think of as a "square pie chart"... > -Original Message- > From: r-help-boun...@r-project.org > [mailto:r-help-boun...@r-project.org] On Behalf Of Naomi Robbins > Sent: Sunday, July 24, 2011 7:09 AM > To: Thomas Levine > Cc

Re: [R] Bounding ellipse for any set of points

2011-07-20 Thread Andy Lyons
The mvee() function is intended to be released under the BSD license. Copyright (c) 2009, Nima Moshtagh Copyright (c) 2011, Andy Lyons All rights reserved. http://www.opensource.org/licenses/bsd-license.php Redistribution and use in source and binary forms, with or without modification, are

  1   2   3   4   >