As Bert advised correctly, this is not an R programming question. There is
some misunderstanding on how training//test data work together
in predictions. Suppose your test data has only one class. Therefore, you can
get the following rate by betting on the majority class every time, again
using dat
Thank you for your comment! This tree function is from the tree package.
Although it might be a pure statistical question, it could be related to
how the tree function is used. I will explore the site that you suggested.
But if there is anyone who can figure it out off the top of their head, I'd
ve
Purely statistical questions -- as opposed to R programming queries -- are
generally off topic here.
Here is where they are on topic: https://stats.stackexchange.com/
Suggestion: when you post, do include the package name where you get tree()
from, as there might be
more than one with this functi
Dear all R experts,
I have a question about using cross-validation to assess results estimated
from a classification tree model. I annotated what each line does in the R
code chunk below. Basically, I split the data, named usedta, into 70% vs.
30%, with the training set having 70% and the test set
I was impressed by Jim's effort.
So, I thought I'd try to produce an exploratory plot.
I've adapted some of his code.
The following script produces a heatmap for a cylindrical density estimate.
Bright areas are (mathematical) regions of high density.
However, the interpretation is complicated by t
Sorry, I should know better:
rollmean<-function(x,width=2) {
lenx<-length(x)
result<-rep(NA,lenx)
for(i in 1:lenx) {
chunk<-i:(i+width-1)
if(i(lenx-width)) chunk<-c(i:lenx,rep(lenx,i-(width-1)))
result[i]<-mean(x[chunk])
}
return(result)
}
I forgot to replace this with:
library(zoo)
r
? source("../rollmean.R") ?
On May 18, 2020 4:11:52 AM PDT, Jim Lemon wrote:
>Hi Stefano,
>If I understand your request, this may also help, Uses the same data
>transformations as my previous email.
>
>png("SS_foehn.png")
>plot(mydf$data_POSIX,
> ifelse(mydf$main_dir %in% c("WSW","SW"),mydf$max_s
---
>
>
>Da: Jeff Newmiller [jdnew...@dcn.davis.ca.us]
>Inviato: sabato 16 maggio 2020 21.04
>A: Stefano Sofia; Jim Lemon; r-help mailing list
>Oggetto: RE: [R] Classification of wind events
>
>Please run your code before posting it... y
Hi Stefano,
If I understand your request, this may also help, Uses the same data
transformations as my previous email.
png("SS_foehn.png")
plot(mydf$data_POSIX,
ifelse(mydf$main_dir %in% c("WSW","SW"),mydf$max_speed,NA),
type="b",main="Wind speed (WSW or SW) by time",
xlab="Time of day",ylab="W
: Stefano Sofia; Jim Lemon; r-help mailing list
Oggetto: RE: [R] Classification of wind events
Please run your code before posting it... you forgot the quotes in your
main_dir column.
first_day_POSIX <- as.POSIXct("2020-02-19-00-00", format="%Y-%m-%d-%H-%M")
last_day_POSIX
t_day_POSIX, by="10
> > > min"))
> > >
> > > mydf$main_dir <- c(WSW, WSW, SW, SW, W, WSW, WSW, WSW, W, W, SW, WSW,
> > > SSW, S, SW, SW, WSW, WNW, W, WSW, WSW, SE, SE, SE, NW, NNE, ENE, SE, NNW,
> > > NW, NW, NW, NW, NW, NW, NE, NW, NW, NW,
ur attention
> > Stefano
> >
> >
> > (oo)
> > --oOO--( )--OOo----
> > Stefano Sofia PhD
> > Civil Protection - Marche Region
> > Meteo Section
> > Snow Section
> > Via del Colle Ameno 5
> > 60126 Torrette di Ancona, A
Colle Ameno 5
> 60126 Torrette di Ancona, Ancona
> Uff: 071 806 7743
> E-mail: stefano.so...@regione.marche.it
> ---Oo-oO
>
>
> Da: Jim Lemon [drjimle...@gmail.com]
> Inviato: mercoledì 13 maggio 2020 11.01
>
, NW, NW, NW, N, WNW, NW, NNW,
>NNW, NW, NW, NW, WNW, ESE, W, WSW, SW, SW, SW, WSW, SW, S, S, SSW, SW,
>WSW, WSW, WSW, WSW, WSW, WSW, WSW, SW, WSW, WSW, WSW, WSW, SW, SW, WSW,
>WSW, WSW, WSW, WSW, SW, SW, SW, SW, SW, SW, SW, SW, SW, WSW, WSW, WSW,
>WSW, SW, SW, SW, SW, WSW, SW, SW, SW, SW, SW, WSW, SW, SW, W, WSW, WSW,
>SSW, S, WNW, SW
__
Da: Jim Lemon [drjimle...@gmail.com]
Inviato: mercoledì 13 maggio 2020 11.01
A: Stefano Sofia; r-help mailing list
Oggetto: Re: [R] Classification of wind events
Hi Stefano,
Given only one observation point you will find it difficult. If your
automatic weather station is in the low area wher
Hi Stefano,
Given only one observation point you will find it difficult. If your
automatic weather station is in the low area where the foehn wind is
felt, it can only be distinguished from a dry katabatic wind if the
upwind conditions are known. There is a similar but milder version of
this in eas
Please make a reproducible R example of input and output.
On May 12, 2020 1:11:41 AM PDT, Stefano Sofia
wrote:
>Dear R list users,
>I am aware that this question is not strictly related, at the present
>moment, to R code and it is more general. Please forgive me, but I need
>to share my thoughts
Dear R list users,
I am aware that this question is not strictly related, at the present moment,
to R code and it is more general. Please forgive me, but I need to share my
thoughts with you.
Foehn conditions on the southern slope of Alps happen with strong northerly
flows that impact perpendic
On Tue, 13 Jun 2017, Dimitrie Siriopol via R-help wrote:
I am trying to use the CART in a survival analysis. I have three variables of
interest (all 3 ordinal - x, y and z, each of them with 5 categories) from
which I want to make smaller groups (just an example 1st category from X
variable w
1. Please read and follow the posting guide below. Your post does not
meet the guidelines.
2. Search before posting!
e.g. on rseek.org: "Regression trees survival analysis"
in which you will find:
https://cran.r-project.org/web/views/MachineLearning.html
-- Bert
Bert Gunter
"The trouble with
I am trying to use the CART in a survival analysis. I have three variables of
interest (all 3 ordinal - x, y and z, each of them with 5 categories) from
which I want to make smaller groups (just an example 1st category from X
variable with the 2nd and 3rd categories from the Y category and 2, 3
Dear Dr. José Faria,
I think that the best category to put polynomial regressions is single
regressions. Although, in polynomial regressions there are more then one
term as in multiple regressions this is an adjustment consequence, not a
design consequence. So, to me this is sufficient to justify
Dear list,
I'm posting in the R-help list due to:
- Not knowing a better place for it;
- I would like to know the opinion of more specialized people.
What is the best place to classify polynomial regressions (Y = bo +
b1X + b2X^2 + ... + bnX^n): single or multiple linear regression?
Regards,
--
Look at:
State - Space Discrimination and Clustering of. Atmospheric Time Series Data.
Based on Kullback Information Measures. Thomas Bengtsson
If you Google the topic, there are host of other papers too, but the one
meshes with exiting star-space methods.
-Roy
On May 27, 2013, at 4:34 AM, L
Did you have a look at Dynamic Time Warping and dtw package?
Best, E.
On Mon, May 27, 2013 at 01:34:42PM +0200, Lorenzo Isella wrote:
> Dear All,
> Apologies for not posting a code snippet, but I really need a pointer about
> a methodology to look at my data and possibly some R package which can
Dear All,
Apologies for not posting a code snippet, but I really need a pointer about
a methodology to look at my data and possibly some R package which can ease
my task.
I am given a set consisting of several multivariate noisy time series,
let's call it {A}.
Each A_i in {A}, in turn, consists of
Hi,
We got a actuarial question which cannot be solved in Excel, so we are
wondering if R can help us on it.
As the sample table below, variable X has 50 different values and the
weighted Y has a lognormal distribution.
We want to make X into four or five classes, based on the standard
deviation
e Gestão
Universidade Católica Portuguesa / Porto
www.feg.porto.ucp.pt
Date: Mon, 19 Nov 2012 20:53:10 +0100
From: Peter Kupfer
To: Max Kuhn
Cc: "r-help@r-project.org"
Subject: Re: [R] Classification methods - which one?
Message-ID:
Content-Type: text/plain; CHARSET=US-ASCII
Dear
Dear Max,
first: Thanks a lot for your suggestion and the open words about methods in
real life. I guess: Thats my problem.
Regarding my analysis: Yes, thats the problem and I have to coerce to do this
analysis regarding lack of time to start something/other methods.
So you suggest Linear Discr
Dear all,
i searched for some classification methods and I have no glue if i took the
right once.
My problem: I have a matrix with 17000 rows and 33 colums (genes and patients).
The patients are grouped into 3 diseases.
No I want to classify the patients and for sure i want to know which rows are
Hi all
I'm dealing with a supervised binary classification issue. I'd like to use
the GBM package to classify individuals as uninfected/infected. I have 15
times more uninfected than infected individuals.
I was wondering if GBM models suffer in the case of imbalanced class sizes?
I didn't find an
Kai Ying iastate.edu> writes:
>
> Hi,
>I want using zero-inflated negative binomial regression model to
> classify data(a vector of data), that is I want know each observed value is
> more likely belong to the "zero" or "count" distribution(better with
> relative probability). My data is som
Hi,
I want using zero-inflated negative binomial regression model to
classify data(a vector of data), that is I want know each observed value is
more likely belong to the "zero" or "count" distribution(better with
relative probability). My data is some like:
count site samp
1290911
Dear R-Help,
I'm dealing with a supervized binary classification issue. My dataset is
composed of 1500 individuals, living in 600 households. I have
approximately 4000 variables to classify my subjects as
"infected/uninfected".
I was wondering how would it be possible to account for the hierarchi
On Mar 8, 2012, at 3:14 PM, Ajay Askoolum wrote:
Given
studentNumbers<-10;
subjEnglish<-sample(-1:100,studentNumbers,replace=TRUE);
when subEnglish <=0, 'U'
<=39, 'F'
<=49 'D'
<=59, 'C'
<=69, 'B'
?findInterval
> subjEnglish<-sample(-1:100,studentNumbers,replace=TRUE);
> grade <- c(-Inf, 39, 49, 59, 69, 79, 100) # grade break points
> let <- c("U", "F", "D", "C", "B", "A", "A+")[findInterval(subjEnglish, grade)]
> cbind(subjEnglish, let)
subjEnglish let
[1,] "77""B"
[2,] "9
Given
studentNumbers<-10;
subjEnglish<-sample(-1:100,studentNumbers,replace=TRUE);
when subEnglish <=0, 'U'
<=39, 'F'
<=49 'D'
<=59, 'C'
<=69, 'B'
<=79,'A'
Hello everyone!
I'm working with Decision tree and I have doubt about one of the arguments
of "plot.rpart" function:
When we use "uniform=F", the vertical spacing of nodes will be proportional
to the error in the fit.
But, I want to build a scale next my classif tree to show it.
So, how could I
hello, i am so glad to write you.
i am dealing now with writing my M.Sc in Applied Statistics thesis, titled "
Data Mining Classifiers and Predictive Models Validation and Evaluation".
I am planning to compare several DM classifiers like "NN, kNN, SVM, Dtree, and
Naïve Bayes" according to their
This is a probably a daft question, but I would appreciate some help.
I want to attempt to separate groups in a dataset using discriminant
function analysis, and have been using linear discriminant analysis
(lda(klaR)) and canonical discriminant analysis (candisc(candisc)).
# CDA:
iris.mod <- lm(
?predict.rpart
Weidong Gu
On Mon, Aug 8, 2011 at 6:08 PM, Jose Bustos Melo wrote:
> Hello Everyone,
>
> I'm doing a Classification trees with categorical explanatory variables using
> library rpart and I would like to do a prediction for some data imputs. I
> don't know where's a function or h
Hello Everyone,
I'm doing a Classification trees with categorical explanatory variables using
library rpart and I would like to do a prediction for some data imputs. I don't
know where's a function or how can I do it?. Is there someone can help ?? ¿.
Here's the code that I'm using.
library(rp
Dear all, this is not a pure R question, but really about how to set up a
multinomial logistic regression model to do a multi-class classification. I
would really appreciate if any of you would give me some of your thoughts and
recommendation.
Let's say we have 3-class classification problem: A
Hi.
Working with a data set like:
"age", "demographic data (n fields)", "interests(n fields)" has
performed X actions for event Y.
I want to ask how likely is it for another person with his/her age,
demographic data and interests to perform actions for that event.
My query set might be part
People who speak only English and Hebrew (like myself), can't help you.
Consider reposting in English.
Tal
Contact
Details:---
Contact me: tal.gal...@gmail.com | 972-52-7275845
Read me: www.talgalili.com (Hebrew) | www.biostatis
# Classification Tree with rpart
library(rpart)
# grow tree
fit <- rpart(y~ x1 + x2+ x3 + x4+ x5,method="class", data=data)
printcp(fit) # display the results
plotcp(fit) # visualize cross-validation results
summary(fit) # detailed summary of splits
# plot tree
plot(fit, uniform=TRUE,main="Clas
On Mon, Jun 7, 2010 at 9:05 AM, sidahmed BENABDERRAHMANE
wrote:
> Dear all,
>
> I have a problem when using some classification functions (Kmeans, PAM,
> FANNY...) with a distance matrix, and i would to understand how it proceeds
> for the positioning of centroids after one execution step.
>
> In
Dear all,
I have a problem when using some classification functions (Kmeans, PAM,
FANNY...) with a distance matrix, and i would to understand how it
proceeds for the positioning of centroids after one execution step.
In fact, in the classical formulation of the algorithm, after each step,
t
Hi,
I've a problem with growing a classification tree. I have 26427 observations
and divided into 4 groups.
A=17866
B=6873
C=1556
D=132
The problems is when I want to plot the tree, the result appear there is no
splitnodes for the tree. What should I do now? Is there any ideas how to build
a
Hi all,
I thought I'd just point out, to those not having yet seen this, that
today there was a classification challenge posted for astronomy.
The web-site is http://www.hep.anl.gov/SNchallenge/
[I have nothing to do with this project so don't ask me any details!]
Basically the idea behind is t
Thanks to both of you. Problem's solved. Greatly appreciated. :]
Chris
Chris Li wrote:
>
> Hi all,
>
> I have got a dataset like the following:
>
> 3
> 5
> 7
> 3
> 9
> 7
>
>
> i.e. random numbers with some repeats.
>
> I want R to classify them for me. E.g. every row that has a value of 3
>
On Fri, 20 Nov 2009 10:43:19 +0100 smu wrote:
> x <- c(3,5,7,3,9,7)
> > as.numeric(as.factor(x))
> [1] 1 2 3 1 4 3
While that is my preferred solution too, this may be easier to
understand:
match(x,sort(unique(x)))
(It is basically what 'factor' does.)
The question wasn't quite clear, though
Hello,
x <- c(3,5,7,3,9,7)
> as.numeric(as.factor(x))
[1] 1 2 3 1 4 3
regards,
stefan
On Fri, Nov 20, 2009 at 12:02:59AM -0800, Chris Li wrote:
>
> Hi all,
>
> I have got a dataset like the following:
>
> 3
> 5
> 7
> 3
> 9
> 7
>
>
> i.e. random numbers with some repeats.
>
> I want R to
Hi all,
I have got a dataset like the following:
3
5
7
3
9
7
i.e. random numbers with some repeats.
I want R to classify them for me. E.g. every row that has a value of 3 will
be asigned a value of 1, and every row that has a value will be asigned a
value of 2 etc.
I want R to return the fol
Hello everybody,
I'm looking for a way to build an RBF classification network with R but I
can't find any.
I know there is the 'neural' package, but apparently the RBF networks I can
build with that are for approximation tasks only. Is there any package I can
use to build an RBF network for a class
Thanks a lot!
Yet is there a way to incorporate the lifting score into Cross
Validation, not just a plot?
Thanks again!
On Wed, Jun 24, 2009 at 9:07 AM, Tobias Sing wrote:
> Michael,
>
> a lift chart for evaluating binary scoring classifiers, as I
> understand it, plots...
>
> lift score: P(Yhat
Michael,
a lift chart for evaluating binary scoring classifiers, as I
understand it, plots...
lift score: P(Yhat = + | Y = +)/P(Yhat = +)
against
rate of rate of positive predictions: P(Yhat = +).
...across the continuum of possible cutoffs. If you want to do this,
here is how you would do this
Maybe the packages caret,RWeka and ROCR are usefuel starting points.
Cheers, Christian
Hi all,
Could anybody give me some pointers to Cross Validation using Lifting
Score as error function, as commonly used in data-mining and
classification field in marketing and e-commerce research?
Thanks!
Hi all,
Could anybody give me some pointers to Cross Validation using Lifting
Score as error function, as commonly used in data-mining and
classification field in marketing and e-commerce research?
Thanks!
__
R-help@r-project.org mailing list
https://s
Frank E Harrell Jr wrote
>Armida,
>
>I regret putting CTABLE as an option on the old SAS PROC LOGIS which was
>a basis for PROC LOGISTIC. Classification tables are arbitrary and
>misleading so I would stay away from them.
>
>You might build a model with and without the variable of interest and
Carbajal, Armida J wrote:
Prof. Harrell,
My name is Armida Carbajal, I'm a graduate student intern at Sandia National
Laboratories (SNL) and am conducting some research for my thesis project at the
University of New Mexico in Statistics for SNL.
My project entails a logistic regression and I
Prof. Harrell,
My name is Armida Carbajal, I'm a graduate student intern at Sandia National
Laboratories (SNL) and am conducting some research for my thesis project at the
University of New Mexico in Statistics for SNL.
My project entails a logistic regression and I wanted to create a
classifi
Hi List,
I want to do classification using neural network (e.g Package neural, Amore
etc). How these packages handles nominal variables. Is there any specific
coding we have to use or we have to do dummy coding for each nominal
variable. Any help will be appreciated. Is any body know how to do
c
Achim Zeileis wrote:
On Thu, 20 Nov 2008, David Kaplan wrote:
Hi all,
I'm looking for a program that will take the predicted probabilities
from a logistic regression using glm{stats}, dichotomize them
according to a threshold that I can control, and then use them to form
sensitivity, specif
On Thu, 20 Nov 2008, David Kaplan wrote:
Hi all,
I'm looking for a program that will take the predicted probabilities from a
logistic regression using glm{stats}, dichotomize them according to a
threshold that I can control, and then use them to form sensitivity,
specificity, false pos and f
Hi all,
I'm looking for a program that will take the predicted probabilities
from a logistic regression using glm{stats}, dichotomize them according
to a threshold that I can control, and then use them to form
sensitivity, specificity, false pos and false neg rates.
Thanks in advance.
David
66 matches
Mail list logo