Oh and don't forget:
#first line of code, bring dplyr into memory after that package has been
installed.
library(dplyr)
On Wednesday, November 27th, 2024 at 12:05 PM, Tom Woolman
wrote:
>
>
> Check out the dplyr package, specifically the mutate function.
>
> # Cre
Check out the dplyr package, specifically the mutate function.
# Create new column based on existing column value
df <- df %>% mutate(FirstDay = if(ID = 2, 5))
df
Repeat as needed to capture all of the day/firstday combinations you want to
account for.
Like everything else in R, there are
Imagine that it's the year 2022 and you don't know how to look up
information about performing a Kruskal-Wallis H test.
It would take you longer to join the listserv and then write such a
cokamemie email than to open the stats textbook you are supposed to have
for the course, much less doing
Some ideas:
You could create a cluster model with k=3 for each of the 3 variables,
to determine what constitutes high/medium/low centroid values for each
of the 3 types of plant types. Centroid values could then be used as the
upper/lower boundary ranges for high/med/low.
Or utilize a hist
Everyone needs to speak English exactly like I do or else they're doing
it wrong
:)
By I pronounce CRAN the same way that I pronounce the first half of
cranberry.
On 2022-05-04 20:24, Avi Gross via R-help wrote:
Extended discussion may be a waste but speaking for myself, I found it
highl
Have you looked at the merge function in base R?
https://www.rdocumentation.org/packages/base/versions/3.6.2/topics/merge
On 2022-03-19 21:15, Jeff Reichman wrote:
R-Help Community
I'm trying to combine two data.frames which each containing 10 columns
of
which they each share two common fiel
I concur on both of Eric's suggestions below. I love R but I couldn't
imagine using it on a daily basis without "key" packages for various
regression and classification modeling problems, etc. Likewise on
being able to embed images (within reason... maybe establish a max KB or
MB file size fo
Greg Williams has a book titled "Data Mining with Rattle and R", which
has a chapter on association rules and the arules package. Williams'
Rattle GUI package for R also lets you define an association rules model
using a graphical interface (which creates the R code for you in the log
file for
Apologies, I left out 3 critical lines of code after the randomized
sample dataframe is created:
group_a <- d[ which(d$label =='A'), ]
group_b <- d[ which(d$label =='B'), ]
group_c <- d[ which(d$label =='C'), ]
On 2021-08-03 18:56, Tom Woolman wro
# Resending this message since the original email was held in queue by
the listserv software because of a "suspicious" subject line, and/or
because of attached .png histogram chart attachments. I'm guessing that
the listserv software doesn't like multiple image file attachments.
Hi everyon
ngton State University
> Graduate Advocate, American Association of University Professors (OR)
>
> Recent work
(https://www.researchgate.net/profile/Nathan_Parsons3/publications)
> Schedule an appointment (https://calendly.com/nate-parsons)
>
> > On Wednesday, Jul 21, 2021 at
y, Washington State University
Graduate Advocate, American Association of University Professors (OR)
Recent work
(https://www.researchgate.net/profile/Nathan_Parsons3/publications)
Schedule an appointment (https://calendly.com/nate-parsons)
On Wednesday, Jul 21, 2021 at 8:30 PM, Tom Woolman
Couldn't you convert the date columns to character type data in a data
frame, and then convert those strings to factors in a 2nd step?
The only downside I think to treating dates as factor levels is that
you might have an awful lot of factors if you have a large enough
dataset.
Quoti
Hi Brian. I assume you're interested in some kind of classification of
the theme or the contents within each document?
In which case I would direct you to natural language processing for
multinomial classification of unstructured data. Basically an NLP
(natural language processing) classifica
In Windows versions of R/RStudio when refering to filename paths, you
need to either use two "\\" characters instead of one, OR use the
reverse slash "/" as used in Linux/Unix. It's an unfortunate conflict
between R and Windows in that a single \ character by itself is
treated as an esc
Hi Dr. Pedersen.
I haven't used cook's on an aov object but I do it all the time from
an lm (general linear model) object, ie.:
mod <- lm(data=dataframe)
cooksdistance <- cooks.distance(mod)
I *think* you might be able to simulate an aov using the lm functon by
selecting the parameter in
ount
dupAcctID<-colSums(table(Data)>0)
Data$dupAcct<-NA
# fill in the new column
for(i in 1:length(dupAcctID))
Data$dupAcct[Data$AcctID == names(dupAcctID[i])]<-dupAcctID[i]
Jim
On Wed, Nov 18, 2020 at 8:20 AM Tom Woolman
wrote:
Hi everyone. I have a dataframe that is a collectio
in his "Bloom County" comic strip )
On Tue, Nov 17, 2020 at 3:29 PM Tom Woolman
wrote:
Hi Bill. Sorry to be so obtuse with the example data, I was trying
(too hard) not to share any actual values so I just created randomized
values for my example; of course I should have specified th
uot;))
?
Must each vendor have only one account? If not, what should the result be
for
Data2 <- data.frame(Vendor=c("V1","V2","V3","V1","V4","V2"),
Account=c("A1","A2","A2","A2","A3",&q
Hi everyone. I have a dataframe that is a collection of Vendor IDs
plus a bank account number for each vendor. I'm trying to find a way
to count the number of duplicate bank accounts that occur in more than
one unique Vendor_ID, and then assign the count value for each row in
the dataframe
Hi everyone.
I'd like to perform RIDIT scoring of a column that consists of ordinal
values, but I don't have a comparison dataset to use against it as
required by the Ridit::ridit function.
As a question of best practice, could I use a normally distributed
frequency distribution table gen
Hi Leslie and all.
You may want to investigate using SparklyR on a cloud environment like
AWS, where you have more packages that are designed to work on cluster
computing environments and you have more control over those types of
parallel operations.
V/r,
Tom W.
Quoting Leslie Rutkows
.
Quoting Tom Woolman :
Hi everyone. I'm using the kernlab ksvm function with the rbfdot
kernel for a binary classification problem and getting a strange
result back. The predictions seem to be very accurate judging by the
training results provided by the algorithm, but I'm unable to
Hi everyone. I'm using the kernlab ksvm function with the rbfdot
kernel for a binary classification problem and getting a strange
result back. The predictions seem to be very accurate judging by the
training results provided by the algorithm, but I'm unable to generate
a confusion matrix
train[,1:29], nperm=99, ntree=500)
Thanks in advance.
Tom Woolman
PhD student, Indiana State University
__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide
I am using R with the nnet package to perform a multinomial logistic
regression on a training dataset with ~5800 training dataset records
and 45 predictor variables in that training data. Predictor variables
were chosen as a subset of all ~120 available variables based on PCA
analysis. My t
I have a data frame each with 10 variables of integer data for various
attributes about each row of data, and I need to know the highest 5
variables related to each of
row in this data frame and output that to a new data frame. In addition to
the 5 highest variable names, I also need to kn
27 matches
Mail list logo