= 0, upper =
10)
summary(GA)
-----
Thank you,
Barry King
--
This message is for the intended recipient only and may contain
confidential and privileged information. If you are not the intended
recipient, please return this message to the sender and delete this message
I've recently had a research manuscript rejected by an editor. The
manuscript showed
that for a real life data set, random forest outperformed multiple linear
regression
with respect to predicting the target variable. The editor's objection was
that
random forest is a black box where the random ass
red data Y of dimensions: 16 1
No variable selection.
Available components:
loading vectors: see object$loadings
variates: see object$variates
variable names: see object$names
#= End console output =
Barry King
Associate Professor of Information Technology
But
I am attempting to predict tomorrow's rainfall, RISK_MM, with LASSO using a
data set that
I have partitioned into a train data set and a test data set. The
structures of the
two data sets are shown below and appear to be identical except the number
of observations:
str(train)
'data.frame': 262 ob
ex1 <- c('Y', 'N', 'Y')
ex2 <- c('Y', 'N', 'Y')
ex3 <- c('N', 'N', 'Y')
ex4 <- c('N', 'N', 'Y')
status <- array(NA, dim=3)
# I am trying to return 'Okay' if any of the values
# in a column above is 'Y' but I am not constructing
# the function corrrectly. Any assistance is
# greatly appreciated
# Here are sample data, sample vectors, and a for loop
# that I am using now. I wish to get rid of the for loop
# and use functions and one of the apply functions to
# perform the work without needing a many iteration for loop.
A_01 <- 1:5
A_02 <- 6:10
A_03 <- 11:15
A_04 <- 16:20
B_01 <- 101:105
B
Is there a way to get around R’s memory-bound limitation by interfacing
with a Hadoop database or should I look at products like SAS or JMP to work
with data that has hundreds of thousands of records? Any help is
appreciated.
--
__
*Barry E. King, Ph.D.*
Analytics Modeler
I have divided my data into a training set and a test set.
I have then applied a logistic transformation to the variables
in the training set and have used pam to assign the observations
to one of four clusters.
My question is How do I score the test observations now that I have the
training set w
I am reading parameters from an Excel file but am having trouble using them
in tapply. Here is a mini-version of my problem.
sorted <- data.frame(c("Sing
rowID==x); indx <-
> with(x1,cut(Price,breaks=c(-Inf,upperLimit[x],Inf),labels=FALSE));
> x1[indx>1,]}))
>
> acceptdf <- do.call(rbind,lapply(names(upperLimit),function(x) {x1 <-
> subset(Prices.df,rowID==x); indx <-
> with(x1,cut(Price,breaks=c(-Inf,upperLimit[x],In
mit
Full Season Half Season
779.12 231.11
> Prices.df
rowID Price
1 Full Season 417.95
2 Full Season 679.43
3 Full Season 839.79
4 Half Season 159.39
5 Half Season 256.93
Any assistance you can provide is greatly appreciated.
Barry King
[[alternativ
When using XLConnect's readWorksheet, instead of it correctly reading
string and numeric columns, I receive NA's with the following message:
" Error when trying to evaluate cell A2 - not implemented yet"
I do not know what this means. Can anyone please assist?
--
__
*Ba
;ItemColumn"**]], parameters[["PriceColumn"]])
>
>
>
>
> On 28/05/2013 07:06, Barry King wrote:
>
>> I have an Excel worksheet with 20 rows. Using XLConnect I successfully
>> read the data into 'indata'. In order to sort it on the
his does not work. Only one row appears in 'sortedData'. I've tried
unlisting the two arguments to 'order' but this does not correct the
problem.
Can anyone suggest a solution to my problem? Your assistance is
appreciated.
- Barry King
--
__
*Ba
I have a large Excel file with SKU numbers (stock keeping units) and
forecasts which can be mimicked with the following:
Period <- c(1, 2, 3, 1, 2, 3, 4, 1, 2)
SKU <- c("A1","A1","A1","X4","X4","X4","X4","K2","K2")
Forecast <- c(99, 103, 128, 63, 69, 72, 75, 207, 201)
PeriodSKUForecast <- data.fra
I need to extract labels from Excel input data to use as dimnames later on.
I can successfully read the Excel data into three matrices:
capacity <- read.csv("c:\\R\\data\\capacity.csv")
price.lookup <- read.csv("c:\\R\\data\\price lookup.csv")
sales <- read.csv("c:\\R\\data\\sales.csv")
The value
16 matches
Mail list logo