Hi David,
>> I think he may also need to add the header=TRUE argument:
No! The argument header= is not required in this case.
##
> TDat <- read.csv("small.txt", sep="\t")
> str(TDat[,1:3])
'data.frame': 10 obs. of 3 variables:
$ Placename: Factor w/ 10 levels "Aankoms","Aapieshoek",..: 1 2
Hi,
Thanks for the continued support.
I've been working on this all night, and have learned some things:
1) Since I'm really committed to using an SVM, I need to skip the
examples with missing data. I have a training set of approximately
22,000 examples of which about 500 have missing values
Hi,
More questions in my ongoing quest to convert from RapidMiner to R.
One thing has become VERY CLEAR: None of the issues I'm asking about
here are addressed in RapidMiner. How it handles misisng values,
scaling, etc. is hidden within the "black box". Using R is forcing me
to take a much
Murray Jorgensen wrote:
I am wondering how to interpret the parameter estimates that lm()
reports in this sort of situation:
y = round(rnorm(n=24,mean=5,sd=2),2)
A = gl(3,2,24,labels=c("one","two","three"))
B = gl(4,6,24,labels=c("i","ii","iii","iv"))
# Make both observations for A=1, B=4 missin
Hi,
Suppose a binomial GLM with both continuous as well as categorical
predictors (sometimes referred to as GLM-ANCOVA, if I remember
correctly). For the categorical predictors = indicator variables, is
then there a suggested minimum frequency of each level ? Would such a
rule/ recommendation
Thanks, I had a look at mlogit. It seems it does fit a multinomial logit
regression but - just as nnet or VGAM are doing it - it has a function that
tells you the fitted value, not the value that you have with a set of
parameters (which might not be the optimal ones). Or am I wrong on this?
Rong
Thanks for the book suggestion. I'll check it out tomorrow when the library
opens up.
Yes, it is a multilevel model, but its likelihood function is the sum of the
likelihood functions for the individual levels (i.e. a simple multinomial
logits) and some other terms (the priors). It is, essential
To add to Rolf's point, a tool for imputation in R is aregImpute in
Frank Harrell's Hmisc package.
I am not sure if the discussion of past GPA as the missing variable is
literal or merely illustrative. If literal, is the gpa missing because
it was not reported (ie, it exists but was not rep
Hi -
I've been using the option
main=bquote(paste(mu==.(mu),", ",lambda==.(lambda),",
",truncation==.(truncation),", ",N[T]==.(n)))
to produce a title when using the "plot" command - a title which includes
variable names (two Greek)
along with their values.
The above option, however, does no
I am wondering how to interpret the parameter estimates that lm()
reports in this sort of situation:
y = round(rnorm(n=24,mean=5,sd=2),2)
A = gl(3,2,24,labels=c("one","two","three"))
B = gl(4,6,24,labels=c("i","ii","iii","iv"))
# Make both observations for A=1, B=4 missing
y[19] = NA
y[20] = NA
d
I am using rpart in classification mode and am confused about this
particular model's predictions.
> predict(fit, train[8,])
-1 1
8 0.5974089 0.4025911
> predict(fit, train[8,], type="class")
1
Levels: -1 1
So, it seems like there is a 60% change of being class -1 according th
Search on http://finzi.psych.upenn.edu/search.html
gives lots of hits on "pie chart".
See for example ?pie3D in the plotrix package.
Or, be more specific about what you are looking for.
Remko
-
Remko Duursma
Post-Doctoral Fellow
Centre for Plants
On 3/08/2009, at 3:43 PM, David Winsemius wrote:
On Aug 2, 2009, at 10:46 PM, Rolf Turner wrote:
On 3/08/2009, at 1:48 PM, David Winsemius wrote:
On Aug 2, 2009, at 7:29 PM, Rolf Turner wrote:
On 3/08/2009, at 11:14 AM, David Winsemius wrote:
On Aug 2, 2009, at 7:02 PM, Noah Silver
On Aug 2, 2009, at 10:46 PM, Rolf Turner wrote:
On 3/08/2009, at 1:48 PM, David Winsemius wrote:
On Aug 2, 2009, at 7:29 PM, Rolf Turner wrote:
On 3/08/2009, at 11:14 AM, David Winsemius wrote:
On Aug 2, 2009, at 7:02 PM, Noah Silverman wrote:
Hi,
It seems as if the problem was cau
On 3/08/2009, at 1:48 PM, David Winsemius wrote:
On Aug 2, 2009, at 7:29 PM, Rolf Turner wrote:
On 3/08/2009, at 11:14 AM, David Winsemius wrote:
On Aug 2, 2009, at 7:02 PM, Noah Silverman wrote:
Hi,
It seems as if the problem was caused by an odd quirk of the
"scale"
function.
So
Hi R users,
I am wondering if it there is R's function that can help integrate venn
diagram and pie chart to compare two related datasets.
I know the package limma has something built-in for making venn diagram, but
I guess it would be very painful to use line and text to specify the
proporti
I think perturb:colldiag implementes condition index as well as
variance decomposition proportions.
Ronggui
2009/7/21 Stephan Kolassa :
> Hi Alex,
>
> I personally have had more success with the (more complicated) collinearity
> diagnostics proposed by Belsley, Kuh & Welsch in their book "Regress
On Aug 2, 2009, at 7:29 PM, Rolf Turner wrote:
On 3/08/2009, at 11:14 AM, David Winsemius wrote:
On Aug 2, 2009, at 7:02 PM, Noah Silverman wrote:
Hi,
It seems as if the problem was caused by an odd quirk of the "scale"
function.
Some of my data have NA entries.
So, I substitute 0 for
You may refer to mlogit for the ordinary multinomial regression. As
fas as I know, there are no functions for multilevel multinomial
regression.
Ronggui
2009/8/2 nikolay12 :
>
> Hi,
>
> I would like to apply the L-BFGS optimization algorithm to compute the MLE
> of a multilevel multinomial Logist
Briefly, the steps to get the data.frame are:
1. Read a image as a data.frame with readGDAL - this is the first loop.
2. Dataframe is classified with a predict(lda())
3. It's back converted to a SPDF called pixID
4. Pixels (regions) of a specific class of the SPDF pixID is dissolved into
single reg
Hi Thomas,
On Sun, Aug 2, 2009 at 11:02 AM, Thomas Steiner wrote:
> Is there a R-package with a function that returns me the timezone, if
> I hand over longitude and latitude?
> I know online services like
> http://ws.geonames.org/timezone?lat=-38.01&lng=147 and
> http://www.earthtools.org/webserv
On 3/08/2009, at 11:32 AM, Noah Silverman wrote:
Rolf,
Point taken.
However, some of the variables in the experiment simply don't have
data for some of the examples.
Since I'm training an SVM that will complain about an NA, how do
you suggest I handle this.
Imagine a model predicting
Should "dwtest" and "durbin.watson" be giving me the same DW statistic and
p-value for these two fits?
library(lmtest)
library(car)
X <- c(4.8509E-1,8.2667E-2,6.4010E-2,5.1188E-2,3.4492E-2,2.1660E-2,
3.2242E-3,1.8285E-3)
Y <- c(2720,1150,1010,790,482,358,78,35)
W <- 1/Y^2
fit <- lm(Y ~
On Aug 2, 2009, at 6:02 PM, Waichler, Scott R wrote:
Martyn,
The maintainer of the RHEL RPMs no longer has an i386 machine
running EL4, and cross-building on an x86_64 machine did not
work, so I did not distribute them.
As noted in a previous thread, there is a project to port the
Fedora R RP
On Aug 2, 2009, at 7:25 PM, Noah Silverman wrote:
Just tried your suggestion.
rawdata[is.na(rawdata), ] <- 0
It FAILS with the following error:
Error in `[<-.data.frame`(`*tmp*`, is.na(rawdata), , value = 0) :
non-existent rows not allowed
Since we don't have the data, it is not your duty
Rolf,
Point taken.
However, some of the variables in the experiment simply don't have data
for some of the examples.
Since I'm training an SVM that will complain about an NA, how do you
suggest I handle this.
Imagine a model predicting student performance/grades/whatever.
One variable might
On 3/08/2009, at 11:14 AM, David Winsemius wrote:
On Aug 2, 2009, at 7:02 PM, Noah Silverman wrote:
Hi,
It seems as if the problem was caused by an odd quirk of the "scale"
function.
Some of my data have NA entries.
So, I substitute 0 for any NA with:
rawdata[is.na(rawdata)] <- 0
Perhap
Just tried your suggestion.
rawdata[is.na(rawdata), ] <- 0
It FAILS with the following error:
Error in `[<-.data.frame`(`*tmp*`, is.na(rawdata), , value = 0) :
non-existent rows not allowed
__
R-help@r-project.org mailing list
https://stat.ethz.ch/m
Interesting,
For some of the test cases, we don't have data for a particular field.
We have a training set of 20,000 entries. For example, imagine the
column "Average age of children". If the person has no children, then
the data is "NA". However, I can't train an SVM with any NA data (at
l
On Aug 2, 2009, at 7:02 PM, Noah Silverman wrote:
Hi,
It seems as if the problem was caused by an odd quirk of the "scale"
function.
Some of my data have NA entries.
So, I substitute 0 for any NA with:
rawdata[is.na(rawdata)] <- 0
Perhaps this would have done what you intended:
rawdata[is
Liviu Andronic wrote:
> other missing dependencies), fetch the source archive [1] from CRAN,
I tried this already but this source archive only contains (besides some
C code for the starter) only one .jar file without any Java sources in
it, only .class files. Also i tried to check out the project
Hi,
It seems as if the problem was caused by an odd quirk of the "scale"
function.
Some of my data have NA entries.
So, I substitute 0 for any NA with:
rawdata[is.na(rawdata)] <- 0
I then scale the data.
For some reason that I don't understand, I find some NA back in the data
after the scale
Martyn,
> The maintainer of the RHEL RPMs no longer has an i386 machine
> running EL4, and cross-building on an x86_64 machine did not
> work, so I did not distribute them.
>
> As noted in a previous thread, there is a project to port the
> Fedora R RPMs to Enterprise Linux:
>
> On Thursday 2
Does this help at all?
> contrasts(A)
two three
one 0 0
two 1 0
three 0 1
> contrasts(B)
ii iii iv
i0 0 0
ii 1 0 0
iii 0 1 0
iv 0 0 1
> contrasts(A:B)
one:ii one:iii one:iv two:i two:ii two:iii two:iv three:i
three:ii three:iii t
On Sunday 02 August 2009 02:34:43 pm Noah Silverman wrote:
> The column names have to obfuscated, but here are 10 rows of the data.
>
> label c0 c1 c2 c3 c4 c5 c6 c7
> c8 c9 c10 c11 c12 c13
> c14 c15 c16 c17 c18
I am wondering how to interpret the parameter estimates that lm()
reports in this sort of situation:
y = round(rnorm(n=24,mean=5,sd=2),2)
A = gl(3,2,24,labels=c("one","two","three"))
B = gl(4,6,24,labels=c("i","ii","iii","iv"))
# Make both observations for A=1, B=4 missing
y[19] = NA
y[20] = NA
d
Hello,
Something you can do is saving your strings in a external text file
(using cat, for instance). In this way, you would not require much
memory while extracting your data.
Once you have extracted it, you can always have a look at your external
file to see if it is too big, what to do with it
Seems to read in fine; what errors were you getting?
> x <- read.table('/small.txt', sep='\t', header=TRUE)
> str(x)
'data.frame': 10 obs. of 7 variables:
$ Placename : Factor w/ 10
levels "Aankoms","Aapieshoek",..: 1 2 3 4 5 6 7 8 9 10
$ X_coord
Hi: John Fox's CAR book has some very nice examples of how the multinomial
likelihood is estimated computationally speaking. But you mentioned
multilevel earlier which sounds more complex ?
On Aug 2, 2009, nikolay12 wrote:
Thanks a lot. The info about computing the gradient
You can use save/save.image to save the objects in your workspace that
you might need to recover from. I don't think setting environment
variable will carry over to the next execution of an R session. It is
probably best to create a parameter file that you can read in to
determine what to do next
Somehow, my data is still getting mangled.
Running the SVM gives me the following error:
"names" attribute[1994] must me the same length as the vector[1950]
Any thoughts?
-N
On 8/2/09 2:35 PM, (Ted Harding) wrote:
> On 02-Aug-09 21:10:12, Noah Silverman wrote:
>
>> Hi,
>> I am reading in
Is there a R-package with a function that returns me the timezone, if
I hand over longitude and latitude?
I know online services like
http://ws.geonames.org/timezone?lat=-38.01&lng=147 and
http://www.earthtools.org/webservices.htm#timezone and wodner if this
exists for R too.
Thanks for helping,
th
Ok i feel pretty stupid.. been trying to read a text file that contains
heading in the first line in to R but cant. all i need to do is make a
contour plot for a friend but after weeks i feel like giving up.. i included
the first few lines of the file.. any help will be great
Thanks
Hannes http:
Thanks a lot. The info about computing the gradient will be helpful.
I admit that I am somewhat confused about the likelihood function itself. It
is often said that you need to set a reference category. However, I found
two different implementations in Matlab for which setting the reference
categ
The tooltip error is a known issue - old versions of GTK (or perhaps RGtk2)
did not have the function I am now using to install tooltips. You need to
follow Felix's advice and install a new version of GTK (nothing to do with
R), then install R - the order used to be important.
I would suggest star
On 02-Aug-09 21:10:12, Noah Silverman wrote:
> Hi,
> I am reading in a dataframe from a CSV file. It has 70 columns.
> I do not have any kind of unique "row id".
>
> rawdata <- read.table("r_work/train_data.csv", header=T, sep=",",
> na.strings=0)
>
> When training an svm,
The column names have to obfuscated, but here are 10 rows of the data.
label c0 c1 c2 c3 c4 c5 c6 c7 c8
c9 c10 c11 c12 c13
c14 c15 c16 c17 c18 c19 c20 c21 c22 c23
c24 c25 c26 c27
Thank you very much.
The instructions you suggested allow the script itself to decide whether to
exit spontaneously.
What I am still missing is how to prevent the script from restarting from
scratch.
I'll try to explain my problem a little bit better.
Please, assume I have 3 huge data.frames cal
You need to post the first 10 lines of your data so that we can see
what it is doing. Most likely you have a format problem, comment
characters, or mismatched quotes.
On Sun, Aug 2, 2009 at 5:24 PM, Noah Silverman wrote:
> Jim,
>
> The "write.table" was simply a diagnostic step.
>
> My problem is
Jim,
The "write.table" was simply a diagnostic step.
My problem is that R is automatically adding row_names and then shifting
my column labels over. (The shifting creates a bunch of related problems.)
Thanks for the help.
-Noah
On 8/2/09 2:22 PM, jim holtman wrote:
> try 'row.names=FALSE' i
How are you creating the dataframes? You did not provide an example
of the code. Can you use 'lapply' instead of a 'for' loop and then
use 'do.call(rbind, lappy_result)' to create your dataframe? Adding
each time through the loop can get resource consuming if the
dataframes are large, but you ga
try 'row.names=FALSE' in the write.table.
On Sun, Aug 2, 2009 at 5:10 PM, Noah Silverman wrote:
> Hi,
>
> I am reading in a dataframe from a CSV file. It has 70 columns. I do
> not have any kind of unique "row id".
>
>
> rawdata <- read.table("r_work/train_data.csv", header=T, sep=",",
> na.stri
Hi,
I am reading in a dataframe from a CSV file. It has 70 columns. I do
not have any kind of unique "row id".
rawdata <- read.table("r_work/train_data.csv", header=T, sep=",",
na.strings=0)
When training an svm, I keep getting an error
So, as an experiment, I wrote the data back out to a
I'm not sure if there are better methods to create objects such as
dataframes with other than rbind function.
II usually combine a data.frames created at each loop with a rbind(),
specially when I don't know the dimension of the data.frame that will be
created.
Binding the new to an existing data.
I want to print characters from the symbol font (or perhaps even Wingdings)
in an rgl 3d plot, but I am having no luck. So, what do I have to do in
order to get this snippet to print out a character from the symbol font?
library(rgl)
open3d()
text3d(1,1,1,"a",adj=c(0.5,0.5),cex=10,family="symbol"
You can use 'try' to catch errors and take corrective action.
'memory.size' and 'proc.time' will give you information on the memory
usage of your application and the CPU time that has been used.
On Sun, Aug 2, 2009 at 2:02 PM, wrote:
> I am submitting this problem to the R forum , rather than th
Hello,
On 8/2/09, Bernd Kreuss wrote:
> I would even make the changes on my own (i probably would already have
> done it) and supply patches if i only could find any hint on how to
> build JGR from sources. (where to place the source files, what command
> to start the build process and where
I think he may also need to add the header=TRUE argument:
tdat <- read.csv("http://www.nabble.com/file/p24777697/small.txt";,
header=TRUE, sep="\t")
Note: read.table with those arguments should have worked as well.
And then use names(tdat) <- c()
Perhaps along these lines:
tdnames <- names(
Hannes,
>> been trying to read a text file that contains heading in the first line
>> in to R but cant.
You want the following:
##
TDat <- read.csv("small.txt", sep="\t")
TDat
str(TDat)
See ?read.csv
Regards, Mark.
hannesPretorius wrote:
>
> Ok i feel pretty stupid.. been trying to read a
I am submitting this problem to the R forum , rather than the Bioconductor
forum, because its nature is closer to programming style than any
Bioinformatic contents.
I have implemented an R script to extracts many strings through querying 3
Bioinformatic databases in the same loop cycle. Ideal
I'm still not sure that I understand what you are looking for. However
building on David Winsemius does this give you what you want?
#==
pstate<-read.table(textConnection("Changes State1 State2 State3 State4
a Pa1 Pa2 Pa3 Pa
Hi Barry,
This is great! Thanks for doing this.
>> Maybe I should get a life.
Please don't!
Michael
--
Michael Denslow
Graduate Student
I.W. Carpenter Jr. Herbarium [BOON]
Department of Biology
Appalachian State University
Boone, North Carolina U.S.A.
-- AND --
Communications Manager
South
Hello,
Isn't it totally counter-intuitive that if you penalize the error less
the tree finds it?
See:
experience <- as.factor(c(rep("good",90), rep("bad",10)))
cancel <- as.factor(c(rep("no",85), rep("yes",5),
rep("no",5),rep("yes",5)))
foo <- function( i ){
tmp <- rpart(cancel ~ experience
You might want to check how large the dataframe you are creating is
after 1000 images. Normally a single object should not be larger than
30% of the available physical memory so that you can make copies as
you are processing. Do a 'gc()' periodically while processing to see
how memory is growing.
I would like to do so as well, but I faced some problems as well..
livia wrote:
>
> Hi everyone, I would like to write VBA macros for accessing R and it is my
> first attempt. I really could use some help here.
>
> I am trying to use the following code to read data from Access. The R code
package of boolean has been removed from CRAN. Anyone know why?
Thanks.
--
HUANG Ronggui, Wincent
PhD Candidate
Dept of Public and Social Administration
City University of Hong Kong
Home page: http://asrr.r-forge.r-project.org/rghuang.html
__
R-help@r-
Hi,
Providing the gradient function is generally a good idea in optimization;
however, it is not necessary. Almost all optimization routines will compute
this using a simple finite-difference approximation, if they are not
user-specified. If your function is very complicated, then you are more
Hi,
You can use `model.matrix' to create the apropriate design matrix for factor
variables.
set.seed(10)
ftime <- rexp(200)
fstatus <- sample(0:2,200,replace=TRUE)
gg <- factor(sample(1:3,200,replace=TRUE),1:3, c('a','b','c'))
cov <- matrix(runif(600),nrow=200)
dimnames(cov)[[2]] <- c('x1',
You've already been pointed to options(digits=);
here's another way: since your data appear to be limited to 2 decimals,
why not select your noise from UNIF(0, 0.001)?
More importantly, are you really trying to do correlation between the
values you're showing us? What do you hope to learn from su
Bernd Kreuss wrote:
> I would like to add a few points to this list [...]
I would even make the changes on my own (i probably would already have
done it) and supply patches if i only could find any hint on how to
build JGR from sources. (where to place the source files, what command
to start the
> This issue was addressed in a recent discussion [1].
> Liviu
>
> [1]
> http://mailman.rz.uni-augsburg.de/pipermail/stats-rosuda-devel/2009q2/001106.html
I would like to add a few points to this list, some of them I personally
find even more annoying then some of them mentioned there:
- add th
Thanks for your help!
2009/8/2 David Winsemius
> Here's what I would do. Let's assume that you are presenting the results of
> the example on the cph help page. I agree with you that the results should
> be presented on the hazard ratio scale. The Design package provides
> appropriate plotting
Here's what I would do. Let's assume that you are presenting the
results of the example on the cph help page. I agree with you that
the results should be presented on the hazard ratio scale. The Design
package provides appropriate plotting tools for creation of
publication quality graphics
I guess I will broaden the question to include any Earth Geomagnetic Model,
e.g. DoD World Magnetic Model (WMM). Does there exist an R package that
includes any geomagnetic model?
Here is an example of a couple of others provided by NASA, e.g.
http://www.ngdc.noaa.gov/geomag/models.shtml
Un
I got this error message.
Error in .local(.Object, ...) :
GDAL Error 2: CPLRealloc(): Out of memory allocating 16 bytes
I was trying to read 1388 jpg images (993x993) with rgdal in a loop, to store
data in a data.frame. It fails to read the 1338 image.
Isn't R recycling the object?
I'
Or this;
remember <- function(expr, value, ok, visible) {
assign( ".Last.expr", expr, .GlobalEnv )
invisible( TRUE )
}
addTaskCallback(remember)
> x <- rnorm(
+ 10 )
> .Last.expr
x <- rnorm(10)
Romain
On 08/02/2009 01:11 PM, Gabor Grothendieck wrote:
Try this:
x<- 4
x*x+3
[1] 19
save
Try this:
> x <- 4
> x*x+3
[1] 19
> savehistory(".Rhistory")
> c(parse(text = tail(readLines(".Rhistory"), 2)[1]))
expression(x * x + 3)
On Sun, Aug 2, 2009 at 5:02 AM, Daniel Haase wrote:
> Hi,
>
> I am looking for a way to find out the last expression that was entered by
> the user, similar to
Hello,
to call each variable separately, you can coerce your sp object back to a
dataframe. The code below should to the job:
DF.utm <- as.data.frame(SP.utm)
long.diff<-diff(DF.utm$Long)
Nicolas
Tim Clark wrote:
>
>
> Dear List,
>
> I am trying to determine the speed an animal is traveling
Hi,
I am looking for a way to find out the last expression that was
entered by the user, similar to ".Last.value", but for the unevaluated
expression instead of the evaluated one.
Example:
x <- 4
x*x + 3
[1] 19
.Last.value # that's the evaluated last expression
[1] 19
# but I am looking
In practices, it is not easy to make such decision. One example is
size of social ties in social network study. It is very common to use
OLS thought it is count variable rather than normal. I think AIC is
suggestive as well.
Ronggui
2009/8/2 Alain Zuur :
>
>
>
> Mark Na wrote:
>>
>> Dear R-helper
I guess you're trying to make a Windows computer work something like a
Unix computer when using R.
I tried for many weeks trying to get cygwin to work, but all sorts of
details never worked anywhere nearly as simply as they do using Linux.
What you might find useful to know is that the Windows ver
Hello,
I have a question regarding competing risk regression using cmprsk package
(function crr()). I am using R2.9.1. How can I do to assess the effect of
qualitative predictor (gg) with more than two categories (a,b,c) categorie c is
the reference category. See above results, gg is considered
Ddoes options()$digits tell you anything useful?
HTH
On Sat, 01-Aug-2009 at 02:39PM -0400, Manisha Brahmachary wrote:
|> Hello,
|>
|>
|>
|> I am trying to do a spearman correlation. My data has tied values. To
|> overcome this issue, I am adding some random noise (values) to my original
|> d
hI aLL
I feel with the great advice I am getting from Felix & Graham, that I am
finally making progress & I feel that finally I have identified the REAL
PROBLEM. Of cause I am unsure how I will solve it but that is another story.
The "trace" follows after following Grahams latest advice
Error in
Hi,
I would like to apply the L-BFGS optimization algorithm to compute the MLE
of a multilevel multinomial Logistic Regression.
The likelihood formula for this model has as one of the summands the formula
for computing the likelihood of an ordinary (single-level) multinomial logit
regression. S
Hello,
I am trying to do a spearman correlation. My data has tied values. To
overcome this issue, I am adding some random noise (values) to my original
data. However when I add the random noise to the data, the final matrix does
not show the new values. I guess the reason being that the noise I
86 matches
Mail list logo