[R] Creating R file

2016-05-28 Thread jay28 via R-help
 Hi. I am new to R and confused by some conflicting and contradictory 
information about it. Where and how do I create a numeric data file with .csv 
extension for use in R? So numbers meaning numeric data will be separated by 
commas and will consist of one line of numbers randomly chosen from 1 to 40. 
Thanks to all who reply. jay28.
Sent from Yahoo Mail. Get the app
[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Creating R file

2016-05-28 Thread David Winsemius

> On May 27, 2016, at 4:40 PM, jay28 via R-help  wrote:
> 
>  Hi. I am new to R and confused by some conflicting and contradictory 
> information about it. Where and how do I create a numeric data file with .csv 
> extension for use in R? So numbers meaning numeric data will be separated by 
> commas and will consist of one line of numbers randomly chosen from 1 to 40. 
> Thanks to all who reply. jay28.
> Sent from Yahoo Mail. Get the app
>   [[alternative HTML version deleted]]
> 

I don’t understand how the calls to the help function would not answer both 
aspects:

?write.csv
?sample

— 
David.


> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Creating R file

2016-05-28 Thread Hong Yu

When you need numerical data input in R programs, you can use EXCEL to create 
.csv file.  When you need output calculation results, you can write out .csv 
file in R programs.

Yes, the most common .cvs file format is comma seperated numerical values.  You 
can use EXCEL to create .csv file, and view the content with text editor.



From: jay28 via R-help 
Sent: Saturday, May 28, 2016 2:59 PM
To: r-help@r-project.org 
Subject: [R] Creating R file

Hi. I am new to R and confused by some conflicting and contradictory 
information about it. Where and how do I create a numeric data file with .csv 
extension for use in R? So numbers meaning numeric data will be separated by 
commas and will consist of one line of numbers randomly chosen from 1 to 40. 
Thanks to all who reply. jay28.
Sent from Yahoo Mail. Get the app
[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Creating R file

2016-05-28 Thread Jeff Newmiller
This sounds like homework, which has been determined to be off-topic on this 
help list. Please read the Posting Guide before posting. 

That said, it would appear the OP may need to read about data frames in, say, 
the Introduction to R... and perhaps about matrices... and using the as.* 
functions to convert between them... in addition to the below-mentioned help 
pages. 
-- 
Sent from my phone. Please excuse my brevity.

On May 28, 2016 12:22:44 AM PDT, David Winsemius  wrote:
>
>> On May 27, 2016, at 4:40 PM, jay28 via R-help 
>wrote:
>> 
>>  Hi. I am new to R and confused by some conflicting and contradictory
>information about it. Where and how do I create a numeric data file
>with .csv extension for use in R? So numbers meaning numeric data will
>be separated by commas and will consist of one line of numbers randomly
>chosen from 1 to 40. Thanks to all who reply. jay28.
>> Sent from Yahoo Mail. Get the app
>>  [[alternative HTML version deleted]]
>> 
>
>I don’t understand how the calls to the help function would not answer
>both aspects:
>
>?write.csv
>?sample
>
>— 
>David.
>
>
>> __
>> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide
>http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>
>__
>R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
>https://stat.ethz.ch/mailman/listinfo/r-help
>PLEASE do read the posting guide
>http://www.R-project.org/posting-guide.html
>and provide commented, minimal, self-contained, reproducible code.

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] code to provoke a crash running rterm.exe on windows

2016-05-28 Thread Anthony Damico
hi, here's a minimal reproducible example that crashes my R 3.3.0 console
on a powerful windows server.  below the example, i've put the error (not
crash) that occurs on R 3.2.3.

should this be reported to http://bugs.r-project.org/ or am i doing
something silly?  thanx





# C:\Users\AnthonyD>"c:\Program Files\R\R-3.3.0\bin\x64\Rterm.exe"

# R version 3.3.0 (2016-05-03) -- "Supposedly Educational"
# Copyright (C) 2016 The R Foundation for Statistical Computing
# Platform: x86_64-w64-mingw32/x64 (64-bit)

# R is free software and comes with ABSOLUTELY NO WARRANTY.
# You are welcome to redistribute it under certain conditions.
# Type 'license()' or 'licence()' for distribution details.

  # Natural language support but running in an English locale

# R is a collaborative project with many contributors.
# Type 'contributors()' for more information and
# 'citation()' on how to cite R or R packages in publications.

# Type 'demo()' for some demos, 'help()' for on-line help, or
# 'help.start()' for an HTML browser interface to help.
# Type 'q()' to quit R.

sessionInfo()
# R version 3.3.0 (2016-05-03)
# Platform: x86_64-w64-mingw32/x64 (64-bit)
# Running under: Windows Server 2012 R2 x64 (build 9600)

# locale:
# [1] LC_COLLATE=English_United States.1252
# [2] LC_CTYPE=English_United States.1252
# [3] LC_MONETARY=English_United States.1252
# [4] LC_NUMERIC=C
# [5] LC_TIME=English_United States.1252

# attached base packages:
# [1] stats graphics  grDevices utils datasets  methods   base

memory.limit()
# [1] 229247

# works fine
grpsize = ceiling(10^5/26)

# simple data.frame
my_df <-
  data.frame(
  x=rep(LETTERS,each=26*grpsize),
  v=runif(grpsize*26),
  stringsAsFactors=FALSE
  )

# mis-match the number of elements
my_df <-
  data.frame(
  x=rep(LETTERS,each=26*grpsize),
  v=runif(grpsize*26),
  stringsAsFactors=FALSE
  )

# make this much bigger
grpsize = ceiling(10^8/26)

# simple data.frame
my_df <-
  data.frame(
  x=rep(LETTERS,each=grpsize),
  v=runif(grpsize*26),
  stringsAsFactors=FALSE
  )

# mis-match the number of elements
my_df <-
  data.frame(
  x=rep(LETTERS,each=26*grpsize),
  v=runif(grpsize*26),
  stringsAsFactors=FALSE
  )

# CONSOLE CRASH WITHOUT EXPLANATION
C:\Users\AnthonyD>



# # # # # running the exact same commands on r version 3.2.3 on windows:

C:\Users\AnthonyD>"C:\Program Files\R\R-3.2.3\bin\x64\Rterm.exe"

memory.limit()
# [1] 229247

grpsize = ceiling(10^8/26)

# mis-matched number of elements
my_df <-
  data.frame(
  x=rep(LETTERS,each=26*grpsize),
  v=runif(grpsize*26),
  stringsAsFactors=FALSE
  )
# Error in if (mirn && nrows[i] > 0L) { :
  # missing value where TRUE/FALSE needed
# In addition: Warning message:
# In as.data.frame.vector(x, ..., nm = nm) :
  # NAs introduced by coercion to integer range

# # # # but console does not crash # # # #

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] colored table

2016-05-28 Thread Bert Gunter
Hi Naresh:

I shall be brief, as discussions of what statistical/graphical techniques
to use are largely OT.

IMO, this is a bad idea. I think the table entries will be very difficult
to read and groc. If the tables are unrelated, use 2 tables. If you think
they might be related, plot the entries of one versus the other in a
scatter plot. Another possibility would be plot the values as separate bars
in a trellis plot with you table x and y categorical values as conditioning
factors. Judging and comparing bar lengths is much more accurate than
trying to quantify shading density.

Cheers,
Bert
On Sat, May 28, 2016 at 9:12 AM Naresh Gurbuxani <
naresh_gurbux...@hotmail.com> wrote:

> I want to print a table where table elements are colored according to the
> frequency of the bin.  For example, consider below table.
>
> Function values that I would like to print in the table
>
> x.eq.minus1  x.eq.zero  x.eq.plus1
> y.eq.minus1 -20 10-5
> y.eq.zero -10  6 22
> y.eq.plus1-810   -14
>
>
> Frequency table to color the above table
>
> x.eq.minus1  x.eq.zero  x.eq.plus1
> y.eq.minus1 0.05 0.15   0.1
> y.eq.zero 0.07 0.3   0.08
> y.eq.plus10.050.15   0.05
>
>
> In the resulting table, the element for (x = 0, y = 0) will be 6.  This
> will be printed with a dark color background.  The element for (x = -1, y =
> -1) will be -20.  This will be printed with a light color background.  And
> so on.
>
> Thanks for your help,
> Naresh
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] colored table

2016-05-28 Thread Jeff Newmiller
If you don't mix the text and color, heatmaps are pretty standard presentation 
techniques. 
-- 
Sent from my phone. Please excuse my brevity.

On May 28, 2016 7:41:53 AM PDT, Bert Gunter  wrote:
>Hi Naresh:
>
>I shall be brief, as discussions of what statistical/graphical
>techniques
>to use are largely OT.
>
>IMO, this is a bad idea. I think the table entries will be very
>difficult
>to read and groc. If the tables are unrelated, use 2 tables. If you
>think
>they might be related, plot the entries of one versus the other in a
>scatter plot. Another possibility would be plot the values as separate
>bars
>in a trellis plot with you table x and y categorical values as
>conditioning
>factors. Judging and comparing bar lengths is much more accurate than
>trying to quantify shading density.
>
>Cheers,
>Bert
>On Sat, May 28, 2016 at 9:12 AM Naresh Gurbuxani <
>naresh_gurbux...@hotmail.com> wrote:
>
>> I want to print a table where table elements are colored according to
>the
>> frequency of the bin.  For example, consider below table.
>>
>> Function values that I would like to print in the table
>>
>> x.eq.minus1  x.eq.zero  x.eq.plus1
>> y.eq.minus1 -20 10-5
>> y.eq.zero -10  6 22
>> y.eq.plus1-810   -14
>>
>>
>> Frequency table to color the above table
>>
>> x.eq.minus1  x.eq.zero  x.eq.plus1
>> y.eq.minus1 0.05 0.15   0.1
>> y.eq.zero 0.07 0.3   0.08
>> y.eq.plus10.050.15   0.05
>>
>>
>> In the resulting table, the element for (x = 0, y = 0) will be 6. 
>This
>> will be printed with a dark color background.  The element for (x =
>-1, y =
>> -1) will be -20.  This will be printed with a light color background.
> And
>> so on.
>>
>> Thanks for your help,
>> Naresh
>>
>> __
>> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide
>> http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>>
>
>   [[alternative HTML version deleted]]
>
>__
>R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
>https://stat.ethz.ch/mailman/listinfo/r-help
>PLEASE do read the posting guide
>http://www.R-project.org/posting-guide.html
>and provide commented, minimal, self-contained, reproducible code.

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] code to provoke a crash running rterm.exe on windows

2016-05-28 Thread Ben Bolker
Anthony Damico  gmail.com> writes:

> 
> hi, here's a minimal reproducible example that crashes my R 3.3.0 console
> on a powerful windows server.  below the example, i've put the error (not
> crash) that occurs on R 3.2.3.
> 
> should this be reported to http://bugs.r-project.org/ or am i doing
> something silly?  thanx


From the R FAQ (9.1):

If R executes an illegal instruction, or dies with an operating system
error message that indicates a problem in the program (as opposed to
something like “disk full”), then it is certainly a bug.

  So you could submit a bug report, *or* open a discussion on
r-de...@r-project.org  (which I'd have said was a more appropriate
venue for this question in any case) ...

  Ben Bolker
__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] colored table

2016-05-28 Thread Duncan Murdoch

On 28/05/2016 9:10 AM, Naresh Gurbuxani wrote:

I want to print a table where table elements are colored according to the 
frequency of the bin.  For example, consider below table.


How to do this depends on how you want to print the result.  Are you 
looking for a LaTeX table, HTML, Word, or what?


Duncan Murdoch



Function values that I would like to print in the table

x.eq.minus1  x.eq.zero  x.eq.plus1
y.eq.minus1 -20 10-5
y.eq.zero -10  6 22
y.eq.plus1-810   -14


Frequency table to color the above table

x.eq.minus1  x.eq.zero  x.eq.plus1
y.eq.minus1 0.05 0.15   0.1
y.eq.zero 0.07 0.3   0.08
y.eq.plus10.050.15   0.05


In the resulting table, the element for (x = 0, y = 0) will be 6.  This will be 
printed with a dark color background.  The element for (x = -1, y = -1) will be 
-20.  This will be printed with a light color background.  And so on.

Thanks for your help,
Naresh

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.



__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] asking for large memory - crash running rterm.exe on windows

2016-05-28 Thread Martin Maechler
> Ben Bolker 
> on Sat, 28 May 2016 15:42:45 + writes:

> Anthony Damico  gmail.com> writes:
>> 
>> hi, here's a minimal reproducible example that crashes my
>> R 3.3.0 console on a powerful windows server.  below the
>> example, i've put the error (not crash) that occurs on R
>> 3.2.3.
>> 
>> should this be reported to http://bugs.r-project.org/ or
>> am i doing something silly?  thanx


> From the R FAQ (9.1):

> If R executes an illegal instruction, or dies with an
> operating system error message that indicates a problem in
> the program (as opposed to something like “disk full”),
> then it is certainly a bug.

>   So you could submit a bug report, *or* open a discussion
> on r-de...@r-project.org (which I'd have said was a more
> appropriate venue for this question in any case) ...

Indeed.
In this case, this is a known problem -- not just of R, but of
many programs that you can run ---
You are requesting (much) more memory than your computer has
RAM, and in this situation -- depending on the OS ---
your computer will kill R (what you saw) or your it will become
very slow trying to shove all memory to R and start swapping
(out to disk other running / sleeping processes on the
computer).

Both is very unpleasant...
But it is you as R user who asked R to allocate an object of
about 41.6 Gigabytes (26 * 1.6, see below).

As Ben mentioned this may be worth a discussion on R-devel ...
or you rather follow up the existing thread opened by Marius
Hofert  three weeks ago, with subject
 "[Rd] R process killed when allocating too large matrix (Mac OS X)"

  -->  https://stat.ethz.ch/pipermail/r-devel/2016-May/072648.html
 
His simple command to "crash R" was

   matrix(0, 1e5, 1e5)

which for some of use gives an error such as

> x <- matrix(0, 1e5,1e5)
Error: cannot allocate vector of size 74.5 Gb

but for others it had the same effect as your example.
BTW: I repeat it here in a functionalized form with added
 comments which makes apparent what's going on:


## Make simple data.frame
mkDf <- function(grpsize, wrongSize = FALSE) {
ne <- (if(wrongSize) 26 else 1) *grpsize
data.frame(x = rep(LETTERS, each = ne),
   v = runif(grpsize*26), stringsAsFactors=FALSE)
}

g1 <- ceiling(10^5/26)
d1 <- mkDf(g1) # works fine
str(d1)
## 'data.frame':100022 obs. of  2 variables:

dP <- mkDf(g1, wrong=TRUE)# mis-matching the number of elements

str(dP) # is 26 times larger
## 'data.frame': 2600572 obs. of  2 variables: .


# make this much bigger
gLarge <- ceiling(10^8/26)

dL <- mkDf(gLarge) # works "fine" .. (well, takes time!!)
str(dL)
## 'data.frame': 10004 obs. of  2 variables:
as.numeric(print(object.size(dL)) / 1e6)
## 162088 bytes
## [1] 1600.002  Mega  i.e.,  1.6 GBytes

## Well, this will be 26 times larger than already large ==> your R may crash 
*OR*
 ## your computer may basically slow down to a crawl, when R requests all its 
memory...
if(FALSE) ## ==> do *NOT* evaluate the following lightly !!
dLL <- mkDf(gLarge, wrong=TRUE)
# CONSOLE CRASH WITHOUT EXPLANATION
# C:\Users\AnthonyD>

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Getting Rid of NaN in ts Object

2016-05-28 Thread Martin Maechler

> Perfect!
> Exactly what I was looking for.
> Thanks

> Lorenzo

> On Fri, May 27, 2016 at 01:50:03PM +0200, Christian Brandstätter wrote:
>> Hi Lorenzo,
>> 
>> Try:
>> 
>> tt[is.nan(tt)] <- NA
>> tt <- na.omit(tt)
>> 

or simply  na.omit(tt)

as it omits both NA and NaN (and *does* keep the 'ts' properties
as you have noted).

Martin

>> Best,
>> 
>> Christian
>>

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] sandwich package: HAC estimators

2016-05-28 Thread T.Riedle
Dear R users,

I am running a logistic regression using the rms package and the code looks as 
follows:

crisis_bubble4<-lrm(stock.market.crash~crash.MA+bubble.MA+MP.MA+UTS.MA+UPR.MA+PPI.MA+RV.MA,data=Data_logitregression_movingaverage)

Now, I would like to calculate HAC robust standard errors using the sandwich 
package assuming the NeweyWest estimator which looks as follows:


coeftest(crisis_bubble4,df=Inf,vcov=NeweyWest)

Error in match.arg(type) :

  'arg' should be one of "li.shepherd", "ordinary", "score", "score.binary", 
"pearson", "deviance", "pseudo.dep", "partial", "dfbeta", "dfbetas", "dffit", 
"dffits", "hat", "gof", "lp1"

As you can see, it doesn't work. Therefore, I did the same using the glm() 
instead of lrm():


crisis_bubble4<-glm(stock.market.crash~crash.MA+bubble.MA+MP.MA+UTS.MA+UPR.MA+PPI.MA+RV.MA,family=binomial("logit"),data=Data_logitregression_movingaverage)



If I use the coeftest() function, I get following results.

coeftest(crisis_bubble4,df=Inf,vcov=NeweyWest)



z test of coefficients:



  Estimate Std. Error z value Pr(>|z|)

(Intercept)   -5.260885.01706 -1.0486  0.29436

crash.MA   0.492192.41688  0.2036  0.83863

bubble.MA 12.128685.85228  2.0725  0.03822 *

MP.MA-20.07238  499.37589 -0.0402  0.96794

UTS.MA   -58.18142   77.08409 -0.7548  0.45038

UPR.MA  -337.57985  395.35639 -0.8539  0.39318

PPI.MA   729.37693  358.60868  2.0339  0.04196 *

RV.MA116.00106   79.52421  1.4587  0.14465

---

Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1


I am unsure whether the coeftest from the lmtest package is appropriate in case 
of a logistic regression. Is there another function for logistic regressions? 
Furthermore, I would like to present the regression coefficients, the 
F-statistic and the HAC estimators in one single table. How can I do that?

I thought it would be useful to incorporate the HAC consistent covariance 
matrix into the logistic regression directly and generate an output of 
coefficients and the corresponding standard errors. Is there such a function in 
R?

Thanks for your support.

Kind regards

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] read.fortran format

2016-05-28 Thread William Dunlap via R-help
>The bit about the decimal leading to a shift in the decimal place
>pointed out by Bill is a bit obscure, though it to is mentioned in the
help file.

I don't think that is how real Fortran formats work.  My memory is that
you only put a dot in the format if there were no dots in your data file
(so you could avoid wasting one of the 80 columns on the punched card
on a dot).

With current gfortran, putting a dot in the data overrides the decimal
place specification in the format.

% cat format.f
  double precision d1, d2
  integer*4 i1
  integer*4 i

 10   format(I2F7.1F7.6)
  do 20 i=1,100
read(*,10) i1, d1, d2
write(*,*) i1, d1, d2
 20   continue
  stop
  end

% gfortran format.f -o format.exe
% ./format.exe
1234012340987654
  12   340123.402   0.987654003
123.0123409.7e20
  12   3.012340009.7000E+020
123.012340976e20
  12   3.0123400097600.






Bill Dunlap
TIBCO Software
wdunlap tibco.com

On Fri, May 27, 2016 at 10:14 PM, Jeff Newmiller 
wrote:

> Your rather sarcastic comment about knowledge given by John's mother seems
> inappropriate, given that he told you where his information came from and
> it is the first place you should have looked.
>
> The bit about the decimal leading to a shift in the decimal place pointed
> out by Bill is a bit obscure, though it to is mentioned in the help file.
>
> The "D" format is broken though... the regex template in the processFormat
> embedded function is missing that option. Bill's use of 'F' instead with no
> decimal is the easy workaround, but that is a bug.
> --
> Sent from my phone. Please excuse my brevity.
>
> On May 27, 2016 8:30:49 PM PDT, Steven Yen  wrote:
> >That's great, John. Your mother told you when you were born? How am I
> >supposed to know? Thank you both.
> >The following format statement did it!! I just change F5.3 to F5, 5F8.4
> >
> >to 5F8. I also change 2E15.9 to 2A9, and then use the following
> >as.numeric to convert the alphanumerical to numerical. Thank you!!!
> >
> >mydata<-read.fortran("GROUPC.DAT",
> > list(c("1X","F6","5F8"),
> >  c("1X","5F8","F10"),
> >  c("1X","2F6","3A15","F8","F5","F5"),
> >  c("1X","F7","2A15","F9","F5")),
> >col.names=c("year","w1","w2","w3","w4","w5","v1","v2","v3",
> >"v4","v5","m","chyes","chno","ec","vc","cvc",
> >"pop","ahs","fah","tnh","eq","vq","ups","zm1"))
> >mydata$ec <-as.numeric(mydata$ec)
> >
> >On 5/27/2016 6:33 PM, William Dunlap wrote:
> >> It has been a while since I used Fortran formatted input, but the
> >> following,
> >> without dots in the format, works:
> >>
> >> > txt <- "1950. .614350 .026834 .087227 .006821 .180001 4.56E-2"
> >> > print(read.fortran(textConnection(txt), c("f5", "6f8")),
> >digits=10)
> >> V1  V2   V3   V4   V5   V6 V7
> >> 1 1950 0.61435 0.026834 0.087227 0.006821 0.180001 0.0456
> >>
> >>
> >> If I recall correctly, a dot in the format pushes the decimal point:
> >>
> >> > print(read.fortran(textConnection(txt), c("f5", "6f8.3")),
> >> digits=10)
> >> V1 V2 V3 V4V5  V6
> >  V7
> >> 1 1950 0.00061435 2.6834e-05 8.7227e-05 6.821e-06 0.000180001
> >4.56e-05
> >>
> >>
> >>
> >> Bill Dunlap
> >> TIBCO Software
> >> wdunlap tibco.com 
> >>
> >> On Fri, May 27, 2016 at 3:15 PM, Steven Yen  >> > wrote:
> >>
> >> Thanks John. That helped, but I got a mixed of good thing and bad
> >> thing.
> >> Good is R does not like the scientific number format "3E15.9" but
> >> I was
> >> able to read with alphanumerical format "3A15" (and convert to
> >> numerical). Bad is R does not like the numbers .1234, .2345
> >> without the
> >> zeros before the decimal points. My data look like:
> >>
> >>1950. .614350 .026834 .087227 .006821 .180001 .084766
> >>
> >> The first variable was read correctly, followed by six 0's.
> >>
> >> As the instructions say, this fortran format is approximation at
> >best
> >> and in this case, a poort approximation.
> >>
> >> On 5/27/2016 2:21 PM, John McKown wrote:
> >> > On Fri, May 27, 2016 at 12:56 PM, Steven Yen  >> 
> >> > >>wrote:
> >> >
> >> > Dear fellow R users:
> >> > I am reading a data (ascii) file with fortran fixed format,
> >> containing
> >> > multiple records. R does not recognize fortran's record
> >break (a
> >> > slash).
> >> > I tried to do the following but it does not work. Help
> >> appreciated.
> >> >
> >> >   60
> >> >
> >FORMAT(1X,F6.0,5F8.6/1X,5F8.4,F10.6/1X,2F6.0,3E15.9,F8.0,F5.2,F5.3
> >> >   *  /1X,F7.0,2E15.9,F9.4,F5.3)
> >> >
> >> >
> >>
> >mydata<-read.fortran("G:/Journals/D

[R] read multiple sheets of excel data into R

2016-05-28 Thread li li
Hi all,
  I tried to use the package "XLConnect" to read excel data into R.  I got
the following error message:

Error : .onLoad failed in loadNamespace() for 'rJava', details:
  call: fun(libname, pkgname)
  error: No CurrentVersion entry in Software/JavaSoft registry! Try
re-installing Java and make sure R and Java have matching
architectures.


  I tried read.xls and got the following error  message:

> library(gdata)> one <- read.xls ("one.xlsx", sheet=1)Error in 
> findPerl(verbose = verbose) :
  perl executable not found. Use perl= argument to specify the correct
path.Error in file.exists(tfn) : invalid 'file' argument


  Can anyone give me some input on this?
  Thanks.
Hanna

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] read multiple sheets of excel data into R

2016-05-28 Thread Jeff Newmiller
Apparently you need to get your Java runtime setup, or install Perl, depending 
which of these tools you want to use. 

Or if your data are laid out simply, you might be able to use the readxl 
package. 
-- 
Sent from my phone. Please excuse my brevity.

On May 28, 2016 10:55:50 AM PDT, li li  wrote:
>Hi all,
>I tried to use the package "XLConnect" to read excel data into R.  I
>got
>the following error message:
>
>Error : .onLoad failed in loadNamespace() for 'rJava', details:
>  call: fun(libname, pkgname)
>  error: No CurrentVersion entry in Software/JavaSoft registry! Try
>re-installing Java and make sure R and Java have matching
>architectures.
>
>
>  I tried read.xls and got the following error  message:
>
>> library(gdata)> one <- read.xls ("one.xlsx", sheet=1)Error in
>findPerl(verbose = verbose) :
>  perl executable not found. Use perl= argument to specify the correct
>path.Error in file.exists(tfn) : invalid 'file' argument
>
>
>  Can anyone give me some input on this?
>  Thanks.
>Hanna
>
>   [[alternative HTML version deleted]]
>
>__
>R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
>https://stat.ethz.ch/mailman/listinfo/r-help
>PLEASE do read the posting guide
>http://www.R-project.org/posting-guide.html
>and provide commented, minimal, self-contained, reproducible code.

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] sandwich package: HAC estimators

2016-05-28 Thread Achim Zeileis

On Sat, 28 May 2016, T.Riedle wrote:


Dear R users,

I am running a logistic regression using the rms package and the code 
looks as follows:


crisis_bubble4<-lrm(stock.market.crash~crash.MA+bubble.MA+MP.MA+UTS.MA+UPR.MA+PPI.MA+RV.MA,data=Data_logitregression_movingaverage)

Now, I would like to calculate HAC robust standard errors using the 
sandwich package assuming the NeweyWest estimator which looks as 
follows:


coeftest(crisis_bubble4,df=Inf,vcov=NeweyWest)

Error in match.arg(type) :

 'arg' should be one of "li.shepherd", "ordinary", "score", 
"score.binary", "pearson", "deviance", "pseudo.dep", "partial", 
"dfbeta", "dfbetas", "dffit", "dffits", "hat", "gof", "lp1"


As you can see, it doesn't work.


Yes. The "sandwich" package relies on two methods being available: bread() 
and estfun(). See vignette("sandwich-OOP", package = "sandwich") for the 
background details.


For objects of class "lrm" no such methods are available. But as "lrm" 
objects inherit from "glm" the corresponding methods are called. However, 
"lrm" objects are actually too different from "glm" objects (despite the 
inheritance) resulting in the error.


It is easy to add these methods, though, because "lrm" brings all the 
necessary information:


bread.lrm <- function(x, ...) vcov(x) * nobs(x)
estfun.lrm <- function(x, ...) residuals(x, "score")


Therefore, I did the same using the glm() instead of lrm():

crisis_bubble4<-glm(stock.market.crash~crash.MA+bubble.MA+MP.MA+UTS.MA+UPR.MA+PPI.MA+RV.MA,family=binomial("logit"),data=Data_logitregression_movingaverage)

If I use the coeftest() function, I get following results.

coeftest(crisis_bubble4,df=Inf,vcov=NeweyWest)

z test of coefficients:

 Estimate Std. Error z value Pr(>|z|)
(Intercept)   -5.260885.01706 -1.0486  0.29436
crash.MA   0.492192.41688  0.2036  0.83863
bubble.MA 12.128685.85228  2.0725  0.03822 *
MP.MA-20.07238  499.37589 -0.0402  0.96794
UTS.MA   -58.18142   77.08409 -0.7548  0.45038
UPR.MA  -337.57985  395.35639 -0.8539  0.39318
PPI.MA   729.37693  358.60868  2.0339  0.04196 *
RV.MA116.00106   79.52421  1.4587  0.14465
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1


Some of these coefficients and standard errors are suspiciously large. It 
might make sense to check for quasi-complete separation.


I am unsure whether the coeftest from the lmtest package is appropriate 
in case of a logistic regression.


Yes, this is ok. (Whether or not the application of HAC standard errors is 
the best way to go is a different matter though.)


Is there another function for logistic regressions? Furthermore, I would 
like to present the regression coefficients, the F-statistic and the HAC 
estimators in one single table. How can I do that?


Running first coeftest() and then lrtest() should get you close to what 
you want - even though it is not a single table.


I thought it would be useful to incorporate the HAC consistent 
covariance matrix into the logistic regression directly and generate an 
output of coefficients and the corresponding standard errors. Is there 
such a function in R?


Not with HAC standard errors, I think.


Thanks for your support.

Kind regards

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.



__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Trimming time series to only include complete years

2016-05-28 Thread Jeff Newmiller

# read about POSIXlt at ?DateTimeClasses
# note that the "mon" element is 0-11
isPartialWaterYear <- function( d ) {
  dtl <- as.POSIXlt( dat$Date )
  wy1 <- cumsum( ( 9 == dtl$mon ) & ( 1 == dtl$mday ) )
  ( 0 == wy1  # first partial year
  | (  8 != dtl$mon[ nrow( dat ) ] # end partial year
& 30 != dtl$mday[ nrow( dat ) ]
) & wy1[ nrow( dat ) ] == wy1
  )
}

dat2 <- dat[ !isPartialWaterYear( dat$Date ), ]

The above assumes that, as you said, the data are continuous at one-day 
intervals, such that the only partial years will occur at the beginning 
and end. The "diff" function could be used to identify irregular data 
within the data interval if needed.


On Fri, 27 May 2016, Morway, Eric wrote:


In bulk processing streamflow data available from an online database, I'm
wanting to trim the beginning and end of the time series so that daily data
associated with incomplete "water years" (defined as extending from Oct 1st
to the following September 30th) is trimmed off the beginning and end of
the series.

For a small reproducible example, the time series below starts on
2010-01-01 and ends on 2011-11-05.  So the data between 2010-01-01 and
2010-09-30 and also between 2011-10-01 and 2011-11-05 is not associated
with a complete set of data for their respective water years.  With the
real data, the initial date of collection is arbitrary, could be 1901 or
1938, etc.  Because I'm cycling through potentially thousands of records, I
need help in designing a function that is efficient.

dat <-
data.frame(Date=seq(as.Date("2010-01-01"),as.Date("2011-11-05"),by="day"))
dat$Q <- rnorm(nrow(dat))

dat$wyr <- as.numeric(format(dat$Date,"%Y"))
is.nxt <- as.numeric(format(dat$Date,"%m")) %in% 1:9
dat$wyr[!is.nxt] <- dat$wyr[!is.nxt] + 1


function(dat) {
  ...
  returns a subset of dat such that dat$Date > -09-30 & dat$Date <
-10-01
  ...
}

where the years between - are "complete" (no missing days).  In the
example above, the returned dat would extend from 2010-10-01 to 2011-09-30

Any offered guidance is very much appreciated.

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.



---
Jeff NewmillerThe .   .  Go Live...
DCN:Basics: ##.#.   ##.#.  Live Go...
  Live:   OO#.. Dead: OO#..  Playing
Research Engineer (Solar/BatteriesO.O#.   #.O#.  with
/Software/Embedded Controllers)   .OO#.   .OO#.  rocks...1k

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Application of "merge" and "within"

2016-05-28 Thread Jeff Newmiller
Why do you want to do this?
-- 
Sent from my phone. Please excuse my brevity.

On May 27, 2016 4:00:14 PM PDT, Santosh  wrote:
>Dear Rxperts!
>
>Is there a way to compute relative values.. using within().. function?
>
>Any assistance/suggestions are highly welcome!!
>Thanks again,
>Santosh...
>___
>A sample dataset and the computation "outside" within()  function is
>shown..
>
>q <- data.frame(GL = rep(paste("G",1:3,sep = ""),each = 50),
>G  = rep(1:3,each = 50),
>D = rep(paste("D",1:5,sep = ""),each = 30),
>a = rep(1:15,each = 10),
>t = rep(seq(10),15),
>b = round(runif(150,10,20)))
>r <- subset(q,!duplicated(paste(G,a)),sel=c(G,a,b))
>names(r)[3] <- "bl"
>s <- merge(q,r)
> s$db <- s$b-s$bl
>
>> head(s,5)
>G  a GL  D  t  b bl db
>1   1  1 G1 D1  1 13 13  0
>2   1  1 G1 D1  2 16 13  3
>3   1  1 G1 D1  3 19 13  6
>4   1  1 G1 D1  4 12 13 -1
>5   1  1 G1 D1  5 19 13  6
>
>   [[alternative HTML version deleted]]
>
>__
>R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
>https://stat.ethz.ch/mailman/listinfo/r-help
>PLEASE do read the posting guide
>http://www.R-project.org/posting-guide.html
>and provide commented, minimal, self-contained, reproducible code.

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Application of "merge" and "within"

2016-05-28 Thread Duncan Murdoch

On 27/05/2016 7:00 PM, Santosh wrote:

Dear Rxperts!

Is there a way to compute relative values.. using within().. function?

Any assistance/suggestions are highly welcome!!
Thanks again,
Santosh...
___
A sample dataset and the computation "outside" within()  function is shown..

q <- data.frame(GL = rep(paste("G",1:3,sep = ""),each = 50),
G  = rep(1:3,each = 50),
D = rep(paste("D",1:5,sep = ""),each = 30),
a = rep(1:15,each = 10),
t = rep(seq(10),15),
b = round(runif(150,10,20)))
r <- subset(q,!duplicated(paste(G,a)),sel=c(G,a,b))
names(r)[3] <- "bl"
s <- merge(q,r)
 s$db <- s$b-s$bl


head(s,5)

G  a GL  D  t  b bl db
1   1  1 G1 D1  1 13 13  0
2   1  1 G1 D1  2 16 13  3
3   1  1 G1 D1  3 19 13  6
4   1  1 G1 D1  4 12 13 -1
5   1  1 G1 D1  5 19 13  6


Just use

 s <- within(s, db <- b - bl)

Duncan Murdoch

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Creating R file

2016-05-28 Thread Rolf Turner

On 28/05/16 19:37, Hong Yu wrote:





When you need numerical data input in R programs, you can use EXCEL
to create .csv file.  When you need output calculation results, you
can write out .csv file in R programs.

Yes, the most common .cvs file format is comma seperated numerical
values.  You can use EXCEL to create .csv file, and view the content
with text editor.


There are myriad ways to create *.csv files, including text editors and 
R itself, and other sound statistical packages.  There is no need to use 
the abomination that is known as Excel.


cheers,

Rolf Turner

--
Technical Editor ANZJS
Department of Statistics
University of Auckland
Phone: +64-9-373-7599 ext. 88276

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] read multiple sheets of excel data into R

2016-05-28 Thread jim holtman
Try the 'openxlsx' package.  I gave up using XLConnect because of the Java
requirement, and speed on larger tables. "openxlsx" has the access routines
written in C so you don't need any other outside dependencies.


Jim Holtman
Data Munger Guru

What is the problem that you are trying to solve?
Tell me what you want to do, not how you want to do it.

On Sat, May 28, 2016 at 2:34 PM, Jeff Newmiller 
wrote:

> Apparently you need to get your Java runtime setup, or install Perl,
> depending which of these tools you want to use.
>
> Or if your data are laid out simply, you might be able to use the readxl
> package.
> --
> Sent from my phone. Please excuse my brevity.
>
> On May 28, 2016 10:55:50 AM PDT, li li  wrote:
> >Hi all,
> >I tried to use the package "XLConnect" to read excel data into R.  I
> >got
> >the following error message:
> >
> >Error : .onLoad failed in loadNamespace() for 'rJava', details:
> >  call: fun(libname, pkgname)
> >  error: No CurrentVersion entry in Software/JavaSoft registry! Try
> >re-installing Java and make sure R and Java have matching
> >architectures.
> >
> >
> >  I tried read.xls and got the following error  message:
> >
> >> library(gdata)> one <- read.xls ("one.xlsx", sheet=1)Error in
> >findPerl(verbose = verbose) :
> >  perl executable not found. Use perl= argument to specify the correct
> >path.Error in file.exists(tfn) : invalid 'file' argument
> >
> >
> >  Can anyone give me some input on this?
> >  Thanks.
> >Hanna
> >
> >   [[alternative HTML version deleted]]
> >
> >__
> >R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> >https://stat.ethz.ch/mailman/listinfo/r-help
> >PLEASE do read the posting guide
> >http://www.R-project.org/posting-guide.html
> >and provide commented, minimal, self-contained, reproducible code.
>
> [[alternative HTML version deleted]]
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Creating R file

2016-05-28 Thread Hong Yu

You are right, that it is unnecessary to relate to excel.  I explain this way, 
because I often tutor people with little programming experience, and often I 
tutor people with no concept of “text editors”.



From: Rolf Turner 
Sent: Sunday, May 29, 2016 5:57 AM
To: Hong Yu 
Cc: infinite2...@yahoo.com ; r-help@r-project.org 
Subject: Re: [R] Creating R file

On 28/05/16 19:37, Hong Yu wrote:
>

> When you need numerical data input in R programs, you can use EXCEL
> to create .csv file.  When you need output calculation results, you
> can write out .csv file in R programs.
>
> Yes, the most common .cvs file format is comma seperated numerical
> values.  You can use EXCEL to create .csv file, and view the content
> with text editor.

There are myriad ways to create *.csv files, including text editors and 
R itself, and other sound statistical packages.  There is no need to use 
the abomination that is known as Excel.

cheers,

Rolf Turner

-- 
Technical Editor ANZJS
Department of Statistics
University of Auckland
Phone: +64-9-373-7599 ext. 88276

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] asking for large memory - crash running rterm.exe on windows

2016-05-28 Thread Anthony Damico
hi, thanks to you both!  note the large memory.limit() on the machine
before the crash (200+ gb) so i'm not sure it's a simple overloading
explosion?  i've filed a bug report..

https://bugs.r-project.org/bugzilla/show_bug.cgi?id=16927



On Saturday, May 28, 2016, Martin Maechler 
wrote:

> > Ben Bolker 
> > on Sat, 28 May 2016 15:42:45 + writes:
>
> > Anthony Damico  gmail.com> writes:
> >>
> >> hi, here's a minimal reproducible example that crashes my
> >> R 3.3.0 console on a powerful windows server.  below the
> >> example, i've put the error (not crash) that occurs on R
> >> 3.2.3.
> >>
> >> should this be reported to http://bugs.r-project.org/ or
> >> am i doing something silly?  thanx
>
>
> > From the R FAQ (9.1):
>
> > If R executes an illegal instruction, or dies with an
> > operating system error message that indicates a problem in
> > the program (as opposed to something like “disk full”),
> > then it is certainly a bug.
>
> >   So you could submit a bug report, *or* open a discussion
> > on r-de...@r-project.org (which I'd have said was a more
> > appropriate venue for this question in any case) ...
>
> Indeed.
> In this case, this is a known problem -- not just of R, but of
> many programs that you can run ---
> You are requesting (much) more memory than your computer has
> RAM, and in this situation -- depending on the OS ---
> your computer will kill R (what you saw) or your it will become
> very slow trying to shove all memory to R and start swapping
> (out to disk other running / sleeping processes on the
> computer).
>
> Both is very unpleasant...
> But it is you as R user who asked R to allocate an object of
> about 41.6 Gigabytes (26 * 1.6, see below).
>
> As Ben mentioned this may be worth a discussion on R-devel ...
> or you rather follow up the existing thread opened by Marius
> Hofert  three weeks ago, with subject
>  "[Rd] R process killed when allocating too large matrix (Mac OS X)"
>
>   -->  https://stat.ethz.ch/pipermail/r-devel/2016-May/072648.html
>
> His simple command to "crash R" was
>
>matrix(0, 1e5, 1e5)
>
> which for some of use gives an error such as
>
> > x <- matrix(0, 1e5,1e5)
> Error: cannot allocate vector of size 74.5 Gb
>
> but for others it had the same effect as your example.
> BTW: I repeat it here in a functionalized form with added
>  comments which makes apparent what's going on:
>
>
> ## Make simple data.frame
> mkDf <- function(grpsize, wrongSize = FALSE) {
> ne <- (if(wrongSize) 26 else 1) *grpsize
> data.frame(x = rep(LETTERS, each = ne),
>v = runif(grpsize*26), stringsAsFactors=FALSE)
> }
>
> g1 <- ceiling(10^5/26)
> d1 <- mkDf(g1) # works fine
> str(d1)
> ## 'data.frame':100022 obs. of  2 variables:
>
> dP <- mkDf(g1, wrong=TRUE)# mis-matching the number of elements
>
> str(dP) # is 26 times larger
> ## 'data.frame': 2600572 obs. of  2 variables: .
>
>
> # make this much bigger
> gLarge <- ceiling(10^8/26)
>
> dL <- mkDf(gLarge) # works "fine" .. (well, takes time!!)
> str(dL)
> ## 'data.frame': 10004 obs. of  2 variables:
> as.numeric(print(object.size(dL)) / 1e6)
> ## 162088 bytes
> ## [1] 1600.002  Mega  i.e.,  1.6 GBytes
>
> ## Well, this will be 26 times larger than already large ==> your R may
> crash *OR*
>  ## your computer may basically slow down to a crawl, when R requests all
> its memory...
> if(FALSE) ## ==> do *NOT* evaluate the following lightly !!
> dLL <- mkDf(gLarge, wrong=TRUE)
> # CONSOLE CRASH WITHOUT EXPLANATION
> # C:\Users\AnthonyD>
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.