Re: [R] Loop to check for large dataset

2016-10-10 Thread PIKAL Petr
Hi

see in line

> -Original Message-
> From: R-help [mailto:r-help-boun...@r-project.org] On Behalf Of Christoph
> Puschmann
> Sent: Sunday, October 9, 2016 1:27 AM
> To: Adrian Dușa 
> Cc: r-help@r-project.org; Christoph Puschmann
> 
> Subject: Re: [R] Loop to check for large dataset
>
> Dear Adrian,
>
> Yes it is a cyclical data set and theoretically it should repeat this 
> interval until
> 61327. The data set itself is divided into 2 Parts:
> 1. Product category (column 10)
> 2. Number of Stores Participating (column 01) Overall there are 22 different
> products and in each you have 19 different stores participating. And
> theoretically each store over each product category should have a 1 - 157
> week interval.

Not much clearer and definitely not reproducible.

From what I understand you have 22*19= 418 combinations of product/store. How 
do you want to put these 418 combinations into 157 rows?

It seems to me that it can be somehow done with aggregate function, however 
without some small reproducible example we are fishing in murky water.

Try to post data with let say 3 stores and 4 products to explain how your data 
is structured and what is or is not correct.

Cheers
Petr

>
> The part I am struggling with is how do I run a loop over the whole data set,
> while checking if all stores participated 157 weeks over the different
> products.
>
> So far I came up with this:
>
> n=61327   # Generate Matrix to check for values
> Control = matrix(
>   0,
>   nrow = n,
>   ncol = 1)
>
> s <- seq(from =1 , to = 157, by = 1)
> CW = matrix(
>   s,
>   nrow = 157,
>   ncol = 1
> )
>
> colnames(CW)[1] <- ’s'
>
> CW = as.data.frame(CW)
>
> for (i in 1:nrow(FD)) {   # Let run trhough all the rows
>   for (j in 1:157) {
> if(FD$WEEk[j] == C$s[j]) {
>   Control[i] = 1 # coresponding control row = 1
> } else {
>   Control[i] = 0 # corresponding control row = 0
> }
> }
> }
>
> I coded a  MRE and attached an sample of my data set.
>
> MRE:
>
> #MRE
>
> dat <- data.frame(
>   Store = c(rep(8, times = 157), rep(12, times = 157)),  # Number of stores
>   WEEK = rep(seq(from=1, to = 157, by = 1), times = 2)
> )
>
>
>
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-
> guide.html
> and provide commented, minimal, self-contained, reproducible code.


Tento e-mail a jakékoliv k němu připojené dokumenty jsou důvěrné a jsou určeny 
pouze jeho adresátům.
Jestliže jste obdržel(a) tento e-mail omylem, informujte laskavě neprodleně 
jeho odesílatele. Obsah tohoto emailu i s přílohami a jeho kopie vymažte ze 
svého systému.
Nejste-li zamýšleným adresátem tohoto emailu, nejste oprávněni tento email 
jakkoliv užívat, rozšiřovat, kopírovat či zveřejňovat.
Odesílatel e-mailu neodpovídá za eventuální škodu způsobenou modifikacemi či 
zpožděním přenosu e-mailu.

V případě, že je tento e-mail součástí obchodního jednání:
- vyhrazuje si odesílatel právo ukončit kdykoliv jednání o uzavření smlouvy, a 
to z jakéhokoliv důvodu i bez uvedení důvodu.
- a obsahuje-li nabídku, je adresát oprávněn nabídku bezodkladně přijmout; 
Odesílatel tohoto e-mailu (nabídky) vylučuje přijetí nabídky ze strany příjemce 
s dodatkem či odchylkou.
- trvá odesílatel na tom, že příslušná smlouva je uzavřena teprve výslovným 
dosažením shody na všech jejích náležitostech.
- odesílatel tohoto emailu informuje, že není oprávněn uzavírat za společnost 
žádné smlouvy s výjimkou případů, kdy k tomu byl písemně zmocněn nebo písemně 
pověřen a takové pověření nebo plná moc byly adresátovi tohoto emailu případně 
osobě, kterou adresát zastupuje, předloženy nebo jejich existence je adresátovi 
či osobě jím zastoupené známá.

This e-mail and any documents attached to it may be confidential and are 
intended only for its intended recipients.
If you received this e-mail by mistake, please immediately inform its sender. 
Delete the contents of this e-mail with all attachments and its copies from 
your system.
If you are not the intended recipient of this e-mail, you are not authorized to 
use, disseminate, copy or disclose this e-mail in any manner.
The sender of this e-mail shall not be liable for any possible damage caused by 
modifications of the e-mail or by delay with transfer of the email.

In case that this e-mail forms part of business dealings:
- the sender reserves the right to end negotiations about entering into a 
contract in any time, for any reason, and without stating any reasoning.
- if the e-mail contains an offer, the recipient is entitled to immediately 
accept such offer; The sender of this e-mail (offer) excludes any acceptance of 
the offer on the part of the recipient containing any amendment or variation.
- the sender insists on that the respective contract is conclude

Re: [R] Loop to check for large dataset

2016-10-10 Thread Christoph Puschmann
Dear Petr,

I attached a sample file, which contains the first 4 products.

It is more that I have: 157 weeks, 19 different Stores and 22 products: 
157*19*22 = 65,626 rows. And as I sated I have roughly 63,127 rows. (so some 
have to be missing).

All the best,

Christoph

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Loop to check for large dataset

2016-10-10 Thread PIKAL Petr
Hi

I named your data test

res <- xtabs(~STORE+WEEK+Description, data=test)

should give you values in which there is for given Description WEEK and STORE 
missing.

you can select week and store by

which(res[, ,1]==0, arr.ind=T)

for description 1 and so on.

Another option is to generate full set STORE, WEEK and description and merge it 
with original data by merge.

Cheers
Petr

From: Christoph Puschmann [mailto:c.puschm...@student.unsw.edu.au]
Sent: Monday, October 10, 2016 9:34 AM
To: PIKAL Petr ; r-help@r-project.org
Subject: Re: [R] Loop to check for large dataset

Dear Petr,

I attached a sample file, which contains the first 4 products.

It is more that I have: 157 weeks, 19 different Stores and 22 products: 
157*19*22 = 65,626 rows. And as I sated I have roughly 63,127 rows. (so some 
have to be missing).

All the best,

Christoph



Tento e-mail a jak?koliv k n?mu p?ipojen? dokumenty jsou d?v?rn? a jsou ur?eny 
pouze jeho adres?t?m.
Jestli?e jste obdr?el(a) tento e-mail omylem, informujte laskav? neprodlen? 
jeho odes?latele. Obsah tohoto emailu i s p??lohami a jeho kopie vyma?te ze 
sv?ho syst?mu.
Nejste-li zam??len?m adres?tem tohoto emailu, nejste opr?vn?ni tento email 
jakkoliv u??vat, roz?i?ovat, kop?rovat ?i zve?ej?ovat.
Odes?latel e-mailu neodpov?d? za eventu?ln? ?kodu zp?sobenou modifikacemi ?i 
zpo?d?n?m p?enosu e-mailu.

V p??pad?, ?e je tento e-mail sou??st? obchodn?ho jedn?n?:
- vyhrazuje si odes?latel pr?vo ukon?it kdykoliv jedn?n? o uzav?en? smlouvy, a 
to z jak?hokoliv d?vodu i bez uveden? d?vodu.
- a obsahuje-li nab?dku, je adres?t opr?vn?n nab?dku bezodkladn? p?ijmout; 
Odes?latel tohoto e-mailu (nab?dky) vylu?uje p?ijet? nab?dky ze strany p??jemce 
s dodatkem ?i odchylkou.
- trv? odes?latel na tom, ?e p??slu?n? smlouva je uzav?ena teprve v?slovn?m 
dosa?en?m shody na v?ech jej?ch n?le?itostech.
- odes?latel tohoto emailu informuje, ?e nen? opr?vn?n uzav?rat za spole?nost 
??dn? smlouvy s v?jimkou p??pad?, kdy k tomu byl p?semn? zmocn?n nebo p?semn? 
pov??en a takov? pov??en? nebo pln? moc byly adres?tovi tohoto emailu p??padn? 
osob?, kterou adres?t zastupuje, p?edlo?eny nebo jejich existence je adres?tovi 
?i osob? j?m zastoupen? zn?m?.

This e-mail and any documents attached to it may be confidential and are 
intended only for its intended recipients.
If you received this e-mail by mistake, please immediately inform its sender. 
Delete the contents of this e-mail with all attachments and its copies from 
your system.
If you are not the intended recipient of this e-mail, you are not authorized to 
use, disseminate, copy or disclose this e-mail in any manner.
The sender of this e-mail shall not be liable for any possible damage caused by 
modifications of the e-mail or by delay with transfer of the email.

In case that this e-mail forms part of business dealings:
- the sender reserves the right to end negotiations about entering into a 
contract in any time, for any reason, and without stating any reasoning.
- if the e-mail contains an offer, the recipient is entitled to immediately 
accept such offer; The sender of this e-mail (offer) excludes any acceptance of 
the offer on the part of the recipient containing any amendment or variation.
- the sender insists on that the respective contract is concluded only upon an 
express mutual agreement on all its aspects.
- the sender of this e-mail informs that he/she is not authorized to enter into 
any contracts on behalf of the company except for cases in which he/she is 
expressly authorized to do so in writing, and such authorization or power of 
attorney is submitted to the recipient or the person represented by the 
recipient, or the existence of such authorization is known to the recipient of 
the person represented by the recipient.

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] turning comma separated string from multiple choices into

2016-10-10 Thread silvia giussani
Hi all,



could you please tell me if you find a solution to this problem (in
Subject)?



June Kim wrote:

>* Hello,*

>

>* I use google docs' Forms to conduct surveys online. Multiple choices*

>* questions are coded as comma separated values.*

>

>* For example,*

>

>* if the question is like:*

>

>* 1. What magazines do you currently subscribe to? (you can choose*

>* multiple choices)*

>* 1) Fast Company*

>* 2) Havard Business Review*

>* 3) Business Week*

>* 4) The Economist*

>

>* And if the subject chose 1) and 3), the data is coded as a cell in a*

>* spreadsheet as,*

>

>* "Fast Company, Business Week"*

>

>* I read the data with read.csv into R. To analyze the data, I have to*

>* change that string into something like flags(indicator variables?).*

>* That is, there should be 4 variables, of which values are either 1 or*

>* 0, indicating chosen or not-chosen respectively.*

>

>* Suppose the data is something like,*

>

>

>>* survey1*

>>

>*   agefavorite_magazine*

>* 1  29 Fast Company*

>* 2  31  Fast Company, Business Week*

>* 3  32 Havard Business Review, Business Week, The Economist*

>

>

>* Then I have to chop the string in favorite_magazine column to turn*

>* that data into something like,*

>

>

>>* survey1transformed*

>>

>*   age Fast Company Havard Business Review Business Week The Economist*

>* 1  291  0 0 0*

>* 2  311  0 1 0*

>* 3  320  1 1 1*

>

>

>* Actually I have many more multiple choice questions in the survey.*

>

>* What is the easy elegant and natural way in R to do the job?*

>



I'd look into something like as.data.frame(lapply(strings, grep,

x=favorite_magazine, fixed=TRUE)), where strings <- c("Fast Company",

"Havard Business Review", ...).



(I take it that the mechanism is such that you can rely on at least

having everything misspelled in the same way? If it is alternatingly

"Havard" and "Harvard", then things get a bit trickier.)



Thank you and regards,

Silvia Giussani

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Loop to check for large dataset

2016-10-10 Thread Adrian Dușa
This is an example of how a reproducible code looks like, assuming you have
three columns in your dataset named S (store), P (product) and W (week),
and also assuming they have integer values from 1 to 19, 1 to 22 and 1 to
157 respectively:

#

mydata <- expand.grid(seq(19), seq(22), seq(157))
names(mydata) <- c("S", "P", "W")

# randomly delete 65626 - 63127 = 2499 rows
set.seed(12345) # make it replicable

mydata <- mydata[-sample(seq(nrow(mydata)), nrow(mydata) - 63127), ]

#


Now the dataframe mydata contains exactly 63127 rows, just as in your case.
The task is to find which weeks are missing, from which store and for which
product.
Below is a possible code to do that. Given you have a small number of
stores and products, I'll keep it simple and stupid, by using for loops:


#

result <- matrix(nrow = 0, ncol = 3)

for (i in seq(19)) {
for (j in seq(22)) {
miss <- setdiff(seq(157), mydata$W[mydata$S == i & mydata$P == j])
if (length(miss) > 0) {
result <- rbind(result, cbind(S = i, P = j, W = miss))
}
}
}

# The result matrix contains 2499 rows that are missing.

> head(result)
 S P   W
[1,] 1 1  10
[2,] 1 1  11
[3,] 1 1  82
[4,] 1 1 100
[5,] 1 1 117
[6,] 1 1 148

#


In this example, for S(tore) number 1 and P(roduct) number 1, you are
missing W(eek) 10, 11, 82 and so on.

In hoping you can adapt this code to your particular example,
Adrian


On Sun, Oct 9, 2016 at 2:26 AM, Christoph Puschmann <
c.puschm...@student.unsw.edu.au> wrote:
>
> Dear Adrian,
>
> Yes it is a cyclical data set and theoretically it should repeat this
interval until 61327. The data set itself is divided into 2 Parts:
> 1. Product category (column 10)
> 2. Number of Stores Participating (column 01)
> Overall there are 22 different products and in each you have 19 different
stores participating. And theoretically each store over each product
category should have a 1 - 157 week interval.
>
> The part I am struggling with is how do I run a loop over the whole data
set, while checking if all stores participated 157 weeks over the different
products.
>
> So far I came up with this:
>
> n=61327   # Generate Matrix to check for values
> Control = matrix(
>   0,
>   nrow = n,
>   ncol = 1)
>
> s <- seq(from =1 , to = 157, by = 1)
> CW = matrix(
>   s,
>   nrow = 157,
>   ncol = 1
> )
>
> colnames(CW)[1] <- ’s'
>
> CW = as.data.frame(CW)
>
> for (i in 1:nrow(FD)) {   # Let run trhough all the rows
>   for (j in 1:157) {
> if(FD$WEEk[j] == C$s[j]) {
>   Control[i] = 1 # coresponding control row = 1
> } else {
>   Control[i] = 0 # corresponding control row = 0
> }
> }
> }
>
> I coded a  MRE and attached an sample of my data set.
>
> MRE:
>
> #MRE
>
> dat <- data.frame(
>   Store = c(rep(8, times = 157), rep(12, times = 157)),  # Number of
stores
>   WEEK = rep(seq(from=1, to = 157, by = 1), times = 2)
> )
>
>
>
>



--
Adrian Dusa
University of Bucharest
Romanian Social Data Archive
Soseaua Panduri nr.90
050663 Bucharest sector 5
Romania

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Loop to check for large dataset

2016-10-10 Thread PIKAL Petr
Hi

Given this example data, you can get same answer with less typing and without 
loops.

res<-xtabs(~W+P+S,mydata)
res1<-which(res==0, arr.ind=T)
head(res1)
  W P S
10   10 1 1
11   11 1 1
82   82 1 1
100 100 1 1
117 117 1 1
148 148 1 1

Cheers
Petr


From: dusa.adr...@gmail.com [mailto:dusa.adr...@gmail.com] On Behalf Of Adrian 
Du?a
Sent: Monday, October 10, 2016 12:26 PM
To: Christoph Puschmann 
Cc: r-help@r-project.org; PIKAL Petr 
Subject: Re: [R] Loop to check for large dataset

This is an example of how a reproducible code looks like, assuming you have 
three columns in your dataset named S (store), P (product) and W (week), and 
also assuming they have integer values from 1 to 19, 1 to 22 and 1 to 157 
respectively:

#
mydata <- expand.grid(seq(19), seq(22), seq(157))
names(mydata) <- c("S", "P", "W")

# randomly delete 65626 - 63127 = 2499 rows
set.seed(12345) # make it replicable
mydata <- mydata[-sample(seq(nrow(mydata)), nrow(mydata) - 63127), ]
#


Now the dataframe mydata contains exactly 63127 rows, just as in your case. The 
task is to find which weeks are missing, from which store and for which product.
Below is a possible code to do that. Given you have a small number of stores 
and products, I'll keep it simple and stupid, by using for loops:


#

result <- matrix(nrow = 0, ncol = 3)

for (i in seq(19)) {
for (j in seq(22)) {
miss <- setdiff(seq(157), mydata$W[mydata$S == i & mydata$P == j])
if (length(miss) > 0) {
result <- rbind(result, cbind(S = i, P = j, W = miss))
}
}
}

# The result matrix contains 2499 rows that are missing.

> head(result)
 S P   W
[1,] 1 1  10
[2,] 1 1  11
[3,] 1 1  82
[4,] 1 1 100
[5,] 1 1 117
[6,] 1 1 148

#


In this example, for S(tore) number 1 and P(roduct) number 1, you are missing 
W(eek) 10, 11, 82 and so on.

In hoping you can adapt this code to your particular example,
Adrian


On Sun, Oct 9, 2016 at 2:26 AM, Christoph Puschmann 
mailto:c.puschm...@student.unsw.edu.au>> wrote:
>
> Dear Adrian,
>
> Yes it is a cyclical data set and theoretically it should repeat this 
> interval until 61327. The data set itself is divided into 2 Parts:
> 1. Product category (column 10)
> 2. Number of Stores Participating (column 01)
> Overall there are 22 different products and in each you have 19 different 
> stores participating. And theoretically each store over each product category 
> should have a 1 - 157 week interval.
>
> The part I am struggling with is how do I run a loop over the whole data set, 
> while checking if all stores participated 157 weeks over the different 
> products.
>
> So far I came up with this:
>
> n=61327   # Generate Matrix to check for values
> Control = matrix(
>   0,
>   nrow = n,
>   ncol = 1)
>
> s <- seq(from =1 , to = 157, by = 1)
> CW = matrix(
>   s,
>   nrow = 157,
>   ncol = 1
> )
>
> colnames(CW)[1] <- ’s'
>
> CW = as.data.frame(CW)
>
> for (i in 1:nrow(FD)) {   # Let run trhough all the rows
>   for (j in 1:157) {
> if(FD$WEEk[j] == C$s[j]) {
>   Control[i] = 1 # coresponding control row = 1
> } else {
>   Control[i] = 0 # corresponding control row = 0
> }
> }
> }
>
> I coded a  MRE and attached an sample of my data set.
>
> MRE:
>
> #MRE
>
> dat <- data.frame(
>   Store = c(rep(8, times = 157), rep(12, times = 157)),  # Number of stores
>   WEEK = rep(seq(from=1, to = 157, by = 1), times = 2)
> )
>
>
>
>



--
Adrian Dusa
University of Bucharest
Romanian Social Data Archive
Soseaua Panduri nr.90
050663 Bucharest sector 5
Romania


Tento e-mail a jakékoliv k němu připojené dokumenty jsou důvěrné a jsou určeny 
pouze jeho adresátům.
Jestliže jste obdržel(a) tento e-mail omylem, informujte laskavě neprodleně 
jeho odesílatele. Obsah tohoto emailu i s přílohami a jeho kopie vymažte ze 
svého systému.
Nejste-li zamýšleným adresátem tohoto emailu, nejste oprávněni tento email 
jakkoliv užívat, rozšiřovat, kopírovat či zveřejňovat.
Odesílatel e-mailu neodpovídá za eventuální škodu způsobenou modifikacemi či 
zpožděním přenosu e-mailu.

V případě, že je tento e-mail součástí obchodního jednání:
- vyhrazuje si odesílatel právo ukončit kdykoliv jednání o uzavření smlouvy, a 
to z jakéhokoliv důvodu i bez uvedení důvodu.
- a obsahuje-li nabídku, je adresát oprávněn nabídku bezodkladně přijmout; 
Odesílatel tohoto e-mailu (nabídky) vylučuje přijetí nabídky ze strany příjemce 
s dodatkem či odchylkou.
- trvá odesílatel na tom, že příslušná smlouva je uzavřena teprve výslovným 
dosažením shody na všech jejích náležitostech.
- odesílatel tohoto emailu informuje, že není oprávněn uzavírat za společnost 
žádné smlouvy s výjimkou případů, kdy k tomu byl písemně zmocněn nebo písemně 
pověřen a takové pověření nebo plná moc byly adresátovi tohoto emailu případně 
osobě, kterou adresát zastupuje, předloženy nebo jejich existence je adresátovi 
či osobě jím zastoupené

[R] [R-pkgs] aVirtualTwins available on CRAN

2016-10-10 Thread Francois vieille

[markdown format]

I'm glad to introduce you the new package aVirtualTwins. This package is an 
adaptation of VirtualTwins method of subgroup identification from [Foster, J. 
C., Taylor, J. M.G. and Ruberg, S. J. 
(2011)](http://onlinelibrary.wiley.com/doi/10.1002/sim.4322/abstract).

### Explanation

Virtual Twins has been created to find subgroup of patients in a random 
clinical trial with enhanced treatment effect, if it exists. Theorically, this 
method can be used for binary and continous outcome. This package only deals 
with binary outcome in a two arms clinical trial.

Virutal Twins is also adapted for A/B testing of course.

Virtual Twins is based on random forest and regression/classification trees.

### Quick preview

Here's a example of aVirtualTwins use with a well known dataset (_sepsis_) in 
subgroup decovery:

_Sepsis_ contains simulated data on 470 subjects with a binary outcome 
survival, that stores survival status for patient after 28 days of treatment, 
value of 1 for subjects who died after 28 days and 0 otherwise. There are 11 
covariates, listed below, all of which are numerical variables.


```r
library(aVirtualTwins)

# Load data
data(sepsis)
# Format data
vt.obj <- vt.data(dataset = sepsis,
  outcome.field   = "survival",
  treatment.field = "THERAPY",
  interactions= TRUE)
## "1" will be the favorable outcome
# view of data
head(sepsis)
##   survival THERAPY PRAPACHEAGE BLGCS ORGANNUM   BLIL6  BLLPLAT
## 10   1   19 42.921151  301.80 191.
## 21   1   48 68.818112  118.90 264.1565
## 30   1   20 68.818152   92.80 123.
## 40   1   19 33.174142 1232.00 244.
## 50   1   48 46.532 34 2568.00  45.
## 60   0   21 56.098141  162.65 137.
##BLLBILI BLLCREAT TIMFIRST BLADL blSOFA
## 1 2.913416 1.0017.17 0   5.00
## 2 0.40 1.1017.17 5  10.00
## 3 5.116471 1.0010.00 1   7.50
## 4 3.142092 1.2017.17 0   6.25
## 5 4.052668 3.0010.00 0  12.00
## 6 0.50 4.66255610.00 0   8.75
# Print Incidences of sepsis data
vt.obj$getIncidences()
## $table
##trt
## resp01 sum  
##   0 101  188   289  
##   1 52   129   181  
##   sum   153  317   470  
##   Incidence 0.34 0.407 0.385
## 
## $rr
## [1] 1.197059
# $table
#trt
# resp01 sum  
#   0 101  188   289  
#   1 52   129   181  
#   sum   153  317   470  
#   Incidence 0.34 0.407 0.385
#
# $rr
# [1] 1.197059
#

# First step : create random forest model
vt.for <- vt.forest(forest.type  = "one",
vt.data  = vt.obj,
interactions = TRUE,
ntree= 500)
# Second step : find rules in data 
vt.trees <- vt.tree(tree.type = "class",
vt.difft  = vt.for, 
threshold = quantile(vt.for$difft, seq(.5,.8,.1)),
maxdepth  = 2)
# Print results
vt.sbgrps <- vt.subgroups(vt.trees)
knitr::kable(vt.sbgrps)
```

Subgroup  Subgroup size   Treatement event rate   
Control event rate   Treatment sample size   Control sample sizeRR (resub)  
 RR (snd)
--    --  --  
---  --    ---  
-
tree1   PRAPACHE>=26.5157 0.752   
0.327105 52  2.300  
1.774
tree3   PRAPACHE>=26.5 & AGE>=51.74   120 0.897   
0.31 78  42  2.894  
1.924


aVirtualTwins can be found on 
[CRAN](https://cran.r-project.org/package=aVirtualTwins) and 
[github](https://github.com/prise6/aVirtualTwins). Feel free to contribute.

Francois.

___
R-packages mailing list
r-packa...@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-packages

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] [R-pkgs] New Package: Plotluck

2016-10-10 Thread Stefan Schrödl

Dear useRs,

I am happy to announce that my package "plotluck" is now on CRAN:
[1]https://cran.r-project.org/web/packages/plotluck/

The aim of the package is to let the user focus on what to plot, rather than on
 the "how" during exploratory data analysis. Based on  the characteristics of a
 data frame and a formula, it tries to automatically choose the most suitable t
ype of plot (supported options are scatter, violin, box, bar, density, hexagon
bin, spine plot, and heat map). It also automates handling of observation weigh
ts, logarithmic axis scaling, reordering of factor levels, and overlaying smoot
hing curves and median lines. Plots are drawn using 'ggplot2'. Please see the v
ignette for some examples.

I welcome all feedback, suggestions, bug reports and feature requests.
Thank you!


   - Stefan

References

   1. https://cran.r-project.org/web/packages/plotluck/
___
R-packages mailing list
r-packa...@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-packages

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] [R-pkgs] Announcing qualpalr 0.2.1

2016-10-10 Thread Johan Larsson
Dear R users,

I would like to announce an updated version of qualpalr:

https://cran.r-project.org/package=qualpalr

qualpalr uses color difference equations to generate distinct qualitative color 
palettes for use in R graphics. Version 0.2.1 has been redesigned to use a 
better, more efficient optimization method and moreover introduces methods to 
adapt palettes to color blindness.

Please see the vignette 
(https://cran.r-project.org/web/packages/qualpalr/vignettes/introduction.html) 
if you'd like to learn more or visit the repository on GitHub 
(https://github.com/jolars/qualpalr) if you want to contribute.
All the best,
Johan

[[alternative HTML version deleted]]

___
R-packages mailing list
r-packa...@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-packages

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Finding starting values for the parameters using nls() or nls2()

2016-10-10 Thread ProfJCNash
The key lines are

library(nlmrt)
test <- nlxb(expf2, start= c(b0=.1, b1=1, th=.1), trace=TRUE, data=cl)

Thus I started with .1 1 and .1. The "solution" from nlxb, which is using 
analytic derivatives
and a very aggressive Marquardt code to keep trying even in bad situations, was
as you included below. Note that the singular values of the Jacobian are given 
(they are
recorded on the same table as the parameters, but do NOT correspond to the 
parameters.
The placement was simply a tidy place to put these numbers.)

The ratio of these sv's is 1.735e+16/0.004635 or approx 4E+18, so the condition 
number
of the traditional Gauss Newton approach is about 1E+37. Not a nice problem!

You probably should reformulate.

JN




On 16-10-10 10:41 AM, Pinglei Gao wrote:
> Thanks very much for your kindness help. I run your script then came out
> lots of outputs and I also studied the solution you posted. Forgive my
> ignorance, I still can't find the suitable starting values. Did I
> misunderstand something?
> 
> Best,
> 
> Pinglei Gao
> 
> -邮件原件-
> 发件人: ProfJCNash [mailto:profjcn...@gmail.com] 
> 发送时间: 2016年10月10日 10:41
> 收件人: Gabor Grothendieck; Pinglei Gao
> 主题: Re: [R] Finding starting values for the parameters using nls() or
> nls2()
> 
> I forgot to post the "solution" found by nlmrt:
> 
> nlmrt class object: x
> residual sumsquares =  1086.8  on  15 observations
> after  5001Jacobian and  6991 function evaluations
>   namecoeff  SE   tstat  pval  gradient
> JSingval
> b05.3274e-14NA NA NA  -6.614e+13
> 1.735e+16
> b1   33.5574NA NA NA  -3.466
> 11518
> th   -0.00721203NA NA NA  -740.8
> 0.004635
> 
> 
> Note the singular values -- this is the worst SV(max)/SV(min) ratio I've
> observed!
> 
> JN
> 
> 
>

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Stats course in Montreal

2016-10-10 Thread Highland Statistics Ltd

We would like to announce the following statistics course:

Course: Data exploration, regression, GLM & GAM with R

Where:  Montreal, Canada

When:   9-13 January 2017

Course website: http://www.highstat.com/statscourse.htm

Course flyer: 
http://highstat.com/Courses/Flyers/Flyer2017_01Montreal_RGG.pdf


Kind regards,

Alain Zuur


Other open courses in 2017:

Data exploration, regression, GLM & GAM with introduction to R. 13-17 
February 2017. Lisbon.
Introduction to Regression Models with Spatial and Temporal Correlation. 
20-24 February 2017. Lisbon.
Introduction to Regression Models with Spatial and Temporal Correlation. 
8-12 May 2017. Genoa.
Linear Mixed Effects Models and GLMM with R. Frequentist and Bayesian 
approaches. 9-13 October 2017. Trondheim.
Introduction to Regression Models with Spatial and Temporal Correlation. 
23-27 October 2017. Southampton


-- 
Dr. Alain F. Zuur

First author of:
1. Beginner's Guide to GAMM with R (2014).
2. Beginner's Guide to GLM and GLMM with R (2013).
3. Beginner's Guide to GAM with R (2012).
4. Zero Inflated Models and GLMM with R (2012).
5. A Beginner's Guide to R (2009).
6. Mixed effects models and extensions in ecology with R (2009).
7. Analysing Ecological Data (2007).

Highland Statistics Ltd.
9 St Clair Wynd
UK - AB41 6DZ Newburgh
Tel:   0044 1358 788177
Email: highs...@highstat.com
URL: www.highstat.com


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Having trouble with running ldBands in Hmisc package. Posted in StackExchange

2016-10-10 Thread Pitch Mandava via R-help
I am trying to run the example from Hmisc package in RStudio environment under 
Windows 10 and downloaded ld98.exe> .libPaths()Produces the following output[1] 
"C:/Users/username/X1_Carbon/Documents/R/win-library/3.2"[2] "C:/Program 
Files/R/R-3.2.5/library"I moved the ld98.exe to 
C:/Users/username/X1_Carbon/Documents/R/win-library/3.2Then installed Hmisc and 
ran the following> require(Hmisc)
> b <- ldBands(5, pr=FALSE)Produces the followingError: could not find function 
> "ldBands"To see if ld98.exe is working in the directory I ran the ld98.exe in 
> the Windows environment I get the following outputProgram for computations 
> related to group sequential boundaries using spending functions.Is this an 
> interactive session? (1=yes,0=no)yes interactive = 1etc.
[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Extending sparklyr

2016-10-10 Thread Javier Luraschi
For versions 1.6.1 and 2.0.0 of Spark, the GaussianMixture is under the ml
namespace not mllib, try this instead:

envir$model <- "org.apache.spark.mllib.clustering.GaussianMixture"

Best, Javier

On Sun, Oct 9, 2016 at 1:47 PM, Axel Urbiz  wrote:

> Hi All,
>
> Just started to experiment with "sparklyr" and already loving it.
>
> I'm trying to build an extension by constructing an R wrapper to Spark's
> Gaussian Mixtures. My attempt is below, and so is the error message. Not
> sure if this is possible to do, and if so, what is wrong with my code.
>
> Any hints would be much appreciated.
>
> Best,
> Axel.
>
> -
>
> library(sparklyr)
> library(dplyr)
> sc <- spark_connect(master = "local")
>
> x <- copy_to(sc, iris)
> x <- x %>% select(Petal_Width, Petal_Length)
>
> # set params
> k <- 3
> iter.max <- 100
> features <- dplyr::tbl_vars(x)
> compute.cost <- TRUE
> tolerance <- 1e-4
> ml.options <- ml_options()
>
> df <- spark_dataframe(x)
> sc <- spark_connection(df)
> df <- ml_prepare_features(
>   x = df,
>   features = features,
>   envir = environment()
>   # ml.options = ml.options
> )
> envir <- new.env(parent = emptyenv())
> envir$id <- ml.options$id.column
> df <- df %>%
>   sdf_with_unique_id(envir$id) %>%
>   spark_dataframe()
> tdf <- ml_prepare_dataframe(df, features, ml.options = ml.options, envir =
> envir)
> envir$model <- "org.apache.spark.ml.clustering.GaussianMixture"
> gmm <- invoke_new(sc, envir$model)
> >Error: failed to invoke spark command
> >16/10/09 16:35:35 ERROR  on 
> >org.apache.spark.ml.clustering.GaussianMixture
> failed
>
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] 答复: Finding starting values for the parameters using nls() or nls2()

2016-10-10 Thread Pinglei Gao
Thanks very much for taking time on this. Your assistances are very much
appreciated. But, I am afraid that I still have a question to bother you. 

I am working on a paper about weed seeds dispersal with harvest machine. I
found three general models for seed dispersal and retention after a review
of relevant literature. All models were optimized using nonlinear least
squares via the nls function in the statistical package R. The model that
best described the data will be determined by comparing Akaike Information
Criterion (AIC) values and the model with the lowest AIC score will be
selected. 

The first general model incorporated simple exponential and power
exponential functions, its starting value was easily to be found. But, I am
stuck with model 2 (which was mentioned previously) and model 3 with the
form:  Retention = (b0*Area^th+1)^b1. The model 3 is totally different to
others. I tried the measures that you were mentioned. But I still can’t
find suitable starting values because of my limited knowledge. I hope you
can do me the favor again. I can send the draft to you when I finished the
paper, if it is necessary. Maybe you can give me some constructive
suggestion about statistic and model construction and I can name you as a
coauthor for your contributions.

Best,

Pinglei Gao

-邮件原件-
发件人: peter dalgaard [mailto:pda...@gmail.com] 
发送时间: 2016年10月10日 7:41
收件人: Pinglei Gao
抄送: Andrew Robinson; R help (r-help@r-project.org); Bert Gunter
主题: Re: [R] Finding starting values for the parameters using nls() or
nls2()


> On 10 Oct 2016, at 00:40 , Bert Gunter  wrote:
> 
> Well... (inline -- and I hope this isn't homework!)
> 

Pretty much same as I thought. 

Fixing th=0.02 in the grid search looks wrong. Bert's plot is pretty linear,
so th=1 is a good guesstimate. There's a slight curvature but to reduce it,
you would increase th, not decrease it. Running the regression, as Bert
suggests, indicates that b0=5.16 and b1= -0.00024 could work as reasonable
starting values. Notice that the grid search had "b1 = seq(0.01, 4, by =
0.01)" which is wrong in both sign and scale.

Andrew's suggestion of dividing Retention by 100 is tempting, since it looks
like a percentage, but that would make all Y values less than 1 and the
double exponential function as written has values that are always bigger
than 1. (It is conceivable that the model itself is wrong, though. E.g. it
could be that Retention on a scale from 0 to 1 could be modeled as
exp(-something), but we really have no idea of the context here.)

(If this was in fact homework, you should now go and write a proper
SelfStart initializer routine for this model. Even if it isn't homework, you
do need to study the text again, because you have clearly not understood how
self-starting models work.)

-pd

> 
> 
> 
> On Sun, Oct 9, 2016 at 3:05 PM, Andrew Robinson 
>  wrote:
>> Here are some things to try.  Maybe divide Area by 1000 and retention 
>> by 100.  Try plotting the data and superimposing the line that 
>> corresponds to the 'fit' from nls2.  See if you can correct it with 
>> some careful guesses.
>> 
>> Getting suitable starting parameters for non-linear modeling is one 
>> of the black arts of statistical fitting. ...
>> 
>> Andrew
> 
> True. But it's usually worthwhile thinking about the math a bit before
guessing.
> 
> Note that the model can be linearized to:
> 
> log(log(Retention)) = b0 + b1*Area^th
> 
> So a plot of log(log(Retention)) vs Area may be informative and useful 
> for finding starting values. e.g., for a grid of th's, do linear 
> regression fits .
> 
> However, when I look at that plot, it seems pretty linear with a 
> negative slope. This suggests that you may have an overparametrization 
> problem . i.e. fix th =1 and use the b0 and b1 from the above 
> regression for starting values.
> 
> Do note that this strategy isn't foolproof, as it ignores that the 
> error term is additive in the above transformed metric, rather than 
> the original. This can sometimes mislead. But this is just a 
> heuristic.
> 
> Cheers,
> Bert
> 
> 
> 
> 
> 
> 
> 
>> 
>> On 9 October 2016 at 22:21, Pinglei Gao  wrote:
>>> Hi,
>>> 
>>> I have some data that i'm trying to fit a double exponential model:
data.
>>> Frame (Area=c (521.5, 689.78, 1284.71, 2018.8, 2560.46, 524.91, 
>>> 989.05, 1646.32, 2239.65, 2972.96, 478.54, 875.52, 1432.5, 2144.74, 
>>> 2629.2),
>>> 
>>> Retention=c (95.3, 87.18, 44.94, 26.36, 18.12, 84.68, 37.24, 33.04, 
>>> 23.46, 9.72, 97.92, 71.44, 44.52, 24.44, 15.26) ) and the formula of 
>>> the double exponential is: exp (b0*exp (b1*x^th)).
>>> 
>>> 
>>> 
>>> I failed to guess the initial parameter values and then I learned a 
>>> measure to find starting values from Nonlinear Regression with R (pp.
25-27):
>>> 
>>> 
>>> 
 cl<-data.frame(Area =c(521.5, 689.78, 1284.71, 2018.8, 2560.46, 
 524.91,
>>> 989.05, 1646.32, 2239.65, 2972.96, 478.54, 875.52, 1432.5, 2144.74, 
>>> 2629.2),
>>> 
>>> + Retention =c(95.3, 87.18, 44.

Re: [R] Finding starting values for the parameters using nls() or nls2()

2016-10-10 Thread Pinglei Gao
Thanks very much for your kindness help. I run your script then came out
lots of outputs and I also studied the solution you posted. Forgive my
ignorance, I still can't find the suitable starting values. Did I
misunderstand something?

Best,

Pinglei Gao

-邮件原件-
发件人: ProfJCNash [mailto:profjcn...@gmail.com] 
发送时间: 2016年10月10日 10:41
收件人: Gabor Grothendieck; Pinglei Gao
主题: Re: [R] Finding starting values for the parameters using nls() or
nls2()

I forgot to post the "solution" found by nlmrt:

nlmrt class object: x
residual sumsquares =  1086.8  on  15 observations
after  5001Jacobian and  6991 function evaluations
  namecoeff  SE   tstat  pval  gradient
JSingval
b05.3274e-14NA NA NA  -6.614e+13
1.735e+16
b1   33.5574NA NA NA  -3.466
11518
th   -0.00721203NA NA NA  -740.8
0.004635


Note the singular values -- this is the worst SV(max)/SV(min) ratio I've
observed!

JN

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Recoding lists of categories of a variable

2016-10-10 Thread MACDOUGALL Margaret
Hello

The R code
mydata$newvar[oldvar = "topic1"] <- "parenttopic"

is intended to recode the category 'topic 1' of the old  varaible 'oldvar' a 
new category label 'parenttopic' by defining the new variable 'newvar'.

Is there a convenient way to edit this code to allow me to recode a list of 
categories 'topic 1', 'topic 9' and 'topic 14', say, of the the old variable 
'oldvar' as 'parenttopic' by means of the new variable 'newvar', while also 
mapping system missing values to system missing values?

Thanks in advance

Best wishes
Margaret

The University of Edinburgh is a charitable body, registered in
Scotland, with registration number SC005336.
__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Recoding lists of categories of a variable

2016-10-10 Thread S Ellison
> Is there a convenient way to edit this code to allow me to recode a list of
> categories 'topic 1', 'topic 9' and 'topic 14', say, of the the old variable 
> 'oldvar'
> as 'parenttopic' by means of the new variable 'newvar', while also mapping
> system missing values to system missing values?

You could look at 'recode()' in the car package.

There's a fair description of other options at 
http://www.uni-kiel.de/psychologie/rexrepos/posts/recode.html

S Ellison




***
This email and any attachments are confidential. Any use...{{dropped:8}}

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] [R-pkgs] new package glm.predict

2016-10-10 Thread kontakt
Dear R users,

 

I'm pleased to announce that my first package has been accepted in CRAN.

 

https://cran.r-project.org/web/packages/glm.predict/

 

With glm.predict it is possible to calculate discrete changes with
confidence intervals for glm(), glm_nb(), polr() and multinom() models.

It is possible to calculate many discrete changes with just one line of
code. The output is a data.frame.

 

The functions calculate the confidence intervals with simulation, so the
results are only true asymptotically.

 

Comments and suggestions are welcome.

 

Best

Benjamin

--

Benjamin Schlegel

University of Zurich

Institut of Political Science
Affolternstrasse 56
8050 Zurich

kont...@benjaminschlegel.ch

+41 44 634 62 08


[[alternative HTML version deleted]]

___
R-packages mailing list
r-packa...@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-packages

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] pure TCL run a command within the R via the tcltk package?

2016-10-10 Thread Cleber N.Borges via R-help

thanks Jonh Fox!  :-)
my solution (partial and temporary) was as follows.
cleber
( for the  r-help history file )

library( tcltk )
# from  ?.Tcl
f <- function()cat("HI!\n")
.Tcl.callback(f)

sink("simpletest.tcl")
cat('toplevel .t\n')
cat('button .t.b -text "but" -command { ',  .Tcl.callback(f), ' }\n' )
cat('checkbutton .t.c -text "ck1" -variable "chkvar"\n' )
cat('pack .t.b\n')
cat('pack .t.c\n')
sink()

#tcl('set', 'argc', '0') # for use with code generated by vTcl
#tcl('set', 'argv', '0') # for use with code generated by vTcl
tcl('source', "simpletest.tcl" )
tclvalue('chkvar')

unlink('simpletest.tcl')


Em 10/10/2016 11:02, Fox, John escreveu:

Dear Cleber,

See ?.Tcl

I hope this helps,
  John

-
John Fox, Professor
McMaster University
Hamilton, Ontario
Canada L8S 4M4
Web: socserv.mcmaster.ca/jfox





-Original Message-
From: R-help [mailto:r-help-boun...@r-project.org] On Behalf Of Cleber
N.Borges via R-help
Sent: October 9, 2016 9:00 PM
To: r-help@r-project.org
Subject: [R] pure TCL run a command within the R via the tcltk package?

Dear,
is there any way of a button on pure TCL run a command within the R via the
tcltk package?
thank you in advance for informations
cleber
what I have in mind is something like:  (below)
##

sink("simpletest.tcl")
cat('
toplevel .t
button .t.b -text "but" -command {"some tcltk command for push data into R"}
checkbutton .t.c -text "ck1" -variable "chkvar"
pack .t.b
pack .t.c '
)
sink()

library( tcltk )
#tcl('set', 'argc', '0')
#tcl('set', 'argv', '0')
tcl('source', "simpletest.tcl" )

  > tclvalue('chkvar')
[1] "1"
  > tclvalue('chkvar') # after click in screen [1] "0"
  >

### after click in button  get error:

invalid command name "some tcltk command for push data into R"
invalid command name "some tcltk command for push data into R"
  while executing
""some tcltk command for push data into R""
  invoked from within
".t.b invoke"
  ("uplevel" body line 1)
  invoked from within
"uplevel #0 [list $w invoke]"
  (procedure "tk::ButtonUp" line 24)
  invoked from within
"tk::ButtonUp .t.b"
  (command bound to event)

##

  > sessionInfo()
R version 3.3.1 (2016-06-21)
Platform: x86_64-w64-mingw32/x64 (64-bit) Running under: Windows 7 x64
(build 7600)

locale:
[1] LC_COLLATE=Portuguese_Brazil.1252 LC_CTYPE=Portuguese_Brazil.1252
[3] LC_MONETARY=Portuguese_Brazil.1252 LC_NUMERIC=C [5]
LC_TIME=Portuguese_Brazil.1252

attached base packages:
[1] tcltk stats graphics  grDevices utils datasets methods
base
  >



---
Este email foi escaneado pelo Avast antivírus.
https://www.avast.com/antivirus

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Having trouble with running ldBands in Hmisc package. Posted in StackExchange

2016-10-10 Thread David Winsemius

> On Oct 10, 2016, at 7:03 AM, Pitch Mandava via R-help  
> wrote:
> 
> I am trying to run the example from Hmisc package in RStudio environment 
> under Windows 10 and downloaded ld98.exe> .libPaths()Produces the following 
> output[1] "C:/Users/username/X1_Carbon/Documents/R/win-library/3.2"[2] 
> "C:/Program Files/R/R-3.2.5/library"I moved the ld98.exe to 
> C:/Users/username/X1_Carbon/Documents/R/win-library/3.2Then installed Hmisc 
> and ran the following> require(Hmisc)
>> b <- ldBands(5, pr=FALSE)Produces the followingError: could not find 
>> function "ldBands"To see if ld98.exe is working in the directory I ran the 
>> ld98.exe in the Windows environment I get the following outputProgram for 
>> computations related to group sequential boundaries using spending 
>> functions.Is this an interactive session? (1=yes,0=no)yes interactive = 
>> 1etc.
>   [[alternative HTML version deleted]]
> 

You should check the NEWS file. That function was removed in Hmisc version 3.14

-- 
David,


> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

David Winsemius
Alameda, CA, USA

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Having trouble with running ldBands in Hmisc package. Posted in StackExchange

2016-10-10 Thread Marc Schwartz

> On Oct 10, 2016, at 11:39 AM, David Winsemius  wrote:
> 
> 
>> On Oct 10, 2016, at 7:03 AM, Pitch Mandava via R-help  
>> wrote:
>> 
>> I am trying to run the example from Hmisc package in RStudio environment 
>> under Windows 10 and downloaded ld98.exe> .libPaths()Produces the following 
>> output[1] "C:/Users/username/X1_Carbon/Documents/R/win-library/3.2"[2] 
>> "C:/Program Files/R/R-3.2.5/library"I moved the ld98.exe to 
>> C:/Users/username/X1_Carbon/Documents/R/win-library/3.2Then installed Hmisc 
>> and ran the following> require(Hmisc)
>>> b <- ldBands(5, pr=FALSE)Produces the followingError: could not find 
>>> function "ldBands"To see if ld98.exe is working in the directory I ran the 
>>> ld98.exe in the Windows environment I get the following outputProgram for 
>>> computations related to group sequential boundaries using spending 
>>> functions.Is this an interactive session? (1=yes,0=no)yes interactive = 
>>> 1etc.
>>  [[alternative HTML version deleted]]
>> 
> 
> You should check the NEWS file. That function was removed in Hmisc version 
> 3.14
> 
> -- 
> David,



Hi,

Just to augment David's reply with some additional direction, if you 
specifically need the Lan-DeMets methodology.

The Clinical Trials Task View:

  https://cran.r-project.org/web/views/ClinicalTrials.html 


provides names of other packages that provide for group sequential boundary 
calculations, although it looks like the entry for Hmisc needs to be edited 
there. I am cc'ing the CT TV maintainer here as a heads up.

Regards,

Marc Schwartz


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Recoding lists of categories of a variable

2016-10-10 Thread David L Carlson
Your code suggests that you do not understand R or what you are doing. The line

mydata$newvar[oldvar = "topic1"] <- "parenttopic"

does not recode cases where oldvar is "topic1", it creates a new variable 
called oldvar (not the same as mydata$oldvar) and sets it to "topic1" because a 
single equals sign assigns a value to a variable whereas two equals signs 
create a logical expression. The result is that all values of mydata$newvar 
become "parenttopic". You did not give us any data, so it is not clear if 
oldvar is a character variable or a factor. Assuming it is a factor try the 
following and then spend a few hours reading some tutorials on R:

> # Create reproducible data
> set.seed(42)
> mydata <- data.frame(oldvar=paste("topic", sample(1:20, 200, replace=TRUE)))
> str(mydata)
'data.frame':   200 obs. of  1 variable:
 $ oldvar: Factor w/ 20 levels "topic 1","topic 10",..: 11 11 17 9 5 3 7 14 6 7 
...
> # Note factor levels are ordered alphabetically. Fix that with
> mydata$oldvar <- factor(mydata$oldvar, levels=c(paste("topic", 1:20)))
> str(mydata)
'data.frame':   200 obs. of  1 variable:
 $ oldvar: Factor w/ 20 levels "topic 1","topic 2",..: 19 19 6 17 13 11 15 3 14 
15 ...
> levels(mydata$oldvar)
 [1] "topic 1"  "topic 2"  "topic 3"  "topic 4"  "topic 5"  "topic 6"  "topic 
7"  "topic 8" 
 [9] "topic 9"  "topic 10" "topic 11" "topic 12" "topic 13" "topic 14" "topic 
15" "topic 16"
[17] "topic 17" "topic 18" "topic 19" "topic 20"
> mydata$newvar <- mydata$oldvar
> levels(mydata$newvar)[c(1, 9, 14)] <- "parenttopic"
> table(mydata$newvar)

parenttopic topic 2 topic 3 topic 4 topic 5 topic 6 
topic 7 topic 8 
 26   6  14  10   8   7 
  7  11 
   topic 10topic 11topic 12topic 13topic 15topic 16
topic 17topic 18 
  8  10   9  13  19  12 
 11   3 
   topic 19topic 20 
 18   8


-
David L Carlson
Department of Anthropology
Texas A&M University
College Station, TX 77840-4352


-Original Message-
From: R-help [mailto:r-help-boun...@r-project.org] On Behalf Of MACDOUGALL 
Margaret
Sent: Monday, October 10, 2016 9:56 AM
To: r-help@r-project.org
Subject: [R] Recoding lists of categories of a variable

Hello

The R code
mydata$newvar[oldvar = "topic1"] <- "parenttopic"

is intended to recode the category 'topic 1' of the old  varaible 'oldvar' a 
new category label 'parenttopic' by defining the new variable 'newvar'.

Is there a convenient way to edit this code to allow me to recode a list of 
categories 'topic 1', 'topic 9' and 'topic 14', say, of the the old variable 
'oldvar' as 'parenttopic' by means of the new variable 'newvar', while also 
mapping system missing values to system missing values?

Thanks in advance

Best wishes
Margaret

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Recoding lists of categories of a variable

2016-10-10 Thread Bert Gunter
Well, I think that's kind of overkill.

Assuming "oldvar" is a factor in the data frame mydata, then the
following shows how to do it:

> set.seed(27)
> d <- data.frame(a = sample(c(letters[1:3],NA),15,replace = TRUE))
> d
  a
1  
2 a
3  
4 b
5 a
6 b
7 a
8 a
9 a
10a
11c
12 
13c
14c
15 


> d$b <- factor(d$a,labels = LETTERS[3:1])
> d
  ab
1   
2 aC
3   
4 bB
5 aC
6 bB
7 aC
8 aC
9 aC
10aC
11cA
12  
13cA
14cA
15  


See ?factor for details.

Incidentally note that in the OP's post,

mydata$newvar[oldvar = "topic1"] <- "parenttopic"

is completely incorrect; it should probably be:

mydata$newvar[mydata$oldvar == "topic1"] <- "parenttopic";

This suggests to me that the OP would probably find it useful to spend
some time with one or more of the many good R tutorials on the web.

Cheers,
Bert











Bert Gunter

"The trouble with having an open mind is that people keep coming along
and sticking things into it."
-- Opus (aka Berkeley Breathed in his "Bloom County" comic strip )


On Mon, Oct 10, 2016 at 9:08 AM, S Ellison  wrote:
>> Is there a convenient way to edit this code to allow me to recode a list of
>> categories 'topic 1', 'topic 9' and 'topic 14', say, of the the old variable 
>> 'oldvar'
>> as 'parenttopic' by means of the new variable 'newvar', while also mapping
>> system missing values to system missing values?
>
> You could look at 'recode()' in the car package.
>
> There's a fair description of other options at 
> http://www.uni-kiel.de/psychologie/rexrepos/posts/recode.html
>
> S Ellison
>
>
>
>
> ***
> This email and any attachments are confidential. Any u...{{dropped:8}}

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] OPeNDAP access / OPeNDAP subsetting with R

2016-10-10 Thread Debasish Pai Mazumder
Hi Roy,
Thanks for your help. It works perfectly. if I am trying to read multiple
files similar ways, how do I do that?
with regards
-Deb



On Fri, Sep 30, 2016 at 5:53 PM, Roy Mendelssohn - NOAA Federal <
roy.mendelss...@noaa.gov> wrote:

> Hi Deb:
>
> > > gribfile <- 'http://thredds.ucar.edu/thredds/ncss/grib/NCEP/GFS/
> Global_0p5deg/best?north=47.0126&west=-114.841&east=-112.
> 641&south=44.8534&time_start=present&time_duration=PT3H&
> accept=netcdf&var=v-component_of_wind_height_above_ground,u-
> component_of_wind_height_above_ground'
> > > download.file(gribfile,'junk.nc',mode = "wb")
> > trying URL 'http://thredds.ucar.edu/thredds/ncss/grib/NCEP/GFS/
> Global_0p5deg/best?north=47.0126&west=-114.841&east=-112.
> 641&south=44.8534&time_start=present&time_duration=PT3H&
> accept=netcdf&var=v-component_of_wind_height_above_ground,u-
> component_of_wind_height_above_ground'
> > Content type 'application/x-netcdf' length unknown
> > 
> > downloaded 4360 bytes
> >
> > > library(ncdf4)
> > > junkFile <- nc_open('junk.nc')
> > > str(junkFile)
> > List of 14
> >  $ filename   : chr "junk.nc"
> >  $ writable   : logi FALSE
> >  $ id : int 65536
> >  $ safemode   : logi FALSE
> >  $ format : chr "NC_FORMAT_CLASSIC"
> >  $ is_GMT : logi FALSE
> >  $ groups :List of 1
> >   ..$ :List of 7
> >   .. ..$ id   : int 65536
> >   .. ..$ name : chr ""
> >   .. ..$ ndims: int 4
> >   .. ..$ nvars: int 7
> >   .. ..$ natts: int 13
> >   .. ..$ dimid: int [1:4(1d)] 0 1 2 3
> >   .. ..$ fqgn : chr ""
> >   .. ..- attr(*, "class")= chr "ncgroup4"
> >  $ fqgn2Rindex:List of 1
> >   ..$ : int 1
> >  $ ndims  : num 4
> >  $ natts  : num 13
> >  $ dim:List of 4
> >
>
> 
>
> I cut off the rest as that is not important for your question.
>
> HTH,
>
> -Roy
>
> > On Sep 30, 2016, at 4:21 PM, Debasish Pai Mazumder 
> wrote:
> >
> > Hi
> > Now I am using netcdfSubset and I am able to download the file but not
> sure how to read the files. here my scripts
> > library("ncdf4")
> >
> > gribfile<-"http://thredds.ucar.edu/thredds/ncss/grib/
> NCEP/GFS/Pacific_40km/best/dataset.html"
> > download.file(gribfile,basename(gribfile),mode = "wb")
> > x<-nc_open(gribfile)
> >
> > gribfile<-"http://thredds.ucar.edu/thredds/ncss/grib/
> NCEP/GFS/Global_0p5deg/best?north=47.0126&west=-114.841&
> east=-112.641&south=44.8534&time_start=present&time_
> duration=PT3H&accept=netcdf&var=v-component_of_wind_height_above_ground,u-
> component_of_wind_height_above_ground"
> > download.file(gribfile,basename(gribfile),mode = "wb")
> > x<-nc_open(gribfile)
> >
> >
> > nc_open doesn't work.
> >
> > which command should I use?
> >
> > with regards
> > -Deb
> >
> >
> > On Tue, Sep 27, 2016 at 9:30 PM, Michael Sumner 
> wrote:
> > Opendap won't work on Windows CRAN build of ncdf4, though the rgdal
> build does work directly on grib.
> >
> > Summary: download the files wholus for use on Windows, or set your own
> system on Linux.
> >
> > Building ncdf4 on Windows is not too hard if you know about doing that.
> >
> > Cheers, Mike
> >
> > On Wed, 28 Sep 2016, 06:49 Roy Mendelssohn - NOAA Federal <
> roy.mendelss...@noaa.gov> wrote:
> > Please post the code of what you tried, as I have no idea otherwise what
> did or did not work for you.
> >
> > -Roy
> >
> > > On Sep 27, 2016, at 12:44 PM, Debasish Pai Mazumder 
> wrote:
> > >
> > > Hi Roy,
> > > Thanks for your response. I have tried according your suggestion but
> it doesn't work.
> > > the OPeNDAP link of the data
> > > http://nomads.ncdc.noaa.gov/thredds/dodsC/modeldata/cfsv2_
> forecast_ts_9mon/2014/201404/20140403/2014040312/
> > >
> > > datafile:
> > > tmax.01.2014040312.daily.grb2
> > >
> > > Thanks
> > > -Deb
> > >
> > > On Tue, Sep 27, 2016 at 11:51 AM, Roy Mendelssohn - NOAA Federal <
> roy.mendelss...@noaa.gov> wrote:
> > > Look at the package ncdf4.  You can use an OPeNDAP URL in place of the
> file name to perform subsets.,
> > >
> > > -Roy
> > >
> > > > On Sep 27, 2016, at 9:06 AM, Debasish Pai Mazumder <
> pai1...@gmail.com> wrote:
> > > >
> > > > Hi all,
> > > >
> > > > I would like to access and subset following OpeNDAP files.
> > > > server:
> > > > http://nomads.ncdc.noaa.gov/thredds/dodsC/modeldata/cfsv2_
> forecast_ts_9mon/2014/201404/20140403/2014040312/
> > > >
> > > > file name: tmax.01.2014040312.daily.grb2
> > > >  forecast_ts_9mon/2014/201404/20140403/2014040312/catalog.
> html?dataset=modeldata/cfsv2_forecast_ts_9mon/2014/201404/
> 20140403/2014040312/tmax.01.2014040312.daily.grb2>
> > > > I would like to access and subset the file. Any help will be
> appreciated.
> > > >
> > > > with regards
> > > > -Deb
> > > >
> > > >   [[alternative HTML version deleted]]
> > > >
> > > > __
> > > > R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> > > > https://stat.ethz.ch/mailman/listinfo/r-help
> > > > PLEASE do read the posting guid

Re: [R] Recoding lists of categories of a variable

2016-10-10 Thread Fox, John
Dear Margaret,

You've had one suggestion of an alternative for recoding variables, but in 
addition your code is in error (see below).

> -Original Message-
> From: R-help [mailto:r-help-boun...@r-project.org] On Behalf Of
> MACDOUGALL Margaret
> Sent: Monday, October 10, 2016 10:56 AM
> To: r-help@r-project.org
> Subject: [R] Recoding lists of categories of a variable
> 
> Hello
> 
> The R code
> mydata$newvar[oldvar = "topic1"] <- "parenttopic"

That should be

   mydata$newvar[oldvar == "topic1"] <- "parenttopic"

Moreover, the code assumes that oldvar is visible, which may not be the case if 
it lives in mydata and mydata isn't attach()ed.

Best,
 John

--
John Fox, Professor
McMaster University
Hamilton, Ontario, Canada
Web: socserv.mcmaster.ca/jfox

> 
> is intended to recode the category 'topic 1' of the old  varaible
> 'oldvar' a new category label 'parenttopic' by defining the new variable
> 'newvar'.
> 
> Is there a convenient way to edit this code to allow me to recode a list
> of categories 'topic 1', 'topic 9' and 'topic 14', say, of the the old
> variable 'oldvar' as 'parenttopic' by means of the new variable
> 'newvar', while also mapping system missing values to system missing
> values?
> 
> Thanks in advance
> 
> Best wishes
> Margaret

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Recoding lists of categories of a variable

2016-10-10 Thread MACDOUGALL Margaret
Thank you for the valued suggestions in response to my query.

Margaret


-- 
The University of Edinburgh is a charitable body, registered in
Scotland, with registration number SC005336.


-Original Message-
From: Fox, John [mailto:j...@mcmaster.ca] 
Sent: 10 October 2016 20:32
To: MACDOUGALL Margaret 
Cc: r-help@r-project.org
Subject: RE: Recoding lists of categories of a variable

Dear Margaret,

You've had one suggestion of an alternative for recoding variables, but in 
addition your code is in error (see below).

> -Original Message-
> From: R-help [mailto:r-help-boun...@r-project.org] On Behalf Of 
> MACDOUGALL Margaret
> Sent: Monday, October 10, 2016 10:56 AM
> To: r-help@r-project.org
> Subject: [R] Recoding lists of categories of a variable
> 
> Hello
> 
> The R code
> mydata$newvar[oldvar = "topic1"] <- "parenttopic"

That should be

   mydata$newvar[oldvar == "topic1"] <- "parenttopic"

Moreover, the code assumes that oldvar is visible, which may not be the case if 
it lives in mydata and mydata isn't attach()ed.

Best,
 John

--
John Fox, Professor
McMaster University
Hamilton, Ontario, Canada
Web: socserv.mcmaster.ca/jfox

> 
> is intended to recode the category 'topic 1' of the old  varaible 
> 'oldvar' a new category label 'parenttopic' by defining the new 
> variable 'newvar'.
> 
> Is there a convenient way to edit this code to allow me to recode a 
> list of categories 'topic 1', 'topic 9' and 'topic 14', say, of the 
> the old variable 'oldvar' as 'parenttopic' by means of the new 
> variable 'newvar', while also mapping system missing values to system 
> missing values?
> 
> Thanks in advance
> 
> Best wishes
> Margaret

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Recoding lists of categories of a variable

2016-10-10 Thread Jim Lemon
Hi Margaret,
This may be a misunderstanding of your request, but what about:

mydata<-data.frame(oldvar=paste("topic",sample(1:9,20,TRUE),sep=""))
mydata$newvar<-sapply(mydata$oldvar,gsub,"topic.","parenttopic")

Jim


On Tue, Oct 11, 2016 at 1:56 AM, MACDOUGALL Margaret
 wrote:
> Hello
>
> The R code
> mydata$newvar[oldvar = "topic1"] <- "parenttopic"
>
> is intended to recode the category 'topic 1' of the old  varaible 'oldvar' a 
> new category label 'parenttopic' by defining the new variable 'newvar'.
>
> Is there a convenient way to edit this code to allow me to recode a list of 
> categories 'topic 1', 'topic 9' and 'topic 14', say, of the the old variable 
> 'oldvar' as 'parenttopic' by means of the new variable 'newvar', while also 
> mapping system missing values to system missing values?
>
> Thanks in advance
>
> Best wishes
> Margaret
>
>
> The University of Edinburgh is a charitable body, registered in
> Scotland, with registration number SC005336.
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Loop to check for large dataset

2016-10-10 Thread Adrian Dușa
Granted,, there are better solutions than my "KISS" (keep it simple and
stupid) example.

Hopefully, Christoph will have learned from both.

Best,
Adrian

On 10 Oct 2016 13:44, "PIKAL Petr"  wrote:

> Hi
>
>
>
> Given this example data, you can get same answer with less typing and
> without loops.
>
>
>
> res<-xtabs(~W+P+S,mydata)
>
> res1<-which(res==0, arr.ind=T)
>
> head(res1)
>
>   W P S
>
> 10   10 1 1
>
> 11   11 1 1
>
> 82   82 1 1
>
> 100 100 1 1
>
> 117 117 1 1
>
> 148 148 1 1
>
>
>
> Cheers
>
> Petr
>
>
>
>
>
> *From:* dusa.adr...@gmail.com [mailto:dusa.adr...@gmail.com] *On Behalf
> Of *Adrian Du?a
> *Sent:* Monday, October 10, 2016 12:26 PM
> *To:* Christoph Puschmann 
> *Cc:* r-help@r-project.org; PIKAL Petr 
> *Subject:* Re: [R] Loop to check for large dataset
>
>
>
> This is an example of how a reproducible code looks like, assuming you
> have three columns in your dataset named S (store), P (product) and W
> (week), and also assuming they have integer values from 1 to 19, 1 to 22
> and 1 to 157 respectively:
>
> #
>
> mydata <- expand.grid(seq(19), seq(22), seq(157))
> names(mydata) <- c("S", "P", "W")
>
> # randomly delete 65626 - 63127 = 2499 rows
> set.seed(12345) # make it replicable
>
> mydata <- mydata[-sample(seq(nrow(mydata)), nrow(mydata) - 63127), ]
>
> #
>
>
> Now the dataframe mydata contains exactly 63127 rows, just as in your
> case. The task is to find which weeks are missing, from which store and for
> which product.
>
> Below is a possible code to do that. Given you have a small number of
> stores and products, I'll keep it simple and stupid, by using for loops:
>
>
>
>
>
> #
>
>
>
> result <- matrix(nrow = 0, ncol = 3)
>
>
>
> for (i in seq(19)) {
>
> for (j in seq(22)) {
>
> miss <- setdiff(seq(157), mydata$W[mydata$S == i & mydata$P == j])
>
> if (length(miss) > 0) {
>
> result <- rbind(result, cbind(S = i, P = j, W = miss))
>
> }
>
> }
>
> }
>
>
>
> # The result matrix contains 2499 rows that are missing.
>
>
>
> > head(result)
>
>  S P   W
>
> [1,] 1 1  10
>
> [2,] 1 1  11
>
> [3,] 1 1  82
>
> [4,] 1 1 100
>
> [5,] 1 1 117
>
> [6,] 1 1 148
>
>
>
> #
>
>
>
>
>
> In this example, for S(tore) number 1 and P(roduct) number 1, you are
> missing W(eek) 10, 11, 82 and so on.
>
>
>
> In hoping you can adapt this code to your particular example,
>
> Adrian
>
>
>
> On Sun, Oct 9, 2016 at 2:26 AM, Christoph Puschmann <
> c.puschm...@student.unsw.edu.au> wrote:
> >
> > Dear Adrian,
> >
> > Yes it is a cyclical data set and theoretically it should repeat this
> interval until 61327. The data set itself is divided into 2 Parts:
> > 1. Product category (column 10)
> > 2. Number of Stores Participating (column 01)
> > Overall there are 22 different products and in each you have 19
> different stores participating. And theoretically each store over each
> product category should have a 1 - 157 week interval.
> >
> > The part I am struggling with is how do I run a loop over the whole data
> set, while checking if all stores participated 157 weeks over the different
> products.
> >
> > So far I came up with this:
> >
> > n=61327   # Generate Matrix to check for values
> > Control = matrix(
> >   0,
> >   nrow = n,
> >   ncol = 1)
> >
> > s <- seq(from =1 , to = 157, by = 1)
> > CW = matrix(
> >   s,
> >   nrow = 157,
> >   ncol = 1
> > )
> >
> > colnames(CW)[1] <- ’s'
> >
> > CW = as.data.frame(CW)
> >
> > for (i in 1:nrow(FD)) {   # Let run trhough all the rows
> >   for (j in 1:157) {
> > if(FD$WEEk[j] == C$s[j]) {
> >   Control[i] = 1 # coresponding control row = 1
> > } else {
> >   Control[i] = 0 # corresponding control row = 0
> > }
> > }
> > }
> >
> > I coded a  MRE and attached an sample of my data set.
> >
> > MRE:
> >
> > #MRE
> >
> > dat <- data.frame(
> >   Store = c(rep(8, times = 157), rep(12, times = 157)),  # Number of
> stores
> >   WEEK = rep(seq(from=1, to = 157, by = 1), times = 2)
> > )
> >
> >
> >
> >
>
>
>
> --
> Adrian Dusa
> University of Bucharest
> Romanian Social Data Archive
> Soseaua Panduri nr.90
> 050663 Bucharest sector 5
> Romania
>
> --
> Tento e-mail a jakékoliv k němu připojené dokumenty jsou důvěrné a jsou
> určeny pouze jeho adresátům.
> Jestliže jste obdržel(a) tento e-mail omylem, informujte laskavě
> neprodleně jeho odesílatele. Obsah tohoto emailu i s přílohami a jeho kopie
> vymažte ze svého systému.
> Nejste-li zamýšleným adresátem tohoto emailu, nejste oprávněni tento email
> jakkoliv užívat, rozšiřovat, kopírovat či zveřejňovat.
> Odesílatel e-mailu neodpovídá za eventuální škodu způsobenou modifikacemi
> či zpožděním přenosu e-mailu.
>
> V případě, že je tento e-mail součástí obchodního jednání:
> - vyhrazuje si odesílatel právo ukončit kdykoliv jednání o uzavření
> smlouvy, a to z jakéhokoliv důvodu i bez uvedení důvodu.
> - a obsahuje-li nabídku, je adresát oprávněn

Re: [R] Recoding lists of categories of a variable

2016-10-10 Thread S Ellison
> Well, I think that's kind of overkill.
Depends whether you want to recode all or some, and how robust you want the 
answer to be. 
recode() allows you to recode a few levels of many, without dependence on level 
ordering; that's kind of neat. 

tbh, though,  I don't use recode() a lot; I generally find myself need to 
change a fair proportion of level labels. 

But I do get nervous about relying on specific ordering; it can break without 
visible warning if the data change (eg if you lose a factor level with a 
slightly different data set, integer indexing will give you apparently valid 
reassignment to the wrong new codes).  So I tend to go via named vectors even 
if it costs me a lot of typing. For example to change 
lcase<-c('a', 'b', 'c') 

to c('B', 'A', 'C') I'll use something like 

c(a='B', b='A', c='C')[lcase] 

or, if lcase were a factor, 
c(a='B', b='A', c='C')[as.character(lcase)] 

Unlike using the numeric levels, that doesn't fail if some of the levels I 
expect are absent; it only fails (and does so visibly) when there's a value in 
there that I haven't assigned a coding to. So it's a tad more robust.

Steve E






***
This email and any attachments are confidential. Any use...{{dropped:8}}

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Question about ggplot2 symbol and legend change

2016-10-10 Thread Luis Fernando García
Dear R experts,

Maybe my question is too basic and I apologize for that. I am having an
issue currently by trying to change manually the symbols of the series. I
need to put them manually, instead of using the symbols that R gives by
default and produce a plot with the classic style. For example I need to
put the symbols 16 and 2, but I have been unable to do it so far. Also, I
need to remove the grey background from the seiries but I have been unable
to do it too.

Any help you can provide will be really helpful.

Below, I am providing the script as well as the picture I gio with it If
necessary I added the dataset.

Many thanks


#


it<-read.table("immotime.txt",header=TRUE)
it
str(it)
names(it)
fit3<-lm(Time ~ Sp*Ratio, data=it)
anova(fit3)
plot(fit3)
summary(fit3)
a$lPeso <- log(Peso)
library(ggplot2)
p <- ggplot(it,aes(x=Ratio,y=Time)) + geom_point(aes(shape=factor(Sp)))
p=p + geom_smooth(aes(linetype=factor(Sp), ),colour="black", method='lm',
se=F)+theme(panel.grid.major = element_blank(), panel.grid.minor =
element_blank(),
  panel.background = element_blank(), axis.line =
element_line(colour = "black"))
p



#

Plot: https://postimg.org/image/3vm2uleip/
dataset "it" http://textuploader.com/d593h

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Recoding lists of categories of a variable

2016-10-10 Thread Bert Gunter
Still overkill, I believe.


" Unlike using the numeric levels, that doesn't fail if some of the
levels I expect are absent; it only fails (and does so visibly) when
there's a value in there that I haven't assigned a coding to. So it's
a tad more robust. "


If you are concerned about missing levels -- which I agree is
legitimate -- then the following simple modification works (for
**factors** of course):

> d <- factor(letters[1:2],levels= letters[1:3])
> d
[1] a b
Levels: a b c
> f <- factor(d,levels = levels(d), labels = LETTERS[3:1])
> f
[1] C B
Levels: C B A

## No levels lost !

Does that allay your concerns?

Cheers,
Bert

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.