Hello,

a) Thanks a lot for the answer relative to the dismissed coefficients. However, the result below has been obtained in the R console and there was no warning (please, see the full display below (***) if you wish).


"yi ~ X$V1 + X$V2 + X$V3 + X$V4 + X$V5 + X$V6 + X$V7 + X$V8 + X$V9 + X$V10"
               Estimate   Std. Error       t value Pr(>|t|)
(Intercept)  0.018062134 5.624517e-17  3.211322e+14        0
X$V1        -0.011104627 3.084989e-17 -3.599567e+14        0
X$V2        -0.018536614 3.241635e-17 -5.718291e+14        0
X$V3         0.107341157 4.884358e-17  2.197651e+15        0
X$V4         0.003258493 3.286878e-17  9.913643e+13        0
X$V6        -0.051599957 4.203840e-17 -1.227448e+15        0
X$V8        -0.057207683 3.049835e-17 -1.875763e+15        0
X$V10        0.134643229 3.849911e-17  3.497308e+15        0

b) Would it be wise to check if the predictors are colinear, prior to running the glm function ? How could it be performed, using the R language?

In additoon, in the present case, I would like to check for colinearity, because I am puzzled by the absence of warning inthe R console.

c) I now answer the question relative to the very low std errors and the way I generated the simulated data :

I ran the following code :

*****************
drawRegressionCoefficientsAndUpdateY <- function (target,
                                                  marks,
                                                  nbInd){
# X (individuals x marks)
# y (nbTargets x individuals)


 l <- length(marks)
 sigma <- runif(1,0.03,0.08)
# Values close to 0 are excluded from c0 simulation interval so that the effect of the predictors are detectable.
 c01    <- runif(1,-2,-0.5)
 c02    <- runif(1,0.5,2)
 i <- runif(1,1:2)
 if (i==1) c0 <- c01 else c0 <- c02

 coefficients <- c()
 for(r in 1:l){
  coefficient <- runif(1,0.2+sigma,1+sigma)
  coefficients <- c(coefficients, coefficient)
 }

 // TILL THAT POINT, I RELIED ON THE PROCESS DESCRIBED IN A THESIS.
 // THERE WAS NO INDICATION RELATIVE TO THE "ADJUSTMENT" OF EPSILON.
 epsilon <- rnorm(1, mean = 0, sd = 0.002)
 for (ind in 1:nbInd){
  y[target,ind]<<- c0 + coefficients %*% X[ind,SNPs] + epsilon
 }

Thank you in advance for your kind help.

C.S.
------------------------------------------------
------------------------------------------------

"yi ~ X$V1 + X$V2 + X$V3 + X$V4 + X$V5 + X$V6 + X$V7 + X$V8 + X$V9 + X$V10"
               Estimate   Std. Error       t value Pr(>|t|)
(Intercept)  0.018062134 5.624517e-17  3.211322e+14        0
X$V1        -0.011104627 3.084989e-17 -3.599567e+14        0
X$V2        -0.018536614 3.241635e-17 -5.718291e+14        0
X$V3         0.107341157 4.884358e-17  2.197651e+15        0
X$V4         0.003258493 3.286878e-17  9.913643e+13        0
X$V6        -0.051599957 4.203840e-17 -1.227448e+15        0
X$V8        -0.057207683 3.049835e-17 -1.875763e+15        0
X$V10        0.134643229 3.849911e-17  3.497308e+15        0


I am sure to have regressed the right number of variables, since I check
that the formula is correct:
"yi ~ X$V1 + X$V2 + X$V3 + X$V4 + X$V5 + X$V6 + X$V7 + X$V8 + X$V9 + X$V10"

Could somebody explain to me
1) why there are mismatches between the "true" coefficients for
predictors 4 and 6
and
2) why there is no information edited for predictors 5, 7 and 9 ?

Thanks in advance for your kind help.

C.S.
------------------------------------------------
------------------------------------------------
(***)
Full console display ;

> inputoutputdir="/home/sinoquet/recherche/mcmc/output/sim_sat_02_10_10/"
>  inputfilesnps="snps.txt"
>  inputfilepheno="pheno.txt"
>
> X <<- read.table(file=paste(inputoutputdir,inputfilesnps,sep=""),h=F) # data.frame > #genotype file (individuals x SNPs); code: 0/1/2 : number of minor alleles
>
> y <<- read.table(file=paste(inputoutputdir,inputfilepheno,sep=""),h=F) # data.frame
>
> #fit <- glm(yi ~ X$V1 + X$V2 + X$V3 + X$V4 + X$V5 + X$V6 + X$V7 + X$V8 + X$V9 + X$V10, family = gaussian)
>  #coefficients( summary(fit))
>
>  #*****************************
>  # BEGIN check real solution:
>  target <- 1
>  inputfilepredictors="predictor_descript.txt"
> f0 <- file(paste(inputoutputdir,inputfilepredictors,sep=""), open = "r")
>  # format of file f0:
>  #target <predictors>
>  #1 1 2 3 4 5 6 7 8 9 10
>  line <- readLines(f0, n = 1) # header
>  line <- readLines(f0, n = 1)
>  v <- unlist(strsplit(line, split=" "))
>  predictorsForThisTarget <- v[-1] # dismiss target number
>  print(paste("!",v,"!",sep=" "))
[1] "! 1 !" "! 1 !" "! 2 !" "! 3 !" "! 4 !" "! 5 !" "! 6 !" "! 7 !"
 [9] "! 8 !"  "! 9 !"  "! 10 !"
>
>  nbPredictors <- length(predictorsForThisTarget)
>  ch<-paste("X$V",predictorsForThisTarget[1],sep="")
>  for (i in 2:nbPredictors){
+   item<-paste("X$V",predictorsForThisTarget[i],sep="")
+   ch<-paste(ch,item,sep=" + ")
+  }
>
>  formulaString <- paste("yi",ch,sep=" ~ ")
>  print(formulaString)
[1] "yi ~ X$V1 + X$V2 + X$V3 + X$V4 + X$V5 + X$V6 + X$V7 + X$V8 + X$V9 + X$V10"
>  formula <- as.formula(formulaString)
>  yi <- unlist(y[target,])
>  fit <- glm(formula, family = gaussian)
>  print("*************BEGIN SOLUTION****************")
[1] "*************BEGIN SOLUTION****************"
>  print(coefficients(summary(fit)))
                Estimate   Std. Error       t value Pr(>|t|)
(Intercept)  0.018062134 5.624517e-17  3.211322e+14        0
X$V1        -0.011104627 3.084989e-17 -3.599567e+14        0
X$V2        -0.018536614 3.241635e-17 -5.718291e+14        0
X$V3         0.107341157 4.884358e-17  2.197651e+15        0
X$V4         0.003258493 3.286878e-17  9.913643e+13        0
X$V6        -0.051599957 4.203840e-17 -1.227448e+15        0
X$V8        -0.057207683 3.049835e-17 -1.875763e+15        0
X$V10        0.134643229 3.849911e-17  3.497308e+15        0
>  print("*************END SOLUTION****************")
[1] "*************END SOLUTION****************"
>  # END check real solution:
>  #*****************************

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Reply via email to