On Mon, Jul 13, 2009 at 4:26 PM, saurav pathak<pathak.sau...@gmail.com> wrote: > I am using R 2.9.1,
That's good! > I am not sure about the version of sampleSelection and maxLik This is important! Please check the version numbers, e.g. with R> help(package="maxLik") R> help(package="sampleSelection") BTW: Did you install the development version of the maxLik package from R-Forge? If yes, please use the stable version that is available on CRAN. > Let em explain, In my data the DV used as 's' in the formula has some > missing values, which can lead to bias in our case, so I used > > adpopdata$s <- ifelse(is.na(Ln-oy5_1),0,1) > > ie I convert all the missing values for the variable ln_oy5_1 to 0 and all > non-missings as 1, so is the source of missing values from the IVs used in > the following: > > myProbit<- glm(s ~ age + gender + gemedu + gemhinc + es_gdppc + >> + imf_pop + estbbo_m, family = binomial(link = "probit")) > > so I dont know where the missing values are coming from, can you suggest how > to correct it?? They might come from the explanatory variables. Please check, e.g. with R> sum( is.na( adpopdata$s ) ) R> sum( is.na( adpopdata$age ) ) R> sum( is.na( adpopdata$gender ) ) ... R> sum( is.na( adpopdata$estbbo_m ) ) Please "reply to all" (i.e. including R-help) so that others who will have similar questions and problems in the future could benefit from our discussion. Arne > On Mon, Jul 13, 2009 at 3:09 PM, Arne Henningsen > <arne.henning...@googlemail.com> wrote: >> >> On Mon, Jul 13, 2009 at 11:18 AM, Pathak, >> Saurav<s.patha...@imperial.ac.uk> wrote: >> > Dear Arne >> > I have gone through the paper and I have tried it at my end, I would >> > really appreciate if you could address the following: >> > >> > 1. Based upon your suggestion I used the following: >> > >> > regmod2 <- selection(s ~ age + gender + gemedu + gemhinc + es_gdppc + >> > imf_pop + estbbo_m, ln_oy5_1 ~ age+ gender+fearfail+gemedu, >> > adpopdata, method = "2step") >> > On trying the above( notice that I have changed "heckit" to "selection" >> > in the above command, i get the following error message >> > >> > Error in coef.probit(result$probit) : >> > could not find function "coef.maxLik" >> >> That's weird. Which versions of R, sampleSelection, and maxLik do you use? >> >> > Before trying the above I tried the following: >> > >> > 2. When I tried to do the Heckman selection model in stages , the first >> > run was successful, I mean, using the following: >> > >> > myProbit<- glm(s ~ age + gender + gemedu + gemhinc + es_gdppc + >> > + imf_pop + estbbo_m, family = binomial(link = "probit")) >> >> summary(myProbit) >> > >> > I am successful upto this point, but >> > >> > 3. When I try calculating the IMR using the following: >> > adpopdata$IMR<-invMillsRatio(myProbit)$IMR1 >> > >> > I get the error below >> > Error in `$<-.data.frame`(`*tmp*`, "IMR", value = c(2.50039945424535, : >> > replacement has 257358 rows, data has 343251 >> >> I guess that you have some NAs in the data so that you have the IMRs >> not for all observations but only for the observations witout NAs. >> >> R> myIMRs <- invMillsRatio(myProbit)$IMR1 >> should work. >> >> > Is there a code to calculate IMR by hand?? >> >> Yes, inside invMillsRatio() >> However, why do you want to do this? >> >> > what I see is that the number of rows of IMR calculated and the number >> > of rows in the actual data set do not match (may be some missing >> > value issues, I am not sure, if it is, how to fix it?) and hence IMR >> > could >> > not be added to my original data set, how do I fix this and then proceed >> > to get correct IMR to use in my outcome equation (the OLS stage) >> > >> > This is really taking a lot of time, I am working on it for weeks, can >> > you please help me kindly, If you wish I can send you the data set as >> > well >> >> Please try to fix it yourself. >> >> Arne >> >> > >> > -----Original Message----- >> > From: Arne Henningsen [mailto:arne.henning...@googlemail.com] >> > Sent: 13 July 2009 00:56 >> > To: Pathak, Saurav; r-help@r-project.org; otoo...@ut.ee >> > Subject: Re: Heckman Selection MOdel Help in R >> > >> > Hi Saurav! >> > >> > On Sun, Jul 12, 2009 at 6:06 PM, Pathak, >> > Saurav<s.patha...@imperial.ac.uk> wrote: >> >> I am new to R, I have to do a 2 step Heckman model, my selection >> >> equation is >> >> below which I was successful in running but I am unable to proceed >> >> further, >> >> >> >> >> >> >> >> I have so far used the following command >> >> >> >> glm(formula = s ~ age + gender + gemedu + gemhinc + es_gdppc + >> >> imf_pop + estbbo_m, family = binomial(link = "probit")) >> >> >> >> My question is >> >> 1. How do i discard the non significant selection variables (one out of >> >> the >> >> seven variables above is non-significant) and calculate the Inverse >> >> Mills >> >> Ratio of the significant variables >> >> >> >> 2. I need the inverse mills ratio from the above to run the outcome >> >> equation >> >> model using OLS with the Inverse mills ratio calculated on the basis of >> >> the >> >> above probit as the control in my outcome equation, hence I need to >> >> get the >> >> IMR (Is there another direct way?) >> >> >> >> 3. How can this be done in R using my concept or otherwise does there >> >> exist >> >> another way of doing what I wish to achieve >> >> >> >> >> >> >> >> On trying >> >> >> >> regmod <- heckit(s ~ age + gender + gemedu + gemhinc + es_gdppc + >> >> >> >> imf_pop + estbbo_m, ln_oy5_1 ~ age+ gender+fearfail+gemedu, >> >> adpopdata,method="2step") >> >> >> >> >> >> >> >> I get >> >> >> >> Error: could not find function "heckit" >> >> >> >> >> >> >> >> Error: could not find function "invMillsRatio" >> >> >> >> >> >> >> >> Am I missing out something, do i have to install something apart from R >> >> also, so far I have used >> >> >> >> >> >> >> >> install.packages( "sampleSelection", >> >> repos="http://R-Forge.R-project.org" ) >> >> >> >> install.packages("Rcmdr", dependencies=TRUE) >> >> >> >> >> >> >> >> Even then I am unable to run heckit, please help >> > >> > You have to install (only once) and *load* the package before you can >> > use it: >> > R> library( "sampleSelection" ) >> > >> > I suggest that you do NOT use function "heckit" but function >> > "selection", because the latter allows you to estimate the model both >> > by the 2-step and the 1-step (ML) method (depending on argument >> > "method"). >> > >> > Our paper about the sampleSelection package published in the JSS could >> > be also helpful for you: >> > http://www.jstatsoft.org/v27/i07/ >> > >> > Arne >> > >> > -- >> > Arne Henningsen >> > http://www.arne-henningsen.name >> > >> >> >> >> -- >> Arne Henningsen >> http://www.arne-henningsen.name >> >> ______________________________________________ >> R-help@r-project.org mailing list >> https://stat.ethz.ch/mailman/listinfo/r-help >> PLEASE do read the posting guide >> http://www.R-project.org/posting-guide.html >> and provide commented, minimal, self-contained, reproducible code. > > > > -- > Dr.Saurav Pathak > PhD, Univ.of.Florida > Mechanical Engineering > Doctoral Student > Innovation and Entrepreneurship > Imperial College Business School > s.patha...@imperial.ac.uk > 0044-7795321121 > -- Arne Henningsen http://www.arne-henningsen.name ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.