Im not sure what SAS is doing (or if you are using it correctly). In R you do not create marginal totals independent of the data and try to fit them to the data. In your first example you create a matrix called raw, but you do not use it for anything. Your loglin() call is for the all cells 16.67 and then you fit a model in which the row and column marginal totals are used but not the row*column marginal. Im not really sure what you are trying to accomplish with that.
In your second example you create three variables and then want to fit another set of marginal totals that seem to be roughly equal distribution for rows/columns/pages except that race has four categories but tart.reg has only three??? If the null hypothesis for these data is no interaction between the variables and that each category should have the same proportion of cases: age=c(1/3, 1/3, 1/3), gender=c(1/2, 1/2), race=c(1/4, 1/4, 1/4, 1/4), then try this: mytable <- xtabs(~age+gender+race, rawdat) # table() loses the variable names loglin(mytable, margin=list(0), fit=TRUE) If you want to preserve the marginal totals for each variable, but not any interactions between them use loglin(mytable, margin=list(1, 2, 3), fit=TRUE) If you want to fit the three two-way interactions use loglin(mytable, margin=list(c(1, 2), c(2, 3), c(1, 3)), fit=TRUE) If you want to fit the saturated table (all interactions), use loglin(mytable, margin=list(c(1, 2, 3)), fit=TRUE) ------- David From: Miao Zhang [mailto:mandyzhangpub...@gmail.com] Sent: Monday, July 30, 2012 9:35 AM To: dcarl...@tamu.edu Subject: Re: [R] How can I use IPF function correctly? Thanks David, The purpose of doing this is that i am trying to weighted the data to get the target values (yes, I am using percentage instead of counts here), I could get what I need for 2 way tables as using loglin() codes as below, I have the row target and column target value: raw<-matrix(c(28.571,14.286,23.809,4.762,9.523,19.049),3,2,byrow=TRUE) rowmarg<-c(33.4,33.3,33.3) colmarg<-c(50,50) newmat1 <- loglin( rowmarg%o%colmarg/sum(colmarg), margin=list(1,2), start=raw, fit=TRUE, eps=1.e-05, iter=100)$fit Am I am not sure how to expending into 3 or higher dimensions(I need expending into higher dimentions latter), that's why I am considering Iterative proportional fitting/ipf(), SAS can use ipf call, but i am not sure how to apply in R, here we could use counts instead of %, here is an example, say we have age, gender and region 3 variables, by using frequency: ### set a rawdata and view the frequency#### age <- c(1,1,1,1,1,1,1,1,2,2,2,2,2,2,2,2,3,3,3,3,3,3,3,3,1,1,2,1,2,2,2,3,3,3,3,1,1, 1,1,2,2,2,2,2,2,2,2,3,3,3,3) gender <- c(1,1,1,1,2,2,2,2,1,1,1,1,2,2,2,2,1,1,1,1,2,2,2,2,1,1,2,2,2,2,2,1,1,1,1,2,2, 2,2,1,1,1,1,2,2,2,2,1,1,1,1) race <- c(1,2,3,4,1,2,3,4,1,2,3,4,1,2,3,4,1,2,3,4,1,2,3,4,1,2,2,2,2,3,4,1,2,3,4,1,2, 3,4,1,2,3,4,1,2,3,4,1,2,3,4) rawdat<-data.frame (age,gender,race) View (rawdat) mytable <- table (rawdat$age,rawdat$gender,rawdat$race) #generates a cross-tab of counts View (mytable) ### set target value to weight the frequency 3 dimensions, NOTE, we are using counts here not percentage, trying to fit the frequency to the target value #### targ.age<-c(17,17,17) targ.gen<-c(24,26) targ.reg<-c(13,13,12) f2<-ipf(mytable, margins=c(1,2,0,1,3,0,2,3), eps = 1e-04, maxits = 50, showits = TRUE) #no 3 way interaction Where and how should I input/set my target value here? Any sugguestions? or I have to write my own function? Manythanks, Mandy On Fri, Jul 27, 2012 at 5:11 PM, David L Carlson <dcarl...@tamu.edu> wrote: It is not clear what you are trying to do. The ipf() function you are using seems to be the one included in package cat for imputing missing values for categorical variables. For ipf() you have not read the instructions carefully because you have entered the marginal values, not their dimensions and you have given ipf() a 2 way table but miss-specified a three way model. No wonder it is confused. Function loglin() which is part of the included stats package also does iterative proportional fitting. Iterative proportional fitting (ipf) is used for fitting models for categorical data when there are three or more variables. There is no need for ipf on a table with two variables since, the values can be directly calculated. Your example data does not include the raw data counts (as it should), but percentages for each of the 3 x 2 cells (I assume, since they sum to 100). The marginal values you list (again percentages) are for a model assuming equal margins. That is easily computed as 1/3*1/2*100 (one third in each row by one half in each column times 100). So each cell should be 16.667 percent of the total. Using loglin() that would be specified as follows: > loglin(raw, margin=list(0), fit=TRUE) 0 iterations: deviation $lrt [1] 25.87661 $pearson [1] 23.80933 $df [1] 5 $margin [1] 0 $fit [,1] [,2] [1,] 16.66667 16.66667 [2,] 16.66667 16.66667 [3,] 16.66667 16.66667 The lrt and pearson statistics are not valid because you are not using original counts. Note that the number of iterations is 0 because in a 2 way model the values are directly computed. ---------------------------------------------- David L Carlson Associate Professor of Anthropology Texas A&M University College Station, TX 77843-4352 > -----Original Message----- > From: r-help-boun...@r-project.org [mailto:r-help-bounces@r- > project.org] On Behalf Of Miao Zhang > Sent: Friday, July 27, 2012 6:52 AM > To: r-help@r-project.org > Subject: [R] How can I use IPF function correctly? > > Hi All, > I am trying to creat a simple example byusing ipf function in R, but i > could not get it succefully...I am very new to R, does anyone could > help, > to instruct me about this ipf fucntion? > Actually, this is what I mean > > 50 | 50 > ---------------------- > 33.4| 28.57 | 14.29 > 33.3| 23.81 | 4.762 > 33.3| 9.523 | 19.05 > ---------------------- > A 3*2 matrix > raw<-matrix(c(28.571,14.286,23.809,4.762,9.523,19.049),3, 2,byrow=TRUE) > the sum of margin (the value I am setting as the target) > m<-c(33.4,50,0,33.3,50,0,33.3,50) > then call ipf function: > fit1<-ipf(table, margins=m,start=raw,eps = 1e-04, maxits = 50, showits > = > TRUE) > I could calculate it by hand with 7 iterations, but end by I am hoping > to > get R build in ipf function to get it done, what should I put "table" > here? > Thanks in advance! > Mandy > > [[alternative HTML version deleted]] > > ______________________________________________ > R-help@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting- > guide.html > and provide commented, minimal, self-contained, reproducible code. ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.