Dear devellopeRs, I've altered factor.scope() from the stats package so it can take marginality into account. Therefore I've added the argument marginal. When marginal is NULL, factor.scope() works like the original version. The marginality is to be specified as a named list. The name of the elements is the name of a variable. The content of the element is a character vector with the variable(s) that must be in the model.
E.g. marginal = list(D = c("B", "C"), A3 = "A2", A2 = "A1") Means that - D can only enter the model if B and C are present. - A2 can only enter the model if A1 is present. - A3 can only enter the model if A2 is present. Since the list contains A2 = "A1", this implies that A1 needs to be present too. - B can only leave the model if D is not present. - C can only leave the model if D is not present. - A2 can only leave the model if A3 is not present. - A1 can only leave the model if A2 and A3 are not present. It copes with interactions too. A1:B needs by default A1 and B due to marginality none A2:B needs by default A2 and B due marginality A1 and A1:B Comments are welcome, Cheers, Thierry factor.scope <- function (factor, scope, marginal = NULL) { drop <- scope$drop add <- scope$add if (length(factor) && !is.null(drop)) { nmdrop <- colnames(drop) facs <- factor if (length(drop)) { nmfac <- colnames(factor) nmfac0 <- sapply(strsplit(nmfac, ":", fixed = TRUE), function(x) paste(sort(x), collapse = ":")) nmdrop0 <- sapply(strsplit(nmdrop, ":", fixed = TRUE), function(x) paste(sort(x), collapse = ":")) where <- match(nmdrop0, nmfac0, 0) if (any(!where)){ stop(gettextf("lower scope has term(s) %s not included in model", paste(sQuote(nmdrop[where == 0]), collapse = ", ")), domain = NA) } facs <- factor[, -where, drop = FALSE] nmdrop <- nmfac[-where] } else { nmdrop <- colnames(factor) } if (ncol(facs) > 1) { keep <- rep.int(TRUE, ncol(facs)) f <- crossprod(facs > 0) for (i in seq(keep)){ keep[i] <- max(f[i, -i]) != f[i, i] } nmdrop <- nmdrop[keep] } keep <- unlist(lapply(strsplit(nmdrop, ":", fixed = TRUE), function(x){ whereMarginal <- match(x, names(marginal)) if(!all(is.na(whereMarginal))){ if(length(x) == 1){ marginal[[whereMarginal]] } else { lowerOrderTerms <- lapply(seq_along(whereMarginal), function(z){ if(is.na(whereMarginal[z])){ x[z] } else { c(x[z], marginal[[whereMarginal[z]]]) } }) lowerOrder <- apply(expand.grid(lowerOrderTerms), 1, function(z){ paste(sort(z), collapse = ":") }) lowerOrder[!lowerOrder %in% paste(sort(x), collapse = ":")] } } })) nmdrop <- nmdrop[!nmdrop %in% keep] } else { nmdrop <- character(0) } if (!length(add)){ nmadd <- character(0) } else { nmfac <- colnames(factor) nmadd <- colnames(add) if (!is.null(nmfac)) { nmfac0 <- sapply(strsplit(nmfac, ":", fixed = TRUE), function(x) paste(sort(x), collapse = ":")) nmadd0 <- sapply(strsplit(nmadd, ":", fixed = TRUE), function(x) paste(sort(x), collapse = ":")) where <- match(nmfac0, nmadd0, 0) if (any(!where)){ stop(gettextf("upper scope does not include model term(s) %s", paste(sQuote(nmfac[where == 0]), collapse = ", ")), domain = NA) } nmadd <- nmadd[-where] add <- add[, -where, drop = FALSE] } if (ncol(add) > 1) { keep <- rep.int(TRUE, ncol(add)) f <- crossprod(add > 0) for (i in seq(keep)){ keep[-i] <- keep[-i] & (f[i, -i] < f[i, i]) } nmadd <- nmadd[keep] } if(length(nmadd)){ keep <- sapply(strsplit(nmadd, ":", fixed = TRUE), function(x){ whereMarginal <- match(x, names(marginal)) if(!all(is.na(whereMarginal))){ if(length(x) == 1){ all(!marginal[[whereMarginal]] %in% nmadd) } else { lowerOrderTerms <- lapply(seq_along(whereMarginal), function(z){ if(is.na(whereMarginal[z])){ x[z] } else { c(x[z], marginal[[whereMarginal[z]]]) } }) lowerOrder <- apply(expand.grid(lowerOrderTerms), 1, function(z){ paste(sort(z), collapse = ":") }) lowerOrder <- lowerOrder[!lowerOrder %in% paste(sort(x), collapse = ":")] all(!lowerOrder %in% nmadd) } } else { TRUE } }) nmadd <- nmadd[keep] } } list(drop = nmdrop, add = nmadd) } ------------------------------------------------------------------------ ---- ir. Thierry Onkelinx Instituut voor natuur- en bosonderzoek / Research Institute for Nature and Forest Cel biometrie, methodologie en kwaliteitszorg / Section biometrics, methodology and quality assurance Gaverstraat 4 9500 Geraardsbergen Belgium tel. + 32 54/436 185 [EMAIL PROTECTED] www.inbo.be To call in the statistician after the experiment is done may be no more than asking him to perform a post-mortem examination: he may be able to say what the experiment died of. ~ Sir Ronald Aylmer Fisher The plural of anecdote is not data. ~ Roger Brinner The combination of some data and an aching desire for an answer does not ensure that a reasonable answer can be extracted from a given body of data. ~ John Tukey ______________________________________________ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel