Dear All, I would like to use the simprof function (clustsig package) but the available distances do not include Jaccard distance, which is the most appropriate for pres/abs community data. Here is the core of the function: > simprof function (data, num.expected = 1000, num.simulated = 999, method.cluster = "average", method.distance = "euclidean", method.transform = "identity", alpha = 0.05, sample.orientation = "row", const = 0, silent = TRUE, increment = 100) { if (!is.matrix(data)) data <- as.matrix(data) rawdata <- data if (sample.orientation == "column") rawdata <- t(rawdata) if (is.function(method.distance)) rawdata.dist <- method.distance(rawdata) else if (method.distance == "braycurtis") { rawdata.dist <- braycurtis(rawdata, const) if (!is.null(rownames(rawdata))) attr(rawdata.dist, "Labels") <- rownames(rawdata) } else rawdata.dist <- dist(rawdata, method = method.distance) if (!method.transform == "identity") rawdata <- trans(rawdata, method.transform) hclust.results <- hclust(rawdata.dist, method = method.cluster) pMatrix <- cbind(hclust.results$merge, matrix(data = NA, nrow = nrow(hclust.results$merge), ncol = 1)) simprof.results <- simprof.body(rawdata = rawdata, num.expected = num.expected, num.simulated = num.simulated, method.cluster = method.cluster, method.distance = method.distance, originaldata = rawdata, alpha = alpha, clust.order = hclust.results$merge, startrow = nrow(hclust.results$merge), pMatrix = pMatrix, const = const, silent = silent, increment = increment) results <- list() results[["numgroups"]] <- length(simprof.results$samples) results[["pval"]] <- simprof.results$pval results[["hclust"]] <- hclust.results if (!is.null(rownames(rawdata))) { for (i in 1:length(simprof.results$samples)) { for (j in 1:length(simprof.results$samples[[i]])) { simprof.results$samples[[i]][j] <- rownames(rawdata)[as.numeric(simprof.results$samples[[i]][j])] } } } results[["significantclusters"]] <- simprof.results$samples return(results) } <environment: namespace:clustsig>
I tried to trick the function by bypassing the input of a raw community matrix (sites x species) by inputing instead a distance matrix already calculated with vegdist (vegan package) (Jaccard distance is available in vegdist, but not in the function dist used in simprof...) I therefore modified the function as follow: simprof2=function (data, num.expected = 1000, num.simulated = 999, method.cluster = "average", alpha = 0.05, sample.orientation = "row", const = 0, silent = TRUE, increment = 100) { hclust.results <- hclust(data, method = method.cluster) pMatrix <- cbind(hclust.results$merge, matrix(data = NA, nrow = nrow(hclust.results$merge), ncol = 1)) simprof.results <- simprof.body(data = data, num.expected = num.expected, num.simulated = num.simulated, method.cluster = method.cluster,originaldata = data, alpha = alpha, clust.order = hclust.results$merge, startrow = nrow(hclust.results$merge), pMatrix = pMatrix, const = const, silent = silent, increment = increment) results <- list() results[["numgroups"]] <- length(simprof.results$samples) results[["pval"]] <- simprof.results$pval results[["hclust"]] <- hclust.results if (!is.null(rownames(data))) { for (i in 1:length(simprof.results$samples)) { for (j in 1:length(simprof.results$samples[[i]])) { simprof.results$samples[[i]][j] <- rownames(data)[as.numeric(simprof.results$samples[[i]][j])] } } } results[["significantclusters"]] <- simprof.results$samples return(results) } But upon running it on my vegdist object, I get the following error: > simprof2(G3_jaccard) Error in simprof2(G3_jaccard) : could not find function "simprof.body" I guess this function is part of the initial environment in which simprof runs, but how do I access it in order to call it when I run my own function outside of the initial environment? Another possible but maybe more complex way would be to define the jaccard distance (as J= 2B/(1+B) B beeing BrayCurtis distance already used in the initial function) Many thanks in advance for your help! -- Maïa Berman, MSc PhD candidate [[alternative HTML version deleted]]
______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.