> On Tue, Apr 8, 2008 at 1:50 PM, Hans-Jörg Bibiko <[EMAIL PROTECTED]> > wrote: >> I was sent a text file containing a distance matrix à la: >> >> 1 >> 2 3 >> 4 5 6
Thanks a lot for your hints. At the end all hints ends up more or less in my "stony" way to do it. Let me summarize it. The clean way is to initialize a matrix containing my distance matrix and generate a dist object by using as.dist(mat). Fine. But how to read the text data (triangular) into a matrix? #1 approach - using 'read.table' mat = read.table("test.txt", fill=T) The problem here is that the first line doesn't contain the correct number of columns of my matrix, thus 'read.table' sets the number of columns to 5 as default. Ergo I have to know the number of columns (num_cols) in beforehand in order to do this: mat = read.table("test.txt", fill=T, col.names=rep('', num_cols)) Per definitionem the last line of "test.txt" contains the correct number of columns. On a UNIX/Mac you can do the following: num_cols <- as.numeric(system("tail -n 1 'test.txt' | wc - w",intern=TRUE)) In other words, read the last line of 'test.txt' and count the number of words if the delimiter is a space. Or one could use 'readLines' and split the last array element to get num_cols. #2 approach - using 'scan()' mat = matrix(0, num_cols, num_cols) mat[row(mat) >= col(mat)] <- scan("test.txt") But this also leads to my problem: 1 2 4 3 5 6 instead of 1 2 3 4 5 6 ==== one solution ============ The approach #2 has two advantages: it's faster than read.table AND I can calculate num_cols. The only problem is the correct order. But this is solvable via: reading the data into the upper triangle and transpose the matrix mat <- matrix(0, num_cols, num_cols) mat[row(mat) <= col(mat)] <- scan("test.txt") mat <- t(mat) Next. If I know that my text file really contains a distance matrix (i.e. the diagonals have been removed) then I can do the following: data <- scan("test.txt") num_cols <- (1 + sqrt(1 + 8*length(data)))/2 - 1 mat <- matrix(0, num_cols, num_cols) mat[row(mat) <= col(mat)] <- data mat <- t(mat) #Finally to get a 'dist' object: mat <- rbind(0, mat) mat <- cbind(mat, 0) dobj <- as.dist(mat) Again, thanks a lot! --Hans ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.