Hi

The short version of my questions is this:

How can I run a chi-square test over a matrix (table) to get the distanaces
between rows and then run a SingleLinkage (or other fusion algorithm over
the resulting table?

------------

The long-version of my question:

My data consists of different data of different countries so I have stuff
like how many people can read, write in X,Y,Z countries and then percentages
for each country. And I want to find out which countries might be similar by
doing a cluster analysis.

So first I want to take the data which would look something like this:

             Plastikbecher Kartonbox Papier
Rama                    24        65     12
Homa                    83        30     21
Flora                   75        28     22
SB                      35        55     21
Holl. Butter            20        40     75

And then run a chi-square test over it (I think that makes the most sense or
does anybody think something different)?

So for that I will put each row with every other row in a single different
matrix (mat1) and use the use the chisq.test.

So mat 1 would for example looks like this:

             Plastikbecher Kartonbox Papier
Rama                    24        65     12
Flora                   75        28     22

And then I would run matResult[1,3] <- sqrt(chisq.test(mat1)[[1]])

So in the end I would get a matrix like this:
            Rama  Homa Flora    SB HollButter
Rama       0.000 6.642 6.470 2.209      6.931
Homa       6.642 0.000 0.430 4.994      8.387
Flora      6.470 0.430 0.000 4.754      7.941
SB         2.209 4.994 4.754 0.000      5.901
HollButter 6.931 8.387 7.941 5.901      0.000

So here is my question:
How can I run a single linkage algorithm over this matrix?

I thought a good  stating point might be "hclust"

hclust(d, method = "complete", members=NULL)

But the R reference says d must be "a dissimilarity structure as produced by
dist."

But the dist function does not have a method chisquared-test or something
similar.

So does anybody have an idea how I can do a clusteranalysis with a
chi-squared test and then use a fusion algorithm to join the clusters?

Thanks

Thorsten

        [[alternative HTML version deleted]]

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Reply via email to