Re: [R] closeness of codes

Jean V Adams Tue, 20 Sep 2011 05:42:56 -0700

Jim Lemon wrote on 09/20/2011 04:15:46 AM:
> 
> On 09/19/2011 04:46 PM, Henri-Paul Indiogine wrote:
> > Greetings!
> >
> > I am using the R library RQDA to assign certain codes to paragraphs of
> > documents in a collection.   Several paragraphs are assigned more than
> > 1 code.  E.g. often the codes "poverty" and "education" will be
> > assigned to the same paragraph.   Often also "math" and "career" will
> > be given to the same paragraphs.  Other codes are never given to the
> > same paragraphs.
> >
> > I would like to calculate the relationship or "closeness" of certain
> > codes.  RQDA will generate a cross-codes table.  It has the form of an
> > upper triangular matrix where the upper triangle has the number of
> > cross occurrences of 2 codes at their intersection.  The lower
> > triangle is filled with NA.  The diagonal simply has the number of
> > occurrences of the codes by themselves.
> >
> > The row names are the names of the codes and the column names are the
> > IDs of the codes.  E.g.
> >
> >             1     2     3    4
> > code1  3     0      2    1
> > code2  NA  4     1     0
> > code3  NA NA   2     0
> > code4  NA NA  NA   3
> >
> > We can see that code1 is associated 2 out of 3 times with code3.
> > Code2 is present 1 out of 4 times with code3.  Code2 is never assigned
> > to the same paragraph as Code1 and Code4 are, and so on.
> >
> > I am trying to understand how to create some sort of graph or diagram
> > to represent this.  Should I use a cluster diagram or a network graph?
> >   Also, what sort of R code could I use?
> 
> Hi Henri,
> The intersectDiagram function in the plotrix package displays the 
> intersections of sets as rectangles with widths (and areas) proportional


> to the number of members of each set intersection. This may be a way for 

> you to represent your codes. For your example, you could proceed like 
> this. Create a file ("hp.csv")containing the following:
> 
> paragraph,attribute
> p1,code1
> p1,code3
> p2,code1
> p2,code3
> p3,code1
> p3,code4
> p4,code2
> p5,code2
> p6,code2
> p7,code2
> p7,code3
> p8,code3
> p9,code3
> p10,code4
> p11,code4
> p12,code4
> 
> then:
> 
> library(plotrix)
> hp<-read.csv("hp.csv")
> intersectDiagram(hp,main="Combinations of codes")
> 
> There are other ways to represent your original data that 
> intersectDiagram will read in that you might like to try.
> 
> Jim


Another approach would be to redefine the cross-codes table as distances.
For example, if the cross-codes table is a matrix called m ...

# convert to "distances"
d <- 1 - m/diag(m)

# fill in the complete matrix
d[lower.tri(d)] <- d[upper.tri(d)]

# use multidimensional scaling to represent the distances in two 
dimensions
twodim <- cmdscale(d)
plot(twodim, type="n")
text(twodim, rownames(twodim))

Jean
        [[alternative HTML version deleted]]

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] closeness of codes

Reply via email to