Re: [R] Summing identical IDs

David Winsemius Sat, 31 Oct 2009 16:05:17 -0700


On Oct 30, 2009, at 3:29 PM, PDXRugger wrote:

David,
You are correct. I think the frist two assumptions can be thrownout andonly the latter two (c,d) can be considered. So how would i combineAcres
for matching Bldgids based on assumptions c,d?

I also think your requested output failed to add the ImpValue of rows10 and 11 (but did add for cases 5 & 6 as well as 8 & 9. So it's notexactly clear what you want.


David Winsemius wrote:



On Oct 29, 2009, at 5:23 PM, PDXRugger wrote:


Terrific help thank you.
dupbuild<-aggregate(DF$Acres, list(Bldgid), sum)
This line worked best.

Now im going to challenge everyone (i think?)

Consider the following:


Acres<-c(100,101,100,130,156,.5,293,300,.09,100,12.5)
Bldgid<-c(1,2,3,4,5,5,6,7,7,8,8)
Year<-c(1946,1952,1922,1910,1955,1955,1999,1990,1991,2000,2000)
ImpValue<-c(1000,1400,1300,900,5000,1200,500,1000,300,1000,1000)
DF=cbind(Acres,Bldgid,Year,ImpValue)
DF<-as.data.frame(DF)

I would like to do the same, except there are some rules i want to
follow.
I only want to aggregate the Acres if :
a) The Years are not identical
b) The ImpValues are not identical

So we are striking out a and b and we only want to sum values wherethe following obtain:

c) The Years are identical and the ImpValue are not
d)The ImpValues are identical and the Years are not


Do an outer join on the DF:
mrgDF <- merge(DF, DF, by="Bldgid", all=T)

# and identify cases within that construct that satisfy the non-equalities:

> with( mrgDF, mrgDF[ImpValue.x != ImpValue.y | Acres.y !=Acres.x , ] )

   Bldgid Acres.x Year.x ImpValue.x Acres.y Year.y ImpValue.y
6       5  156.00   1955       5000    0.50   1955       1200
7       5    0.50   1955       1200  156.00   1955       5000
11      7  300.00   1990       1000    0.09   1991        300
12      7    0.09   1991        300  300.00   1990       1000
15      8  100.00   2000       1000   12.50   2000       1000
16      8   12.50   2000       1000  100.00   2000       1000

You should be able to take it from here ... after you figure out whatit is that you want.


--
David.


As I review your Boolean logic, I run into serious problems.

c) and d) cannot be true if a and b) are true.

So no cases satisfy all 4 specs. In particular both of the pairs you
say you want aggregated (5+6) and 10+11) violate rule a) and the
second pair also violates b).

--
David


but if the Acres and ImpValues are identical i would still like to
add the
Acres together and form one case.
If the cases are put together i would also like to add the ImpValues
together.  So the below

  Acres Bldgid Year ImpValue
1  100.00      1 1946     1000
2  101.00      2 1952     1400
3  100.00      3 1922     1300
4  130.00      4 1910      900
5  156.00      5 1955     5000
6    0.50      5 1955     1200
7  293.00      6 1999      500
8  300.00      7 1990     1000
9    0.09      7 1991      300
10 100.00      8 2000     1000
11  12.50      8 2000     1000

would become

  Acres Bldgid Year ImpValue
1  100.00      1 1946     1000
2  101.00      2 1952     1400
3  100.00      3 1922     1300
4  130.00      4 1910      900
5  156.50     5 1955     6200
7  293.00      6 1999      500
8  300.09     7 1990     1300
10 112.50      8 2000     1000

Thanks, i gave it a bunch of shots but nothing worth posting.





PDXRugger wrote:


Hello All,
 I would like to select records with identical IDs then sum an
attribute
then and return them to the data frame as a single record.  Please
consider


Acres<-c(100,101,100,130,156,.5,293,300,.09)
Bldgid<-c(1,2,3,4,5,5,6,7,7)

DF=cbind(Acres,Bldgid)
DF<-as.data.frame(DF)

So that:

Acres Bldgid
1 100.00      1
2 101.00      2
3 100.00      3
4 130.00      4
5 156.00      5
6   0.50      5
7 293.00      6
8 300.00      7
9   0.09      7

Becomes

Acres Bldgid
1 100.00      1
2 101.00      2
3 100.00      3
4 130.00      4
5 156.50      5
7 293.00      6
8 300.09      7

dup<-unique(DF$Bldgid[duplicated(Bldgid)])
dupbuild<-DF[DF$Bldgid %in% dup,]
dupbuild..dupareasum<-sum(dupbuild$Acres[duplicated(dupbuild
$Bldgid)])

This sums the unique Ids of the duplicated records, not whati want.
Thanks ahead of time

JR


--
View this message in context:
http://www.nabble.com/Summing-identical-IDs-tp26118922p26121056.html
Sent from the R help mailing list archive at Nabble.com.

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


David Winsemius, MD
Heritage Laboratories
West Hartford, CT

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


--
View this message in context: 
http://old.nabble.com/Summing-identical-IDs-tp26118922p26135732.html
Sent from the R help mailing list archive at Nabble.com.

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


David Winsemius, MD
Heritage Laboratories
West Hartford, CT

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Summing identical IDs

Reply via email to