On Aug 12, 2010, at 6:15 PM, David Winsemius wrote:
On Aug 12, 2010, at 5:20 PM, Toby Gass wrote:
Hi,
I do want to look only at slope.
If there is one negative slope measurement for a given day and a
given chamber, I would like to remove all other slope measurements
for that day and that chamber, even if they are positive.
On one day, I will have 20 slope measurements for each chamber. If
one is negative, I would like to delete the other 19 for that chamber
on that day, even if they are positive. I have measurements for
every day of the year, for 4 years and multiple chambers.
I know I could make some awful nested loop with a vector of day and
chamber numbers for each occurrence of a negative slope and then run
that against the whole data set but I hope not to have to do that.
Here is the rationale, if that helps. These are unattended outdoor
chambers that measure soil carbon efflux. When the numbers go
negative during part of the day but otherwise look normal, it usually
means a plant has sprouted in the chamber and is using the carbon
dioxide. That means the measurements are all lower than they should
be and I need to discard all measurements collected on that day,
whether positive or negative.
It might have been a little clearer if I'd make the toy dataframe a
bit larger.
I think the fault was all mine. Failure to read for meaning. Here's
an alternate strategy, although I think Schwartz's might be cleaner:
> toy$ch.day.cat <- with(toy, paste(CH, DAY, sep="."))
> negs.idxs <- tapply(toy$SLOPE , toy$ch.day.cat, function (x) any(x
<0) )
> negs.idxs
3.4 3.5 4.4 4.5 5.4 5.5
FALSE FALSE FALSE FALSE FALSE TRUE
> toy[-which(negs.idxs), ]
CH DAY SLOPE ch.day.cat
1 3 4 0.2 3.4
2 4 4 0.3 4.4
3 5 4 0.4 5.4
4 3 4 0.5 3.4
5 4 4 0.6 4.4
7 3 5 0.1 3.5
8 4 5 0.0 4.5
9 5 5 -0.1 5.5
I think I should give up today. I saw that the above code eliminates
#6 and only after posting saw that #9 was left in:
require(rms) # for %nin% .. or use the %w/o% operator defined on
match help page:
> toy[toy$ch.day.cat %nin% names(negs.idxs[negs.idxs]), ]
CH DAY SLOPE ch.day.cat
1 3 4 0.2 3.4
2 4 4 0.3 4.4
3 5 4 0.4 5.4
4 3 4 0.5 3.4
5 4 4 0.6 4.4
7 3 5 0.1 3.5
8 4 5 0.0 4.5
Now I am really sure that the ave( , , any) strategy is superior.
--
David
Thanks again for the assistance.
Toby
On 12 Aug 2010 at 16:39, David Winsemius wrote:
On Aug 12, 2010, at 4:06 PM, Toby Gass wrote:
Thank you all for the quick responses. So far as I've checked,
Marc's solution works perfectly and is quite speedy. I'm still
trying to figure out what it is doing. :)
Henrique's solution seems to need some columns somewhere. David's
solution does not find all the other measurements, possibly with
positive values, taken on the same day.
I assumed you only wanted to look at what appeared to be a data
column, SLOPE. If you want to look at all columns for negatives then
try:
toy[ which( apply(toy, 1, function(x) all(x >= 0)) ), ] # or
toy[ apply(toy, 1, function(x) all(x >= 0)) , ]
This is how they differ w,r,t, their handling of NA's.
toy[3,2] <- NA
toy[ apply(toy, 1, function(x) all(x >= 0)) , ]
CH DAY SLOPE
1 3 4 0.2
2 4 4 0.3
NA NA NA NA
4 3 4 0.5
5 4 4 0.6
6 5 5 0.2
7 3 5 0.1
8 4 5 0.0
toy[ which(apply(toy, 1, function(x) all(x >= 0)) ), ]
CH DAY SLOPE
1 3 4 0.2
2 4 4 0.3
4 3 4 0.5
5 4 4 0.6
6 5 5 0.2
7 3 5 0.1
8 4 5 0.0
Thank you again for your efforts.
Toby
On 12 Aug 2010 at 14:32, Marc Schwartz wrote:
On Aug 12, 2010, at 2:24 PM, Marc Schwartz wrote:
On Aug 12, 2010, at 2:11 PM, Toby Gass wrote:
Dear helpeRs,
I have a dataframe (14947 x 27) containing measurements
collected
every 5 seconds at several different sampling locations. If one
measurement at a given location is less than zero on a given
day, I
would like to delete all measurements from that location on that
day.
Here is a toy example:
toy <- data.frame(CH = rep(3:5,3), DAY = c(rep(4,5), rep(5,4)),
SLOPE = c(seq(0.2,0.6, .1),seq(0.2, -0.1, -0.1)))
In this example, row 9 has a negative measurement for Chamber 5,
so I
would like to delete row 6, which is the same Chamber on the
same
day, but not row 3, which is the same chamber on a different
day. In
the full dataframe, there are, of course, many more days.
Is there a handy R way to do this?
Thank you for the assistance.
Toby
Not fully tested, but here is one possibility:
toy
CH DAY SLOPE
1 3 4 0.2
2 4 4 0.3
3 5 4 0.4
4 3 4 0.5
5 4 4 0.6
6 5 5 0.2
7 3 5 0.1
8 4 5 0.0
9 5 5 -0.1
subset(toy, ave(SLOPE, CH, DAY, FUN = function(x) any(x < 0))
== 0)
CH DAY SLOPE
1 3 4 0.2
2 4 4 0.3
3 5 4 0.4
4 3 4 0.5
5 4 4 0.6
7 3 5 0.1
8 4 5 0.0
This can actually be slightly shortened to:
subset(toy, !ave(SLOPE, CH, DAY, FUN = function(x) any(x < 0)))
CH DAY SLOPE
1 3 4 0.2
2 4 4 0.3
3 5 4 0.4
4 3 4 0.5
5 4 4 0.6
7 3 5 0.1
8 4 5 0.0
HTH,
Marc
______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
David Winsemius, MD
West Hartford, CT
David Winsemius, MD
West Hartford, CT
______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
David Winsemius, MD
West Hartford, CT
______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.