My guess (since we still have no data on which to test these ideas) is that you need either to merge() or to use a matrix created from the dates and qtr-hours entries in "gw", since matching on dates and hours separately will not uniquely classify the good qtr-hours within their proper corresponding dates. You want a structure (or a matching process) that takes:
        hqhr1   qhr2    qhr3    qhr4 .......
date1   good    bad     good    bad
date2   bad     good    good    good
date3   bad     bad     bad     good
.
.
.
and lets you use the values in "arr" to get values in "gw". Notice that the notion of arr$Date %in% gw$date & arr$qtrhr %in% gw$qtrhr simply will not accomplish anything correct/

Merging by multiple criteria (with the merge function) would do that or you could construct a matrix whose entries were the categories good /bad. The table function could create the matrix for the purpose of using an indexed solution if you are dead-set against the merge concept.




On Jan 17, 2010, at 4:47 PM, James Rome wrote:

Thank you Dennis.
arr$gw <- as.numeric(weather$Date == arr$Date & arr$quarter %in%
weather$quarter)
seems to be what I want to do, but in fact, with the full data set, it
misidentifies the rows, so I think the error message must mean something.

arrr$Date <- as.Date(as.character(ewr$Date),format="%m/%d/%y")
weather$Date <- as.Date(as.character(weather$Date),format="%m/%d/%y")
gw = c(length(arrr))
gw[1:length(arrr[,1])]=FALSE
gw[arrr$Date==weather$Date & weather$quarter %in% arr$quarter]
Warning in `==.default`(arr$Date, weather$Date) :
 longer object length is not a multiple of shorter object length
Warning in arr$Date == weather$Date & weather$quarter %in% arr $quarter :
 longer object length is not a multiple of shorter object length
 [1] 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0
[38] 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0
[75] 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0
[112] 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0
[149] 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0
[186] 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0
[223] 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0
[260] 0 0 0 0 0 0 0 0

There are many many more matches in the 99k line arrival data set.

Thanks a bunch,
Jim


On 1/17/10 3:21 PM, Dennis Murphy wrote:
Hi:

To read a data set from a R-help message into R, one uses
read.table(textConnection("<verbatim text>"), ...)

Your weather data set had
(a) a variable name with a space in it, that R misread and had to be
altered manually;
(b) a missing value with no NA that R interpreted as an incomplete
line; again, it had
    to be altered manually.

This is why David suggested the use of dput(), so that these vagaries
don't have to be
dealt with by those who are trying to help.

That being said, for the example that you gave and the desired value
that you wanted, try

arr$gw <- as.numeric(weather$Date == arr$Date & arr$quarter %in%
weather$quarter)

(I changed DateTime to Date in the arr data frame...)

You'll get warnings like

Warning messages:
1: In is.na <http://is.na>(e1) | is.na <http://is.na>(e2) :
 longer object length is not a multiple of shorter object length

but it seems to do the right thing. The first equality is there to
constrain matches for
quarter to be within the same day.

For future reference,

dput(weather)
structure(list(Date = structure(c(1L, 1L, 1L, 1L), .Label = "1/1/09",
class = "factor"),
   minute = c(5L, 15L, 30L, 45L), hour = c(15L, 15L, 15L, 15L
   ), quarter = 60:63, efficiency = c(NA, 72, 63.3, 85.4)), .Names =
c("Date",
"minute", "hour", "quarter", "efficiency"), class = "data.frame",
row.names = c(NA,
-4L))
dput(arr)
structure(list(Date = structure(c(1L, 1L, 1L, 1L, 1L, 1L, 1L,
1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L), .Label = "1/1/09",
class = "factor"),
   weekday = c(5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L,
   5L, 5L, 5L, 5L, 5L, 5L, 5L), month = c(1L, 1L, 1L, 1L, 1L,
   1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L),
   quarter = c(59L, 59L, 60L, 60L, 60L, 60L, 60L, 60L, 60L,
   60L, 60L, 60L, 60L, 61L, 61L, 61L, 61L, 66L, 67L), ICAO =
structure(c(6L,
   8L, 7L, 3L, 6L, 3L, 5L, 3L, 3L, 1L, 3L, 5L, 3L, 3L, 6L, 6L,
   2L, 4L, 3L), .Label = c("AAL", "AWE", "BTA", "CHQ", "CJC",
   "COA", "JBU", "NWA"), class = "factor"), Flight = structure(c(15L,
   19L, 18L, 6L, 17L, 8L, 12L, 5L, 4L, 1L, 3L, 13L, 9L, 10L,
   14L, 16L, 2L, 11L, 7L), .Label = c("AAL842", "AWE307", "BTA1234",
   "BTA2064", "BTA2085", "BTA2347", "BTA2405", "BTA2916", "BTA3072",
   "BTA3086", "CHQ5312", "CJC3225", "CJC3359", "COA1166", "COA349",
   "COA855", "COA886", "JBU554", "NWA9934"), class = "factor"),
   gw = c(FALSE, FALSE, TRUE, TRUE, TRUE, TRUE, TRUE, TRUE,
   TRUE, TRUE, TRUE, TRUE, TRUE, TRUE, TRUE, TRUE, TRUE, FALSE,
   FALSE)), .Names = c("Date", "weekday", "month", "quarter",
"ICAO", "Flight", "gw"), row.names = c(NA, -19L), class = "data.frame")

These can be copied and pasted directly into an R session without
modification.

HTH,
Dennis

On Sun, Jan 17, 2010 at 10:51 AM, James Rome <jamesr...@gmail.com
<mailto:jamesr...@gmail.com>> wrote:




   On 1/17/10 1:06 PM, David Winsemius wrote:

On Jan 17, 2010, at 12:37 PM, James Rome wrote:

I don't think it is that simple because it is not a one-to-one
   match. In
the arr data frame, there are many arrivals in a quarter hour
   with good
weather on a given day. So I need to match the date and the quarter
hour.

And all of the rows in the weather data frame are times with good
weather--unique date + quarter hour. That is why I needed the
   loop. For
each date and quarter hour in weather, I want to mark all the
   entries
with the corresponding date and weather as TRUE in the arr$gw
   column.

I did convert the dates to POSIXlt dates and rewrote my function as
gooddates = function(all, good) {
 la = length(all)   # All the arrivals
lw = length(good)  # The good 15-minute periods
for(j in 1:lw) {
  d=good$Date[j]
  q=good$quarter[j]
  all$gw[all$Date==d && all$quarter==q]=TRUE


You are attempting a vectorized test and assignment with "&&" which
seems unlikely to succeed, but even then I am not sure your problems
would be over. (I'm also guessing that you might not have reported a
warning.)

Why shouldn't the && succeed? You are correct there, because I do get
   items if I use either part of this and test, when I insert the &&,
   I get
   no hits. And I got no warnings.

Why not merge arr to gw by date and quarter?
The sets contain different data, and the only thing I want from the weather set is the fact that it has an entry for a given date and time

Answering these questions would be greatly speeded up with a small
sample dataset. Are you aware of the virtues of the dput function?


   What I want is for a 1 to be in the gw column in the quarter
   60,61,62,63,...

   For example, here is some data from the good weather set:
   Date    minute  hour    quarter         Efficiency Val
   1/1/09  5       15      60
   1/1/09  15      15      61      72
   1/1/09  30      15      62      63.3
   1/1/09  45      15      63      85.4



   And this is from the arrivals set:
   DateTime        weekday         month   quarter         ICAO
    Flight  gw

   1/1/09  5       1       59      COA     COA349          0
   1/1/09  5       1       59      NWA     NWA9934         0
   1/1/09  5       1       60      JBU     JBU554          0
   1/1/09  5       1       60      BTA     BTA2347         0
   1/1/09  5       1       60      COA     COA886          0
   1/1/09  5       1       60      BTA     BTA2916         0
   1/1/09  5       1       60      CJC     CJC3225         0
   1/1/09  5       1       60      BTA     BTA2085         0
   1/1/09  5       1       60      BTA     BTA2064         0
   1/1/09  5       1       60      AAL     AAL842          0
   1/1/09  5       1       60      BTA     BTA1234         0
   1/1/09  5       1       60      CJC     CJC3359         0
   1/1/09  5       1       60      BTA     BTA3072         0
   1/1/09  5       1       61      BTA     BTA3086         0
   1/1/09  5       1       61      COA     COA1166         0
   1/1/09  5       1       61      COA     COA855          0
   1/1/09  5       1       61      AWE     AWE307          0
   1/1/09  5       1       66      CHQ     CHQ5312         0
   1/1/09  5       1       67      BTA     BTA2405         0



          [[alternative HTML version deleted]]

   ______________________________________________
   R-help@r-project.org <mailto:R-help@r-project.org> mailing list
   https://stat.ethz.ch/mailman/listinfo/r-help
   PLEASE do read the posting guide
   http://www.R-project.org/posting-guide.html
   and provide commented, minimal, self-contained, reproducible code.



        [[alternative HTML version deleted]]

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

David Winsemius, MD
Heritage Laboratories
West Hartford, CT

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Reply via email to