Yeah, sometimes the vocabulary we bring to a task does not match up (or "merge" properly) with the vocabulary that the developers use. In this case the merge operation is one that has a precise meaning in database lingo, which apparently you do not have background in. My experience in trying to "append" objects ran into similar frustrations early in my R endeavors. For the life of me, I could not find any instances of "append" in the index of the references I was using.

I am glad that you found that material helpful, but I think its use of the terms "join" or "merge" are incorrect in a database framework as well, so I do not think it could be used as an unambiguous guide. Your use of "combine" was likewise ambiguous. In composing questions to R- help, it is advised that you post a small example and illustrate what you want to see as a result.

--
David.



On Feb 2, 2010, at 1:47 PM, James Rome wrote:

On 2/1/2010 5:51 PM, David Winsemius wrote:
I figured this out finally. I really believe that the R help write- ups are sorely lacking.

You should ponder whether you actually know enough to criticize the help page when it describes the merge function as performing "database join operations". My guess is that you don't. The help page are not to be designed to teach basic computer programming concepts.



As soon as I looked at http://www.statmethods.net/management/merging.html , it was obvious:
Adding Columns
To merge two dataframes (datasets) horizontally, use the merge function. In most cases, you join two dataframes by one or more common key variables (i.e., an inner join).

# merge two dataframes by ID
total <- merge(dataframeA,dataframeB,by="ID")

# merge two dataframes by ID and Country
total <- merge(dataframeA,dataframeB,by=c("ID","Country"))

Adding Rows
To join two dataframes (datasets) vertically, use the rbind function. The two dataframes must have the same variables, but they do not have to be in the same order.

total <- rbind(dataframeA, dataframeB)

I needed to add rows, and had to use rbind. If the help for merge said "To merge two dataframes (datasets) horizontally" I would have known right away that it was the wrong function to use.

Thanks for the help,
Jim Rome


On Feb 1, 2010, at 5:30 PM, David Winsemius wrote:


On Feb 1, 2010, at 5:16 PM, James Rome wrote:

Dear kind R helpers,

I have a vector of runway names in rwy ("31R", "31L",... the number is user selectable) arrgnd is a data frame with data for all flights and all runways, with a Runway column. I am trying to subset arrgnd into a dat frame for each selected runway, and then combine them back together using the following code:

for (j in 1:nr) {    # nr = number of user-selected runways

Safer would be:

for (j in seq_along(rwy) {

  ar4rw = arrgnd[arrgnd$Runway==rwy[j],]

Clearer would be :

ar4rw <- subset(arrgnd, Runway= j) # and I think the NA line's will also disappear.
                                     ^ ==  ^


  if (j == 1) {
      arrw = ar4rw
  }
  else {
      arrw = merge(arrw, ar4rw)
  }
}

You really should give us something like:

dput(rwy)
dput( head(arrgnd, 10) )

but, the merge step gives me a data frame with all NAs. In addition, ar4rw always gets a row with NAs at the start, which I do not understand. There are no rows with all NAs in the arrgnd data frame.
> ar4rw[1:2,]  # first time through for 31R
DateTime Date month hour minute quarter weekday IATA ICAO Flight
NA <NA> <NA>    NA   NA     NA      NA      NA <NA> <NA> <NA>
529 1/1/09 21:46 2009-01-01 1 21 46 87 5 TA TAI TAI570
  AircraftType   Tail  Arrived   STA Runway     FromTo Delay
NA <NA> <NA> <NA> <NA> <NA> <NA>    NA
529         A320 N496TA 21:46:58 22:30    31R MSLP /KJFK     0
                     Operator            dq gw
NA <NA> <NA> NA
529 TACA INTERNATIONAL AIRLINES 2009-01-01 87  1

> ar4rw[1:2,]   # second time through for 31L
DateTime Date month hour minute quarter weekday IATA ICAO Flight
NA <NA> <NA>    NA   NA     NA      NA      NA <NA> <NA> <NA>
552 1/1/09 23:03 2009-01-01 1 23 3 92 5 AA AAL AAL22 AircraftType Tail Arrived STA Runway FromTo Delay Operator
NA <NA> <NA> <NA> <NA> <NA> <NA>    NA <NA>
552 B762 N329AA 23:03:35 23:10 31L LAX /JFK 0 AMERICAN AIRLINES
             dq gw
NA <NA> NA

But after the merge, I get all NAs. What am I doing wrong?

The data layout gets mangled and I cannot tell what rows are being matched to what. Use dput to convey an unambiguous, and easily replicated example.

Thanks,
Jim Rome

552 2009-01-01 92  1

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

David Winsemius, MD
Heritage Laboratories
West Hartford, CT

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

David Winsemius, MD
Heritage Laboratories
West Hartford, CT



David Winsemius, MD
Heritage Laboratories
West Hartford, CT

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Reply via email to