Hi: I'm just ideating here (think IBM commercial...) but perhaps a graphical model approach might be worth looking into. It seems to me that Mr. Rhodes is looking for clusters of banks that are under the same ownership umbrella. That information is not directly available in a single variable, but can evidently be inferred from the matches between the two variables: B[i] controls A[i] if B[i] is nonempty. In the bank 144 -> 147 -> 149 example, 149 controls 147 and 147 controls 144, so it appears that some transitive relation holds among the set of matches as well. (Why is PacMan going through my head? :) I know next to nothing about graphical models, but I'm thinking about igraph and some of the tools in the statnet bundle to tackle this problem. Does that make sense to anyone? Alternatives?
FWIW, Dennis On Wed, Aug 25, 2010 at 2:24 AM, Mike Rhodes <mike_simpso...@yahoo.co.uk>wrote: > Dear Mr Petr PIKAL > After reading the R code provided by you, I realized that I would have > never figured out how this could have been done. I am going to re-read again > and again your code to understand the logic and the commands you have > provided. > Thanks again from the heart for your kind advice. > Regards > Mike > > --- On Wed, 25/8/10, Petr PIKAL <petr.pi...@precheza.cz> wrote: > > From: Petr PIKAL <petr.pi...@precheza.cz> > Subject: Re: [R] Odp: Finding pairs > To: "Mike Rhodes" <mike_simpso...@yahoo.co.uk> > Cc: r-help@r-project.org > Date: Wednesday, 25 August, 2010, 9:01 > > Hm > > r-help-boun...@r-project.org napsal dne 25.08.2010 09:43:26: > > > Dear Mr Petr Pikal > > > > I am extremely sorry for the manner I have raised the query. Actually > that was > > my first post to this R forum and in fact even I was also bit confused > while > > drafting the query, for which I really owe sorry to all for consuming > the > > precious time. Perhaps I will try to redraft my query in a better way as > follows. > > > > I have two datasets "A" and "B" containing the names of branch offices > of a > > particular bank say XYZ plc bank. The XYZ bank has number of main branch > > > offices (say Parent) and some small branch offices falling under the > purview > > of these main branch offices (say Child). > > > > The datalist "A" and "B" consists of these main branch office names as > well as > > small branch office names. B is subset of A and these branch names are > coded. > > Thus we have two datasets A and B as (again I am using only a > > portion of a large database just to have some idea) > > > > > > A B > > 144 > ^^^^what is here in B? Empty space?, > > 145 > > 146 > > 147 144 > > How do you know that 144 from B relates to 147 in A? Is it according to > its positions? I.e. 4th item in B belongs to 4.th item in A? > > > 148 145 > > > > 149 147 > > 151 148 > > > > > > > > Now the branch 144 appears in A as well as in B and in B it is mapped > with > > 147. This means branch 147 comes under the purview of main branch 144. > Again > > 147 is controlling the branch 149 (since 147 also has appeared in B and > is > > mapped with 149 of A). > > > > Similarly, branch 145 is controlling branch 148 which further controls > > operations of bank branch 151 and like wise. > > Well as you did not say anything about structure of your data > A<-144:151 > B<-144:148 > data.frame(A,B) > A B > 1 144 NA > 2 145 NA > 3 146 NA > 4 147 144 > 5 148 145 > 6 149 146 > 7 150 147 > 8 151 148 > DF<-data.frame(A,B) > main<-DF$A[is.na(DF$B)] > branch1<-DF[!is.na(DF$B),] > selected.branch1<-branch1$A[branch1$B%in%main] > branch2<-branch1[!branch1$B%in%main,] > selected.branch2<-branch2$A[branch2$B%in%selected.branch1] > > and for cbinding your data which has uneven number of values see Jim > Holtman's answer to this > > How to cbind DF:s with differing number of rows? > > Regards > Petr > > > > > > So in the end I need an output something like - > > > > Main Branch Branch office1 Branch > > office2 > > 144 147 149 > > 145 148 151 > > > 146 NA > > NA > > > > ............................................................................... > > > > .............................................................................. > > > > > > I understand again I am not able to put forward my query properly. But I > must > > thank all of you for giving a patient reading to my query and for > reverting > > back earlier. Thanks once again. > > > > With warmest regards > > > > Mike > > > > > > --- On Wed, 25/8/10, Petr PIKAL <petr.pi...@precheza.cz> wrote: > > > > From: Petr PIKAL <petr.pi...@precheza.cz> > > Subject: Odp: [R] Finding > > pairs > > To: "Mike Rhodes" <mike_simpso...@yahoo.co.uk> > > Cc: r-help@r-project.org > > Date: Wednesday, 25 August, 2010, 6:39 > > > > Hi > > > > without other details it is probably impossible to give you any > reasonable > > advice. Do you have your data already in R? What is their form? Are they > > > in 2 columns in data frame? How did you get them paired? > > > > So without some more information probably nobody will invest his time as > > > it seems no trivial to me. > > > > Regards > > Petr > > > > r-help-boun...@r-project.org napsal dne 24.08.2010 20:28:42: > > > > > > > > > > > > > > > > > Dear R Helpers, > > > > > > > > > I am a newbie and recently got introduced to R. I have a large > database > > > containing the names of bank branch offices along-with other details. > I > > am > > > into Operational > > Risk as envisaged by BASEL II Accord. > > > > > > > > > I am trying to express my problem and I am using only an indicative > data > > which > > > comes in coded format. > > > > > > > > > > > > > > > A (branch) B (controlled by) > > > > > > > > > 144 > > > 145 > > > 146 > > > 147 144 > > > 148 145 > > > 149 > > 147 > > > 151 146 > > > ...... ....... > > > > > > ...... ....... > > > > > > > > > where 144's etc are branch codes in a given city and B is subset of A. > > > > > > > > > > > > > > > If a branch code appearing in "A" also appears in "B" (which is paired > > > with > > > some otehr element of A e.g. 144 appearing in A, also appears in "B" > and > > is > > > paired with 147 of "A" and > > likewise), then that means 144 is controlling > > > > > operations of bank office 147. Again, 147 itself appears again in B > and > > is > > > paired with bank branch coded 149. Thus, 149 is controlled by 147 and > > 147 is > > > controlled by 144. Likewise there are more than 700 hundred branch > name > > codes available. > > > > > > > > > My objective is to group them as follows - > > > > > > > > > Bank Branch > > > > > > > > > 144 147 149 > > > > > > > > > 145 > > > > > > > > > 146 151 > > > > > > > > > 148 > > > ..... > > > > > > > > > or even the following output will do. > > > > > > > > > 144 > > > 147 > > > 149 > > > > > > > > > 145 > > > > > > > > > 146 > > > 151 > > > > > > > > > 148 > > > 151 > > > ...... > > > > > > > > > I understand I should be writing some R > > code to begin with which I had > > tried > > > also but as of now I am helpless. Please guide me. > > > > > > > > > Mike > > > > > > > > > > > > > > > > > > [[alternative HTML version deleted]] > > > > > > ______________________________________________ > > > R-help@r-project.org mailing list > > > https://stat.ethz.ch/mailman/listinfo/r-help > > > PLEASE do read the posting guide > > http://www.R-project.org/posting-guide.html > > > and provide commented, minimal, self-contained, reproducible code. > > > > > > > > > > > > [[alternative HTML version deleted]] > > > > ______________________________________________ > > R-help@r-project.org mailing list > > https://stat.ethz.ch/mailman/listinfo/r-help > > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > > and provide commented, minimal, self-contained, reproducible code. > > > > > > [[alternative HTML version deleted]] > > > ______________________________________________ > R-help@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > > [[alternative HTML version deleted]] ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.