All have found (from using the Admin/Analysis page) that if I were to append unique initials (that didn't match any other word or acronym) to each pronoun (e.g. I-WCN, she-WCN, my-WCN etc) that the default parsing and tokenization for the text field in SOLR might actually do the trick -- it parses down to I, wcn, IWCN, i, idgn -- all at the same word position -- so that is perfect. I haven't exhaustively tested all capitalization nuances, but am too woried about that.
If I want to do an exhaustive search for person WCN, i just have to enter his/her initials and than can get all references including pronouns? Anybody see any holes in this? (sounds alarmingly easy so far)? Dave ----- Original Message ---- From: David Neubert <[EMAIL PROTECTED]> To: solr-user@lucene.apache.org Sent: Monday, November 12, 2007 3:04:20 PM Subject: Re: Associating pronouns instances to proper nouns? Attempting to answer my own question, which I should probably just try, assuming I can doctor the indexed text ---I suppose I could do something like change all instances or I, he, etc that refer to one person to IJBA HEJBA, HIMJBA (making sure they would never equal a normal word) -- then use the synonym feature to link IJBA, HEJBA, HIMJBA, Joe Book Author, J.B.Author (although, even if this were a good approach) I don't know if you can link synonyms for phrases as opposed to a single word. And of course this would require a correlative translation mechanism at display time to render I, he, him, instead of the indexed acronym. ----- Original Message ---- From: David Neubert <[EMAIL PROTECTED]> To: solr-user@lucene.apache.org Sent: Monday, November 12, 2007 2:54:11 PM Subject: Associating pronouns instances to proper nouns? All, I am working with very exact text and search over permament documents (books). It would be great to associate pronouns like he, she, him, her, I, my, etc. with the acutal author or person the pronoun refers to. I can see how I could get pretty darn close with the synonym feature in Lucene. Unfortunately though, as I understand it, this would associate all instances or I, he, she, etc. instead of particular instances. I have come up with a crude mechanism, adding the initials for the referred person, immediately after the pronoun ... him{DGN}, but this of course complicates word counts and potential prhase lookups, etc..... (which I could probably live with and work around). But after understanding how easy it is to add synonymns for any particular word in a document, is there any standard practical way to add synonymns to a particular word instance within a document? That would really do the trick? Dave __________________________________________________ Do You Yahoo!? Tired of spam? Yahoo! Mail has the best spam protection around http://mail.yahoo.com __________________________________________________ Do You Yahoo!? Tired of spam? Yahoo! Mail has the best spam protection around http://mail.yahoo.com __________________________________________________ Do You Yahoo!? Tired of spam? Yahoo! Mail has the best spam protection around http://mail.yahoo.com