Hello Gert, I think you'd have to apply custom heuristics that involves looking at top N hits for each query and looking at the % overlap.
Otis ---- Sematext :: http://sematext.com/ :: Solr - Lucene - Nutch Lucene ecosystem search :: http://search-lucene.com/ ----- Original Message ---- > From: "Villemos, Gert" <gert.ville...@logica.com> > To: solr-user@lucene.apache.org > Sent: Fri, April 23, 2010 10:20:54 AM > Subject: Comparing two queries > > We want to support that a user can register for interest in > information, based on a query he has defined himself. For example that he > type in a query, press a save button, provides his email and the system will > now email him with a daily digest. As part of this, it would > be nice to be able to tell the user that the same / a similar query are > already being monitored by another user, as the users will likely have the > same interests. I would therefore like to evaluate whether two queries will > return (almost) the same set of results. But how can I > compare two queries to determine if they will return (almost) the same set of > results? Thanks, Gert. Please help Logica > to respect the environment by not printing this email / Pour contribuer > comme Logica au respect de l'environnement, merci de ne pas imprimer ce mail > / Bitte drucken Sie diese Nachricht nicht aus und helfen Sie so Logica > dabei, die Umwelt zu schützen. / Por favor ajude a Logica a respeitar o > ambiente nao imprimindo este correio electronico. This e-mail and > any attachment is for authorised use by the intended recipient(s) only. It > may > contain proprietary material, confidential information and/or be subject to > legal privilege. It should not be copied, disclosed to, retained or used by, > any > other party. If you are not an intended recipient then please promptly delete > this e-mail and any attachment and all copies and inform the sender. Thank > you.