leading wildcard search is called grep ;-) Ditto on the indexing reversed words suggestion.
Can you create a second field in solr that contains /only/ the words from the fields you care to reverse? Once you do that you could pre-process the query and look for leading wildcards and address those (after reversing the query) only against your special reverse-meta-data field. The *foo* case really is grep! You nearly by definition have to linearly scan the index unless some magic is added. Your options are to extend Otis' ngram suggestion and turn a word like "baffoonery" into: (stored in "meta field") baffoonery affoonery ffoonery foonery oonery onery nery ery ry Now you can take a query like "*foo*" and drop the leading wildcard and it will hit on 'foonery'. Make sense? You are trading index size for not doing a linear scan like grep. It's not advisable to do this for every word in your document set ;-) - Neal Richter On Wed, Jan 28, 2009 at 12:19 AM, Jana, Kumar Raja <kj...@ptc.com> wrote: > Hi, > > Thanks Otis, Newton and everyone else for the help on this issue. > > Most of the data I index are documents like pdfs, word Docs, open office > documents, etc. I store the content of the document in a field called > content and the remaining metadata of the document like name, id, > created by, modified by, created on, etc in a copy field called > metadata. I am not particularly interested in enabling leading wildcard > characters in the content (although such a possibility would be a > bonus). For this, I've tried implementing the suggestion to store > reverse strings as well as the correct strings for the metadata field. > All leading wildcard queries like "*abc" and searched as "cba*" against > the reversed metadata field. So far so good. Thank you :) > > But now, I ran into the scenario where the query string is *abc* :( and > the whole thing came down crashing again. I cannot ignore such queries. > I would rather take the risk of Solr OOMing by enabling the leading > wildcard query searches. > > Can someone please tell me the steps to turn on this feature in Lucene > QueryParser? I am sure it will be helpful to many to document such a > procedure on the Wiki or somewhere else. (I am definitely going to do > that once I fix this. Too much trouble this seems to be) > Also, which queryParser does Solr use by default? > > Thanks, > Kumar > > > > > -----Original Message----- > From: Otis Gospodnetic [mailto:otis_gospodne...@yahoo.com] > Sent: Thursday, January 15, 2009 10:18 PM > To: solr-user@lucene.apache.org > Subject: Re: Customizing Solr to handle Leading Wildcard queries > > Hi ramuK, > > I believe you can turn that "on" via the Lucene QueryParser, but of > course such searches will be slo(oo)w. You can also index reversed > tokens (e.g. *kumar --> rakum*) or you could index n-grams with > begin/end delim characters (e.g. kumar -> ^ k u m a r $, *kumar -> "k u > m a r $") > > > Otis > -- > Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch > > > > ----- Original Message ---- >> From: "Jana, Kumar Raja" <kj...@ptc.com> >> To: solr-user@lucene.apache.org >> Sent: Thursday, January 15, 2009 9:49:24 AM >> Subject: RE: Customizing Solr to handle Leading Wildcard queries >> >> Hi Erik, >> >> Thanks for the quick reply. >> I want to enable leading wildcard query searches in general. The case >> mentioned in the earlier mail is just one of the many instances I use >> this feature. >> >> -Kumar >> >> >> >> >> -----Original Message----- >> From: Erik Hatcher [mailto:e...@ehatchersolutions.com] >> Sent: Thursday, January 15, 2009 7:59 PM >> To: solr-user@lucene.apache.org >> Subject: Re: Customizing Solr to handle Leading Wildcard queries >> >> >> On Jan 15, 2009, at 8:23 AM, Jana, Kumar Raja wrote: >> > Not being able to perform Leading Wildcard queries is a major >> > handicap. >> > I want to be able to perform searches like *.pdf to fetch all pdf >> > documents from Solr. >> >> For this particular case, I recommend indexing the document type as a > >> separate field. Something like type:pdf (or use a MIME type string). > >> Then you can do a very direct and fast query to search or facet by >> document types. >> >> Erik > >