Custom Field Type
Hello, I have a multivalued field in my schema of type text_ws, values are of the form #int #int I need to be able to query on the first and sort on the second, this does not seem to be enabled out of the box I looked around for a while and it seems the recommended approach is to create a custom field type and implement this logic in the getSortField method But since the field is multivalued, I need to sort by the value i am searching for, so i need access to the current query being executed. Also, i can't seem to figure out the correct -classpath to give to javac for it to find the packages needed to create the class file (my java is a bit rusty to say the least) Thanks, Fouad
Re: Custom Field Type
Hello Yonik, Thanks for your help, but I am not really sure I follow. It is possible to use the PatternTokenizerFactory with pattern = (\d+) and group = 0 to tokenize the input correctly But I don't see how to use the copyField to achieve sorting I read the documentation and this does not seem to be possible Are there any performance implications on using dynamic fields? Could you please elaborate on your idea Thanks again /Fouad On Wed, Mar 4, 2009 at 8:12 PM, Yonik Seeley wrote: > On Wed, Mar 4, 2009 at 12:24 PM, Fouad Mardini > wrote: > > I have a multivalued field in my schema of type text_ws, values are of > the > > form #int #int > > I need to be able to query on the first and sort on the second, this does > > not seem to be enabled out of the box > > Can you put the two numbers in separate fields for this purpose? > If you can't do it from the indexer, a schema with copyField in > conjunction with PatternTokenizerFactory could do it. > > -Yonik > http://www.lucidimagination.com >
Dismax request handler and highlighting
Hello, I am having problems with the dismax request handler and highlighting. The following query works as intended http://localhost:8983/solr/select?indent=on&q=myquery&start=0&rows=10&fl=id%2Cscore&qt=standard&wt=standard&hl=true&hl.fl=myfield whereas http://localhost:8983/solr/select?indent=on&q.alt=myquery&start=0&rows=10&fl=id%2Cscore&qt=dismax&wt=standard&hl=true&hl.fl=myfield I am using dismax since i need boost functions. Furthermore, using the q parameter with dismax doesn't seem to work with me, debug gives the following output myquery +() () is there a setting somewhere that i need to set? I am building SOLR right out of svn. Thanks, Fouad
Indexing large documents
Hello, I am using solr to index text extracted from word documents, and it is working really well. Recently i started noticing that some documents are not indexed, that is i know that the word foobar is in a document, but when i search for foobar the id of that document is not returned. I suspect that this has to do with the size of the document, and that documents with a lot of text are not being indexed. Please advise. thanks, fmardini
Re: Indexing large documents
Well, I am using the java textmining library to extract text from documents, then i do a post to solr I do not have an error log, i only have *.request.log files in the logs directory Thanks On 8/20/07, Peter Manis <[EMAIL PROTECTED]> wrote: > > Fouad, > > I would check the error log or console for any possible errors first. > They may not show up, it really depends on how you are processing the > word document (custom solr, feeding the text to it, etc). We are > using a custom version of solr with PDF, DOC, XLS, etc text extraction > and I have successfully indexed 40mb documents. I did have indexing > problems with a large document or two and simply increasing the heap > size fixed the problem. > > - Pete > > On 8/20/07, Fouad Mardini <[EMAIL PROTECTED]> wrote: > > Hello, > > > > I am using solr to index text extracted from word documents, and it is > > working really well. > > Recently i started noticing that some documents are not indexed, that is > i > > know that the word foobar is in a document, but when i search for foobar > the > > id of that document is not returned. > > I suspect that this has to do with the size of the document, and that > > documents with a lot of text are not being indexed. > > Please advise. > > > > thanks, > > fmardini > > >
Re: Indexing large documents
thanks, i reindexed the documents and now it works, there was an issue with text extraction it seems. I also changed the maxFieldLength and it must have helped thanks On 8/20/07, Pieter Berkel <[EMAIL PROTECTED]> wrote: > > You will probably need to increase the value of maxFieldLength in your > solrconfig.xml. The default value is 1 which might explain why your > documents are not being completely indexed. > > Piete > > > On 20/08/07, Peter Manis <[EMAIL PROTECTED]> wrote: > > > > The that should show some errors if something goes wrong, if not the > > console usually will. The errors will look like a java stacktrace > > output. Did increasing the heap do anything for you? Changing mine > > to 256mb max worked fine for all of our files. > > > > On 8/20/07, Fouad Mardini <[EMAIL PROTECTED]> wrote: > > > Well, I am using the java textmining library to extract text from > > documents, > > > then i do a post to solr > > > I do not have an error log, i only have *.request.log files in the > logs > > > directory > > > > > > Thanks > > > > > > On 8/20/07, Peter Manis <[EMAIL PROTECTED]> wrote: > > > > > > > > Fouad, > > > > > > > > I would check the error log or console for any possible errors > first. > > > > They may not show up, it really depends on how you are processing > the > > > > word document (custom solr, feeding the text to it, etc). We are > > > > using a custom version of solr with PDF, DOC, XLS, etc text > extraction > > > > and I have successfully indexed 40mb documents. I did have indexing > > > > problems with a large document or two and simply increasing the heap > > > > size fixed the problem. > > > > > > > > - Pete > > > > > > > > On 8/20/07, Fouad Mardini <[EMAIL PROTECTED]> wrote: > > > > > Hello, > > > > > > > > > > I am using solr to index text extracted from word documents, and > it > > is > > > > > working really well. > > > > > Recently i started noticing that some documents are not indexed, > > that is > > > > i > > > > > know that the word foobar is in a document, but when i search for > > foobar > > > > the > > > > > id of that document is not returned. > > > > > I suspect that this has to do with the size of the document, and > > that > > > > > documents with a lot of text are not being indexed. > > > > > Please advise. > > > > > > > > > > thanks, > > > > > fmardini > > > > > > > > > > > > > > >