lucene document via JSON
Hi, Is adding/updating/deleting in JSON format possible? actually my need is mostly update I like to let user update certain fields of an existing results? Another solution is I let user save it in DB and then server convert/post XML to Solr.. but not so fancy :) Thanks Anton __ Ta semester! - sök efter resor hos Kelkoo. Jämför pris på flygbiljetter och hotellrum här: http://www.kelkoo.se/c-169901-resor-biljetter.html?partnerId=96914052
highlight results from pdf search
Hi. I have some PDF documents indexed through solr cell. My highlighting queries work fine on standard xml doc types, eg the samples. I would now like to highlight some queries on a PDF document. Currently for my simple examples I am just indexing a PDF, providing an id, and an arbitrary ext.literal. I would like to be able to get highlighted snippets back from the extracted content of the PDF. Is this possible? Thanks in advance for your help, - Ross -- View this message in context: http://www.nabble.com/highlight-results-from-pdf-search-tp23791905p23791905.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: java.lang.RuntimeException: after flush: fdx size mismatch
Woops, here's the patch (added you, diretly, on the "To:" so that you get the patch; Apache's list manager strips patches). Yes, if the fdx file is getting deleted out from under Lucene, that'd also explain what's happening. Though the timing would have to be very quick. What's happening is Lucene had opened _X.fdx for writing, written some small # bytes to it, and then closed it but found the file no longer exists. I'm not familiar with what exactly happens when you create/unload Solr cores and move them around machines; does this involve moving files from one machine to another? (Ie, deleting files)? If so, is there some way to log when such migrations take place and try to correlate to this exception? Mike On Fri, May 29, 2009 at 8:29 PM, James X wrote: > Hi Mike,I don't see a patch file here? > > Could another explanation be that the fdx file doesn't exist yet / has been > deleted from underneath Lucene? > > I'm constantly CREATE-ing and UNLOAD-ing Solr cores, and more importantly, > moving the bundled cores around between machines. I find it much more likely > that there's something wrong with my core admin code than there is with the > Lucene internals :) It's possible that I'm occasionally removing files which > are currently in use by a live core... > > I'm using an ext3 filesystem on a large EC2 instance's own hard disk. I'm > not sure how Amazon implement the local hard disk, but I assume it's a real > hard disk exposed by the hypervisor. > > Thanks, > James > > On Fri, May 29, 2009 at 3:41 AM, Michael McCandless < > luc...@mikemccandless.com> wrote: > >> Very interesting: FieldsWriter thinks it's written 12 bytes to the fdx >> file, yet the directory says the file does not exist. >> >> Can you re-run with this new patch? I'm suspecting that FieldsWriter >> wrote to one segment, but somehow we are then looking at the wrong >> segment. The attached patch prints out which segment FieldsWriter >> actually wrote to. >> >> What filesystem & underlying IO system/device are you using? >> >> Mike >> >> On Thu, May 28, 2009 at 10:53 PM, James X >> wrote: >> > My apologies for the delay in running this patched Lucene build - I was >> > temporarily pulled onto another piece of work. >> > >> > Here is a sample 'fdx size mismatch' exception using the patch Mike >> > supplied: >> > >> > SEVERE: java.lang.RuntimeException: after flush: fdx size mismatch: 1 >> docs >> > vs 0 length in bytes of _1i.fdx exists=false didInit=false inc=0 dSO=1 >> > fieldsWriter.doClose=true fieldsWriter.indexFilePointer=12 >> > fieldsWriter.fieldsFilePointer=2395 >> > at >> > >> org.apache.lucene.index.StoredFieldsWriter.closeDocStore(StoredFieldsWriter.java:96) >> > at >> > >> org.apache.lucene.index.DocFieldConsumers.closeDocStore(DocFieldConsumers.java:83) >> > at >> > >> org.apache.lucene.index.DocFieldProcessor.closeDocStore(DocFieldProcessor.java:47) >> > at >> > >> org.apache.lucene.index.DocumentsWriter.closeDocStore(DocumentsWriter.java:367) >> > at >> > org.apache.lucene.index.DocumentsWriter.flush(DocumentsWriter.java:567) >> > at >> > org.apache.lucene.index.IndexWriter.doFlush(IndexWriter.java:3540) >> > at >> org.apache.lucene.index.IndexWriter.flush(IndexWriter.java:3450) >> > at >> > org.apache.lucene.index.IndexWriter.closeInternal(IndexWriter.java:1638) >> > at >> org.apache.lucene.index.IndexWriter.close(IndexWriter.java:1602) >> > at >> org.apache.lucene.index.IndexWriter.close(IndexWriter.java:1578) >> > at >> > org.apache.solr.update.SolrIndexWriter.close(SolrIndexWriter.java:153) >> > >> > >> > Will now run with assertions enabled and see how that affects the >> behaviour! >> > >> > Thanks, >> > James >> > >> > -- Forwarded message -- >> > From: James X >> > Date: Thu, May 21, 2009 at 2:24 PM >> > Subject: Re: java.lang.RuntimeException: after flush: fdx size mismatch >> > To: solr-user@lucene.apache.org >> > >> > >> > Hi Mike,Documents are web pages, about 20 fields, mostly strings, a >> couple >> > of integers, booleans and one html field (for document body content). >> > >> > I do have a multi-threaded client pushing docs to Solr, so yes, I suppose >> > that would mean I have several active Solr worker threads. >> > >> > The only exceptions I have are the RuntimeException flush errors, >> followed >> > by a handful (normally 10-20) of LockObtainFailedExceptions, which i >> > presumed were being caused by the faulty threads dying and failing to >> > release locks. >> > >> > Oh wait, I am getting WstxUnexpectedCharException exceptions every now >> and >> > then: >> > SEVERE: com.ctc.wstx.exc.WstxUnexpectedCharException: Illegal character >> > ((CTRL-CHAR, code 8)) >> > at [row,col {unknown-source}]: [1,26070] >> > at >> > com.ctc.wstx.sr.StreamScanner.throwInvalidSpace(StreamScanner.java:675) >> > at >> > >> com.ctc.wstx.sr.BasicStreamReader.readTextSecondary(BasicStreamReader.java:4668) >>
SV: lucene document via JSON
I have found this https://issues.apache.org/jira/browse/SOLR-945 Seems like this might solves problem.. interesting its also faster!! Question - is there any specific reason this is not in the trunk? Also does this mean once the issue is sorted then Data Import Handler will also benefit from it?.. I can't wait to see this happen Cheers --- Den lör 2009-05-30 skrev Antonio Eggberg : > Från: Antonio Eggberg > Ämne: lucene document via JSON > Till: solr-user@lucene.apache.org > Datum: lördag 30 maj 2009 09.41 > > Hi, > > Is adding/updating/deleting in JSON format possible? > actually my need is mostly update I like to let user update > certain fields of an existing results? > > Another solution is I let user save it in DB and then > server convert/post XML to Solr.. but not so fancy :) > > Thanks > Anton > > > > __ > Ta semester! - sök efter resor hos Kelkoo. > Jämför pris på flygbiljetter och hotellrum här: > http://www.kelkoo.se/c-169901-resor-biljetter.html?partnerId=96914052 > __ Går det långsamt? Skaffa dig en snabbare bredbandsuppkoppling. Sök och jämför priser hos Kelkoo. http://www.kelkoo.se/c-100015813-bredband.html?partnerId=96914325
When searching for !...@#$%^&*() all documents are matched incorrectly
Hi, I'm running Solr 1.3/Java 1.6. When I run a query like - (activity_type:NAME) AND title:(\...@#$%\^&\*\(\)) all the documents are returned even though there is not a single match. There is no title that matches the string (which has been escaped). My document structure is as follows NAME Bathing The title field is of type text_title which is described below. When I run the query against Luke, no results are returned. Any suggestions are appreciated. -- View this message in context: http://www.nabble.com/When-searching-for-%21%40-%24-%5E-*%28%29-all-documents-are-matched-incorrectly-tp23797731p23797731.html Sent from the Solr - User mailing list archive at Nabble.com.
how to do exact serch with solrj
Hi, I want to search "hello the world" in the "title" field using solrj. I set the query filter query.addFilterQuery("title"); query.setQuery("hello the world"); but it returns not exact match results as well. I know one way to do it is to set "title" field to string instead of text. But is there any way i can do it? If I do the search through web interface Solr Admin by title:"hello the world", it returns exact matches. Thanks. JB
Re: When searching for !...@#$%^&*() all documents are matched incorrectly
two key things to try (for anyone ever wondering why a query matches documents) 1. add &debugQuery=true and look at the explain text below -- anything that contributed to the score is listed there 2. check /admin/analysis.jsp -- this will let you see how analyzers break text up into tokens. Not sure off hand, but I'm guessing the WordDelimiterFilterFactory has something to do with it... On Sat, May 30, 2009 at 5:59 PM, Sam Michaels wrote: > > Hi, > > I'm running Solr 1.3/Java 1.6. > > When I run a query like - (activity_type:NAME) AND title:(\...@#$%\^&\*\(\)) > all the documents are returned even though there is not a single match. > There is no title that matches the string (which has been escaped). > > My document structure is as follows > > > NAME > Bathing > > > > > The title field is of type text_title which is described below. > > positionIncrementGap="100"> > > > > generateWordParts="1" generateNumberParts="1" catenateWords="1" > catenateNumbers="1" catenateAll="1" splitOnCaseChange="1"/> > > > > > > ignoreCase="true" expand="true"/> > generateWordParts="1" generateNumberParts="1" catenateWords="1" > catenateNumbers="1" catenateAll="1" splitOnCaseChange="1"/> > > > > > > > When I run the query against Luke, no results are returned. Any suggestions > are appreciated. > > > -- > View this message in context: > http://www.nabble.com/When-searching-for-%21%40-%24-%5E-*%28%29-all-documents-are-matched-incorrectly-tp23797731p23797731.html > Sent from the Solr - User mailing list archive at Nabble.com. > >
Re: When searching for !...@#$%^&*() all documents are matched incorrectly
I'm really curious. What is the most relevant result for that query? wunder On 5/30/09 7:35 PM, "Ryan McKinley" wrote: > two key things to try (for anyone ever wondering why a query matches > documents) > > 1. add &debugQuery=true and look at the explain text below -- > anything that contributed to the score is listed there > 2. check /admin/analysis.jsp -- this will let you see how analyzers > break text up into tokens. > > Not sure off hand, but I'm guessing the WordDelimiterFilterFactory has > something to do with it... > > > On Sat, May 30, 2009 at 5:59 PM, Sam Michaels wrote: >> >> Hi, >> >> I'm running Solr 1.3/Java 1.6. >> >> When I run a query like - (activity_type:NAME) AND title:(\...@#$%\^&\*\(\)) >> all the documents are returned even though there is not a single match. >> There is no title that matches the string (which has been escaped). >> >> My document structure is as follows >> >> >> NAME >> Bathing >> >> >> >> >> The title field is of type text_title which is described below. >> >> > positionIncrementGap="100"> >> >> >> >> > generateWordParts="1" generateNumberParts="1" catenateWords="1" >> catenateNumbers="1" catenateAll="1" splitOnCaseChange="1"/> >> >> >> >> >> >> > ignoreCase="true" expand="true"/> >> > generateWordParts="1" generateNumberParts="1" catenateWords="1" >> catenateNumbers="1" catenateAll="1" splitOnCaseChange="1"/> >> >> >> >> >> >> >> When I run the query against Luke, no results are returned. Any suggestions >> are appreciated. >> >> >> -- >> View this message in context: >> http://www.nabble.com/When-searching-for-%21%40-%24-%5E-*%28%29-all-documents >> -are-matched-incorrectly-tp23797731p23797731.html >> Sent from the Solr - User mailing list archive at Nabble.com. >> >>
Re: how to do exact serch with solrj
query.setQuery("title:hello the world") is what you need. Cheers Avlesh On Sun, May 31, 2009 at 6:23 AM, Jianbin Dai wrote: > > Hi, > > I want to search "hello the world" in the "title" field using solrj. I set > the query filter > query.addFilterQuery("title"); > query.setQuery("hello the world"); > > but it returns not exact match results as well. > > I know one way to do it is to set "title" field to string instead of text. > But is there any way i can do it? If I do the search through web interface > Solr Admin by title:"hello the world", it returns exact matches. > > Thanks. > > JB > > > > >
Re: how to do exact serch with solrj
I tried, but seems it's not working right. --- On Sat, 5/30/09, Avlesh Singh wrote: > From: Avlesh Singh > Subject: Re: how to do exact serch with solrj > To: solr-user@lucene.apache.org > Date: Saturday, May 30, 2009, 10:56 PM > query.setQuery("title:hello the > world") is what you need. > > Cheers > Avlesh > > On Sun, May 31, 2009 at 6:23 AM, Jianbin Dai > wrote: > > > > > Hi, > > > > I want to search "hello the world" in the "title" > field using solrj. I set > > the query filter > > query.addFilterQuery("title"); > > query.setQuery("hello the world"); > > > > but it returns not exact match results as well. > > > > I know one way to do it is to set "title" field to > string instead of text. > > But is there any way i can do it? If I do the search > through web interface > > Solr Admin by title:"hello the world", it returns > exact matches. > > > > Thanks. > > > > JB > > > > > > > > > > >
Re: how to do exact serch with solrj
You need exact match for all the three tokens? If yes, try query.setQuery("title:\"hello the world\""); Cheers Avlesh On Sun, May 31, 2009 at 12:12 PM, Jianbin Dai wrote: > > I tried, but seems it's not working right. > > --- On Sat, 5/30/09, Avlesh Singh wrote: > > > From: Avlesh Singh > > Subject: Re: how to do exact serch with solrj > > To: solr-user@lucene.apache.org > > Date: Saturday, May 30, 2009, 10:56 PM > > query.setQuery("title:hello the > > world") is what you need. > > > > Cheers > > Avlesh > > > > On Sun, May 31, 2009 at 6:23 AM, Jianbin Dai > > wrote: > > > > > > > > Hi, > > > > > > I want to search "hello the world" in the "title" > > field using solrj. I set > > > the query filter > > > query.addFilterQuery("title"); > > > query.setQuery("hello the world"); > > > > > > but it returns not exact match results as well. > > > > > > I know one way to do it is to set "title" field to > > string instead of text. > > > But is there any way i can do it? If I do the search > > through web interface > > > Solr Admin by title:"hello the world", it returns > > exact matches. > > > > > > Thanks. > > > > > > JB > > > > > > > > > > > > > > > > > > > > > >
Re: how to do exact serch with solrj
That's correct! Thanks Avlesh. --- On Sat, 5/30/09, Avlesh Singh wrote: > From: Avlesh Singh > Subject: Re: how to do exact serch with solrj > To: solr-user@lucene.apache.org > Date: Saturday, May 30, 2009, 11:45 PM > You need exact match for all the > three tokens? > If yes, try query.setQuery("title:\"hello the world\""); > > Cheers > Avlesh > > On Sun, May 31, 2009 at 12:12 PM, Jianbin Dai > wrote: > > > > > I tried, but seems it's not working right. > > > > --- On Sat, 5/30/09, Avlesh Singh > wrote: > > > > > From: Avlesh Singh > > > Subject: Re: how to do exact serch with solrj > > > To: solr-user@lucene.apache.org > > > Date: Saturday, May 30, 2009, 10:56 PM > > > query.setQuery("title:hello the > > > world") is what you need. > > > > > > Cheers > > > Avlesh > > > > > > On Sun, May 31, 2009 at 6:23 AM, Jianbin Dai > > > > wrote: > > > > > > > > > > > Hi, > > > > > > > > I want to search "hello the world" in the > "title" > > > field using solrj. I set > > > > the query filter > > > > query.addFilterQuery("title"); > > > > query.setQuery("hello the world"); > > > > > > > > but it returns not exact match results as > well. > > > > > > > > I know one way to do it is to set "title" > field to > > > string instead of text. > > > > But is there any way i can do it? If I do > the search > > > through web interface > > > > Solr Admin by title:"hello the world", it > returns > > > exact matches. > > > > > > > > Thanks. > > > > > > > > JB > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > >