lucene document via JSON

2009-05-30 Thread Antonio Eggberg

Hi,

Is adding/updating/deleting in JSON format possible? actually my need is mostly 
update I like to let user update certain fields of an existing results?

Another solution is I let user save it in DB and then server convert/post XML 
to Solr.. but not so fancy :)

Thanks
Anton


  __
Ta semester! - sök efter resor hos Kelkoo.
Jämför pris på flygbiljetter och hotellrum här:
http://www.kelkoo.se/c-169901-resor-biljetter.html?partnerId=96914052


highlight results from pdf search

2009-05-30 Thread rossputin

Hi.

I have some PDF documents indexed through solr cell.  My highlighting
queries work fine on standard xml doc types, eg the samples.  I would now
like to highlight some queries on a PDF document.  Currently for my simple
examples I am just indexing a PDF, providing an id, and an arbitrary
ext.literal.  I would like to be able to get highlighted snippets back from
the extracted content of the PDF.  Is this possible?

Thanks in advance for your help,

 - Ross
-- 
View this message in context: 
http://www.nabble.com/highlight-results-from-pdf-search-tp23791905p23791905.html
Sent from the Solr - User mailing list archive at Nabble.com.



Re: java.lang.RuntimeException: after flush: fdx size mismatch

2009-05-30 Thread Michael McCandless
Woops, here's the patch (added you, diretly, on the "To:" so that you
get the patch; Apache's list manager strips patches).

Yes, if the fdx file is getting deleted out from under Lucene, that'd
also explain what's happening.  Though the timing would have to be
very quick.  What's happening is Lucene had opened _X.fdx for writing,
written some small # bytes to it, and then closed it but found the
file no longer exists.

I'm not familiar with what exactly happens when you create/unload Solr
cores and move them around machines; does this involve moving files
from one machine to another?  (Ie, deleting files)?  If so, is there
some way to log when such migrations take place and try to correlate
to this exception?

Mike

On Fri, May 29, 2009 at 8:29 PM, James X
 wrote:
> Hi Mike,I don't see a patch file here?
>
> Could another explanation be that the fdx file doesn't exist yet / has been
> deleted from underneath Lucene?
>
> I'm constantly CREATE-ing and UNLOAD-ing Solr cores, and more importantly,
> moving the bundled cores around between machines. I find it much more likely
> that there's something wrong with my core admin code than there is with the
> Lucene internals :) It's possible that I'm occasionally removing files which
> are currently in use by a live core...
>
> I'm using an ext3 filesystem on a large EC2 instance's own hard disk. I'm
> not sure how Amazon implement the local hard disk, but I assume it's a real
> hard disk exposed by the hypervisor.
>
> Thanks,
> James
>
> On Fri, May 29, 2009 at 3:41 AM, Michael McCandless <
> luc...@mikemccandless.com> wrote:
>
>> Very interesting: FieldsWriter thinks it's written 12 bytes to the fdx
>> file, yet the directory says the file does not exist.
>>
>> Can you re-run with this new patch?  I'm suspecting that FieldsWriter
>> wrote to one segment, but somehow we are then looking at the wrong
>> segment.  The attached patch prints out which segment FieldsWriter
>> actually wrote to.
>>
>> What filesystem & underlying IO system/device are you using?
>>
>> Mike
>>
>> On Thu, May 28, 2009 at 10:53 PM, James X
>>  wrote:
>> > My apologies for the delay in running this patched Lucene build - I was
>> > temporarily pulled onto another piece of work.
>> >
>> > Here is a sample 'fdx size mismatch' exception using the patch Mike
>> > supplied:
>> >
>> > SEVERE: java.lang.RuntimeException: after flush: fdx size mismatch: 1
>> docs
>> > vs 0 length in bytes of _1i.fdx exists=false didInit=false inc=0 dSO=1
>> > fieldsWriter.doClose=true fieldsWriter.indexFilePointer=12
>> > fieldsWriter.fieldsFilePointer=2395
>> >        at
>> >
>> org.apache.lucene.index.StoredFieldsWriter.closeDocStore(StoredFieldsWriter.java:96)
>> >        at
>> >
>> org.apache.lucene.index.DocFieldConsumers.closeDocStore(DocFieldConsumers.java:83)
>> >        at
>> >
>> org.apache.lucene.index.DocFieldProcessor.closeDocStore(DocFieldProcessor.java:47)
>> >        at
>> >
>> org.apache.lucene.index.DocumentsWriter.closeDocStore(DocumentsWriter.java:367)
>> >        at
>> > org.apache.lucene.index.DocumentsWriter.flush(DocumentsWriter.java:567)
>> >        at
>> > org.apache.lucene.index.IndexWriter.doFlush(IndexWriter.java:3540)
>> >        at
>> org.apache.lucene.index.IndexWriter.flush(IndexWriter.java:3450)
>> >        at
>> > org.apache.lucene.index.IndexWriter.closeInternal(IndexWriter.java:1638)
>> >        at
>> org.apache.lucene.index.IndexWriter.close(IndexWriter.java:1602)
>> >        at
>> org.apache.lucene.index.IndexWriter.close(IndexWriter.java:1578)
>> >        at
>> > org.apache.solr.update.SolrIndexWriter.close(SolrIndexWriter.java:153)
>> >
>> >
>> > Will now run with assertions enabled and see how that affects the
>> behaviour!
>> >
>> > Thanks,
>> > James
>> >
>> > -- Forwarded message --
>> > From: James X 
>> > Date: Thu, May 21, 2009 at 2:24 PM
>> > Subject: Re: java.lang.RuntimeException: after flush: fdx size mismatch
>> > To: solr-user@lucene.apache.org
>> >
>> >
>> > Hi Mike,Documents are web pages, about 20 fields, mostly strings, a
>> couple
>> > of integers, booleans and one html field (for document body content).
>> >
>> > I do have a multi-threaded client pushing docs to Solr, so yes, I suppose
>> > that would mean I have several active Solr worker threads.
>> >
>> > The only exceptions I have are the RuntimeException flush errors,
>> followed
>> > by a handful (normally 10-20) of LockObtainFailedExceptions, which i
>> > presumed were being caused by the faulty threads dying and failing to
>> > release locks.
>> >
>> > Oh wait, I am getting WstxUnexpectedCharException exceptions every now
>> and
>> > then:
>> > SEVERE: com.ctc.wstx.exc.WstxUnexpectedCharException: Illegal character
>> > ((CTRL-CHAR, code 8))
>> >  at [row,col {unknown-source}]: [1,26070]
>> >        at
>> > com.ctc.wstx.sr.StreamScanner.throwInvalidSpace(StreamScanner.java:675)
>> >        at
>> >
>> com.ctc.wstx.sr.BasicStreamReader.readTextSecondary(BasicStreamReader.java:4668)
>>

SV: lucene document via JSON

2009-05-30 Thread antonio_eggberg

I have found this

https://issues.apache.org/jira/browse/SOLR-945

Seems like this might solves problem.. interesting its also faster!!

Question - is there any specific reason this is not in the trunk? Also does 
this mean once the issue is sorted then Data Import Handler will also benefit 
from it?..

I can't wait to see this happen

Cheers

--- Den lör 2009-05-30 skrev Antonio Eggberg :

> Från: Antonio Eggberg 
> Ämne: lucene document via JSON
> Till: solr-user@lucene.apache.org
> Datum: lördag 30 maj 2009 09.41
> 
> Hi,
> 
> Is adding/updating/deleting in JSON format possible?
> actually my need is mostly update I like to let user update
> certain fields of an existing results?
> 
> Another solution is I let user save it in DB and then
> server convert/post XML to Solr.. but not so fancy :)
> 
> Thanks
> Anton
> 
> 
>      
> __
> Ta semester! - sök efter resor hos Kelkoo.
> Jämför pris på flygbiljetter och hotellrum här:
> http://www.kelkoo.se/c-169901-resor-biljetter.html?partnerId=96914052
>



  __
Går det långsamt? Skaffa dig en snabbare bredbandsuppkoppling. 
Sök och jämför priser hos Kelkoo.
http://www.kelkoo.se/c-100015813-bredband.html?partnerId=96914325


When searching for !...@#$%^&*() all documents are matched incorrectly

2009-05-30 Thread Sam Michaels

Hi,

I'm running Solr 1.3/Java 1.6.  

When I run a query like  - (activity_type:NAME) AND title:(\...@#$%\^&\*\(\))
all the documents are returned even though there is not a single match.
There is no title that matches the string (which has been escaped). 

My document structure is as follows


NAME
Bathing




The title field is of type text_title which is described below. 


  





  
  






  


When I run the query against Luke, no results are returned. Any suggestions
are appreciated.


-- 
View this message in context: 
http://www.nabble.com/When-searching-for-%21%40-%24-%5E-*%28%29-all-documents-are-matched-incorrectly-tp23797731p23797731.html
Sent from the Solr - User mailing list archive at Nabble.com.



how to do exact serch with solrj

2009-05-30 Thread Jianbin Dai

Hi,

I want to search "hello the world" in the "title" field using solrj. I set the 
query filter
query.addFilterQuery("title");
query.setQuery("hello the world");

but it returns not exact match results as well. 

I know one way to do it is to set "title" field to string instead of text. But 
is there any way i can do it? If I do the search through web interface Solr 
Admin by title:"hello the world", it returns exact matches.

Thanks.

JB


  



Re: When searching for !...@#$%^&*() all documents are matched incorrectly

2009-05-30 Thread Ryan McKinley
two key things to try (for anyone ever wondering why a query matches documents)

1.  add &debugQuery=true and look at the explain text below --
anything that contributed to the score is listed there
2.  check /admin/analysis.jsp -- this will let you see how analyzers
break text up into tokens.

Not sure off hand, but I'm guessing the WordDelimiterFilterFactory has
something to do with it...


On Sat, May 30, 2009 at 5:59 PM, Sam Michaels  wrote:
>
> Hi,
>
> I'm running Solr 1.3/Java 1.6.
>
> When I run a query like  - (activity_type:NAME) AND title:(\...@#$%\^&\*\(\))
> all the documents are returned even though there is not a single match.
> There is no title that matches the string (which has been escaped).
>
> My document structure is as follows
>
> 
> NAME
> Bathing
> 
> 
>
>
> The title field is of type text_title which is described below.
>
>  positionIncrementGap="100">
>      
>        
>        
>         generateWordParts="1" generateNumberParts="1" catenateWords="1"
> catenateNumbers="1" catenateAll="1" splitOnCaseChange="1"/>
>        
>        
>      
>      
>        
>         ignoreCase="true" expand="true"/>
>         generateWordParts="1" generateNumberParts="1" catenateWords="1"
> catenateNumbers="1" catenateAll="1" splitOnCaseChange="1"/>
>        
>        
>
>      
>    
>
> When I run the query against Luke, no results are returned. Any suggestions
> are appreciated.
>
>
> --
> View this message in context: 
> http://www.nabble.com/When-searching-for-%21%40-%24-%5E-*%28%29-all-documents-are-matched-incorrectly-tp23797731p23797731.html
> Sent from the Solr - User mailing list archive at Nabble.com.
>
>


Re: When searching for !...@#$%^&*() all documents are matched incorrectly

2009-05-30 Thread Walter Underwood
I'm really curious. What is the most relevant result for that query?

wunder

On 5/30/09 7:35 PM, "Ryan McKinley"  wrote:

> two key things to try (for anyone ever wondering why a query matches
> documents)
> 
> 1.  add &debugQuery=true and look at the explain text below --
> anything that contributed to the score is listed there
> 2.  check /admin/analysis.jsp -- this will let you see how analyzers
> break text up into tokens.
> 
> Not sure off hand, but I'm guessing the WordDelimiterFilterFactory has
> something to do with it...
> 
> 
> On Sat, May 30, 2009 at 5:59 PM, Sam Michaels  wrote:
>> 
>> Hi,
>> 
>> I'm running Solr 1.3/Java 1.6.
>> 
>> When I run a query like  - (activity_type:NAME) AND title:(\...@#$%\^&\*\(\))
>> all the documents are returned even though there is not a single match.
>> There is no title that matches the string (which has been escaped).
>> 
>> My document structure is as follows
>> 
>> 
>> NAME
>> Bathing
>> 
>> 
>> 
>> 
>> The title field is of type text_title which is described below.
>> 
>> > positionIncrementGap="100">
>>      
>>        
>>        
>>        > generateWordParts="1" generateNumberParts="1" catenateWords="1"
>> catenateNumbers="1" catenateAll="1" splitOnCaseChange="1"/>
>>        
>>        
>>      
>>      
>>        
>>        > ignoreCase="true" expand="true"/>
>>        > generateWordParts="1" generateNumberParts="1" catenateWords="1"
>> catenateNumbers="1" catenateAll="1" splitOnCaseChange="1"/>
>>        
>>        
>> 
>>      
>>    
>> 
>> When I run the query against Luke, no results are returned. Any suggestions
>> are appreciated.
>> 
>> 
>> --
>> View this message in context:
>> http://www.nabble.com/When-searching-for-%21%40-%24-%5E-*%28%29-all-documents
>> -are-matched-incorrectly-tp23797731p23797731.html
>> Sent from the Solr - User mailing list archive at Nabble.com.
>> 
>> 



Re: how to do exact serch with solrj

2009-05-30 Thread Avlesh Singh
query.setQuery("title:hello the world") is what you need.

Cheers
Avlesh

On Sun, May 31, 2009 at 6:23 AM, Jianbin Dai  wrote:

>
> Hi,
>
> I want to search "hello the world" in the "title" field using solrj. I set
> the query filter
> query.addFilterQuery("title");
> query.setQuery("hello the world");
>
> but it returns not exact match results as well.
>
> I know one way to do it is to set "title" field to string instead of text.
> But is there any way i can do it? If I do the search through web interface
> Solr Admin by title:"hello the world", it returns exact matches.
>
> Thanks.
>
> JB
>
>
>
>
>


Re: how to do exact serch with solrj

2009-05-30 Thread Jianbin Dai

I tried, but seems it's not working right.

--- On Sat, 5/30/09, Avlesh Singh  wrote:

> From: Avlesh Singh 
> Subject: Re: how to do exact serch with solrj
> To: solr-user@lucene.apache.org
> Date: Saturday, May 30, 2009, 10:56 PM
> query.setQuery("title:hello the
> world") is what you need.
> 
> Cheers
> Avlesh
> 
> On Sun, May 31, 2009 at 6:23 AM, Jianbin Dai 
> wrote:
> 
> >
> > Hi,
> >
> > I want to search "hello the world" in the "title"
> field using solrj. I set
> > the query filter
> > query.addFilterQuery("title");
> > query.setQuery("hello the world");
> >
> > but it returns not exact match results as well.
> >
> > I know one way to do it is to set "title" field to
> string instead of text.
> > But is there any way i can do it? If I do the search
> through web interface
> > Solr Admin by title:"hello the world", it returns
> exact matches.
> >
> > Thanks.
> >
> > JB
> >
> >
> >
> >
> >
> 


  



Re: how to do exact serch with solrj

2009-05-30 Thread Avlesh Singh
You need exact match for all the three tokens?
If yes, try query.setQuery("title:\"hello the world\"");

Cheers
Avlesh

On Sun, May 31, 2009 at 12:12 PM, Jianbin Dai  wrote:

>
> I tried, but seems it's not working right.
>
> --- On Sat, 5/30/09, Avlesh Singh  wrote:
>
> > From: Avlesh Singh 
> > Subject: Re: how to do exact serch with solrj
> > To: solr-user@lucene.apache.org
> > Date: Saturday, May 30, 2009, 10:56 PM
> > query.setQuery("title:hello the
> > world") is what you need.
> >
> > Cheers
> > Avlesh
> >
> > On Sun, May 31, 2009 at 6:23 AM, Jianbin Dai 
> > wrote:
> >
> > >
> > > Hi,
> > >
> > > I want to search "hello the world" in the "title"
> > field using solrj. I set
> > > the query filter
> > > query.addFilterQuery("title");
> > > query.setQuery("hello the world");
> > >
> > > but it returns not exact match results as well.
> > >
> > > I know one way to do it is to set "title" field to
> > string instead of text.
> > > But is there any way i can do it? If I do the search
> > through web interface
> > > Solr Admin by title:"hello the world", it returns
> > exact matches.
> > >
> > > Thanks.
> > >
> > > JB
> > >
> > >
> > >
> > >
> > >
> >
>
>
>
>
>


Re: how to do exact serch with solrj

2009-05-30 Thread Jianbin Dai

That's correct! Thanks Avlesh.

--- On Sat, 5/30/09, Avlesh Singh  wrote:

> From: Avlesh Singh 
> Subject: Re: how to do exact serch with solrj
> To: solr-user@lucene.apache.org
> Date: Saturday, May 30, 2009, 11:45 PM
> You need exact match for all the
> three tokens?
> If yes, try query.setQuery("title:\"hello the world\"");
> 
> Cheers
> Avlesh
> 
> On Sun, May 31, 2009 at 12:12 PM, Jianbin Dai 
> wrote:
> 
> >
> > I tried, but seems it's not working right.
> >
> > --- On Sat, 5/30/09, Avlesh Singh 
> wrote:
> >
> > > From: Avlesh Singh 
> > > Subject: Re: how to do exact serch with solrj
> > > To: solr-user@lucene.apache.org
> > > Date: Saturday, May 30, 2009, 10:56 PM
> > > query.setQuery("title:hello the
> > > world") is what you need.
> > >
> > > Cheers
> > > Avlesh
> > >
> > > On Sun, May 31, 2009 at 6:23 AM, Jianbin Dai
> 
> > > wrote:
> > >
> > > >
> > > > Hi,
> > > >
> > > > I want to search "hello the world" in the
> "title"
> > > field using solrj. I set
> > > > the query filter
> > > > query.addFilterQuery("title");
> > > > query.setQuery("hello the world");
> > > >
> > > > but it returns not exact match results as
> well.
> > > >
> > > > I know one way to do it is to set "title"
> field to
> > > string instead of text.
> > > > But is there any way i can do it? If I do
> the search
> > > through web interface
> > > > Solr Admin by title:"hello the world", it
> returns
> > > exact matches.
> > > >
> > > > Thanks.
> > > >
> > > > JB
> > > >
> > > >
> > > >
> > > >
> > > >
> > >
> >
> >
> >
> >
> >
>