Re: error when post xml data to solr

2008-10-20 Thread cricdigs

I get the same error sometimes. I believe it happens when my xml exceeds 4
KB. Is there an upper limit somewhere. 
I am using Solr1.3 and using SolrJ to do the post.

Any ideas as to how I can increase the limit?

Thanks!


李学健 wrote:
> 
> hi, all
> 
> when i post an xml file to solr, some errors happen as below:
> ==
> com.ctc.wstx.exc.WstxEOFException: Unexpected EOF in prolog
> at [row,col {unknown-source}]: [1,0]
> at
> com.ctc.wstx.sr.StreamScanner.throwUnexpectedEOF(StreamScanner.java:686)
> at
> com.ctc.wstx.sr.BasicStreamReader.handleEOF(BasicStreamReader.java:2134)
> at
> com.ctc.wstx.sr.BasicStreamReader.nextFromProlog(BasicStreamReader.java:2040)
> at com.ctc.wstx.sr.BasicStreamReader.next(BasicStreamReader.java:1069)
> at
> org.apache.solr.handler.XmlUpdateRequestHandler.processUpdate(XmlUpdateRequestHandler.java:148)
> at
> org.apache.solr.handler.XmlUpdateRequestHandler.handleRequestBody(XmlUpdateRequestHandler.java:123)
> at
> org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:131)
> at org.apache.solr.core.SolrCore.execute(SolrCore.java:1204)
> at
> org.apache.solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter.java:303)
> at
> org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:232)
> at
> org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:235)
> at
> org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:206)
> at
> org.apache.catalina.core.StandardWrapperValve.invoke(StandardWrapperValve.java:233)
> at
> org.apache.catalina.core.StandardContextValve.invoke(StandardContextValve.java:191)
> at
> org.apache.catalina.core.StandardHostValve.invoke(StandardHostValve.java:128)
> at
> org.apache.catalina.valves.ErrorReportValve.invoke(ErrorReportValve.java:102)
> at
> org.apache.catalina.core.StandardEngineValve.invoke(StandardEngineValve.java:109)
> at
> org.apache.catalina.connector.CoyoteAdapter.service(CoyoteAdapter.java:286)
> at
> org.apache.coyote.http11.Http11Processor.process(Http11Processor.java:845)
> at
> org.apache.coyote.http11.Http11Protocol$Http11ConnectionHandler.process(Http11Protocol.java:583)
> at org.apache.tomcat.util.net.JIoEndpoint$Worker.run(JIoEndpoint.java:447)
> at java.lang.Thread.run(Thread.java:619)
> ) that prevented it from fulfilling this request.
> 
> ==
> 
> i am confused: wstx jar is used for web-service, why does solr use it ?
> 
> can anyone help me ? thanks.
> 
> 

-- 
View this message in context: 
http://www.nabble.com/error-when-post-xml-data-to-solr-tp19565671p20066616.html
Sent from the Solr - User mailing list archive at Nabble.com.



Faceting Vs using lucene filters ?

2007-09-17 Thread cricdigs

Hi,

I have a collection of blogs. Each Solr document has one blog with 3 fields
- blogger(id), title and blog text.
The search is performed over all 3 fields. When doing the search I need to
show 2 things:

1. Bloggers block with all the matching bloggers (so if a title, blog or
blogger contains the search term, I show the blogger's id)
2. Blogs block that shows the blog titles for the matching blogs.

The first block is my problem since it shows multiple instances of the same
blogger if that blogger has multiple matching blogs. I can use faceting to
show the bloggers but is there a better or more efficient way to do so? I
was thinking of creating a lucene filter to do this, is it feasible?
Basically, I need the unique bloggers from the index whose blogs match a
given search term.

Thanks!
-- 
View this message in context: 
http://www.nabble.com/Faceting-Vs-using-lucene-filters---tf4469665.html#a12744115
Sent from the Solr - User mailing list archive at Nabble.com.



RE: Faceting question

2007-09-21 Thread cricdigs


Thanks Peter. That will be my work-around, but I was hoping to find a more
elegant solution ;)
I am not that knowledgeable about the solr architecture but if there is a
way it can be done in a more elegant way I might be willing to put the extra
time to code it..


Binkley, Peter wrote:
> 
> Faceting works on the terms in an index, so you can't get information
> beyond those terms without doing extra work. You could build an extra
> index used only for faceting that concatenates the information you need
> from other fields, and then parse it out in your application: e.g.
> "Tolkien, J.R.R.|35421".
> 
> If you're doing this so that you can do precise follow-on searches,
> though (where a user clicks on a link in a list of facets), you might
> want to think about whether the author name gives you everything you
> need. You may have two authors with the same name, who would show up as
> a single facet if you don't tack the id on; but even if you do, how is
> the user going to distinguish them? They'll just see two links, maybe
> with opaque id numbers. So maybe the bare author name is good enough. (I
> had a similar situation and found that getting away from a
> relational-database approach and going with what Solr does best was the
> best solution).
> 
> Hope that helps,
> 
> Peter 
> 
> -Original Message-
> From: Cric Digs [mailto:[EMAIL PROTECTED] 
> Sent: Friday, September 21, 2007 7:36 AM
> To: solr-user@lucene.apache.org
> Subject: Re: Faceting question
> 
> lol I agree with you Hoss - sorry for that
> 
> Here's the thing:
> I need additional information from the index - such as the id related to
> a facet field. For example, say I am faceting on author names for a book
> store, I would also like to get the author id along with the author name
> to show a link (next to the author name) to say the author's bio page.
> The author id is stored in the index but how do i get that back with the
> facet results?
> 
> 
> On 9/20/07, Chris Hostetter <[EMAIL PROTECTED]> wrote:
>>
>>
>> : I'm using faceting to get some results. I also want to get another 
>> field
>> -
>> : the id field along with it. Is it possible to get that somehow in 
>> the facet
>> : results?
>>
>> you're going to have to elaborate on what it is you are trying to do 
>> ... i genuinely have no idea what you are asking (and i think i'm 
>> usually pretty good at reading between the lines and guessing what
> people mean).
>>
>>
>>
>> -Hoss
>>
>>
> 
> 

-- 
View this message in context: 
http://www.nabble.com/Faceting-question-tf4489342.html#a12824623
Sent from the Solr - User mailing list archive at Nabble.com.



Re: Alphabetical Facets

2007-11-01 Thread cricdigs


Will there be a way to do reverse alphabetical ordering in addition to
alphabetical? a->z and z->a ?

Thanks in advance.



ryantxu wrote:
> 
> Chris Hostetter wrote:
>> : Has anyone given any thought to alphabetical faceting?
>> 
>> if by alphabetical you mean the natural unicode ordering of terms for
>> facet.field type facets -- that's already supported.
>> 
>> It's the default sort if there is no facet limit (ie:  facet.limit=-1)
>> but
>> even with a limit it can be explicitly turned on with facet.sort=false
>> 
>> http://wiki.apache.org/solr/SimpleFacetParameters#head-569f93fb24ec41b061e37c702203c99d8853d5f1
>> http://localhost:8983/solr/select/?q=*%3A*&facet=true&facet.field=cat&rows=0&facet.limit=5&facet.sort=false
>> 
> 
> perfect!
> 
> I read that, but did not realize "natural index order" is alphabetical 
> in the ascii range.
> 
> thanks
> ryan
> 
> 

-- 
View this message in context: 
http://www.nabble.com/Alphabetical-Facets-tf3728353.html#a13533000
Sent from the Solr - User mailing list archive at Nabble.com.



Re: Getting only size of getFacetCounts , to simulate count(group by( a field) ) using facets

2007-11-01 Thread cricdigs

Hi,

Not sure what the resolution on this has been. But, I also have this need
and I am using 1.2 release of solr. If there is a workaround so I can get
this functionality, please advise. It will be very helpful.

Thanks.


Laurent Hoss wrote:
> 
> Hi
> 
> We want to (mis)use facet search to get the number of (unique) field 
> values appearing in a document resultset.
> I thought  facet search perfect for this, because it already gives me 
> all the (unique) field values.
> But for us to be used for this special problem, we don't want all the 
> values listed in response as there might be over 1 and we don't need 
> the values at all, just the count of how many!
> 
> I looked at
> http://wiki.apache.org/solr/SimpleFacetParameters
> and hoped to find a parameter like
> facet.sizeOnly = true
> (or facet.showSize=true  , combined with facet.limit=1 or other small
> value)
> 
> Would you accept a patch with such a feature ?
> 
> It should probably be relatively easy, though not sure if fits into the 
> concept of facets..
> 
> I looked at the code, maybe  add an extra Value to returned NamedList of 
> getFacetCounts() in SimpleFacets ?!
> 
> ps: Other user having same request AFAIU :
> http://www.nabble.com/showing--range-facet-example-%3D-by-Range-%28-1-to-1000-%29-t3660704.html#a10229069
> 
> thanks,
> 
> Laurent Hoss   
> 
> 
> 
> 
> 

-- 
View this message in context: 
http://www.nabble.com/Getting-only-size-of-getFacetCounts-%2C-to-simulate-count%28group-by%28-a-field%29-%29-using-facets-tf4482430.html#a13532933
Sent from the Solr - User mailing list archive at Nabble.com.



Unique field and EmbeddedSolr issue..

2007-12-28 Thread cricdigs

Hi, 

I have 2 questions:

1. I am using EmbeddedSolr and using example from here:
http://wiki.apache.org/solr/EmbeddedSolr
I just noticed that there is a note there saying that the page is out of
date, is that true and if yes is there an example that uses Solrj?

2. I am using EmbeddedSolr and I see that my documents are getting added
multiple times even if I have specified the uniqueKey field. I am using Solr
1.2 and following are the relevant entries from my schema.xml:


 id

Please let me know what I could be doing wrong. I realize this is holiday
time though we are very close to our release date and would really
appreciate any kind of immediate help.

Thanks in advance..

-- 
View this message in context: 
http://www.nabble.com/Unique-field-and-EmbeddedSolr-issue..-tp14531010p14531010.html
Sent from the Solr - User mailing list archive at Nabble.com.



Re: SOLR 1.2 - Duplicate Documents??

2007-12-28 Thread cricdigs

I am having the same issue. . Here is my schema.xml entries:

 
 id

I am using EmbeddedSolr instructions from the current wiki page and setting
the following for my AddUpdateCommand:

  AddUpdateCommand addcmd = new AddUpdateCommand();
  addcmd.allowDups = false;
  addcmd.overwritePending = true;
  addcmd.overwriteCommitted = true;

Thanks!


ryantxu wrote:
> 
>> 
>> Schema.xml
>>  
> 
> Have you edited schema.xml since building a full index from scratch?  If 
> so, try rebuilding the index.
> 
> People often get the behavior you describe if the 'id' is a 'text' field.
> 
> ryan
> 
> 
> 

-- 
View this message in context: 
http://www.nabble.com/SOLR-1.2---Duplicate-Documents---tp13621332p14531206.html
Sent from the Solr - User mailing list archive at Nabble.com.



Stop words and exact phrase

2008-05-14 Thread cricdigs

Hi all,

Is there a config setting that I could use to not remove stop words when
doing an exact phrase match. For example when searching for "the world" (in
quotes) I would like to look for just that and not get results for just
"world". When I look at the analysis, I see that word "the" is removed by
the StopFilter even if it is in quotes. So is there a work-around to solve
this?

Thanks!
-- 
View this message in context: 
http://www.nabble.com/Stop-words-and-exact-phrase-tp17233404p17233404.html
Sent from the Solr - User mailing list archive at Nabble.com.



Re: Stop words and exact phrase

2008-05-14 Thread cricdigs

Hi wunder,

Thanks for your response. I am still a little confused. Solr's analysis page
shows that the stop word is removed from the query - its got nothing to do
with the indexing imo.

If indexing has removed the stop words then I should not get any results
right? But I get the results with the stop word removed. 

How do I tell Solr to send phrase queries to a field other than default?
Will I have to code that or is it just a config setting?

Thanks.


Walter Underwood wrote:
> 
> Try creating a separate field that does not remove stopwords,
> populating that with  and configuring the phrase
> queries to go against that field instead.
> 
> I do something similar. For both regular and phrase queries,
> we have a stemmed and stopped field and another field with
> neither. The "exact" field has a higher boost. This helps
> with movies like "Saw" and "Ran", which should not show
> "see" or "run" as the top match.
> 
> wunder
> 
> On 5/14/08 8:09 AM, "cricdigs" <[EMAIL PROTECTED]> wrote:
> 
>> 
>> Hi all,
>> 
>> Is there a config setting that I could use to not remove stop words when
>> doing an exact phrase match. For example when searching for "the world"
>> (in
>> quotes) I would like to look for just that and not get results for just
>> "world". When I look at the analysis, I see that word "the" is removed by
>> the StopFilter even if it is in quotes. So is there a work-around to
>> solve
>> this?
>> 
>> Thanks!
> 
> 
> 

-- 
View this message in context: 
http://www.nabble.com/Stop-words-and-exact-phrase-tp17233404p17237198.html
Sent from the Solr - User mailing list archive at Nabble.com.