parsing many documents takes too long

2011-08-11 Thread Tri Nguyen
Hi,   My results from solr returns about 982 documents and I use jaxb to parse them into java objects, which takes about 469 ms, which is over my 150-200ms threshold.   Is there a solution around this?  Can I store the java objects in the index and return them in the solr response and then seria

Timeout trying to index from nutch

2011-08-11 Thread Phil Scadden
I am new user and I have SOLR installed. I can use the admin page and query the example data. However, I was using nutch to load index with intranet web pages and I got this message. SolrIndexer: starting at 2011-08-12 16:52:44 org.apache.solr.client.solrj.SolrServerException: java.net.ConnectE

Re: Dates off by 1 day?

2011-08-11 Thread Chris Hostetter
: In Solr the date is stored as Zulu time zone and Solrj is returning date in : CDT timezone (jvm is picking system time zone.) Strictly speaking, Solrj is not returning the date "in CDT timezone" ... Date objects in java are absolute moments in time, that know nothing about timezones. Where t

Re: SEVERE: org.apache.solr.common.SolrException: Error loading class 'solr.ICUTokenizerFactory'

2011-08-11 Thread Chris Hostetter
: I copied the file apache-solr-analysis-extras-3.3.0.jar into solr's lib : folder. Now the error is different - ... : > > I also added the following files to my apache-solr-3.3.0\example\lib : > folder: Deja-Vu... http://www.lucidimagination.com/search/document/5967b87c6fa56fd1/error_lo

Why is "boost" not always listed in explain when debug is on?

2011-08-11 Thread Jonathan Acheson
using Solr Specification Version: 4.0.0.2011.08.09.11.02.13 While trying understand scoring I noticed that "boost" is intermittently displayed in the explain. For example, using edismax and the query string is "q=Starbucks&qf=name.search name^2" my first result has the boost explicitly listed in t

Re: Unbuffered entity enclosing request can not be repeated & Invalid chunk header

2011-08-11 Thread Markus Jelsma
Hi, We see these errors too once on a while but there is real answer on the mailing list here except one user suspecting Tomcat is responsible (connection time outs). Another user proposed to limit the number of documents per batch but that, of course, increases the number of connections made

Re: unique terms and multi-valued fields

2011-08-11 Thread Kevin Osborn
Thant makes sense. There are actually stored fields. I was mostly just trying to figure out how much my index size might grow. These fields I am dealing with are large and repetitive (but mixed). From: Erick Erickson To: solr-user@lucene.apache.org; Kevin Osbor

Re: bug in termfreq? was Re: is it possible to do a sort without query?

2011-08-11 Thread Alexei Martchenko
are you boosting your docs? 2011/8/8 Jason Toy > I am trying to test out and compare different sorts and scoring. > > When I use dismax to search for "indie music" > with: qf=all_lists_text&q="indie+music"&defType=dismax&rows=100 > I see some stuff that seems "irrelevant", meaning in top result

Re: how to ignore case in solr search field?

2011-08-11 Thread Alexei Martchenko
Here's an example. Since I only query this for spelling, i can lowecase both on index and query time. 2011/8/10 nagarjuna > Hi please help me .. >how to ignore case while searching in solr > > > ex:i need same results for the keywords abc, ABC , aBc,AbC and all the

Re: Need help indexing/querying a particular type of hierarchy

2011-08-11 Thread Michael B. Klein
I've been experimenting with that, but that fq wouldn't limit my facet counts adequately. Since the document has both an accessionWF and a digitizationWF, the fq would match (and count) the document no matter what the status for each process. I suppose I could do something like this: acce

Re: strip html from data

2011-08-11 Thread Alexei Martchenko
You can use like here in this example. Check the docs about your specific SOLR version because something has changed in the htmlstrip syntax in 1.4 and 3.x 2011/8/11 Merlin Morgenstern > I am sorry, but I do not really understand the difference of indexed and > returned result set. > > I l

RE: Searching For Term 'OR'

2011-08-11 Thread John Brewer
Thanks for the advice everyone. I am rebuilding the index with a lowercase field instead of string. -Original Message- From: Chris Hostetter [mailto:hossman_luc...@fucit.org] Sent: Thursday, August 11, 2011 1:10 PM To: solr-user@lucene.apache.org Subject: Re: Searching For Term 'OR' :

Re: Searching For Term 'OR'

2011-08-11 Thread Chris Hostetter
: I am looking for some advice on how to index and search a field that : contains a two character state name without the query parser dying on : the OR and also not treating it as an 'OR' Boolean operator. fq={!term f=state}OR ...this kind of filter you don't want a query parser that has any

RE: Searching For Term 'OR'

2011-08-11 Thread Rode González
hi, use the filter LowerCaseFilterFactory (don't work with string type, you must create a new fieldtype of text type) or use scaped forms: \OR \AND I tried it a moment ago and it works. saludos --- Rode González > -Mensaje original- > De: Tomás Fernández Löbbe [mailto:tomasflo...@g

Re: Searching For Term 'OR'

2011-08-11 Thread John Brewer
Thanks for the feedback. I'll give these a try. Tomás Fernández Löbbe wrote: >I guess this is because Lucene QP is interpreting the 'OR' operator. >You can either: > use lowercase > use other query parser, like the term query parser. See >http://lucene.apache.org/solr/api/org/apache/solr

Re: Searching For Term 'OR'

2011-08-11 Thread Tomás Fernández Löbbe
I guess this is because Lucene QP is interpreting the 'OR' operator. You can either: use lowercase use other query parser, like the term query parser. See http://lucene.apache.org/solr/api/org/apache/solr/search/TermQParserPlugin.html Also, if you just removed the "or" term from the stop

Searching For Term 'OR'

2011-08-11 Thread John Brewer
Hello, I am looking for some advice on how to index and search a field that contains a two character state name without the query parser dying on the OR and also not treating it as an 'OR' Boolean operator. For example: The following query with a filter query key/value pair causes an exc

need some guidance about how to configure a specific solr solution.

2011-08-11 Thread Roman, Pablo
Hi There, I am IT and work on a project based on Liferary 605 with solr-3.2 like the indexer/search engine. I presently have only one server that is indexing and searching but reading the Liferay Support suggestions they point to the need of having: - 2 to n SOLR read-server for searching from

RE: Hudson build issues

2011-08-11 Thread arian487
I downloaded the official build (4.0) and I've been customizing it for my needs. I'm not really sure how to use these scripts. Is there somewhere in Hudson where I can apply these scripts or something? -- View this message in context: http://lucene.472066.n3.nabble.com/Hudson-build-issues-tp3

RE: copyfields in schema.xml

2011-08-11 Thread Michael Ryan
Nope. The 'text' field will just have the 'titulo' contents. To have both, you would have to do something like this: -Michael

copyfields in schema.xml

2011-08-11 Thread Rode González
Hi all. if in schema.xml we put something like: Can I expect that in 'text' field I have the 'title' and the 'titulo' contents ? thanks ;) Note: in our app, the titles refer to books that can be named in several different ways . --- Rode González _

Re: Solr 3.3: DIH configuration for Oracle

2011-08-11 Thread Shawn Heisey
On 8/10/2011 2:52 PM, Eugeny Balakhonov wrote: java.lang.IllegalArgumentException: deltaQuery has no column to resolve to declared primary key pk='T1_ID_RECORD, T2_ID_RECORD' I have analyzed the source code of DIH. I found that in the DocBuilder class collectDelta() method works with value of en

SolR : Spellchecking & Autocomplete

2011-08-11 Thread vsham
Hello, I posted on the Lucene Forums, and someone told me to e-mail it here. Instead of writing again my question here, I take the liberty to link my post. Its about SolR, autocompletion, Spellchecking and "case-sentivieness" (?). http://lucene.472066.n3.nabble.com/SolR-Spellchecking-amp-Autoco

NRT in Master- Slave setup, crazy?

2011-08-11 Thread eks dev
Thinking aloud and grateful for sparing .. I need to support high commit rate (low update latency) in a master slave setup and I have a bad feelings about it, even with disabling warmup and stripping everything down that slows down refresh. I will try it anyway, but I started thinking about "back

Re: Solr 3.3 crashes after ~18 hours?

2011-08-11 Thread Stephen Duncan Jr
I know it seems like my problem may not be the same as the original poster, but in investigating this, I did find this Jetty issue that may be related: http://jira.codehaus.org/browse/JETTY-1377 Stephen Duncan Jr www.stephenduncanjr.com On Thu, Aug 4, 2011 at 1:54 PM, Stephen Duncan Jr wrote:

Re: Need help indexing/querying a particular type of hierarchy

2011-08-11 Thread Dmitry Kan
Hi, Can you keep your hierarchy flat in SOLR and then use filter queries (fq=wf:accessionWF) inside you facet queries (facet.field=status)? Or is the requirement to have one single facet query producing the hierarchical facet counts? On Thu, Aug 11, 2011 at 10:43 AM, Michael B. Klein wrote: > H

Re: frange not working in query

2011-08-11 Thread Yonik Seeley
On Wed, Aug 10, 2011 at 5:57 AM, Amit Sawhney wrote: > Hi All, > > I am trying to sort the results on a unix timestamp using this query. > > http://url.com:8983/solr/db/select/?indent=on&version=2.1&q={!frange%20l=0.25}query($qq)&qq=nokia&sort=unix-timestamp%20desc&start=0&rows=10&qt=dismax&wt=dis

Re: LockObtainFailedException

2011-08-11 Thread Peter Sturge
Optimizing indexing time is a very different question. I'm guessing your 3mins+ time you refer to is the commit time. There are a whole host of things to take into account regarding indexing, like: number of segments, schema, how many fields, storing fields, omitting norms, caching, autowarming, s

RE: Hudson build issues

2011-08-11 Thread Steven A Rowe
Hi arian487, You apparently are not using the official Ant build? (Maven is officially unsupported.) The scripts used by the Lucene and Solr Jenkins builds at the ASF are available here: http://svn.apache.org/repos/asf/lucene/dev/nightly/ The ASF Jenkins jobs checkout the above direc

RE: Building a facet query in SolrJ

2011-08-11 Thread Simon, Richard T
Thanks! I actually found a page on line that explained this. -Rich -Original Message- From: Chris Hostetter [mailto:hossman_luc...@fucit.org] Sent: Wednesday, August 10, 2011 4:01 PM To: solr-user@lucene.apache.org Cc: Simon, Richard T Subject: RE: Building a facet query in SolrJ : que

Re: LockObtainFailedException

2011-08-11 Thread Naveen Gupta
Yes this was happening because of JVM heap size But the real issue is that if our index size is growing (very high) then indexing time is taking very long (using streaming) earlier for indexing 15,000 docs at a time (commit after 15000 docs) , it was taking 3 mins 20 secs time, after deleting t

Re: how to change default response fromat as json in solr configuration?

2011-08-11 Thread Erik Hatcher
You can set default="true" in solrconfig on the JSON response writer, like this: Or you can add json to any request handler definitions. Erik On Aug 11, 2011, at 07:36 , nagarjuna wrote: > Hi everybody, > > when ever i enter search term in solr i am able to getting response in XML

Re: strip html from data

2011-08-11 Thread Ahmet Arslan
> Is there a way to strip the html tags completly and not > index them? If not, > how to I retrieve the results without html tags? How do you push documents to solr? You need to strip html tags before the analysis chain. For example, if you are using Data Import Handler, you can use HTMLStripTra

Re: LockObtainFailedException

2011-08-11 Thread Peter Sturge
Hi, When you get this exception with no other error or explananation in the logs, this is almost always because the JVM has run out of memory. Have you checked/profiled your mem usage/GC during the stream operation? On Thu, Aug 11, 2011 at 3:18 AM, Naveen Gupta wrote: > Hi, > > We are doing st

Re: strip html from data

2011-08-11 Thread Merlin Morgenstern
I am sorry, but I do not really understand the difference of indexed and returned result set. I look on the "returned" dataset via this command: solr/select/?q=id:533563&terms=true which gives me html tags like this ones: I also tried to turn on TermsComponent, but it did not change anything: s

Need help indexing/querying a particular type of hierarchy

2011-08-11 Thread Michael B. Klein
Hi all, I have a particular data structure I'm trying to index into a solr document so that I can query and facet it in a particular way, and I can't quite figure out the best way to go about it. One sample object is here: https://gist.github.com/1139065 The part that's tripping me up is the wor