Re: Solr for multiple websites

2010-08-18 Thread Grijesh.singh
Using multicore is the right approach -- View this message in context: http://lucene.472066.n3.nabble.com/Solr-for-multiple-websites-tp1173220p1219772.html Sent from the Solr - User mailing list archive at Nabble.com.

Re: specifying the doc id in clustering component

2010-08-18 Thread Stanislaw Osinski
Hi Tommy, I'm using the clustering component with solr 1.4. > > The response is given by the id field in the doc array like: >"labels":["Devices"], >"docs":["200066", > "195650", > "204850", > Is there a way to change the doc label to be another field? > > i couldn

Re: Date sorting

2010-08-18 Thread kirsty
Grijesh.singh wrote: > > provide schema.xml and solrconfig.xml to dig the problem and by which > version of solr u have indexed the data? > My greatest apologies, I have seen my mistake! ...looks like someone had added a sort into the requestHandler on another date already...which I was not awa

Re: Date sorting

2010-08-18 Thread Grijesh.singh
lance, is there any Bug in solr1.4 related to sorting on date field? -- View this message in context: http://lucene.472066.n3.nabble.com/Date-sorting-tp1219372p1219537.html Sent from the Solr - User mailing list archive at Nabble.com.

Re: Date sorting

2010-08-18 Thread Grijesh.singh
provide schema.xml and solrconfig.xml to dig the problem and by which version of solr u have indexed the data? -- View this message in context: http://lucene.472066.n3.nabble.com/Date-sorting-tp1219372p1219534.html Sent from the Solr - User mailing list archive at Nabble.com.

Re: Solrj ContentStreamUpdateRequest Slow

2010-08-18 Thread Lance Norskog
'stream.url' is just a simple parameter. You should be able to just add it directly. On Wed, Aug 18, 2010 at 5:35 AM, Tod wrote: > On 8/16/2010 6:12 PM, Chris Hostetter wrote: >> >> : > I think your problem may be that StreamingUpdateSolrServer buffers up >> : > commands and sends them in batches

Re: Date sorting

2010-08-18 Thread Lance Norskog
Wow. Can you try upgrading to 1.4.1 and re-indexing? On Wed, Aug 18, 2010 at 10:35 PM, kirsty wrote: > > Sorry forgot to mention that I am using SOLR 1.4 > and using the dismax query type. > > > > kirsty wrote: >> >> Hi I hope someone can point out what I am doing wrong. >> I have a date field in

Re: improving search response time

2010-08-18 Thread Lance Norskog
More on this: you should give Solr enough memory to run comfortably, then stop. Leave as much as you can for the OS to manage its disk cache. The OS is better at this than Solr is. Also, it does not have to do garbage collection. Filter queries are a big help. You should create a set of your basic

Re: Date sorting

2010-08-18 Thread kirsty
Sorry forgot to mention that I am using SOLR 1.4 and using the dismax query type. kirsty wrote: > > Hi I hope someone can point out what I am doing wrong. > I have a date field in my schema > > > and I am trying to do a sort on it > example url: > ...select/?sort=PublishDate > asc&qt=FinCo

Date sorting

2010-08-18 Thread kirsty
Hi I hope someone can point out what I am doing wrong. I have a date field in my schema and I am trying to do a sort on it example url: ...select/?sort=PublishDate asc&qt=FinCompanyCodeSearch&rows=20&fq=CompanyCode:1TM&fl=CompanyCode%20Title%20PublishDate This works for the most part, but if I

Re: Integrating Solr's SynonymFilter in lucene

2010-08-18 Thread Arun Rangarajan
I think the lucene WhitespaceAnalyzer I am using inside Solr's SynonymFilteris the one that prevents multi-word synonyms like "New York" from getting mapped to the generic synonym name like CONCEPTYcity. It appears to me that an analyzer which recognizes that a white-space is inside a synonym like

Re: Indexing Hanging during GC?

2010-08-18 Thread Rebecca Watson
hi all, in case anyone is having similar issues now / in the future -- here's what I think is at least part of the problem: once I commit the index, the RAM requirement jumps because the .tii files are loaded in at that point and because i have a very large number of unique terms I use 200MB+ of

Re: Solr data type for date faceting

2010-08-18 Thread Karthik K
adding facet.query=timestamp:[20100601+TO+201006312359]&facet.query=timestamp:[20100701+TO+201007312359]... in query should give the desired response without changing the schema or re-indexing.

Re: tii RAM usage on startup

2010-08-18 Thread Koji Sekiguchi
> I'm not sure how Solr exposes this configuration though. this one? Koji -- http://www.rondhuit.com/en/ (10/08/19 3:36), Michael McCandless wrote: I'm not sure why you see 1.5 GB before restart but then 4 GB after. But seeing a 26 MB tii file --> 200 MB RAM is unfortunately expected;

Re: queryResultCache has no hits for date boost function

2010-08-18 Thread Peter Karich
forget to say: thanks again! Now the cache gets hits! Regards, Peter. > On Wed, Aug 18, 2010 at 4:34 PM, Peter Karich wrote: > >> Thanks a lot Yonik! Rounding makes sense. >> Is there a date math for the 'LAST_COMMIT'? >> > No - but it's an interesting idea! > > -Yonik > http://www.lucid

Re: queryResultCache has no hits for date boost function

2010-08-18 Thread Peter Karich
Hi Yonik, would you point me to the Java classes where solr handles a commit or an optimize and then the date math definitions? Regards, Peter. > On Wed, Aug 18, 2010 at 4:34 PM, Peter Karich wrote: > >> Thanks a lot Yonik! Rounding makes sense. >> Is there a date math for the 'LAST_COMMIT'?

Re: Missing tokens

2010-08-18 Thread Jan Høydahl / Cominvent
Cannot see anything obvious... Try http://localhost/solr/select?q=contents:OB10* http://localhost/solr/select?q=contents:"OB 10" http://localhost/solr/select?q=contents:"OB10."; http://localhost/solr/select?q=contents:ob10 Also, go to the Analysis page in admin, typie in your field name, enable

multiple values

2010-08-18 Thread Ma, Xiaohui (NIH/NLM/LHC) [C]
Hello, I only can display one author which is last one. It looks like overwrite others. In xml, I have more than one name in . In data_config.xml, I put the . In schema.xml, I put . Please let me know if I did something wrong, or how I can display it in jsp. I really appreciate your help!

Re: Solr's Index Live Updates

2010-08-18 Thread Jan Høydahl / Cominvent
Hi, I'm afraid you'll have to post the full document again, then do a commit. But it WILL be lightning fast, as it is only the updated document which is indexed, all the other existing documents will not be re-indexed. -- Jan Høydahl, search solution architect Cominvent AS - www.cominvent.com Tr

Re: queryResultCache has no hits for date boost function

2010-08-18 Thread Yonik Seeley
On Wed, Aug 18, 2010 at 4:34 PM, Peter Karich wrote: > Thanks a lot Yonik! Rounding makes sense. > Is there a date math for the 'LAST_COMMIT'? No - but it's an interesting idea! -Yonik http://www.lucidimagination.com

Re: queryResultCache has no hits for date boost function

2010-08-18 Thread Peter Karich
Thanks a lot Yonik! Rounding makes sense. Is there a date math for the 'LAST_COMMIT'? Peter. > On Tue, Aug 17, 2010 at 6:29 PM, Peter Karich wrote: > >> my queryResultCache has no hits. But if I am removing one line from the >> bf section in my dismax handler all is fine. Here is the line: >>

Re: Jetty rerturning HTTP error code 413

2010-08-18 Thread Yonik Seeley
Yep, or you can submit the query via POST, which has a much bigger limit on the size of the body. -Yonik http://www.lucidimagination.com On Wed, Aug 18, 2010 at 3:58 PM, didier deshommes wrote: > Hi Alexandre, > Have you tried setting a higher headerBufferSize?  Look in > etc/jetty.xml and sear

Re: Jetty rerturning HTTP error code 413

2010-08-18 Thread didier deshommes
Hi Alexandre, Have you tried setting a higher headerBufferSize? Look in etc/jetty.xml and search for 'headerBufferSize'; I think it controls the size of the url. By default it is 8192. didier On Wed, Aug 18, 2010 at 2:43 PM, Alexandre Rocco wrote: > Guys, > > We are facing an issue executing ve

Re: sort order of "missing" items

2010-08-18 Thread Yonik Seeley
On Tue, Aug 17, 2010 at 4:10 PM, Brad Dewar wrote: > When items are sorted, are all the docs with the sort field missing > considered "tied" in terms of their sort order, or are they "indeterminate", > or do they have some arbitrary order imposed on them (e.g. _docid_)? If it's a numeric field,

Jetty rerturning HTTP error code 413

2010-08-18 Thread Alexandre Rocco
Guys, We are facing an issue executing very large query (~4000 bytes in the URL) in Solr. When we execute the query, Solr (probably Jetty) returns a HTTP 413 error (FULL HEAD). I guess that this is related to the very big query being executed, and currently we can't make it short. Is there any co

Re: queryResultCache has no hits for date boost function

2010-08-18 Thread Yonik Seeley
On Tue, Aug 17, 2010 at 6:29 PM, Peter Karich wrote: > my queryResultCache has no hits. But if I am removing one line from the > bf section in my dismax handler all is fine. Here is the line: > recip(ms(NOW,date),3.16e-11,1,1) NOW has millisecond resolution, so it's actually a different query eac

Re: How to use synonms on a faceted field with multiple words

2010-08-18 Thread Scott Zientara
A quick and dirty work around using Solr 1.4 is to replace spaces in the synonm file with some other character/pattern. I used ## (i.e. video => digital##media). Then add the solr.PatternReplaceFilterFactory after the synonm filter to replace pattern with space. This works, but I'd love to kn

Re: tii RAM usage on startup

2010-08-18 Thread Michael McCandless
I'm not sure why you see 1.5 GB before restart but then 4 GB after. But seeing a 26 MB tii file --> 200 MB RAM is unfortunately expected; in 3.x Lucene's in-RAM representation of the terms index is very inefficient (three separate object instances (TermInfo, Term, String) per indexed term, with ea

tii RAM usage on startup

2010-08-18 Thread Rebecca Watson
hi, I am running solr 1.4.1 and java 1.6 with 6GB heap and the following GC settings: gc_args="-XX:+UseConcMarkSweepGC -XX:+CMSClassUnloadingEnabled -XX:NewSize=2g -XX:MaxNewSize=2g -XX:CMSInitiatingOccupancyFraction=60" So 6GB total heap and 2GB allocated to eden space. I have caching, autoco

How to use synonms on a faceted field with multiple words

2010-08-18 Thread Scott Zientara
I am trying to use solr.SynonymFilterFactory on a faceted field in Solr 1.3. I am using Solr to index resources from a media library. The data is coming from various sources, some of which I do not have control over. I need to be able to map resource types in the data to common terms for facet

Re: improving search response time

2010-08-18 Thread Shawn Heisey
Most of your time is spent doing the query itself, which in the light of other information provided, does not surprise me. With 12GB of RAM and 9GB dedicated to the java heap, the available RAM for disk caching is pretty low, especially if Solr is actually using all 9GB. Since your index is

Re: improving search response time

2010-08-18 Thread Muneeb Ali
First, thanks very much for a prompt reply. Here is more info: === a) What operating system? Debian GNU/Linux 5.0 b) What Java container (Tomcat/Jetty) Jetty c) What JAVA_OPTIONS? I.e. memory, garbage collection etc. -Xmx9000m -DDEBUG -Djava.awt.headless=true -Dorg.mortbay.

Re: improving search response time

2010-08-18 Thread Gora Mohanty
On Wed, 18 Aug 2010 05:18:34 -0700 (PDT) Muneeb Ali wrote: > > Hi All, > > I need some guidance over improving search response time for our > catalog search. [...] > I would appreciate if anyone with similar background could shed > some light on upgrading hardware in our situation. Or if any >

Re: Missing tokens

2010-08-18 Thread paul . moran
Here's my field description. I mentioned 'contents' field in my original post. I've changed it to a different field, 'summary'. It's using the 'text' fieldType as you can see below.

Re: Help Debugging Delta Query

2010-08-18 Thread Frank A
Uhg... my mistake. Thanks! On Wed, Aug 18, 2010 at 10:22 AM, Ahmet Arslan wrote: >> I'm trying to use a delta query to update a specific entity >> by its >> primary key.  The URL I'm using is: >> >> http://localhost:8080/solr/core2/dataimport?command=delta-import&did=5&commit=true&debug=true >>

Re: Help Debugging Delta Query

2010-08-18 Thread Ahmet Arslan
> I'm trying to use a delta query to update a specific entity > by its > primary key.  The URL I'm using is: > > http://localhost:8080/solr/core2/dataimport?command=delta-import&did=5&commit=true&debug=true > > Where 5 is the PK. > > In my db config I have: > >             name="place" >      

Help Debugging Delta Query

2010-08-18 Thread Frank A
Hi, I'm trying to use a delta query to update a specific entity by its primary key. The URL I'm using is: http://localhost:8080/solr/core2/dataimport?command=delta-import&did=5&commit=true&debug=true Where 5 is the PK. In my db config I have: When I run the URL above I see the f

Re: autocomplete: case-insensitive and middle word

2010-08-18 Thread Paul
Here's my solution. I'm posting it in case it is radically wrong; I hope someone can help straighten me out. It seems to work fine, and seems fast enough. In schema.xml:

Solr's Index Live Updates

2010-08-18 Thread Gonzalo Payo Navarro
Hi everyone! I've a question: Is there a way to update a document in Solr (v. 1.4) and that document is ready for searches without a reindex? Let me put it this way: My index is filled with documents like, say, DOC_ID, STATUS and TEXT fields. What if I want to update the TEXT field and see that c

Re: World order sensitivity query id Solr

2010-08-18 Thread Ahmet Arslan
> First I very sorry form my bad English :/ > > I am new user of Apache Solr. I have read documentation and > check > Google, but I can not find solution of my problem: > I need "words order sensitivity" query, for example I have > two > documents in Solr index: > 1. one something two something th

World order sensitivity query id Solr

2010-08-18 Thread Krzysztof Szalast
Hi, First I very sorry form my bad English :/ I am new user of Apache Solr. I have read documentation and check Google, but I can not find solution of my problem: I need "words order sensitivity" query, for example I have two documents in Solr index: 1. one something two something three something

Re: improving search response time

2010-08-18 Thread Jan Høydahl / Cominvent
Some questions: a) What operating system? b) What Java container (Tomcat/Jetty) c) What JAVA_OPTIONS? I.e. memory, garbage collection etc. d) Example queries? I.e. what features, how many facets, sort fields etc e) How do you load balance queries between the slaves? f) What is your search latency

Re: Solrj ContentStreamUpdateRequest Slow

2010-08-18 Thread Tod
On 8/16/2010 6:12 PM, Chris Hostetter wrote: : > I think your problem may be that StreamingUpdateSolrServer buffers up : > commands and sends them in batches in a background thread. if you want to : > send individual updates in real time (and time them) you should just use : > CommonsHttpSolrSer

improving search response time

2010-08-18 Thread Muneeb Ali
Hi All, I need some guidance over improving search response time for our catalog search. we are using solr 1.4.0 version and have master/slave setup (3 dedicated servers, one being the master and other two slaves). The server specs are as follows: Quad Core 2.5Ghz 1333mhz 12GB Ram 2x250GB disks

Re: Phrase Highlighting with special characters

2010-08-18 Thread Kranti K K Parisa
Seems the following is working query.setHighlight(true).setHighlightSnippets(1); query.setHighlightSimplePre(""); query.setHighlightSimplePost(""); query.setHighlightFragsize(1000); query.setParam("hl.fl", ""); also I was reading something about (I

Re: Missing tokens

2010-08-18 Thread Jan Høydahl / Cominvent
Hi, Can you share with us how your schema looks for this field? What FieldType? What tokenizer and analyser? How do you parse the PDF document? Before submitting to Solr? With what tool? How do you do the query? Do you get the same results when doing the query from a browser, not SolrJ? -- Jan

Re: Solr data type for date faceting

2010-08-18 Thread Jan Høydahl / Cominvent
If you want to change the schema on the live index, make sure you do a compatible change, as Solr does not do any type checking or schema change validation. I would ADD a field with another name for the tint field. Unfortunately you have to re-index to have an index built on this field. May I su

Re: Function query to boost scores by a constant if all terms are present

2010-08-18 Thread Jan Høydahl / Cominvent
You can use the map() function for this, see http://wiki.apache.org/solr/FunctionQuery#map q=a fox&defType=dismax&qf=allfields&bf=map(query($qq),0,0,0,100.0)&qq=allfields:(quick AND brown AND fence) This adds a constant boost of 100.0 if the $qq field returns a non-zero score, which it does w

Missing tokens

2010-08-18 Thread paul . moran
Hi, I'm having a problem with certain search terms not being found when I do a query. I'm using Solrj to index a pdf document, and add the contents to the 'contents' field. If I query the 'contents' field on the SolrInputDocument doc object as below, I get 50k tokens. StringTokenizer to = new Str

Re: Solr data type for date faceting

2010-08-18 Thread Karthik K
Thanks Mark. Yeah, storing it as 'tint' would be quite efficient.As i cannot re-index the massive data, please let me know if the changes i make in schema reflect to the already indexed data? I am not sure how type checking happens in solr. You can then do a facet query, specifying your desired r

Re: Solr data type for date faceting

2010-08-18 Thread Mark Allan
If you're storing the timestamp as MMDDHHMM, why don't you make it a trie-coded integer field (type 'tint') rather than text? That way, I believe range queries would be more efficient. You can then do a facet query, specifying your desired ranges as one facet query for each range. N

Phrase Highlighting with special characters

2010-08-18 Thread Kranti K K Parisa
Hi All, I am trying with Solr Highlighting. I have problem in highlighting phrases consists of special characters for example, if I search for a phrase like "united. states. usa" then the results are displayed matching the exact phrase and also without special characters means "united states us

Solr data type for date faceting

2010-08-18 Thread Karthik K
I have a field storing timestamp as text (MMDDHHMM). Can i get the results as i get with date faceting? (July(30),August(54) etc) As per my knowledge Solr currently doesn't support range faceting, even if it does in the future , text will not be recognized as integer/long. Tried for a workarou

solr working...

2010-08-18 Thread satya swaroop
hi all, i am very intrested to know the working of solr. can anyone tell me which modules or classes that gets invoked when we start the servlet container like tomcat or when we send any requests to solr like sending pdf files or what files get invoked at the start of solr.?? regards, saty