Re: Missing tokens

2010-08-19 Thread paul . moran
Great! Now I'm getting somewhere, this worked! The others didn't. http://localhost/solr/select?q=contents:"OB10."; Hope this makes sense to you. I'm still somewhat confused with the output here. I had 'highlight matches' check, and from what I can tell, 'OB10' wasn't found. When I enter 'OB10.' i

Re: Indexing fieldvalues with dashes and spaces

2010-08-19 Thread PeterKerk
Sorry for late reply, just back from holiday :) I did what you mentioned: and then in url facet.field=services_raw It works...awesome, thanks! -- View this message in context: http://lucene.472066.n3.nabble.com/Indexing-fieldvalues-with-dashes-and-spaces-tp1023699p1222961.html Sent from th

field collapsing on multiple fields

2010-08-19 Thread Bharat Jain
Hello, I was just wondering if there is field collapsing available for multiple fields. Basically grouping in different ways, like languages, country etc. Does anybody have any performance data available that they would like to share. Thanks Bharat Jain

Showing results based on facet selection

2010-08-19 Thread PeterKerk
I have indexed all data (as can be seen below). But now I want to be able to simulate when a user clicks on a facet value, for example clicks on the value "Gemeentehuis" of facet "themes_raw" AND has a selection on features facet on value "Strand" I've been playing with facet.query function: fac

Re: edismax pf2 and ps

2010-08-19 Thread Ron Mayer
Chris Hostetter wrote: > : Perhaps fold it into the pf/pf2 syntax? > : > : pf=text^2// current syntax... makes phrases with a boost of 2 > : pf=text~1^2 // proposed syntax... makes phrases with a slop of 1 and > : a boost of 2 > : > : That actually seems pretty natural given the lucene query

RE: Solr for multiple websites

2010-08-19 Thread Hitendra Molleti
Thanks Girjesh. Can you please let me know what are the pros and cons of this apporoach. Also, how can we setup load balancing between multiple solrs Thanks Hitendra -Original Message- From: Grijesh.singh [mailto:pintu.grij...@gmail.com] Sent: Thursday, August 19, 2010 10:25 AM To: s

Re: Jetty rerturning HTTP error code 413

2010-08-19 Thread Alexandre Rocco
Hi diddier, I have updated my etc/jetty.xml and updated my headerBufferSize to 2x as: 16384 But the error persists. Do you know if there is any other config that should be updated so this setting works? Also, is there any way to check if jetty is use this config inside Solr admin pages? I know th

Faceting by fields that contain special characters

2010-08-19 Thread Christos Constantinou
Hi all, I am doing a faceted search on a solr field that contains URLs, for the sole purpose of trying to locate duplicate URLs in my documents. However, the solr response I get looks like this: public 'com' => int 492198 public 'flickr' => int 492198 public 'http' =>

RE: Faceting by fields that contain special characters

2010-08-19 Thread Markus Jelsma
A very common issue, you need to facet on a non-analyzed field. http://lucene.472066.n3.nabble.com/Indexing-fieldvalues-with-dashes-and-spaces-td1023699.html#a1222961   -Original message- From: Christos Constantinou Sent: Thu 19-08-2010 15:08 To: solr-user@lucene.apache.org; Subject: Fa

RE: Showing results based on facet selection

2010-08-19 Thread Markus Jelsma
Hi,   A facet query serves a different purpose [1]. You need to filter your result set [2]. And don't forget to follow the links on caching and such.   [1]: http://wiki.apache.org/solr/SimpleFacetParameters#facet.query_:_Arbitrary_Query_Faceting [2]: http://wiki.apache.org/solr/CommonQueryPa

RE: Solr for multiple websites

2010-08-19 Thread Markus Jelsma
http://osdir.com/ml/solr-user.lucene.apache.org/2009-09/msg00630.html http://osdir.com/ml/solr-user.lucene.apache.org/2009-03/msg00309.html   Load balancing is bit out of scope here but all you need is a simple HTTP load balancer and a replication mechanism, depending on your set up.   -Ori

/update/extract

2010-08-19 Thread satya swaroop
Hi all, when we handle extract request handler what class gets invoked.. I need to know the navigation of classes when we send any files to solr. can anybody tell me the classes or any sources where i can get the answer.. or can anyone tell me what classes get invoked when we start the solr.

Re: specifying the doc id in clustering component

2010-08-19 Thread Tommy Chheng
The solr schema has the fields, id, name and desc. I would like to get docs:["name Field here" ] instead of the doc Id field as in "docs":["200066",         "195650", On Wednesday, August 18, 2010, Stanislaw Osinski wrote: > Hi Tommy, > >  I'm using the clustering component with solr 1.4. >>

RE: Showing results based on facet selection

2010-08-19 Thread PeterKerk
Hi Markus, Thanks for the quick reply. it works now! :) -- View this message in context: http://lucene.472066.n3.nabble.com/Showing-results-based-on-facet-selection-tp1223362p1225626.html Sent from the Solr - User mailing list archive at Nabble.com.

Using postCommit event to swap cores

2010-08-19 Thread simon
Hi there, I have solr configured with 2 cores, "live" and "standby". "Live" is used to service search requests from our users. "Standby" is used to rebuild the index from scratch each night. Currently I have the postCommit hook setup to swap the two cores over as soon as the indexing on "standb

Re: Jetty rerturning HTTP error code 413

2010-08-19 Thread Alexandre Rocco
Hi diddier, Nevermind. I figured it out. There was some miscommunication between me and our IT guy. Thanks for helping. It's fixed now. Alexandre On Thu, Aug 19, 2010 at 9:59 AM, Alexandre Rocco wrote: > Hi diddier, > > I have updated my etc/jetty.xml and updated my headerBufferSize to 2x as:

Autosuggest on PART of cityname

2010-08-19 Thread PeterKerk
I want to have a Google-like autosuggest function on citynames. So when user types some characters I want to show cities that match those characters but ALSO the amount of locations that are in that city. Now with Solr I now have the parameter: "&fq=title:Bost" But the result doesnt show the cit

Re: Solrj ContentStreamUpdateRequest Slow

2010-08-19 Thread Tod
On 8/19/2010 1:45 AM, Lance Norskog wrote: 'stream.url' is just a simple parameter. You should be able to just add it directly. I agree (code excluding imports): public class CommonTest { public static void main(String[] args) { System.out.println("main..."); try { String fileNam

RE: Autosuggest on PART of cityname

2010-08-19 Thread Markus Jelsma
You need a new analyzed field with the EdgeNGramTokenizer or you can try facet.prefix for this to work. To retrieve the number of locations for that city, just use the results from the faceting engine as usual.   I'm unsure which approach is actually faster but i'd guess using the EdgeNGramTok

Re: improving search response time

2010-08-19 Thread Muneeb Ali
Thanks for your input guys. I will surely try these suggestions, in particular, reducing heap size JAVA_OPTION and adjusting cache sizes to see if that makes a difference. I am also considering upgrading RAM for slave nodes, and also looking into moving from SATA enterprise HDD to SSD flash/DRAM

Re: Solr data type for date faceting

2010-08-19 Thread Jan Høydahl / Cominvent
Yes, I forgot that strings support alphanumeric ranges. However, they will potentially be very memory intensive since you dont get the trie-optimization and since strings take up more space than ints. Only way is to try it out. -- Jan Høydahl, search solution architect Cominvent AS - www.cominve

RE: Autosuggest on PART of cityname

2010-08-19 Thread PeterKerk
Ok, I now tried this: http://localhost:8983/solr/db/select/?wt=json&indent=on&q=*:*&fl=city&facet.field=city&facet.prefix=Bost Then I get: { "responseHeader":{ "status":0, "QTime":0, "params":{ "fl":"city", "indent":"on", "q":"*:*", "facet.prefix":"Bost",

Re: Missing tokens

2010-08-19 Thread Jan Høydahl / Cominvent
Hi, Your bug is right there in the WhitespaceTokenizer, where you see that it does NOT strip away the "." as whitespace. Try with StandardTokenizerFactory instead, as it removes punctuation. -- Jan Høydahl, search solution architect Cominvent AS - www.cominvent.com Training in Europe - www.solrt

RE: Autosuggest on PART of cityname

2010-08-19 Thread Markus Jelsma
Hmm, you have only four documents in your index i guess? That would make sense because you query for *:*. This technique doesn't rely on the found documents but the faceting engine so you should include rows=0 in your query and the fl parameter is not required anymore. Also, add facet=true to en

SolrIndex / LuceneIndex

2010-08-19 Thread stockii
Hello. in http://lucene.apache.org/solr/api/index.html?org/apache/solr/common/SolrDocument.html Is the talk about "SolrIndex" --> "A concrete representation of a document within a Solr index" is solr create an special SolrIndex or is here mean an LuceneIndex ? thx ;) -- View this messa

Re: improving search response time

2010-08-19 Thread Jan Høydahl / Cominvent
It is crucial to MEASURE your system to confirm your bottleneck. I agree that you are very likely to be disk I/O bound with such little memory left for the OS, a large index and many terms in each query. Have your IT guys do some monitoring on your disks and log this while under load. Then you sho

Proper Escaping of Ampersands

2010-08-19 Thread Nikolas Tautenhahn
Hi, I have a problem with, for example, company names like "AT&S". A Job is sending data to the solr 1.4 (also tested it with 1.4.1) index via python in XML, everything is escaped properly ("&" becomes "&"). When I search for "at s"(q=%22at%20s%22), using the dismax handler, I find the dataset to

Re: Missing tokens

2010-08-19 Thread paul . moran
I did that and it worked. Thanks very much for your expert assistance, Jan! Paul From: Jan Høydahl / Cominvent To: solr-user@lucene.apache.org Date: 19/08/2010 16:15 Subject:Re: Missing tokens Hi, Your bug is right there in the WhitespaceTokenizer, where you see that it d

Re: tire fields and sortMissingLast

2010-08-19 Thread harish.agarwal
Just curious if there has been any progress on implementing sortMissingLast on TrieFields? -- View this message in context: http://lucene.472066.n3.nabble.com/trie-fields-and-sortMissingLast-tp479233p1227971.html Sent from the Solr - User mailing list archive at Nabble.com.

Re: tire fields and sortMissingLast

2010-08-19 Thread Yonik Seeley
On Thu, Aug 19, 2010 at 12:28 PM, harish.agarwal wrote: > Just curious if there has been any progress on implementing sortMissingLast > on TrieFields? Not yet - that info is not available from the lucene FieldCache. -Yonik http://www.lucidimagination.com

Re: tire fields and sortMissingLast

2010-08-19 Thread harish.agarwal
Is there a good opportunity to work on this issue right now? I'd be happy to do it, if you could provide some initial advice on how to attack the problem. Moving forward, I'd like to use Trie fields, but the lack of this option is really holding me back... -- View this message in context: http

SpellCheckComponent question

2010-08-19 Thread fabritw
Hi, I am having some trouble with SpellCheckComponent when using queries such as "2galwy city". The spellchecker seems to ignore the number and suggest "galway". This is fine but in the collation it adds the number back onto the suggestion "2galway". This causes problems for me as I'm using it f

RE: SpellCheckComponent question

2010-08-19 Thread Dyer, James
This possibly might be a bug. See http://lucene.472066.n3.nabble.com/Spellcheck-help-td951059.html#a990476 James Dyer E-Commerce Systems Ingram Book Company (615) 213-4311 -Original Message- From: fabritw [mailto:fabr...@gmail.com] Sent: Thursday, August 19, 2010 12:51 PM To: solr-user@

Re: Problems to clustering on tomcat

2010-08-19 Thread Claudio Devecchi
Tks so much Otis, It was very helpfull On Tue, Aug 10, 2010 at 3:37 AM, Otis Gospodnetic < otis_gospodne...@yahoo.com> wrote: > Claudio, > > It sounds like the word "Cluster" there is adding confusion. > ClusteringComponent has to do with search results clustering. What you > seem to > be after

facets - id and display value

2010-08-19 Thread Satish Kumar
Hi, Is it possible to associate properties to a facet? For example, facet on categoryId (1, 2, 3 etc. ) and get properties display name, image, etc? Thanks, Satish

Basic conceptual questions about solr

2010-08-19 Thread Shaun McArthur
I'm looking for a Google search appliance look-a-like. We have a file share with 1000's of documents in a hierarchy that makes it ridiculously difficult to locate documents. Here are some basic questions: Is the idea to install Solr on separate hardware and have it crawl the file system? Can c

Re: multiple values

2010-08-19 Thread Erick Erickson
The first thing I'd do is look at the document in the admin pages and determine what you actually have in the index. If that's OK, have you dumped your responses to see if the returned document has multiple entries but you're parsing is off? Best Erick On Wed, Aug 18, 2010 at 5:00 PM, Ma, Xiaohui

Re: Date sorting

2010-08-19 Thread Erick Erickson
Whew! Thanks for bringing closure to that one, it looked ugly at the start! Best Erick On Thu, Aug 19, 2010 at 2:03 AM, kirsty wrote: > > > Grijesh.singh wrote: > > > > provide schema.xml and solrconfig.xml to dig the problem and by which > > version of solr u have indexed the data? > > > My gr

Re: Basic conceptual questions about solr

2010-08-19 Thread Jan Høydahl / Cominvent
Hi, You can place Solr wherever you want, but if your data is veery large, you'd want dedicated box. Have a look at DIH (http://wiki.apache.org/solr/DataImportHandler). It can both crawl a file share periodically, indexing only files changed since a timestamp (can be e.g. NOW-1HOUR) and extrac

Re: specifying the doc id in clustering component

2010-08-19 Thread Stanislaw Osinski
> The solr schema has the fields, id, name and desc. > > I would like to get docs:["name Field here" ] instead of the doc Id > field as in > "docs":["200066", "195650", > The idea behind using the document ids was that based on them you could access the individual documents' content, inc