Boost dependency

2012-03-05 Thread William Bell
What I would like to do is ONLY boost if there is a match on terms in SOLR 3.5. For example: 1. q=smith&defType=dismax&qf=user_query&sort=score desc 2. I want to add a boost by distance (closest = highest score), ONLY if there is a hit on #1. This one only multiplies by the "smith" * recip(geodis

Re: Couple issues with edismax in 3.5

2012-03-05 Thread William Bell
I also get an issue with "." with edismax. For example: Dr. Smith gices me different results than "dr Smith" On Thu, Mar 1, 2012 at 10:18 PM, Way Cool wrote: > Thanks Ahmet! That's good to know someone else also tried to make  phrase > queries to fix multi-word synonym issue. :-) > > > On Thu, M

XSLT Response Writer and content transformation

2012-03-05 Thread darul
Hello, Using native XSLT Response Writer, we may need to alter content before processing xml solr output as a RSS Feed. Example (trivial one...): bla bla bla After processing content: bla bla bla bla bla bla bla bla bla bla bla bla Have you any ideas on how to implement a custom func

Re: [SoldCloud] Slow indexing

2012-03-05 Thread Markus Jelsma
On Sun, 4 Mar 2012 21:09:30 -0500, Mark Miller wrote: On Mar 4, 2012, at 5:43 PM, Markus Jelsma wrote: everything stalls after it lists all segment files and that a ZK state change has occured. Can you get a stack trace here? I'll try to respond to more tomorrow. What version of trunk are yo

Re: A sorting question.

2012-03-05 Thread Luis Cappa Banda
Sometimes the solution is so easy that I can't see it in front of me. Thanks, Mikhail! 2012/3/3 Mikhail Khludnev > Hi Luis, > > Do you mean > > q=id:(A^10+OR+B^9+OR+C^8+OR...) > I'm not sure whether it woks but > > q=id:A^10+OR+id:B^9+OR+id:C^8+OR...) > > definitely does > > On Fri, Mar 2, 201

Polish language in Solr

2012-03-05 Thread Agnieszka Kukałowicz
Hi, I have question about Polish language in Solr. There are 2 options: StempelPolishStemFilterFactory or HunspellStemFilterFactory with polish dictionary. I've made some tests but the results are not satisfying me. StempelPolishStemFilterFactory is very fast during indexing but the quality of se

Re: Date search by specific month and day

2012-03-05 Thread Jan Høydahl
I've seen this question several times on the list. Perhaps it could be beneficial to create a new Date field that also soupports year-only, year-month, year-month-day etc queries? It could be called ExtendedDateField or something, and when indexing a date "-MM-DDTHH:mm:ssZ" it would individu

Re: How to define a multivalued string type "langid.langsField" in solrconfig.xml

2012-03-05 Thread Jan Høydahl
Hi, The documentation for this features says: > langid.langsField > > Specifies the field to output a list of detected languages into. This must be > a multiValued String field. If you use langid.map.individual, each detected > language will be added to this field. > Your langid.langsField fie

Re: errata for solr tutorial

2012-03-05 Thread Jan Høydahl
Hi, Thanks for reporting. This is fixed now on the staging site, will be set live soon. -- Jan Høydahl, search solution architect Cominvent AS - www.cominvent.com Solr Training - www.solrtraining.com On 1. mars 2012, at 16:50, Nicolai Scheer wrote: > Hi! > > Having just worked through the sol

Re: Remove underscore char when indexing and query problem

2012-03-05 Thread Erick Erickson
Look at the admin/analysis page and be sure to check the "verbose" checkboxes. that'll show you what each filter does to the input. My guess is that WordDelimiterFilterFactory has different parameters and that's what you're seeing. WDFF can be tricky to understand... If that's not helpful, you nee

JoinQuery and document score problem

2012-03-05 Thread Stefan Moises
Hi list, we are using the kinda new JoinQuery feature in Solr 4.x Trunk and are facing a problem (and also Solr 3.5. with the JoinQuery patch applied) ... We have documents with a parent - child relationship where a parent can have any number of childs, parents being identified by the field "pa

Re: Retrieving multiple levels with hierarchical faceting in Solr

2012-03-05 Thread Erick Erickson
I should have read more carefully. Why not just use facet.query? They are treated completely independently, so you can specify something like: facet.query=field:0* facet.query=field:1_foovalue* and you can even specify facet.field as well, they all just come back as separate sections in the facets

Re: [SoldCloud] Slow indexing

2012-03-05 Thread darren
A question relating to this. If you are running a single ZK node, but say 10 other nodes and then parallel index on each of those nodes, will the ZK be hit by all 10 indexing nodes constantly? i.e. very chatty? If one of those 10 indexing nodes goes down or falls out of sync and comes back, does

Re: Help with Synonyms

2012-03-05 Thread Donald Organ
> > >> > Hi Donald, > > Try to remove tokenizerFactory="**KeywordTokenizerFactory" in your > synonym filter > definition because I think you would want to tokenize the synonym settings > in > synonyms.txt as "floor" / "locker" => "storage" / "locker". But if you set > it > to KeywordTokenizer, it w

Re: [SoldCloud] Slow indexing

2012-03-05 Thread Mark Miller
On Mar 5, 2012, at 10:01 AM, dar...@ontrenet.com wrote: > If one of those 10 indexing nodes goes down or falls out of sync and comes > back, does ZK block the state of indexing until that single node catches > back up? No - if a node falls out of sync or comes back, the rest of the cluster cont

Re: Couple issues with edismax in 3.5

2012-03-05 Thread Ahmet Arslan
> I also get an issue with "." with > edismax. > > For example: Dr. Smith gices me different results than "dr > Smith" I believe this is related to analysis ( rather than query parser). You can inspect output admin/analysis.jsp. What happens when you switch to &defType=lucene ? Dr. Smith yield

ngram synonyms & dismax together

2012-03-05 Thread Husain, Yavar
I have ngram-indexed 2 fields (columns in the database) and the third one is my full text field. Now my default text field is the full text field and while querying I use dismax handler and specify in it both the ngrammed field with certain boost values and also full text field with a certain

need input - lessons learned or best practices for data imports

2012-03-05 Thread geeky2
hello all, we are approaching the time when we will move our first solr core in to a more "production like" environment. as a precursor to this, i am attempting to write some documents on impact assessment and batch load / data import strategies. does anyone have processes or lessons learned - t

Re: [SoldCloud] Slow indexing

2012-03-05 Thread Markus Jelsma
On Mon, 5 Mar 2012 11:26:20 -0500, Mark Miller wrote: On Mar 5, 2012, at 10:01 AM, dar...@ontrenet.com wrote: If one of those 10 indexing nodes goes down or falls out of sync and comes back, does ZK block the state of indexing until that single node catches back up? No - if a node falls ou

How to limit the number of open searchers?

2012-03-05 Thread Michael Ryan
Is there a way to limit the number of searchers that can be open at a given time? I know there is a maxWarmingSearchers configuration that limits the number of warming searchers, but that's not quite what I'm looking for... Ideally, when I commit, I want there to only be one searcher open befor

Re: Solr Design question on spatial search

2012-03-05 Thread Lance Norskog
The Lucene geo searching code is very fast. Geosearch queries calculate the distance from the city to all 20k stores and sort on this. If this is not fast enough, you can pre-calculate the city/store lists by doing all of this searching in advance. You can store these in a DB and do incremental up

Re: Java6 End of Life, upgrading to 7

2012-03-05 Thread Shawn Heisey
On 2/28/2012 8:16 AM, Shawn Heisey wrote: Due to the End of Life announcement for Java6, I am going to need to upgrade to Java 7 in the very near future. I'm running Solr 3.5.0 modified with a couple of JIRA patches. https://blogs.oracle.com/henrik/entry/updated_java_6_eol_date I saw the ann

Re: How can Solr do parallel query warming with and ?

2012-03-05 Thread Mikhail Khludnev
Neil, Still is not clear whether it multi or singe valued fields that defines usage or FieldCache or UnInvertedField, and per-segment reader vs top-level reader. The only concern I have about your approach is the waste of cpu for calculate facets for huge *:* docsets. I guess you can try to find

disabling QueryElevationComponent

2012-03-05 Thread Welty, Richard
i googled and found numerous references to this, but no answers that went to my specific issues. i have a solr 3.5.0 server set up that needs to index several different document types, there is no common unique key field. so i can't use the uniqueKey declaration and need to disable the QueryEle

Re: disabling QueryElevationComponent

2012-03-05 Thread Walter Underwood
You may be able to have unique keys. At Netflix, I found that there were collisions between the movie IDs and the person IDs. So, I put an 'm' at the beginning of each movie ID and a 'p' at the beginning of each person ID. Like magic, I had unique IDs. You should be able to disable the query el

RE: disabling QueryElevationComponent

2012-03-05 Thread Welty, Richard
Walter Underwood [mailto:wun...@wunderwood.org] writes: >You may be able to have unique keys. At Netflix, I found that there were >collisions between >the movie IDs and the person IDs. So, I put an 'm' at the >beginning of each movie ID and a >'p' at the beginning of each person ID. Like >ma

wildcard queries with edismax and lucene query parsers

2012-03-05 Thread Robert Stewart
How is scoring affected by wildcard queries? Seems when I use a wildcard query I get all constant scores in response (all scores = 1.0). That occurs with both edismax as well as lucene query parser. I am trying to implement auto-suggest feature so I need to use wild card to return all results tha

RE: How can Solr do parallel query warming with and ?

2012-03-05 Thread Michael Ryan
https://issues.apache.org/jira/browse/SOLR-2548 may be of interest to you. -Michael

Re: disabling QueryElevationComponent

2012-03-05 Thread Walter Underwood
On Mar 5, 2012, at 1:16 PM, Welty, Richard wrote: > Walter Underwood [mailto:wun...@wunderwood.org] writes: > >> You may be able to have unique keys. At Netflix, I found that there were >> collisions between >the movie IDs and the person IDs. So, I put an 'm' at >> the beginning of each movie I

RE: disabling QueryElevationComponent

2012-03-05 Thread Welty, Richard
Walter Underwood [mailto:wun...@wunderwood.org] writes: >On Mar 5, 2012, at 1:16 PM, Welty, Richard wrote: >> Walter Underwood [mailto:wun...@wunderwood.org] writes: >>> You may be able to have unique keys. At Netflix, I found that there were >>> collisions between the movie IDs and the person

Highlighting Multivalued Field question

2012-03-05 Thread Jamie Johnson
If I have a multivalued field with values as follows black pantswhite shirt and I do a query against that field with highlighting enabled as follows /select?hl.fl=clothing&rows=5&q=clothing:black clothing:shirt&hl=on&indent=true I thought I would see the following in the highlights black pants

Re: Help with Synonyms

2012-03-05 Thread Koji Sekiguchi
(12/03/06 0:11), Donald Organ wrote: Try to remove tokenizerFactory="**KeywordTokenizerFactory" in your synonym filter definition because I think you would want to tokenize the synonym settings in synonyms.txt as "floor" / "locker" => "storage" / "locker". But if you set it to KeywordTokenizer,

Re: Help with Synonyms

2012-03-05 Thread Donald Organ
No I do synonyms at index time. On Monday, March 5, 2012, Koji Sekiguchi wrote: > (12/03/06 0:11), Donald Organ wrote: >>> >>> Try to remove tokenizerFactory="**KeywordTokenizerFactory" in your >>> synonym filter >>> definition because I think you would want to tokenize the synonym settings >>> i

Re: XSLT Response Writer and content transformation

2012-03-05 Thread Matthew Parker
You can embed custom Java functions in XSLT: http://cafeconleche.org/books/xmljava/chapters/ch17s03.html On Mon, Mar 5, 2012 at 4:27 AM, darul wrote: > Hello, > > Using native XSLT Response Writer, we may need to alter content before > processing xml solr output as a RSS Feed. > > Example (tri

Re: Help with Synonyms

2012-03-05 Thread Koji Sekiguchi
(12/03/06 11:07), Donald Organ wrote: No I do synonyms at index time. : I am still getting results for storage locker and no results for floor locker synonyms.txt still looks like this: floor locker=>storage locker So that's the cause of the problem. Due to the definition "floor locker=>s

Re: Help with Synonyms

2012-03-05 Thread Donald Organ
Ok so do I need to use a different format in my synonyms.txt file in order to do this at index time? On Monday, March 5, 2012, Koji Sekiguchi wrote: > (12/03/06 11:07), Donald Organ wrote: >> >> No I do synonyms at index time. >> > : I am still getting results for storage locker and no

Re: Help with Synonyms

2012-03-05 Thread Koji Sekiguchi
(12/03/06 11:23), Donald Organ wrote: Ok so do I need to use a different format in my synonyms.txt file in order to do this at index time? Right, if you want to apply synonym rules to only index time. Use "," like this: floor locker, storage locker And don't forget to set expand="true" in yo

Re: Help with Synonyms

2012-03-05 Thread Donald Organ
Excellent thank you, it is now working! On Mon, Mar 5, 2012 at 9:37 PM, Koji Sekiguchi wrote: > (12/03/06 11:23), Donald Organ wrote: > >> Ok so do I need to use a different format in my synonyms.txt file in order >> to do this at index time? >> >> > Right, if you want to apply synonym rules to

Re: Building a resilient cluster

2012-03-05 Thread Ranjan Bagchi
Hi Mark, So I tried this: started up one instance w/ zookeeper, and started a second instance defining a shard name in solr.xml -- it worked, searching would search both indices, and looking at the zookeeper ui, I'd see the second shard. However, when I brought the second server down -- the first

Creating a query-able dictionary using Solr

2012-03-05 Thread Beach, Joel
Hi there, Am looking at using Solr to perform the following tasks: 1. Push a lot of PDF documents into SOLR. 2. Build a database of all the words encountered in those documents. 3. Be able to query for a list of words matching a string like "a*" For example, if the collection contains the words

How to rank an exact match higher?

2012-03-05 Thread Tommy Chheng
I'm using solr 3.5 for a type ahead search system. I want to rank exact matches(lowercased) higher than non-exact matches. For example, if i have two docs: Doc One: title="New York" Doc Two: title="New York City" I would expect a query of "new york" to rank "New York" over "New York City" It loo

Re: Couple issues with edismax in 3.5

2012-03-05 Thread William Bell
Actually the results are great with lucene. The issue is with edismax. I did figure out the issue... The scoring was putting different results based on distance, when I really need the scoring to be: score=tf(user_query,"smith") and add geodist() only if tf > 0. this is pretty difficult to do in