What I would like to do is ONLY boost if there is a match on terms in
SOLR 3.5. For example:
1. q=smith&defType=dismax&qf=user_query&sort=score desc
2. I want to add a boost by distance (closest = highest score), ONLY
if there is a hit on #1.
This one only multiplies by the "smith" * recip(geodis
I also get an issue with "." with edismax.
For example: Dr. Smith gices me different results than "dr Smith"
On Thu, Mar 1, 2012 at 10:18 PM, Way Cool wrote:
> Thanks Ahmet! That's good to know someone else also tried to make phrase
> queries to fix multi-word synonym issue. :-)
>
>
> On Thu, M
Hello,
Using native XSLT Response Writer, we may need to alter content before
processing xml solr output as a RSS Feed.
Example (trivial one...):
bla bla bla
After processing content:
bla bla bla bla bla bla bla bla bla bla bla bla
Have you any ideas on how to implement a custom func
On Sun, 4 Mar 2012 21:09:30 -0500, Mark Miller
wrote:
On Mar 4, 2012, at 5:43 PM, Markus Jelsma wrote:
everything stalls after it lists all segment files and that a ZK
state change has occured.
Can you get a stack trace here? I'll try to respond to more tomorrow.
What version of trunk are yo
Sometimes the solution is so easy that I can't see it in front of me.
Thanks, Mikhail!
2012/3/3 Mikhail Khludnev
> Hi Luis,
>
> Do you mean
>
> q=id:(A^10+OR+B^9+OR+C^8+OR...)
> I'm not sure whether it woks but
>
> q=id:A^10+OR+id:B^9+OR+id:C^8+OR...)
>
> definitely does
>
> On Fri, Mar 2, 201
Hi,
I have question about Polish language in Solr.
There are 2 options: StempelPolishStemFilterFactory or
HunspellStemFilterFactory with polish dictionary. I've made some tests but
the results are not satisfying me. StempelPolishStemFilterFactory is very
fast during indexing but the quality of se
I've seen this question several times on the list.
Perhaps it could be beneficial to create a new Date field that also soupports
year-only, year-month, year-month-day etc queries? It could be called
ExtendedDateField or something, and when indexing a date "-MM-DDTHH:mm:ssZ"
it would individu
Hi,
The documentation for this features says:
> langid.langsField
>
> Specifies the field to output a list of detected languages into. This must be
> a multiValued String field. If you use langid.map.individual, each detected
> language will be added to this field.
>
Your langid.langsField fie
Hi,
Thanks for reporting. This is fixed now on the staging site, will be set live
soon.
--
Jan Høydahl, search solution architect
Cominvent AS - www.cominvent.com
Solr Training - www.solrtraining.com
On 1. mars 2012, at 16:50, Nicolai Scheer wrote:
> Hi!
>
> Having just worked through the sol
Look at the admin/analysis page and be sure to check the "verbose"
checkboxes. that'll show you what each filter does to the input. My
guess is that WordDelimiterFilterFactory has different parameters
and that's what you're seeing. WDFF can be tricky to understand...
If that's not helpful, you nee
Hi list,
we are using the kinda new JoinQuery feature in Solr 4.x Trunk and are
facing a problem (and also Solr 3.5. with the JoinQuery patch applied) ...
We have documents with a parent - child relationship where a parent can
have any number of childs, parents being identified by the field "pa
I should have read more carefully. Why not just use facet.query? They are
treated completely independently, so you can specify something like:
facet.query=field:0*
facet.query=field:1_foovalue*
and you can even specify facet.field as well, they all just come back
as separate sections in the facets
A question relating to this.
If you are running a single ZK node, but say 10 other nodes and then
parallel index on each of those nodes, will the ZK be hit by all 10
indexing nodes constantly? i.e. very chatty?
If one of those 10 indexing nodes goes down or falls out of sync and comes
back, does
>
>
>>
> Hi Donald,
>
> Try to remove tokenizerFactory="**KeywordTokenizerFactory" in your
> synonym filter
> definition because I think you would want to tokenize the synonym settings
> in
> synonyms.txt as "floor" / "locker" => "storage" / "locker". But if you set
> it
> to KeywordTokenizer, it w
On Mar 5, 2012, at 10:01 AM, dar...@ontrenet.com wrote:
> If one of those 10 indexing nodes goes down or falls out of sync and comes
> back, does ZK block the state of indexing until that single node catches
> back up?
No - if a node falls out of sync or comes back, the rest of the cluster
cont
> I also get an issue with "." with
> edismax.
>
> For example: Dr. Smith gices me different results than "dr
> Smith"
I believe this is related to analysis ( rather than query parser). You can
inspect output admin/analysis.jsp.
What happens when you switch to &defType=lucene ? Dr. Smith yield
I have ngram-indexed 2 fields (columns in the database) and the third one is my
full text field. Now my default text field is the full text field and while
querying I use dismax handler and specify in it both the ngrammed field with
certain boost values and also full text field with a certain
hello all,
we are approaching the time when we will move our first solr core in to a
more "production like" environment. as a precursor to this, i am attempting
to write some documents on impact assessment and batch load / data import
strategies.
does anyone have processes or lessons learned - t
On Mon, 5 Mar 2012 11:26:20 -0500, Mark Miller
wrote:
On Mar 5, 2012, at 10:01 AM, dar...@ontrenet.com wrote:
If one of those 10 indexing nodes goes down or falls out of sync and
comes
back, does ZK block the state of indexing until that single node
catches
back up?
No - if a node falls ou
Is there a way to limit the number of searchers that can be open at a given
time? I know there is a maxWarmingSearchers configuration that limits the
number of warming searchers, but that's not quite what I'm looking for...
Ideally, when I commit, I want there to only be one searcher open befor
The Lucene geo searching code is very fast. Geosearch queries
calculate the distance from the city to all 20k stores and sort on
this.
If this is not fast enough, you can pre-calculate the city/store lists
by doing all of this searching in advance. You can store these in a DB
and do incremental up
On 2/28/2012 8:16 AM, Shawn Heisey wrote:
Due to the End of Life announcement for Java6, I am going to need to
upgrade to Java 7 in the very near future. I'm running Solr 3.5.0
modified with a couple of JIRA patches.
https://blogs.oracle.com/henrik/entry/updated_java_6_eol_date
I saw the ann
Neil,
Still is not clear whether it multi or singe valued fields that
defines usage or FieldCache or UnInvertedField, and per-segment reader vs
top-level reader.
The only concern I have about your approach is the waste of cpu for
calculate facets for huge *:* docsets. I guess you can try to find
i googled and found numerous references to this, but no answers that went to my
specific issues.
i have a solr 3.5.0 server set up that needs to index several different
document types, there is no common unique key field. so i can't use the
uniqueKey declaration and need to disable the QueryEle
You may be able to have unique keys. At Netflix, I found that there were
collisions between the movie IDs and the person IDs. So, I put an 'm' at the
beginning of each movie ID and a 'p' at the beginning of each person ID. Like
magic, I had unique IDs.
You should be able to disable the query el
Walter Underwood [mailto:wun...@wunderwood.org] writes:
>You may be able to have unique keys. At Netflix, I found that there were
>collisions between >the movie IDs and the person IDs. So, I put an 'm' at the
>beginning of each movie ID and a >'p' at the beginning of each person ID. Like
>ma
How is scoring affected by wildcard queries? Seems when I use a
wildcard query I get all constant scores in response (all scores =
1.0). That occurs with both edismax as well as lucene query parser.
I am trying to implement auto-suggest feature so I need to use wild
card to return all results tha
https://issues.apache.org/jira/browse/SOLR-2548 may be of interest to you.
-Michael
On Mar 5, 2012, at 1:16 PM, Welty, Richard wrote:
> Walter Underwood [mailto:wun...@wunderwood.org] writes:
>
>> You may be able to have unique keys. At Netflix, I found that there were
>> collisions between >the movie IDs and the person IDs. So, I put an 'm' at
>> the beginning of each movie I
Walter Underwood [mailto:wun...@wunderwood.org] writes:
>On Mar 5, 2012, at 1:16 PM, Welty, Richard wrote:
>> Walter Underwood [mailto:wun...@wunderwood.org] writes:
>>> You may be able to have unique keys. At Netflix, I found that there were
>>> collisions between the movie IDs and the person
If I have a multivalued field with values as follows
black pantswhite shirt
and I do a query against that field with highlighting enabled as follows
/select?hl.fl=clothing&rows=5&q=clothing:black clothing:shirt&hl=on&indent=true
I thought I would see the following in the highlights
black pants
(12/03/06 0:11), Donald Organ wrote:
Try to remove tokenizerFactory="**KeywordTokenizerFactory" in your
synonym filter
definition because I think you would want to tokenize the synonym settings
in
synonyms.txt as "floor" / "locker" => "storage" / "locker". But if you set
it
to KeywordTokenizer,
No I do synonyms at index time.
On Monday, March 5, 2012, Koji Sekiguchi wrote:
> (12/03/06 0:11), Donald Organ wrote:
>>>
>>> Try to remove tokenizerFactory="**KeywordTokenizerFactory" in your
>>> synonym filter
>>> definition because I think you would want to tokenize the synonym
settings
>>> i
You can embed custom Java functions in XSLT:
http://cafeconleche.org/books/xmljava/chapters/ch17s03.html
On Mon, Mar 5, 2012 at 4:27 AM, darul wrote:
> Hello,
>
> Using native XSLT Response Writer, we may need to alter content before
> processing xml solr output as a RSS Feed.
>
> Example (tri
(12/03/06 11:07), Donald Organ wrote:
No I do synonyms at index time.
:
I am still getting results for storage locker and no results for floor
locker
synonyms.txt still looks like this:
floor locker=>storage locker
So that's the cause of the problem. Due to the definition "floor locker=>s
Ok so do I need to use a different format in my synonyms.txt file in order
to do this at index time?
On Monday, March 5, 2012, Koji Sekiguchi wrote:
> (12/03/06 11:07), Donald Organ wrote:
>>
>> No I do synonyms at index time.
>>
> :
I am still getting results for storage locker and no
(12/03/06 11:23), Donald Organ wrote:
Ok so do I need to use a different format in my synonyms.txt file in order
to do this at index time?
Right, if you want to apply synonym rules to only index time.
Use "," like this:
floor locker, storage locker
And don't forget to set expand="true" in yo
Excellent thank you, it is now working!
On Mon, Mar 5, 2012 at 9:37 PM, Koji Sekiguchi wrote:
> (12/03/06 11:23), Donald Organ wrote:
>
>> Ok so do I need to use a different format in my synonyms.txt file in order
>> to do this at index time?
>>
>>
> Right, if you want to apply synonym rules to
Hi Mark,
So I tried this: started up one instance w/ zookeeper, and started a second
instance defining a shard name in solr.xml -- it worked, searching would
search both indices, and looking at the zookeeper ui, I'd see the second
shard. However, when I brought the second server down -- the first
Hi there,
Am looking at using Solr to perform the following tasks:
1. Push a lot of PDF documents into SOLR.
2. Build a database of all the words encountered in those documents.
3. Be able to query for a list of words matching a string like "a*"
For example, if the collection contains the words
I'm using solr 3.5 for a type ahead search system. I want to rank
exact matches(lowercased) higher than non-exact matches.
For example, if i have two docs:
Doc One: title="New York"
Doc Two: title="New York City"
I would expect a query of "new york" to rank "New York" over "New York City"
It loo
Actually the results are great with lucene. The issue is with edismax.
I did figure out the issue...
The scoring was putting different results based on distance, when I
really need the scoring to be:
score=tf(user_query,"smith") and add geodist() only if tf > 0. this is
pretty difficult to do in
42 matches
Mail list logo