I'm also unable to config that type of search through schema.xml. As I use
SOLR in drupal, I've implement that in hook_search_api_solr_query_alter by
exploding my search string on two (or more) chunks and now search works
well.
Strangely that couldn'y do it through SOLR configuration.
--
View t
Hi,
Recently we have upgraded solr 1.4 version to 4.0 version. After upgrading
we are experiencing unusual behavior in SOLR4.0.
The same query is working properly in solr 1.4 and it is throwing SEVERE:
null:java.lang.IllegalStateException: Form too large1611387>20 error in
solr4.0.
I have in
Is there some way to supplement the DirectSolrSpellChecker with a dictionary?
(In some cases terms are not used because of threshold, but should be
offered as spellcheck suggestion)
--
View this message in context:
http://lucene.472066.n3.nabble.com/Using-additional-dictionary-with-DirectSolrS
0 down vote favorite
We are using solr 3.6.
We have field named Description. We want searching feature with stemming and
also without stemming (exact word/phrase search), with highlighting in both
.
For that , we had made lot of research and come to conclusion, to use the
copy field wi
some time back I used dreamhost for a Solr based project. Looks as though
all their offerings, including shared hosting have Java support - see
http://wiki.dreamhost.com/What_We_Support. I was very happy with their
service and support.
-Simon
On Tue, Oct 9, 2012 at 10:44 AM, Michael Della Bitta
Hi - The WordDelimiterFilter can help you get *-BAAN-* for A100-BAAN-C20 but
only because BAAN is surrounded with characters the filter splits and combines
upon.
-Original message-
> From:Kissue Kissue
> Sent: Wed 10-Oct-2012 14:20
> To: solr-user@lucene.apache.org
> Subject: Wild card
Hi,
Check jetty configs, this looks like an error from the container.
Otis
--
Performance Monitoring - http://sematext.com/spm
On Oct 10, 2012 4:50 AM, "ravicv" wrote:
> Hi,
>
> Recently we have upgraded solr 1.4 version to 4.0 version. After upgrading
> we are experiencing unusual behavior in
1611387 is 1,611,387 which is clearly greater than your revised limit of
50 = 500,000.
Try setting the limit to 2,000,000 = 200. Or maybe even 5,000,000 =
500.
-- Jack Krupansky
-Original Message-
From: ravicv
Sent: Wednesday, October 10, 2012 4:49 AM
To: solr-user@luce
OK so I solved the question about the query that returns no results and
still takes time - I needed to add the facet.mincount=1 parameter and this
reduced the time to 200-300 ms instead of seconds.
I still could't figure out why a query that returns very few results (like
query number 2) still tak
1. What is your specific motivation for wanting to do this? (Sounds like yet
another "XY problem"!)
2. What specific rules are you expecting to use for synthesis of patterns
from the raw data?
For the latter, do you expect to index hand-coded specific patterns to be
returned or do you have som
There's nothing really built in to Solr to allow this. Are you
absolutely sure you can't just use the copyfield? Have you
actually tried it?
But I don't think you need to store the contents twice. Just
store it once and always highlight on that field whether you
search it or not. Since it's the ra
Guys,
thanks for all the inputs, I was continuing my research to know more about
segments in Lucene. Below are my conclusion, please correct me if am wrong.
1. Segments are independent sub-indexes in seperate file, while indexing
its better to create new segment as it doesnt have to modify a
I don't want to tweak the threshold. For majority of cases it works fine.
It's for cases where term has low frequency but is spelled correctly.
If you lower the threshold you would also get incorrect spelled terms as
suggestions.
Robert Muir wrote
> These thresholds are adjustable: read the jav
On Wed, 2012-10-10 at 14:15 +0200, Kissue Kissue wrote:
> I have added the string: *-BAAN-* to the index to a field called pattern
> which is a string type. Now i want to be able to search for A100-BAAN-C20
> or ZA20-BAAN-300 and have Solr return *-BAAN-*.
That sounds a lot like the problem presen
The synonym filter does set the "type" attribute to TYPE_SYNONYM for
synonyms, so you could write your own token filter that "keeps" only tokens
with that type.
Try the Solr Admin "analysis" page to see how various terms are analyzed by
the synonym filter. It will show TYPE_SYNONYM.
-- Jack
On Wed, Oct 10, 2012 at 12:02 AM, Briggs Thompson
wrote:
> *Sami*
> The client IS
> instantiated only once and not for every request. I was curious if this was
> part of the problem. Do I need to re-instantiate the object for each
> request made?
No, it is expensive if you instantiate the client
> Token_Input:
> the fox jumped over the lazy dog
>
> Synonym_Map:
> fox => vulpes
> dog => canine
>
> Token_Output:
> vulpes canine
>
> So remove all tokens, but retain those matched against the
> synonym map
May be you can make use of
http://lucene.apache.org/solr/api-4_0_0-ALPHA/org/apach
It is really not fixed. It could also be *-*-BAAN or BAAN-CAN20-*. In each
i just want only the fixed character(s) to match then the * can match any
character.
On Wed, Oct 10, 2012 at 2:05 PM, Toke Eskildsen wrote:
> On Wed, 2012-10-10 at 14:15 +0200, Kissue Kissue wrote:
> > I have added the st
Ah ha .. good thinking ... thanks!
Dan
On Wed, Oct 10, 2012 at 2:39 PM, Ahmet Arslan wrote:
>
> > Token_Input:
> > the fox jumped over the lazy dog
> >
> > Synonym_Map:
> > fox => vulpes
> > dog => canine
> >
> > Token_Output:
> > vulpes canine
> >
> > So remove all tokens, but retain those mat
There are other updates that happen on the server that do not fail, so the
answer to your question is yes.
On Wed, Oct 10, 2012 at 8:12 AM, Sami Siren wrote:
> On Wed, Oct 10, 2012 at 12:02 AM, Briggs Thompson
> wrote:
> > *Sami*
> > The client IS
> > instantiated only once and not for every re
On Wed, Oct 10, 2012 at 5:36 PM, Briggs Thompson
wrote:
> There are other updates that happen on the server that do not fail, so the
> answer to your question is yes.
The other updates are using solrj or something else?
It would be helpful if you could prepare a simple java program that
uses sol
They are both SolrJ.
What is happening is I have a "batch" indexer application that does a full
re-index once per day. I also have an "incremental" indexer that takes
items off a queue when they are updated.
The problem only happens when both are running at the same time - they also
run from the
Hi,
I know that you can use a facet query to get the unique terms for a field
taking account of any q or fq parameters but for our use case the counts are
not needed. So is there a more efficient way of finding just unique terms for
a field?
Phil
The Solr TermsComponent:
http://wiki.apache.org/solr/TermsComponent
-- Jack Krupansky
-Original Message-
From: Phil Hoy
Sent: Wednesday, October 10, 2012 11:45 AM
To: solr-user@lucene.apache.org
Subject: Unique terms without faceting
Hi,
I know that you can use a facet query to get
Here is another simpler example of what I am trying to achieve:
Multi-Valued Field 1:
Data 1
Data 2
Data 3
Data 4
Multi-Valued Field 2:
Data 11
Data 12
Data 13
Data 14
Multi-Valued Field 3:
Data 21
Data 22
Data 23
Data 24
How can I specify that Data 1,Data 11 and data 21 are all related? And i
I cannot seem to get delete by query working in my simple setup in Solr 4.0
beta.
I have a single collection and I want to delete old documents from it. There
is a single solr node in the config (no replication, not distributed). This is
something that I previously did in Solr 3.x
My collecti
Hey There,
We have the following data structure:
- Person
-- Interest 1
--- Subinterest 1
--- Subinterest 1 Description
--- Subinterest 1 ID
-- Interest 2
--- Subinterest 2
--- Subinterest 2 Description
--- Subinterest 2 ID
.
-- Interest 99
--- Subinterest 99
--- Subinterest 99
Hi,
I don't think you can use that component whilst taking into account any fq or q
parameters.
Phil
-Original Message-
From: Jack Krupansky [mailto:j...@basetechnology.com]
Sent: 10 October 2012 16:51
To: solr-user@lucene.apache.org
Subject: Re: Unique terms without faceting
The Solr
Hello,
I have a weird problem, Whenever I read the doc from solr and
then index the same doc that already exists in the index (aka
reindexing) I get the following error. Can somebody tell me what I am
doing wrong. I use solr 3.6 and the definition of the field is given
below
Exception i
Do you have a "_version_" field in your schema. I believe SOLR 4.0
Beta requires that field.
Ravi Kiran Bhaskar
On Wed, Oct 10, 2012 at 11:45 AM, Andrew Groh wrote:
> I cannot seem to get delete by query working in my simple setup in Solr 4.0
> beta.
>
> I have a single collection and I want to
Hi Mikhail, Thank you for the reply. I will try with it .
Thanks
Rajani
On Sun, Oct 7, 2012 at 5:10 PM, Mikhail Khludnev wrote:
> Rajani,
>
> IIRC solrmeter can grab search phrases from log. There is a special command
> for doing it there. Right - Tool/Extract Queries.
>
> Regards
>
> On Sun,
You need remove field after read solr doc, when u add new field it will
add to list, so when u try to commit the update field, it will be multi
value and in your schema it is single value
On Oct 10, 2012 9:26 AM, "Ravi Solr" wrote:
> Hello,
> I have a weird problem, Whenever I read the
Does anyone have a clear understanding of how group.caching achieves it's
performance improvements memory wise? Percent means percent of maxDoc so
it's a function of that, but is it a function of that *per* item in the
cache (like filterCache) or altogether? The speed improvement looks pretty
dra
I'm using solr function queries to generate my own custom score. I achieve
this using something along these lines:
q=_val_:"my_custom_function()"
This populates the score field as expected, but it also includes documents
that score 0. I need a way to filter the results so that scores below zero
ar
Gopal I did in fact test the same and it worked when I delete ted the
geolocation_0_coordinate and geolocation_1_coordinate. But that seems
weird, so I was thinking if there is something else I need to do to
avoid doing this awkward workaround.
Ravi Kiran Bhaskar
On Wed, Oct 10, 2012 at 12:36 PM,
Instead addfield method use setfield
On Oct 10, 2012 9:54 AM, "Ravi Solr" wrote:
> Gopal I did in fact test the same and it worked when I delete ted the
> geolocation_0_coordinate and geolocation_1_coordinate. But that seems
> weird, so I was thinking if there is something else I need to do to
>
I am using DirectXmlRequest to index XML. This is just a test case as
my client would be sending me a SOLR compliant XML. so I was trying to
simulate it by reading a doc from an exiting core and reindexing it.
HttpSolrServer server = new
HttpSolrServer("http://testsolr:8080/solr/my
> Do you have a "_version_" field in
> your schema. I believe SOLR 4.0
> Beta requires that field.
Probably he is hitting this https://issues.apache.org/jira/browse/SOLR-3432
Hi,
what is the best way to create a new Collection through the API so I get
an own config folder with schema.xml and solrconfig.xml inside the
created Core?
When I just create a Collection, only the data folder will be created
but the config folder with schema.xml and solrconfig.xml will be
Tomcat localhost log (not the catalina log) for my solr 3.6.1 (master)
instance contains lots of these exceptions but solr itself seems to be doing
fine... any ideas? I'm not seeing these exceptions being logged on my slave
servers btw, just the master where we do our indexing only.
Oct 9,
Something timed out, the other end closed the connection. This end
tried to write to closed pipe and died, something tried to catch that
exception and write its own and died even worse? Just making it up
really, but sounds good (plus a 3-year Java tech-support hunch).
If it happens often enough, s
Hi Stijn,
I have occasionally been seeing similar behavior when profiling one of
our Solr 3.6.1 servers using the similar AppDynamics product. Did you
ever hunt down what was causing this for you or get more info? (I
haven't been able to rule out truncated or filtered call-graphs that
don't show
On 10/9/2012 3:02 PM, Briggs Thompson wrote:
*Otis* - jstack is a great suggestion, thanks! The problem didn't happen
this morning but next time it does I will certainly get the dump to see
exactly where the app is swimming around. I haven't used
StreamingUpdateSolrServer
but I will see if that m
Hi Mikhail,
On Fri, Oct 5, 2012 at 7:15 AM, Mikhail Khludnev
wrote:
> okay. huge rows value is no.1 way to kill Lucene. It's not possible,
> absolutely. You need to rethink logic of your component. Check Solr's
> FieldCollapsing code, IIRC it makes second search to achieve similar goal.
> Also ch
: I have a weird problem, Whenever I read the doc from solr and
: then index the same doc that already exists in the index (aka
: reindexing) I get the following error. Can somebody tell me what I am
: doing wrong. I use solr 3.6 and the definition of the field is given
: below
When you us
You could be right. Going back in the logs, I noticed it used to happen less
frequently and always towards the end of an optimize operation. It is probably
my indexer timing out waiting for updates to occur during optimizes. The
errors grew recently due to my upping the indexer threadcount to
Thanks for the heads up. I just tested this and you are right. I am making
a call to "addBeans" and it succeeds without any issue even when the server
is down. That sucks.
A big part of this process is reliant on knowing exactly what has made it
into the index and what has not, so this a difficult
> When I look at the distribution of the Response-time I notice
> 'SolrDispatchFilter.doFilter()' is taking up 90% of the time.
That's pretty much the top-level entry point to Solr (from the servlet
container), so it's normal.
-Yonik
http://lucidworks.com
What do you want the results to be, persons? And the facets should be
interests or subinterests? Why are there two layers of interests anyway? Can
there my many subinterests under one interest? Is one of those two a name of
the interest which would look nice as a facet?
Anyway, have you rea
Hi,
We are using google translate to do something like what you (onlinespending)
want to do, so maybe it will help.
During indexing, we store the searchable fields from documents into a fields
named _en, _fr, _es, etc. So assuming we capture title and body from each
document, the fields are (t
unsuscribe
Hi everyone, I'm pleased to announce that SolrMeter 0.3.0 was released
today.
To see the issues resolved for this version go to:
http://code.google.com/p/solrmeter/issues/list?can=1&q=Milestone%3DRelease-0.3.0+status%3AResolved
To download the last version:
http://code.google.com/p/solrmeter/down
That is what is being discussed already. The thing is, at present, Solr
requires an even distribution of documents across shards, so you can't
just add another shard, assign it to a hash range, and be done with it.
The reason is down to the scoring mechanism used - TF/IDF (term
frequency/inverse d
Have you looked at WordDelimiterFilterFactory that was mentioned
earlier? Try a fieldType in the admin/analysis page that has
WDFF as part of the analysis chain. It would do exactly what you've
described so far.
WDFF splits the input up as tokens on non-alphanum characters,
alpha/num transitions a
On Wed, Oct 10, 2012 at 9:02 AM, O. Klein wrote:
> I don't want to tweak the threshold. For majority of cases it works fine.
>
> It's for cases where term has low frequency but is spelled correctly.
>
> If you lower the threshold you would also get incorrect spelled terms as
> suggestions.
>
Yeah
Cool!
Who made the logo? It's nice.
- Original Message -
| From: "Tomás Fernández Löbbe"
| To: solr-user@lucene.apache.org
| Sent: Wednesday, October 10, 2012 3:57:32 PM
| Subject: [ANN] new SolrMeter release
|
| Hi everyone, I'm pleased to announce that SolrMeter 0.3.0 was
| released
I have an other question, does the number of segment affect speed for
update index?
2012/10/10 jame vaalet
> Guys,
> thanks for all the inputs, I was continuing my research to know more about
> segments in Lucene. Below are my conclusion, please correct me if am wrong.
>
>1. Segments are ind
I want an update processor that runs Translation Party.
http://translationparty.com/
http://downloadsquad.switched.com/2009/08/14/translation-party-achieves-hilarious-results-using-google-transl/
- Original Message -
| From: "SUJIT PAL"
| To: solr-user@lucene.apache.org
| Sent: Wednesda
so other than commercial solutions, it seems like i need to have plugin
right? i couldnt find any open source solutions yet...
-
Zeki ama calismiyor... Calissa yapar...
--
View this message in context:
http://lucene.472066.n3.nabble.com/Auto-Correction-tp4012666p4013044.html
Sent from the S
Hapax legomena (terms with DF of 1) are very often typos. You can automatically
build a stopword file from these. If you want to be picky, you can use only
words with a very small distance from words with much larger DF.
- Original Message -
| From: "Robert Muir"
| To: solr-user@lucene.
Study index merging. This is awesome.
http://blog.mikemccandless.com/2011/02/visualizing-lucenes-segment-merges.html
Jame- opening lots of segments is not a problem. A major performance problem
you will find is 'Large Pages'. This is an operating-system strategy for
managing servers with 10s of
Hi Erickson,
Thanks for your valuable reply.
Actually we had tried with just storing one field and highlighting on that
field all the time , whether we search on it or not.
It sometimes occurs issue , like if i search with the term : 'hospitality' .
and I use field for highlighting , which havi
Hi,
Are you using a correct stopword file for the French language ? It is
very importante in order the the MLT component works fine.
You should also take a look at this document.
http://cephas.net/blog/2008/03/30/how-morelikethis-works-in-lucene/
MLT support in SolrJ is a an old story. May be
63 matches
Mail list logo