Hi all,
I wanted to share the issues we're having with Solr 1.4 to get some ideas
of things we can do in the short term that will buy us enough time to
validate Solr 4 before upgrading and not have 1.4 burn to the ground before
we get there.
We've been running Solr 1.4 in production for over 3 ye
Sorry, was away a bit & hence the delay.
I am inserting java strings into a java bean class, and then doing a
addBean() method to insert the POJO into Solr.
When i Query using either tomcat/jetty, I get these special characters. But
I have noted, if I change output to - "Shift-JIS" encoding then
Hello,
I know this has been discussed extensively in past posts. I have tried a
bunch of suggestions and I still have a few questions.
I am using solr4.4 from tomcat 7. I am using openjdk1.7 and I am using 1
solr core
I am trying to index a bunch of csv files (total size 13GB). Each csv file
Hi,
I am using solr4.3.1.
When I search something, the first time is too slow. How can I improve this?
[cid:image001.png@01CEDA11.6F0AA1B0]
The first time search
[cid:image002.png@01CEDA11.6F0AA1B0]
The second time search
Best Regards,
Boole Guo
Software Engineer, NESC-SH.MIS
+86-021-5153
On Tue, Nov 5, 2013 at 6:09 AM, Susheel Kumar <
susheel.ku...@thedigitalgroup.net> wrote:
> Hello,
>
> We have a scenario where we present results to users one from solr and
> other from real time web site search. The solr data we have locally
> available that we are able to index but other websit
Hello,
We have a scenario where we present results to users one from solr and other
from real time web site search. The solr data we have locally available that we
are able to index but other website search, we don't host data and it is real
time.
We are wondering if we can use some federated
Hello,
We are running our search system using Apache Solr 4.2.1 and using
Master/Slave model.
Our index has ~100M document. The index size is ~20gb.
The machine has 24 CPU and 48gb rams.
Our response time is pretty bad, median is ~4 seconds with 25
queries/second.
We noticed a couple of things
Erick,
It could have more than 4M distinct values. The purpose of this facet is
to display the most frequent, say top 500, urls to users.
Sascha,
Thanks for the info. I will look into this thread thing.
Mingfeng
On Mon, Nov 4, 2013 at 4:47 AM, Erick Erickson wrote:
> How many unique URLs do
Hi - we've seen that issue as well (SOLR-4260) and it happend many times with
older versions. The good thing is that we haven't seen it for a very long time
now so i silently assumed other fixes already solved the problem.
We don't know how to reproduce the problem but in older versions it seeme
Hi,
I have 2 replicas with different number of documents, Is it possible?
I'm using Solr 4.5.1
Replica 1:
version:77847
numDocs:5951879
maxDoc:5951978
deletedDocs:99
Replica 2:
version:76011
numDocs:5951793
maxDoc:5951965
deletedDocs:172
Is it not supposed tlog ensure the data consistency?
Thanks, Aloke.
Prefix solves this problem partially but wanted to see if we have solution
which works all the time. For e.g. if we search for "Ronald Wagner" and in
multivalues fields we will get result like below and I really want to get only
the values facets are "Wagner, Ronald S MD ", "Wag
Is it possible that you added stored="true" later, after some of the
documents were already indexed? Then the older documents would not have the
stored values. If so, you need to reindex the older documents.
-- Jack Krupansky
-Original Message-
From: gohome190
Sent: Monday, November
All fields are set to stored="true" in my schema.xml, and fl=* doesn't change
the output of the response. I even checked the logs, no errors on any
fields.
--
View this message in context:
http://lucene.472066.n3.nabble.com/Can-t-find-some-fields-in-solr-result-tp4099245p4099251.html
Sent from
Also also, adding fl=* still doesn't solve the problem, still only 19 fields
returning. And the missing fields definitely have values, because I can do a
specific solr query of a missing field and its value, and the entry show up
(with only 19 fields again though)
--
View this message in context
On Mon, Nov 4, 2013 at 2:19 PM, gohome190 wrote:
> I have a database that has about 25 fields for each entry. However, when I
> do a solr *:* query, I can only see the first 19 fields for each entry.
> However, I can successfully use the fields that don't show up as queries.
> So weird! Because t
Also, no errors in the Logging, and all fields are in the schema.xml.
--
View this message in context:
http://lucene.472066.n3.nabble.com/Can-t-find-some-fields-in-solr-result-tp4099245p4099247.html
Sent from the Solr - User mailing list archive at Nabble.com.
Hi,
I have a database that has about 25 fields for each entry. However, when I
do a solr *:* query, I can only see the first 19 fields for each entry.
However, I can successfully use the fields that don't show up as queries.
So weird! Because that means that solr has them, but isn't sending the
I've got a 4.4 solrCloud cluster running, and have an external process that
rebuilds the currency.xml file and uploads to zookeeper the latest version
every X minutes.
It looks like with CurrencyField the OpenExchangeRatesOrgProvider provider has
a refreshInterval setting, but the documentation
Hi Antoine,
I'll permit myself to respond in English, cause my written French is
slower;-)
Your problem is a well known amongst Sold users, the query parser splits
tokens by empty space, so the analyser never sees input 'la redoutte' but
it receives 'la' 'reroute'. You can of course enclose your se
Bonjour Antoine,
Je ne vois que 2 solutions à ton problème.
1) utilisation de synonymes mais tu seras limités au cas connus d'avance
seulement alors c'est une solution qui ne scale pas à long terme.
2) sinon tu dois envisager d'avoir un deuxième champ (probablement en
CopyField) qui n'utiliser
bq: start=0&rows=30
Let's see the start and rows parameters for a few of
your queries, because on the surface this makes
no sense. If you're always starting at 0, this
shouldn't be happening
And you say "the second query is visibly slower". You're
talking about the "deep paging" problem, whic
Well, I do have to question why you need to do anything.
Just don't send updates to the remote machines..
But do remember that all nodes in SolrCloud can be equal,
which is one of the points.
FWIW,
Erick
On Mon, Nov 4, 2013 at 10:34 AM, Uwe Reh wrote:
> F***, this is the answer, I was
Bonjour,
je souhaite faire en sorte que les recherches dans un champs de type texte
renvoient des résultats même si les espaces sont mal saisies
(par exemple : "la redoute"="laredoute").
Aujourd'hui mon champ texte est défini de la façon suivante :
Merci d'av
The query time increases because in order to calculate the set of documents
that belongs in page N, you must first calculate all the pages prior to
page N, and this information is not stored in between requests.
Two ways of speeding this stuff up are to request bigger pages, and/or use
filter quer
F***, this is the answer, I was afraid of. ;-)
I hoped, there could be anything, similar to
http://zookeeper.apache.org/doc/trunk/zookeeperObservers.html.
Nevertheless, thank you.
Uwe
Am 04.11.2013 14:14, schrieb Erick Erickson:
In this situation, I'd consider going with the older master/slav
Do you want to look thru then all ? Have you considered Lucene API? Not sure if
that is better but it might be.
Bill Bell
Sent from mobile
> On Nov 4, 2013, at 6:43 AM, "michael.boom" wrote:
>
> I saw that some time ago there was a JIRA ticket dicussing this, but still i
> found no relevant i
You could pre create a bunch of directories and base configs. Create as needed.
Then use schema less API to set it up ... Or make changes in a script and
reload the core..
Bill Bell
Sent from mobile
> On Nov 4, 2013, at 6:06 AM, Erick Erickson wrote:
>
> Right, this has been an issue for a w
Thank you, Erick!
-
Thanks,
Michael
--
View this message in context:
http://lucene.472066.n3.nabble.com/SolrCloud-different-machine-sizes-tp4099138p4099195.html
Sent from the Solr - User mailing list archive at Nabble.com.
I saw that some time ago there was a JIRA ticket dicussing this, but still i
found no relevant information on how to deal with it.
When working with big nr of docs (e.g. 70M) in my case, I'm using
start=0&rows=30 in my requests.
For the first req the query time is ok, the next one is visibily slow
In this situation, I'd consider going with the older master/slave
setup. The problem is that in SolrCloud, you have a lot of chatter
back and forth. Presumably the connection to your local instances
is rather slow, so if you're adding data to your index, each and
every add has to be communicated in
"It Depends"(tm). As long as you're getting adequate
throughput on the smaller machines, adding bigger
machines won't make it any _slower_. But sometime
as you add documents, the smaller machines will start
having memory issues etc. and you will see an impact.
Fortunately, the migrating path to la
Right, this has been an issue for a while, there's no current
way to do this.
Someday, I'll be able to work on SOLR-4779 which should
go some toward making this work more easily. It's still not
exactly what you're looking for, but it might work.
Of course with SolrCloud you can specify a configur
Thanks for closing this off!
Erick
On Sun, Nov 3, 2013 at 8:24 PM, Jack Park wrote:
> Issue resolved, with great thanks to Tim Casey.
> The issue was based on my own poor understanding of the mechanics of
> ZooKeeper. The "host" setting in solr.xml must find the correct value
> and not default
What is your commit strategy? A hard commit
(openSearcher=true or false doesn't matter)
should close the current tlog file, open
a new one and delete old ones. That said, there
will be enough tlog files kept around to hold at
least 100 documents. So if you're committing
too often (say after every d
The problem is there are about a dozen places where the character
encoding can be mis-configured. The problem you're seeing above
actually looks like a problem with the character set configured in
your browser, it may have nothing to do with what's actually in Solr.
You might write small SolrJ pro
If the bitset is something you control you can use the binary
field type, although it's not a horribly efficient way to store binary
data.
If the bitset is bounded, you could do something with indexing
N long values that will contain the set and write a custom
similarity class to work with it.
Be
How many unique URLs do you have in your 9M
docs? If your 9M hits have 4M distinct URLs, then
this is not very valuable to the user.
Sascha:
Was that speedup on a single field or were you faceting over
multiple fields? Because as I remember that code spins off
threads on a per-field basis, and if
Well, the easiest thing to do is cheat. Fire up the admin UI, should be
something like
http://localhost:8983/solr
See if anything drops down in the "core selector" box and select it. Then
select a core,
the default is "collection1". Now you should see a "query" section, go
there and
scroll down to
Hi,
as service provider for libraries we run a small cloud (1 collection, 1
shard, 3 replicas). To improve the local reliability we want to offer
the possibility to set up own local replicas.
As fas as I know, this can be easily done just by adding a new node to
the cloud. But the external no
The whole point of SolrCloud is to automatically take care of all
the ugly details of synching etc. You should be able to add a node
and, assuming it has been assigned to a shard, do nothing.
The node will start up, synch with the leader, get registered and
start handling queries without you having
I've setup my SolrCloud using AWS and i'm currently using 2 average machines.
I'm planning to ad one more bigger machine (by bigger i mean double the
RAM).
If they all work in a cluster and the search being distributed, will the
smaller machines limit the performance the bigger machine could offer
You cannot disable coordination factor at query time at this moment so you need
to change your Similarity in the schema. Easiest to do this is to set the
SchemaSimilarityFactory. It defaults to TFIDF but without queryNorm and coord
or use another similarity implementation.
-Original messa
i didnt understnad what i need to do.
Should i make any changes in the CategorizeDocumentFactory or change the
version of the solr core jars?
Thanks,
On Thu, Oct 31, 2013 at 2:35 PM, Koji Sekiguchi wrote:
> Caused by: java.lang.ClassCastException: class com.mahout.solr.classifier.
>> Categoriz
The core admin CREATE function requires that the new instance dir and
schema/config exist already. Is there a particular reason for this? It
would be incredible convenient if I could create a core with a new
schema and new config simply by calling CREATE (maybe providing the
contents of config.
44 matches
Mail list logo