Wow! That was the most pointed, concise discussion of hardware requirements
I've seen to date, and it's fabulously helpful, thank you Shawn! We
currently have 2 servers that I can dedicate about 12GB of ram to Solr on
(we're moving to these 2 servers now). I can upgrade further if it's needed
& ju
On 4/18/2013 11:02 PM, sawanverma wrote:
> Giving content:[* TO *] gives the same error but when I give content:[a TO z]
> it works fine. Can you please explain what does it mean when I give
> content:[a TO z]? Can I use this as workaround? The datatype of content field
> is text_en.
That synta
Shawn,
Giving content:[* TO *] gives the same error but when I give content:[a TO z]
it works fine. Can you please explain what does it mean when I give content:[a
TO z]? Can I use this as workaround? The datatype of content field is text_en.
Thanks again for you replies and suggestions.
Regar
On 4/18/2013 8:12 PM, David Parks wrote:
> I think I still don't understand something here.
>
> My concern right now is that query times are very slow for 120GB index (14s
> on avg), I've seen a lot of disk activity when running queries.
>
> I'm hoping that distributing that query across 2 serve
Hi!
I am using SOLR 4.2.1.
My solrconfig.xml contains the following:
text_spell
MySpellchecker
spell
solr.DirectSolrSpellChecker
internal
0.5
2
1
5
3
0.01
10
id
MySpell
I think I still don't understand something here.
My concern right now is that query times are very slow for 120GB index (14s
on avg), I've seen a lot of disk activity when running queries.
I'm hoping that distributing that query across 2 servers is going to improve
the query time, specifically I
you just change date filedtype to string
--
View this message in context:
http://lucene.472066.n3.nabble.com/Solr-indexing-tp4057017p4057136.html
Sent from the Solr - User mailing list archive at Nabble.com.
Do you mean a range (e.g. [4 TO 17]) or a prefix (e.g. 10*)? For range
you need to index it as a number. For prefix, string is probably
better. Than, just use standard query parameters.
Regards,
Alex.
Personal blog: http://blog.outerthoughts.com/
LinkedIn: http://www.linkedin.com/in/alexandrera
if i wanna search on subsets of number,what can i do?
--
View this message in context:
http://lucene.472066.n3.nabble.com/Solr-system-and-numbers-tp482519p4057134.html
Sent from the Solr - User mailing list archive at Nabble.com.
On the query side, another down side i see would be that for a given memory
pool, you'd have to share it with more cores because every replica uses
it's own cache.
True for the inner solr caching (JVM's heap) and OS caching as well.
Adding a replicated core creates a new data set (index) that will
Hello,
After creating a distributed collection on several different servers I
sometimes get to deal with failing servers (cores appear "not available" =
grey) or failing cores ("Down / unable to recover" = brown / red).
In case i wish to delete this errorneous collection (through collection
API) on
re: more replicas -
pro: you can scale your query processing workload because you have more
nodes available to service queries, eg 1,000 QPS sent to Solr with 5
replicas, then each is only processing roughly 200 QPS. If you need to
scale up to 10K QPS, then add more replicas to distribute the incr
I've been playing around with the PositionLengthAttribute for a few days,
and it doesn't seem to have any effect at all.
I'm aware that position length is not stored in the index, as explained in
this blog post.
http://blog.mikemccandless.com/2012/04/lucenes-tokenstreams-are-actually.html
Howeve
thnx
--
View this message in context:
http://lucene.472066.n3.nabble.com/Paging-and-sorting-in-Solr-tp4057000p4057098.html
Sent from the Solr - User mailing list archive at Nabble.com.
On 4/18/2013 1:59 PM, hassancrowdc wrote:
Is there any way i can change the response xml from delta import query:
locathost:8080/solr/devices/dataimport?command=delta-import&commit=true
I want to change the response.
The response is created by the dataimporthandler source code. It's a
contri
Is there any way i can change the response xml from delta import query:
locathost:8080/solr/devices/dataimport?command=delta-import&commit=true
I want to change the response.
--
View this message in context:
http://lucene.472066.n3.nabble.com/Change-the-response-of-delta-import-tp4057093.html
Hmm... Just found this JIRA: https://issues.apache.org/jira/browse/SOLR-3191
I think I have answered my question.
-Original Message-
From: Andrew Lundgren [mailto:lundg...@familysearch.org]
Sent: Thursday, April 18, 2013 1:21 PM
To: solr-user@lucene.apache.org
Subject: Making fields un
We have a few internal fields that we would like to restrict from being
returned in result sets.
I have seen how fl is used in specify fields that you do what returned, I am
kind of looking for the opposite. There are just a few fields that don't make
sense to return to our clients.
Is there
Hi
I am using solr 4.2, and have set up spatial search config as below
http://wiki.apache.org/solr/SpatialSearch#Schema_Configuration
But everything I make an update to a document,
http://wiki.apache.org/solr/UpdateJSON#Updating_a_Solr_Index_with_JSON
more values of the *_coordinates fields get
On 4/18/2013 11:53 AM, sawanverma wrote:
Shawn,
Thanks a lot for your reply. But I am confused again if the following query is
complex.
http://localhost:8983/solr/test/select/?q=content:*&fl=content&hl=true&hl.fl=content&hl.maxAnalyzedChars=31375&start=64&rows=1&sort=obs_date%20desc
I hardly
I want to elevate certain documents differently depending a a certain fq
parameter in the request. I've read of somebody coding solr to do this but
no code was shared. Where would I start looking to implement this feature
myself?
--
View this message in context:
http://lucene.472066.n3.nabble.c
Shawn,
Thanks a lot for your reply. But I am confused again if the following query is
complex.
http://localhost:8983/solr/test/select/?q=content:*&fl=content&hl=true&hl.fl=content&hl.maxAnalyzedChars=31375&start=64&rows=1&sort=obs_date%20desc
Is that because of content : *? The only unusual thin
Run checksums on all files in both master and slave, and verify that
they are the same.
TCP/IP has a checksum algorithm that was state-of-the-art in 1969.
On 04/18/2013 02:10 AM, Victor Ruiz wrote:
Also, I forgot to say... the same error started to happen again.. the index
is again corrupted :(
Hi all,
I am trying to sort results based on multiple fields aliased as one. Is that
possible? While solr does not complain (no error, results OK, etc etc etc) it
fails to sort the hits appropriately. I've attached the query, relevant schema
part and result.
I am very curious to know if that
Hi -
when I execute a shard query like:
[myhost]:8080/solr/mycore/select?q=type:message&rows=14&...&qt=standard&wt=standard&explainOther=&hl.fl=&shards=solrserver1:8080/solr/mycore,solrserver2:8080/solr/mycore,solrserver3:8080/solr/mycore
everything works fine until I query against a large
yeah I realize using ${solr.core.name} for dataDir must be the cause for the
issue we see... it is fair to say the SWAP and RENAME just create an alias
that still points to the old datadir.
if they can not fix it then it is not a bug :-) at least we understand
exactly what is going on there.
than
On 4/18/2013 6:42 AM, J Mohamed Zahoor wrote:
I dont yet know if this is the reason...
I am looking if jetty has some limit on accepting connections..
Are you using the Jetty included with Solr, or a Jetty installed
separately? The Jetty included with Solr has a maxThreads value of
1 in
On 4/18/2013 6:02 AM, sawanverma wrote:
Hi Yonik,
Thanks for your reply.
I tried increasing the maxClauseCount to a bigger value. But what could be the
ideal value and will not that hit the performance? What are the chances that if
we increase the value we will not face this issue again?
Ch
Solr dates are always "Z", GMT.
-- Jack Krupansky
-Original Message-
From: hassancrowdc
Sent: Thursday, April 18, 2013 11:49 AM
To: solr-user@lucene.apache.org
Subject: Solr indexing
Solr is not showing the dates i have in database. any help? is solr
following
any specific timezone?
20G is allocated to Solr already.
Ming
On Wed, Apr 17, 2013 at 11:56 PM, Toke Eskildsen
wrote:
> On Wed, 2013-04-17 at 20:06 +0200, Mingfeng Yang wrote:
> > I am doing faceting on an index of 120M documents,
> > on the field of url[...]
>
> I would guess that you would need 3-4GB for that.
> H
Maybe you have your name field as "text" rather than "string". Don't try
sorting "text" fields - make a copy (copyField) to a string field and sort
the string field. So, for example, have "name" as "text" for keyword search,
and "name_s" as "string" for sorting (and faceting.)
-- Jack Krupansk
On Apr 18, 2013, at 10:49 AM, hassancrowdc wrote:
> Solr is not showing the dates i have in database. any help? is solr following
> any specific timezone? On my database my date is 2013-04-18 11:29:33 but
> solr shows me "2013-04-18T15:29:33Z". Any help
Solr knows nothing of timezones. Solr
Solr is not showing the dates i have in database. any help? is solr following
any specific timezone? On my database my date is 2013-04-18 11:29:33 but
solr shows me "2013-04-18T15:29:33Z". Any help
--
View this message in context:
http://lucene.472066.n3.nabble.com/Solr-indexing-tp4057017.htm
Hi,
I double checked. It is the field. if i sort through manufacturer field it
sorts but if i sort through name it does not sort. both the field has
everything same. Is there any difference in sorting alphabetically or size
of the word?
--
View this message in context:
http://lucene.472066.n3
When using a field name that doen't follow conventions (basically like
Java identifiers), try this:
fl=field(098765-765-788558-7654_userid)
Or enclose it in quotes if it's really a whacky field name:
fl=field("098765-765-788558-7654_userid")
-Yonik
http://lucidworks.com
On Thu, Apr 18, 2013 a
I am sure it does the sorting first (since I always done that).
On 04/18/2013 02:49 PM, hassancrowdc wrote:
I have done paging using solr rows and start query attributes.
But now it shows me result with that is sorted page wise.
I meant if i have the following scenario:
rows=25&start=0&sort=ma
I have done paging using solr rows and start query attributes.
But now it shows me result with that is sorted page wise.
I meant if i have the following scenario:
rows=25&start=0&sort=manufacturer asc
It will give me first 25 matching results and then sort only those.
I want it to sort all
Hi,
If I disable (comment) the updateLog bloc, this will affect indexing result:
--
View this message in context:
http://lucene.472066.n3.nabble.com/solr4-disable-updateLog-tp4056998.html
Sent from the Solr - User mailing list archive at Nabble.com.
Hi,
I have a field which has data like this:
Where can have from 1 to 10 letters strings and can have up
to 4 digits.
It is defined like this:
When the user enters foo, i search for foo directly or something that starts
with "foo ".
I don't wa
Hi Dave,
This sounds more like a budget / deployment issue vs. anything
architectural. You want 2 shards with replication so you either need
sufficient capacity on each of your 2 servers to host 2 Solr instances or
you need 4 servers. You need to avoid starving Solr of necessary RAM, disk
performa
But my concern is this, when we have just 2 servers:
- I want 1 to be able to take over in case the other fails, as you point
out.
- But when *both* servers are up I don't want the SolrCloud load balancer
to have Shard1 and Replica2 do the work (as they would both reside on the
same physical serv
Thanks, Jack. Sorry, took me a while to reply :)
It sounds like sentence/paragraph level searches won't be easy.
Warm regards,
Alex
-Original Message-
From: Jack Krupansky [mailto:j...@basetechnology.com]
Sent: 15 April 2013 5:09 PM
To: solr-user@lucene.apache.org
Subject: Re: Tokenize
If you understand the underlying lucene searcher it will be easy to
understand what's happening at solr level.
Otis
Solr & ElasticSearch Support
http://sematext.com/
On Apr 18, 2013 3:22 AM, "Furkan KAMACI" wrote:
> Thanks for explanations. I should read deep about the lifecycle of Searcher
> ob
Correct. This is what you want if server 2 goes down.
Otis
Solr & ElasticSearch Support
http://sematext.com/
On Apr 18, 2013 3:11 AM, "David Parks" wrote:
> Step 1: distribute processing
>
> We have 2 servers in which we'll run 2 SolrCloud instances on.
>
> We'll define 2 shards so that both ser
Hi,
What is the issue though? :)
Otis
Solr & ElasticSearch Support
http://sematext.com/
On Apr 18, 2013 2:53 AM, "William Bell" wrote:
> We are getting an issue when using a GUID got a field in Solr 4.2. Solr 3.6
> is fine. Something like:
>
> fl=098765-765-788558-7654_userid as a string stored
On Apr 18, 2013, at 8:40 AM, jmozah wrote:
>
>
> On 16-Apr-2013, at 11:16 PM, Mark Miller wrote:
>
>> Are you using a the concurrent low pause garbage collector or perhaps G1?
>
>
> I use the default one which comes in jdk 1.7.
It varies by platform, but 99% that means you are using the
I dont yet know if this is the reason...
I am looking if jetty has some limit on accepting connections..
./zahoor
On 18-Apr-2013, at 12:52 PM, J Mohamed Zahoor wrote:
>
> Thanks for this.
> The reason i asked this was.. when i fire 30 queries simultaneously from 30
> threads using the same
On 16-Apr-2013, at 11:16 PM, Mark Miller wrote:
> Are you using a the concurrent low pause garbage collector or perhaps G1?
I use the default one which comes in jdk 1.7.
>
> Are you able to use something like visualvm to pinpoint what the bottleneck
> might be?
Unfortunately.. it is pro
Hi
I am using SOlr 4.1 with 6 shards.
i want to find out some "price" stats for all the days in my index.
I ended up using stats component like
"stats=true&stats.field=price&stats.facet=timestamp".
but it throws up error like
Invalid Date String:' #1;#0;#0;#0;'[my(#0;'
My Question is :
> You are missing an essential part: Both the facet and the sort
> structures needs to hold one reference for each document
> _in_the_full_index_, even when the document does not have any values in
> the fields.
>
Wow, thank you for this awesome explanation! This is where the penny
dropped for me.
Yonik,
When i remove the sort part from the query below it works fine. But with
sort it throws the exception
http://localhost:8983/solr/test/select/?q=content:*&fl=content&hl=true&hl.fl=content&hl.maxAnalyzedChars=31375&start=64&rows=1&sort=obs_date%20desc
-- > Throws Exception
http://localhos
Hi Yonik,
Thanks for your reply.
I tried increasing the maxClauseCount to a bigger value. But what could be the
ideal value and will not that hit the performance? What are the chances that if
we increase the value we will not face this issue again?
As you asked pasting below the full trace of
Update:
Also remove your range queries from the main query and specify it as a
filter query.
Best
Pravesh
--
View this message in context:
http://lucene.472066.n3.nabble.com/TooManyClauses-maxClauseCount-is-set-to-1024-tp4056965p4056969.html
Sent from the Solr - User mailing list archive at
Thanks Pravesh.
But won't that hit the query performance? Still what would be the ideal value
to increase? Say this error may come even if we increase the value from 1024 to
say 5120?
Have tried increasing the value and it had hit the performance.
Regards,
Sawan
From: pravesh [via Lucene] [mai
Can you provide a full stack trace of the exception?
There's a maxClauseCount in solrconfig.xml that you can increase to
work around the issue.
-Yonik
http://lucidworks.com
On Thu, Apr 18, 2013 at 7:31 AM, sawanverma wrote:
> Its quite confusing about this error.
>
> I had a situation where i
Just increase the value of /maxClauseCount/ in your solrconfig.xml. Keep it
large enough.
Best
Pravesh
--
View this message in context:
http://lucene.472066.n3.nabble.com/TooManyClauses-maxClauseCount-is-set-to-1024-tp4056965p4056966.html
Sent from the Solr - User mailing list archive at Nabbl
Its quite confusing about this error.
I had a situation where i have to turn on the highlighting. In some cases
though the number of docs found for a particular query was for example say
2, the highlighting was coming only for 1. I did some checks and found that
that particular text searched was i
On Thu, 2013-04-18 at 11:59 +0200, John Nielsen wrote:
> Yes, thats right. No search from any given client ever returns
> anything from another client.
Great. That makes the 1 core/client solution feasible.
[No sort & facet warmup is performed]
[Suggestion 1: Reduce the number of sort fields by
>
> > http://172.22.51.111:8000/solr/default1_Danish/search
>
> [...]
>
> > &fq=site_guid%3a(10217)
>
> This constraints to hits to a specific customer, right? Any search will
> only be in a single customer's data?
>
Yes, thats right. No search from any given client ever returns anything
from anot
Also, I forgot to say... the same error started to happen again.. the index
is again corrupted :(
--
View this message in context:
http://lucene.472066.n3.nabble.com/SolrCloud-vs-Solr-master-slave-replication-tp4055541p4056926.html
Sent from the Solr - User mailing list archive at Nabble.com.
Thank you again for your answer Shawn.
Network card seems to work fine, but we've found segmentation faults, so now
our hosting provider is going to run a full hw check. Hopefully they'll
replace the server and problem wil be solved
Regards,
Victor
--
View this message in context:
http://l
On Thu, 2013-04-18 at 08:34 +0200, John Nielsen wrote:
>
[Toke: Can you find the facet fields in any of the other caches?]
> Yes, here it is, in the field cache:
> http://screencast.com/t/mAwEnA21yL
>
Ah yes, mystery solved, my mistake.
> http://172.22.51.111:8000/solr/default1_Danish/search
Thanks for this.
The reason i asked this was.. when i fire 30 queries simultaneously from 30
threads using the same CloudSolrServer instance,
some queries gets fired after a delay.. sometime the delay is 30-50 seconds...
In solr logs i can see.. 20+ queries get fired almost immediately... but s
Thanks for explanations. I should read deep about the lifecycle of Searcher
objects. Should I read them from a Lucene book or is there any Solr
documentation or books covers it?
2013/4/18 Jack Krupansky
> "merging indexes"
>
> The proper terminology is "merging segments".
>
> Until the new, merg
Step 1: distribute processing
We have 2 servers in which we'll run 2 SolrCloud instances on.
We'll define 2 shards so that both servers are busy for each request
(improving response time of the request).
Step 2: Failover
We would now like to ensure that if either of the servers goes down (we
65 matches
Mail list logo