Thanks for your answersCurrently I have one machine (6 cores, 148 GB RAM, 2.5
TB HDD) and I index around 60 million documents for a day - the index size is
around 26GB.I do have customer-ID today and I use it for the queries. I don't
split the customers but I get bad performance.
If I will make
Hi, just a general question as I was unable to find any old posts relating
to stats/percentile/facets performance/cache settings.
I have been using Solr since version 4.0 , now using the latest v. 5.2.1.
What I have done:
- Increase heap memory to 30gb
- Experimented with the cache settings
-
yura last wrote:
> I have one machine (6 cores, 148 GB RAM, 2.5 TB HDD) and I index
> around 60 million documents for a day - the index size is around 26GB.
So 1 billion documents would be approximately 500GB.
...and 10 billion/day in 90 days would be 450TB.
> I do have customer-ID today and I
I expect that the amount of concurrent customers will be low.Today I have 1
machine so I don't have the capacity for all the data. Because of that I am
thinking on a new "cluster" solution.Today is 1 billion each day for 90 days =
90 billion (around 45TB data).
I should prefer a lot of machines
yura last wrote:
> I expect that the amount of concurrent customers will be low.
> Today I have 1 machine so I don't have the capacity for all
> the data.
You aim for 90 billion documents in the first go and want to prepare for 10
times that. Your current test setup is 60M documents, which means
Erik,
After Walters reply I started thinking along the lines you mentioned and
realized the folly of doing that!
Scott
On 8/15/2015 9:57 PM, Erick Erickson wrote:
Scott:
You better not even let them access Solr directly.
http://server:port/solr/admin/collections?ACTION=delete&name=collect
Thanks, i didn't know you could do this, I'll check this out.
On Aug 15, 2015 12:54 PM, "Alexandre Rafalovitch"
wrote:
> From the "teaching to fish" category of advice (since I don't know the
> actual answer).
>
> Did you try "Analysis" screen in the Admin UI? If you check "Verbose
> output" mark
I exactly have the same requirement
> On 13-Aug-2015, at 2:12 pm, Kiran Sai Veerubhotla wrote:
>
> does solr support joins?
>
> we have a use case where two collections have to be joined and the join has
> to be on the faceted results of the two collections. is this possible?
Is there a way to get the list of terms that matched in a query response?
I realize the q parameter is returned, but I'm looking for just the list
of terms and not the operators.
Scott
--
To those leaning on the sustaining infinite, to-day is big with blessings.
Mary Baker Eddy
Scott Derrick wrote:
> Is there a way to get the list of terms that matched in a query response?
Add debug=query to your request:
https://wiki.apache.org/solr/CommonQueryParameters#debug
You might also want to try
http://splainer.io/
- Toke Eskildsen
with a query like
q=mar*
I tried the debugQuery=true but it just said
rawquerystring": "mar*",
"querystring": "mar*",
"parsedquery": "_text_:mar*",
"parsedquery_toString": "_text_:mar*",
I already know that!
one document match's Mary
another matches Mary and martyr
I will look at splainer.io
I have a solr cloud with 3 nodes. I've added password protection following the
steps here:
http://stackoverflow.com/questions/28043957/how-to-set-apache-solr-admin-password
Now only one node is able to load the collections. The others are getting 401
Unauthorized error when loading the collecti
I did a dataimport with 'clean' set to false.
The DIH status upon completion was:
idle
1
6843427
6843427
0
2015-08-16 16:50:54
Indexing completed. Added/Updated: 6843427 documents. Deleted 0 documents.
Whereas when I query using 'query?q=*:*&rows=0', I get the following count
{
"responseHead
You can do what are called "pseudo joins", which are eqivalent to a
nested query in SQL. You get back data from one core, based upon
criteria in the other. You cannot (yet) merge the results to create a
composite document.
Upayavira
On Sun, Aug 16, 2015, at 06:02 PM, Nagasharath wrote:
> I exactl
You almost certainly have a non-unique ID field. Some documents are
overwritten during indexing. Try it with a clean index, and then review
the number of deleted documents (updates are a delete then insert
action). Deletes are calculated with maxDocs minus numDocs.
Upayavira
On Sun, Aug 16, 2015,
This isn't going to be easy. Why do you need to know? Especially
with wildcards this'll be "challenging".
For the specific docs that are returned, highlighting will tell you _some_
of them. Why only some? Because usually only the best N snippets are
returned, say 3 (it's configurable). And it's st
Is there any chance of this feature(merge the results to create a composite
document) coming out in the next release 5.3 ?
On Sun, Aug 16, 2015 at 2:08 PM, Upayavira wrote:
> You can do what are called "pseudo joins", which are eqivalent to a
> nested query in SQL. You get back data from one cor
bq: Is there any chance of this feature(merge the results to create a composite
document) coming out in the next release 5.3
In a word "no". And there aren't really any long-range plans either that I'm
aware of.
You could also explore streaming aggregation, if the need here is more
batch-oriented
https://issues.apache.org/jira/browse/SOLR-7090
I see this jira open in support of joins which might solve the problem.
On Sun, Aug 16, 2015 at 2:51 PM, Erick Erickson
wrote:
> bq: Is there any chance of this feature(merge the results to create a
> composite
> document) coming out in the next r
I'm searching a collection of documents.
When I build my results page I provide a link to each document. If the
user click the link I display the document with all the matched terms
highlighted. I need to supply my highlighter a list of words to hilight
in the doc.
I thought the highlight
splainer doesn't return anything the debug parameter can.
On 8/16/2015 11:39 AM, Toke Eskildsen wrote:
Scott Derrick wrote:
Is there a way to get the list of terms that matched in a query response?
Add debug=query to your request:
https://wiki.apache.org/solr/CommonQueryParameters#debug
You
" You almost certainly have a non-unique ID field."
Yes it is not absolutely unique but do not think it is at this 1 to 6 ratio.
"Try it with a clean index, and then review the number of deleted documents
(updates are a delete then insert action) "
I tried on a new instance - same effect. I do
I'm using a dataimporthandler
class="org.apache.solr.handler.dataimport.DataImportHandler">
html-config.xml
I'm using the xsl attribute on all the entities, but this one is
throwing an excpetion. This xsl is used in a production document
conversion process with no problems
On 8/16/2015 12:09 PM, Tarala, Magesh wrote:
> I have a solr cloud with 3 nodes. I've added password protection following
> the steps here:
> http://stackoverflow.com/questions/28043957/how-to-set-apache-solr-admin-password
>
> Now only one node is able to load the collections. The others are ge
Thanks Shawn!
We are on 4.10.4. Will consider 5.x upgrade shortly.
-Original Message-
From: Shawn Heisey [mailto:apa...@elyograg.org]
Sent: Sunday, August 16, 2015 9:05 PM
To: solr-user@lucene.apache.org
Subject: Re: Solr Cloud Security Question
On 8/16/2015 12:09 PM, Tarala, Magesh w
Hi,
You should check whether there were deletions by navigating to solr admin
core admin page. Example url
http://localhost:8983/solr/#/~cores/test_shard1_replica1, check for
numDocs, maxDocs and deletedDocs. If numDocs remains equal to maxDocs, then
you confirm that there were no updations (as re
26 matches
Mail list logo