RAM Usage Debugging

2013-07-29 Thread Furkan KAMACI
When I look at my dashboard I see that 27.30 GB available for JVM, 24.77 GB is gray and 16.50 GB is black. I don't do anything on my machine right now. Did it cache documents or is there any problem, how can I learn it?

RE: new field type - enum field

2013-07-29 Thread Elran Dvir
Thanks, Erick. I have tried it four times. It keeps failing. The problem reoccurred today. Thanks. -Original Message- From: Erick Erickson [mailto:erickerick...@gmail.com] Sent: Monday, July 29, 2013 2:44 AM To: solr-user@lucene.apache.org Subject: Re: new field type - enum field You

Re: Two-steps queries with different sorting criteria

2013-07-29 Thread Otis Gospodnetic
Hi, Not sure if this was already answered, but... If the source of the problem are overly general queries, I would try to eliminate or minimize that. For example: * offering query autocomplete functionality can have an affect on query length and precision * showing related searches (derived from

.lock file not created when making a backup snapshot

2013-07-29 Thread Artem Karpenko
Hi, when making a backup snapshot using "/replication?command=backup" call, a snapshot directory is created and starts to be filled, but appropriate .lock file is not created so it's impossible to check when backup is finished. I've taken a look at code and it seems to me that lock.obtain() c

AND Queries

2013-07-29 Thread Furkan KAMACI
I am searching for a keyword as like that: lang:en AND url:book pencil cat It returns me results however none of them includes both book, pencil and cat keywords. How should I rewrite my query? I tried this: lang:en AND url:(book AND pencil AND cat) and looks like OK. However this not: lang:

Re: AND Queries

2013-07-29 Thread Rafał Kuć
Hello! Try turning on debugQuery and see what I happening. From what I see you are searching the en term in lang field, the book term in url field and the pencil and cat terms in the default search field, but from your second query I see that you would like to find the last two terms in the url.

Re: AND Queries

2013-07-29 Thread fbrisbart
It's because when you don't specify any field, it's the default field which is used. So, lang:en AND url:book AND pencil AND cat is interpreted as : ang:en AND url:book AND :pencil AND :cat The default search field is defined in your schema.xml file (defaultSearchField) Franck Brisbart Le l

Re: .lock file not created when making a backup snapshot

2013-07-29 Thread Mark Triggs
Hi Artem, I noticed this recently too. I created a JIRA issue here: https://issues.apache.org/jira/browse/SOLR-5040 Cheers, Mark Artem Karpenko writes: > Hi, > > when making a backup snapshot using "/replication?command=backup" > call, a snapshot directory is created and starts to be fil

swap and GC

2013-07-29 Thread Bernd Fehling
Something interesting I have noticed today, after running my huge single index (49 mio. records / 137 GB index) for about a week and replicating today I recognized that the heap usage after replication did not go down as expected. Expected means if solr is started I have a heap size between 4 to 5

Re: AND Queries

2013-07-29 Thread Furkan KAMACI
When I send that query: select?pf=url^10+title^8&fl=url,content,title&start=0&q=lang:en+AND+(cat+AND+dog+AND+pencil)&qf=content^5+url^8.0+title^6&wt=xml&debugQuery=on It is debugged as: +(+lang:en +(+(content:cat^5.0 | title:cat^6.0 | url:cat^8.0) +(content:dog^5.0 | title:dog^6.0 | url:dog^8.0)

Re: AND Queries

2013-07-29 Thread fbrisbart
Because you specified the search fields to use with 'qf' which overrides the default search field. Franck Brisbart Le lundi 29 juillet 2013 à 13:01 +0300, Furkan KAMACI a écrit : > When I send that query: > > select?pf=url^10+title^8&fl=url,content,title&start=0&q=lang:en+AND+(cat+AND+dog+AND+p

solr query range upper exclusive

2013-07-29 Thread alin1918
q=price_1_1:[197 TO 249] and q=*:*&fq=price_1_1:[197 TO 249] returns 2 records but I have two records with the price_1_1 = 249, it seams that the upper range is exclusive and I can't figure out why, can you help me? -- View this message in context: http://lucene.472066.n3.nabble.com/sol

Re: processing documents in solr

2013-07-29 Thread Erick Erickson
No SolrJ doesn't provide this automatically. You'd be providing the counter by inserting it into the document as you created new docs. You could do this with any kind of document creation you are using. Best Erick On Mon, Jul 29, 2013 at 2:51 AM, Aditya wrote: > Hi, > > The easiest solution wou

Re: new field type - enum field

2013-07-29 Thread Erick Erickson
OK, if you can attach it to an e-mail, I'll attach it. Just to check, though, make sure you're logged in. I've been fooled once or twice by being automatically signed out... Erick On Mon, Jul 29, 2013 at 3:17 AM, Elran Dvir wrote: > Thanks, Erick. > > I have tried it four times. It keeps failin

Re: Performance vs. maxBufferedAddsPerServer=10

2013-07-29 Thread Mark Miller
SOLR-4816 won't address this - it will just speed up *different* parts. There are other things that will need to be done to speed up that part. - Mark On Jul 26, 2013, at 3:53 PM, Erick Erickson wrote: > This is current a hard-coded limit from what I've understood. From what > I remember, Mark

DIH to index the data - 250 millions - Need a best architecture

2013-07-29 Thread Santanu8939967892
Hi, I have a huge volume of DB records, which is close to 250 millions. I am going to use DIH to index the data into Solr. I need a best architecture to index and query the data in an efficient manner. I am using windows server 2008 with 16 GB RAM, zion processor and Solr 4.4. With Regards, Sa

Re: DIH to index the data - 250 millions - Need a best architecture

2013-07-29 Thread Gora Mohanty
On 29 July 2013 17:30, Santanu8939967892 wrote: > Hi, >I have a huge volume of DB records, which is close to 250 millions. > I am going to use DIH to index the data into Solr. > I need a best architecture to index and query the data in an efficient > manner. [...] This is difficult to answer

Re: AND Queries

2013-07-29 Thread Jack Krupansky
qf with multiple fields triggers what is known as a dismax or DisjunctionMax query. It matches a document if ANY of the listed fields contains the query term - checking each term one at a time. So, if you search for three terms with three different fields in qf, the terms may occur in different

Re: DIH to index the data - 250 millions - Need a best architecture

2013-07-29 Thread Santanu8939967892
Hi Gora, I wanted to highlight one point here. As my content volume is large what should be my index architecture single core, cloud... ? With Regards, Santanu On Mon, Jul 29, 2013 at 5:37 PM, Gora Mohanty wrote: > On 29 July 2013 17:30, Santanu8939967892 wrote: > > Hi, > >I have a h

Re: DIH to index the data - 250 millions - Need a best architecture

2013-07-29 Thread Jack Krupansky
The initial question is not how to index the data, but how you want to use or query the data. Use cases for query and data access should drive the data model that you will use to index the data. So, what are some sample queries? How will users want to search and access the data? What data will

Re: .lock file not created when making a backup snapshot

2013-07-29 Thread Artem Karpenko
Thanks Mark! 29.07.2013 12:32, Mark Triggs пишет: Hi Artem, I noticed this recently too. I created a JIRA issue here: https://issues.apache.org/jira/browse/SOLR-5040 Cheers, Mark Artem Karpenko writes: Hi, when making a backup snapshot using "/replication?command=backup" call, a sn

Re: solr query range upper exclusive

2013-07-29 Thread Jack Krupansky
Square brackets are inclusive and curly braces are exclusive for range queries. I tried a similar example with the standard Solr example and it works fine: curl "http://localhost:8983/solr/update?commit=true"; \ -H 'Content-type:application/json' -d ' [{"id": "doc-1", "price_f": 249}]' cur

RE: swap and GC

2013-07-29 Thread Michael Ryan
This is interesting... How are you measuring the heap size? -Michael -Original Message- From: Bernd Fehling [mailto:bernd.fehl...@uni-bielefeld.de] Sent: Monday, July 29, 2013 5:34 AM To: solr-user@lucene.apache.org Subject: swap and GC Something interesting I have noticed today, after

Re: swap and GC

2013-07-29 Thread Bernd Fehling
Am 29.07.2013 14:46, schrieb Michael Ryan: > This is interesting... How are you measuring the heap size? This is displayed in jvisualvm and also logged with munin via JMX. Bernd > > -Michael > > -Original Message- > From: Bernd Fehling [mailto:bernd.fehl...@uni-bielefeld.de] > Sent:

Re: DIH to index the data - 250 millions - Need a best architecture

2013-07-29 Thread Santanu8939967892
Hi Jack, My sample query will be with a keyword (text) and probably 2 to 3 filters. There is a java interface for display of data, which will consume a class, and the class returns a data set object using SolrJ. So for display we will use a list for binding. we may display 20 or 30 meta data in

Improper shutdown of Solr in Jetty 9

2013-07-29 Thread Artem Karpenko
Hi, I can't make Solr shut down properly when using Jetty 9. Tested this with a simple plugin that only extends DirectUpdateHandler2, creates a file in constructor and deletes it in close(). While it's working fine in the example installation (the one that can be downloaded from Solr site) an

Re: DIH to index the data - 250 millions - Need a best architecture

2013-07-29 Thread Jack Krupansky
You neglected to provide information about the filters or the "20 or 30 meta data information". Did you mean to imply that you will not be querying against the metadata (only returning it)? -- Jack Krupansky -Original Message- From: Santanu8939967892 Sent: Monday, July 29, 2013 9:4

Re: solr query range upper exclusive

2013-07-29 Thread alin1918
what query parser should I use? http://wiki.apache.org/solr/SolrQuerySyntax "Differences From Lucene Query Parser Differences in the Solr Query Parser include Range queries [a TO z], prefix queries a*, and wildcard queries a*b are constant-scoring (all matching documents get an equal score)

restricting a query by a "set" of field values

2013-07-29 Thread Benjamin Ryan
Hi, Is it possible to construct a query in SOLR to perform a query that is "restricted" to only those documents that have a field value in a particular set of values similar to what would be done in POstgres with the SQL query: SELECT date_deposited FROM stats

The meaning of the of the doc= on the debugQuery output

2013-07-29 Thread Bruno René Santos
Hello One line on my debugQuery of a query is 2.1706323e-6 = score(doc=49578,freq=1.0 = termfreq=1.0), product of: I wanted to know what the doc= means. It seems to be something used on the fieldWeight but on the other hand it is the same for all fields on the document, regardless of the query m

Re: restricting a query by a "set" of field values

2013-07-29 Thread Jason Hellman
Ben, This could be constructed as so: fl=date_deposited&fq=date[2013-07-01T00:00:00Z TO 2013-07-31T23:59:00Z]&fq=collection_id(1 2 n)&q.op=OR The parenthesis around the 1 2 n set indicate a boolean query, and we're ensuring they are an OR boolean by the q.op parameter. This should get you the

SolrCloud and Joins

2013-07-29 Thread David Larochelle
I'm setting up SolrCloud with around 600 million documents. The basic structure of each document is: stories_id: integer, media_id: integer, sentence: text_en We have a number of stories from different media and we treat each sentence as a separate document because we need to run sentence level a

Re: DIH to index the data - 250 millions - Need a best architecture

2013-07-29 Thread Shawn Heisey
On 7/29/2013 6:00 AM, Santanu8939967892 wrote: > Hi, >I have a huge volume of DB records, which is close to 250 millions. > I am going to use DIH to index the data into Solr. > I need a best architecture to index and query the data in an efficient > manner. > I am using windows server 2008 with

Re: The meaning of the of the doc= on the debugQuery output

2013-07-29 Thread fbrisbart
Hi, doc is the internal docId of the index. Each doc in the index has an internal id. It starts from 1 (1st doc inserted in the index), 2 for the 2nd, ... Franck Brisbart Le lundi 29 juillet 2013 à 15:34 +0100, Bruno René Santos a écrit : > Hello > > One line on my debugQuery of a query is

Solr 4.3.1 - query does not return documents, just numFounds, 2 shards, replication Factor 1

2013-07-29 Thread Nitin Agarwal
Hi, I am using Solr 4.3.1 with 2 Shards and replication factor of 1, running on apache tomcat 7.0.42 with external zookeeper 3.4.5. When I query "select?q=*:*" I only get the number of documents found, but no actual document. When I query with rows=0, I do get correct count of documents in the in

Re: RAM Usage Debugging

2013-07-29 Thread Shawn Heisey
On 7/29/2013 1:12 AM, Furkan KAMACI wrote: > When I look at my dashboard I see that 27.30 GB available for JVM, 24.77 > GB is gray and 16.50 GB is black. I don't do anything on my machine right > now. Did it cache documents or is there any problem, how can I learn it? This is simple information

Solr Out Of Memory with Field Collapsing

2013-07-29 Thread tushar_k47
Hi, We are using Field collapsing feature with multiple shards. We ran into into Out of Memory errors on one of the shards. We use filed collapsing on a particular field which has only one specific value on the shard that goes out of memory. Interestingly the Out of Memory error recurred multiple

Re: SolrCloud and Joins

2013-07-29 Thread Walter Underwood
Denormalize. Add media_set_id to each sentence document. Done. wunder On Jul 29, 2013, at 7:58 AM, David Larochelle wrote: > I'm setting up SolrCloud with around 600 million documents. The basic > structure of each document is: > > stories_id: integer, media_id: integer, sentence: text_en > >

solr - set fileds as default search field

2013-07-29 Thread Mysurf Mail
The following query works well for me http://[]:8983/solr/vault/select?q=VersionComments%3AWhite returns all the documents where version comments includes White I try to omit the field name and put it as a default value as follows : In solr config I write explicit 10 PackageName

Re: solr - set fileds as default search field

2013-07-29 Thread Ahmet Arslan
Hi, df is a single valued parameter. Only one field can be a default field. To query multiple fields use (e)dismax query parser :  http://wiki.apache.org/solr/ExtendedDisMax#qf_.28Query_Fields.29 From: Mysurf Mail To: solr-user@lucene.apache.org Sent: Monday

Re: Solr 4.3.1 - query does not return documents, just numFounds, 2 shards, replication Factor 1

2013-07-29 Thread Jason Hellman
Nitin, You need to ensure the fields you wish to see are marked stored="true" in your schema.xml file, and you should include fields in your fl= parameter (fl=*,score is a good place to start). Jason On Jul 29, 2013, at 8:08 AM, Nitin Agarwal <2nitinagar...@gmail.com> wrote: > Hi, I am using

Re: solr - set fileds as default search field

2013-07-29 Thread Jason Hellman
Or use the copyField technique to a single searchable field and set df= to that field. The example schema does this with the field called "text". On Jul 29, 2013, at 8:35 AM, Ahmet Arslan wrote: > Hi, > > > df is a single valued parameter. Only one field can be a default field. > > To query

Re: SolrCloud and Joins

2013-07-29 Thread David Larochelle
We'd like to be able to easily update the media set to source mapping. I'm concerned that if we store the media_sets_id in the sentence documents, it will be very difficult to add additional media set to source mapping. I imagine that adding a new media set would either require reimporting all 600

Re: Solr 4.3.1 - query does not return documents, just numFounds, 2 shards, replication Factor 1

2013-07-29 Thread Nitin Agarwal
Jason, all my fields are set with stored=ture and indexed = true, and I used select?q=*:*&fl=*,score but still I get the same response * 0 138 *,score *:* * Here is what my schema looks like *

Re: restricting a query by a "set" of field values

2013-07-29 Thread Chris Hostetter
: fl=date_deposited&fq=date[2013-07-01T00:00:00Z TO 2013-07-31T23:59:00Z]&fq=collection_id(1 2 n)&q.op=OR typo -- the colon is missing... fq=collection_id:(1 2 n) if you don't want the q.op to apply globally to your request, you can also scope it only for that filter. likewise the "field_name

Re: Solr 4.3.1 - query does not return documents, just numFounds, 2 shards, replication Factor 1

2013-07-29 Thread Jack Krupansky
Check the "/select" request handler in solrconfig. See if it defaults "start" or "rows". start is the initial document number (e.g., 1), and rows is the number of rows to actually return in the response (nothing to do with numFound). The internal Solr default is rows=10, but you can set it to 20

Re: processing documents in solr

2013-07-29 Thread Joe Zhang
I'll try reindexing the timestamp. The id-creation approach suggested by Erick sounds attractive, but the nutch/solr integration seems rather tight. I don't where to break in to insert the id into solr. On Mon, Jul 29, 2013 at 4:11 AM, Erick Erickson wrote: > No SolrJ doesn't provide this autom

Re: SolrCloud and Joins

2013-07-29 Thread Walter Underwood
A join may seem clean, but it will be slow and (currently) doesn't work in a cluster. You find all the sentences in a media set by searching for that set id and requesting only the sentence_id (yes, you need that). Then you reindex them. With small documents like this, it is probably fairly fas

Re: SolrCloud shard down

2013-07-29 Thread Katie McCorkell
I am using Solr 4.3.1 . I did hard commit after indexing. I think you're right that the node was still recovering. I didn't think so since it didn't show up as yellow "recovering" on the visual display, but after quite a while it went from "Down" to "Active" . Thanks! On Fri, Jul 26, 2013 at 7:5

Re: Solr 4.3.1 - query does not return documents, just numFounds, 2 shards, replication Factor 1

2013-07-29 Thread Nitin Agarwal
Jack, I checked my solrconfig.xml for "/select" RequestHandler, it had rows=10, I changed it 20, uploaded the configs to zookeeper, restarted Tomcat, didn't work. Then removed rows all together from defaults, uploaded the configs to zookeeper and restarted Tomcat, but it still does not work. Here

Using a dictionary to boost queries

2013-07-29 Thread Delip Rao
I have a dictionary of domain specific terms and I want to be able to automatically boost occurrences of those terms in a query. These terms could either be single word or multi-word phrases, like "supreme court", "habeas corpus", etc. So if the query was 'habeas corpus germany' (without the quotes

Re: Solr 4.3.1 - query does not return documents, just numFounds, 2 shards, replication Factor 1

2013-07-29 Thread Chris Hostetter
: Here is what my schema looks like what is your uniqueKey field? I'm going to bet it's "tn_lookup_key_id" and i'm going to bet your "lowercase" fieldType has an interesting analyzer on it. you are probably hitting a situation where the analyzer you have on your uniqueKey field is munging the

Re: SolrCloud shard down

2013-07-29 Thread Mark Miller
On Jul 29, 2013, at 12:49 PM, Katie McCorkell wrote: > I didn't think so > since it didn't show up as yellow "recovering" on the visual display, but > after quite a while it went from "Down" to "Active" . Thanks! Thanks, I think we should improve this! We should publish a recovery state when r

Re: Solr 4.3.1 - query does not return documents, just numFounds, 2 shards, replication Factor 1

2013-07-29 Thread Nitin Agarwal
Hoss, you rock! That was the issue, I changed tn_lookup_key_id, which was my unique key field, to string and reloaded the index and it works. Jason, Jack and Hoss, thanks for your help. Nitin On Mon, Jul 29, 2013 at 12:22 PM, Chris Hostetter wrote: > > : Here is what my schema looks like > >

Re: Performance vs. maxBufferedAddsPerServer=10

2013-07-29 Thread Erick Erickson
Why wouldn't it? Or are you saying that the routing to replicas from the leader also 10/packet? Hmmm, hadn't thought of that... On Mon, Jul 29, 2013 at 7:58 AM, Mark Miller wrote: > SOLR-4816 won't address this - it will just speed up *different* parts. There > are other things that will need to

Re: Solr 4.3.1 - query does not return documents, just numFounds, 2 shards, replication Factor 1

2013-07-29 Thread Erick Erickson
Nitin: What was your tn_lookup_key_id field definition when things didn't work? The stock lowercase is KeywordTokenizerFactory+LowerCaseFilterFactory and if this leads to mis-matches as Hoss outlined, it'd surprise me so I need to file it away in my list of things not to do. Thanks, Erick On Mon

Re: DIH to index the data - 250 millions - Need a best architecture

2013-07-29 Thread Mikhail Khludnev
Mishra, What if you setup DIH with single SQLEntityProcessor without caching, does it works for you? On Mon, Jul 29, 2013 at 4:00 PM, Santanu8939967892 wrote: > Hi, >I have a huge volume of DB records, which is close to 250 millions. > I am going to use DIH to index the data into Solr. > I

Pentaho Kettle vs DIH

2013-07-29 Thread Mikhail Khludnev
Hello, Don't you have any experience with using Pentaho Kettle for processing RDBMS and pouring them into Solr? Isn't it some sort of replacement of the DIH? -- Sincerely yours Mikhail Khludnev Principal Engineer, Grid Dynamics

Re: Solr 4.3.1 - query does not return documents, just numFounds, 2 shards, replication Factor 1

2013-07-29 Thread Nitin Agarwal
Erick, I had typed tn_lookup_key_id as lowercase and it was defined as Nitin On Mon, Jul 29, 2013 at 1:23 PM, Erick Erickson wrote: > Nitin: > > What was your tn_lookup_key_id field definition when things didn't work? > The stock lowercase is KeywordTokenizer

solr sizing

2013-07-29 Thread Torsten Albrecht
Hi all, we have - 70 mio documents to 100 mio documents and we want - 800 requests per second How many servers Amazon EC2/real hardware we Need for this? Solr 4.x with solr cloud or better shards with loadbalancer? Is anyone here who can give me some information, or who operates a similar

Re: Merged segment warmer Solr 4.4

2013-07-29 Thread Chris Hostetter
: I have a slow storage machine and non sufficient RAM for the whole index to : store all the index. This causes the first queries (~5000) to be very slow ... : Secondly I thought of initiating a new searcher event listener that queries : on docs that were inserted since the last hard commi

Performance question on Spatial Search

2013-07-29 Thread Steven Bower
I've been doing some performance analysis of a spacial search use case I'm implementing in Solr 4.3.0. Basically I'm seeing search times alot higher than I'd like them to be and I'm hoping people may have some suggestions for how to optimize further. Here are the specs of what I'm doing now: Mach

Re: solr sizing

2013-07-29 Thread Shawn Heisey
On 7/29/2013 2:18 PM, Torsten Albrecht wrote: we have - 70 mio documents to 100 mio documents and we want - 800 requests per second How many servers Amazon EC2/real hardware we Need for this? Solr 4.x with solr cloud or better shards with loadbalancer? Is anyone here who can give me some i

SOLR replication question?

2013-07-29 Thread SolrLover
I am currently using SOLR 4.4. but not planning to use solrcloud in very near future. I have 3 master / 3 slave setup. Each master is linked to its corresponding slave.. I have disabled auto polling.. We do both push (using MQ) and pull indexing using SOLRJ indexing program. I have enabled soft

Re: Performance question on Spatial Search

2013-07-29 Thread Erick Erickson
This is very strange. I'd expect slow queries on the first few queries while these caches were warmed, but after that I'd expect things to be quite fast. For a 12G index and 256G RAM, you have on the surface a LOT of hardware to throw at this problem. You can _try_ giving the JVM, say, 18G but tha

Solr Cloud - How to balance Batch and Queue indexing?

2013-07-29 Thread SolrLover
I need some advice on the best way to implement Batch indexing with soft commit / Push indexing (via queue) with soft commit when using SolrCloud. *I am trying to figure out a way to: * 1. Make the push indexing available almost real time (using soft commit) without degrading the search / indexing

Streaming Updates Using HttpSolrServer.add(Iterator) In Solr 4.3

2013-07-29 Thread Paul, Terry
Hi all. We're in the midst of upgrading from Solr 1.4 to 4.3.1, and we've run into issues with memory on our client side during a mass index operation. We use the approach described on the SolrJ wiki at http://wiki.apache.org/solr/Solrj#Streaming_documents_for_an_update. In the Solr 1.4 days th

Re: Streaming Updates Using HttpSolrServer.add(Iterator) In Solr 4.3

2013-07-29 Thread SolrLover
I am indexing more than 300 million records, it takes less than 7 hours to index all the records.. Send the documents in batches and also use CUSS (ConcurrentUpdateSolrServer) for multi threading support. Ex: ConcurrentUpdateSolrServer server= new ConcurrentUpdateSolrServer(solrServer, queueSi

Re: SOLR replication question?

2013-07-29 Thread Shawn Heisey
> I am currently using SOLR 4.4. but not planning to use solrcloud in very near > future. > I have 3 master / 3 slave setup. Each master is linked to its > corresponding > slave.. I have disabled auto polling.. > We do both push (using MQ) and pull indexing using SOLRJ indexing program. > I have en

Re: Streaming Updates Using HttpSolrServer.add(Iterator) In Solr 4.3

2013-07-29 Thread Shawn Heisey
> I am indexing more than 300 million records, it takes less than 7 hours to > index all the records.. > > Send the documents in batches and also use CUSS > (ConcurrentUpdateSolrServer) > for multi threading support. > > Ex: > > ConcurrentUpdateSolrServer server= new > ConcurrentUpdateSolrServer(s

Re: Performance question on Spatial Search

2013-07-29 Thread Bill Bell
Can you compare with the old geo handler as a baseline. ? Bill Bell Sent from mobile On Jul 29, 2013, at 4:25 PM, Erick Erickson wrote: > This is very strange. I'd expect slow queries on > the first few queries while these caches were > warmed, but after that I'd expect things to > be quite fa

Re: Performance question on Spatial Search

2013-07-29 Thread Steven Bower
@Erick it is alot of hw, but basically trying to create a "best case scenario" to take HW out of the question. Will try increasing heap size tomorrow.. I haven't seen it get close to the max heap size yet.. but it's worth trying... Note that these queries look something like: q=*:* fq=[date range

Re: Performance vs. maxBufferedAddsPerServer=10

2013-07-29 Thread Mark Miller
Yes, the internal document forwarding path is different and does not use the CloudSolrServer. It currently works with a buffer of 10. - Mark On Jul 29, 2013, at 3:10 PM, Erick Erickson wrote: > Why wouldn't it? Or are you saying that the routing to replicas > from the leader also 10/packet? Hm

Re: DIH to index the data - 250 millions - Need a best architecture

2013-07-29 Thread Santanu8939967892
Hi Shawn, Yes, your assumption is correct. The index size is around 250 GB and we index 20/30 meta data and store around 50. We have plan for a Solr cloud architecture having two nodes one Master and other one is replica of the master (replication factor 2) with multiple zookeeper ensembl

Re: DIH to index the data - 250 millions - Need a best architecture

2013-07-29 Thread Santanu8939967892
Hi, In addition to my last mail one further query. Can we automate the deployment process for multinode environment (N.. nodes)? With Regards, Santanu On Tue, Jul 30, 2013 at 11:53 AM, Santanu8939967892 < mishra.sant...@gmail.com> wrote: > Hi Shawn, > Yes, your assumption is correct. Th

Re: DIH to index the data - 250 millions - Need a best architecture

2013-07-29 Thread Shawn Heisey
On 7/30/2013 12:23 AM, Santanu8939967892 wrote: > Yes, your assumption is correct. The index size is around 250 GB and > we index 20/30 meta data and store around 50. > We have plan for a Solr cloud architecture having two nodes one Master > and other one is replica of the master (replica