Re: 1 main collection or multiple smaller collections?

2017-04-26 Thread Derek Poh
There are some common fields between them. At the source data end (database), the supplier info and product info are updated separately. In this regard, I should separate them? If it's In 1 single collection, when there are updatesto only the supplier info,the product info will be index again ev

Re: 1 main collection or multiple smaller collections?

2017-04-26 Thread Walter Underwood
Also, 300,000 documents is fairly small for Solr. We handle a million queries per day with a few servers on a collection that size. wunder Walter Underwood wun...@wunderwood.org http://observer.wunderwood.org/ (my blog) > On Apr 26, 2017, at 10:33 PM, Walter Underwood wrote: > > Do they have

Re: 1 main collection or multiple smaller collections?

2017-04-26 Thread Walter Underwood
Do they have the same fields or different fields? Are they updated separately or together? If they have the same fields and are updated together, I’d put them in the same collection. Otherwise, probably separate. wunder Walter Underwood wun...@wunderwood.org http://observer.wunderwood.org/ (my

1 main collection or multiple smaller collections?

2017-04-26 Thread Derek Poh
Hi I amplanning for a migration of a legacy searchengine to Solr. Basically thedata can be categorisedinto suppliersinfo, suppliers products info and products category info. These sets of data are related to each other. suppliers products data, which is the largest, have around 300,000 records

Re: Atomic Updates

2017-04-26 Thread Dorian Hoxha
@Chris, According to doc-link-above, only INC,SET are in-place-updates. And only when they're not indexed/stored, while your 'integer-field' is. So still shenanigans in there somewhere (docs,your-code,your-test,solr-code). On Thu, Apr 27, 2017 at 2:04 AM, Chris Ulicny wrote: > That's probably it

Indexing and Querying chinese at SOLR 6.4.2

2017-04-26 Thread Felix Stanley
Hi, I have been facing some issue in indexing and querying chinese character field using cjx analyzer. Here is what I've done: I defined a new field and field type at my schema.xml : and then I indexed the following documents : P_ProductId P_SupplierId test_chinese P_CategoryN

Re: SolrServerException: Invalid use of BasicClientConnManager: connection still allocated.

2017-04-26 Thread sputul
Thank you for the big help! -- Putul On Wed, Apr 26, 2017 at 9:22 AM, Shawn Heisey-2 [via Lucene] < ml+s472066n4331977...@n3.nabble.com> wrote: > On 4/25/2017 1:40 PM, Putul S wrote: > > I am using single instance CloudSolrClient using my HttpClinet. > > Problem with using this httpClient is that

Re: Atomic Updates

2017-04-26 Thread Chris Ulicny
That's probably it then. None of the atomic updates that I've tried have been on TextFields. I'll give the TextField atomic update to verify that it will clear the other field. Has this functionality been consistent since atomic updates were introduced, or is this a side effect of some other chang

Re: Atomic Updates

2017-04-26 Thread Ishan Chattopadhyaya
> Hmm, interesting. I can imagine that as long as you're updating > docValues fields, the other_text field would be there. But the instant > you updated a non-docValues field (text_field in your example) the > other_text field would disappear I can confirm this. When in-place updates to DV fields

Re: Shard CPU usage?

2017-04-26 Thread Erick Erickson
Sharding should, in general, _not_ be used as long as the response time for individual queries is acceptable. It imposes a certain amount of overhead. The typical process is two-pass. pass1: get the candidate top N docs from a replica on each shard. pass2: have each shard return its portion of the

Shard CPU usage?

2017-04-26 Thread Jakov Sosic
Hi guys, I was wondering does the introduction of shards actually increase CPU usage? I have a 30GB index split into two shards (15GB each), and by analyzing the logs, I figured out that ~80% of the queries have the "&shard.url=http://10.3.4.12:8080/solr/mycore/|http://10.3.4.14:8080/solr/myco

Re: Atomic Updates

2017-04-26 Thread Erick Erickson
Hmm, interesting. I can imagine that as long as you're updating docValues fields, the other_text field would be there. But the instant you updated a non-docValues field (text_field in your example) the other_text field would disappear. I DO NOT KNOW this for a fact, but I'm asking people who do.

Re: Atomic Updates

2017-04-26 Thread Dorian Hoxha
There are In Place Updates, but according to docs they stll shouldn't work in your case: https://cwiki.apache.org/confluence/display/solr/Updating+Parts+of+Documents On Wed, Apr 26, 2017 at 10:36 PM, Chris Ulicny wrote: > That's the thing I'm curious about though. As I mentioned in the first > p

Re: Atomic Updates

2017-04-26 Thread Chris Ulicny
That's the thing I'm curious about though. As I mentioned in the first post, I've already tried a few tests, and the value seems to still be present after an atomic update. I haven't exhausted all possible atomic updates, but 'set' and 'add' seem to preserve the non-stored text field. Thanks, Chr

Re: Atomic Updates

2017-04-26 Thread Dorian Hoxha
You'll lose the data in that field. Try doing a commit and it should happen. On Wed, Apr 26, 2017 at 9:50 PM, Chris Ulicny wrote: > Thanks Shawn, I didn't realize docValues were enabled by default now. > That's very convenient and probably makes a lot of the schemas we've been > making excessive

Re: Indexing I/O errors and CorruptIndex messages

2017-04-26 Thread Erick Erickson
Disk space issue? Lucene requires at least as much free disk space as your index size. Note that the disk full issue will be transient, IOW if you look now and have free space it still may have been all used up but had some space reclaimed. Best, Erick On Wed, Apr 26, 2017 at 12:02 PM, simon wro

Re: Atomic Updates

2017-04-26 Thread Chris Ulicny
Thanks Shawn, I didn't realize docValues were enabled by default now. That's very convenient and probably makes a lot of the schemas we've been making excessively verbose. This is on 6.3.0. Do you know what the first version was that they added the docValues by default for non-Text field? However

Re: Atomic Updates

2017-04-26 Thread Shawn Heisey
On 4/25/2017 1:40 PM, Chris Ulicny wrote: > Hello all, > > Suppose I have the following fields in a document and populate all 4 fields > for every document. > > id: uniqueKey, indexed and stored > integer_field: indexed and stored > text_field: indexed and stored > othertext_field: indexed but not

Indexing I/O errors and CorruptIndex messages

2017-04-26 Thread simon
reposting this as the problem described is happening again and there were no responses to the original email. Anyone ? I'm seeing an odd error during indexing for which I can't find any reason. The relevant solr log entry: 2017-03-24 19:09:35.363 ERROR (commitSchedule

After upgrade to Solr 6.5, q.op=AND affects filter query differently than in older version

2017-04-26 Thread Andy C
I'm looking at upgrading the version of Solr used with our application from 5.3 to 6.5. Having an issue with a change in the behavior of one of the filter queries we generate. The field "ctindex" is only present in a subset of documents. It basically contains a user id. For those documents where

Re: DateRangeField and Faceting

2017-04-26 Thread David Smiley
Hi Stephen, I agree that it would be nice if the JSON faceting module worked with DateRangeField. Sadly Solr has several faceting engines (classic, JSON Facets, analytics contrib) and there has yet been any effort to coral them. My sense is that JSON Faceting is where effort should go, and as yo

Could not find collection , Error while ingesting to Solr using Flume and Morphlines

2017-04-26 Thread Anantharaman, Srinatha (Contractor)
Hi, Though I see Zookeeper is uploaded with the collection, I get below error while Ingesting data to Solr using Flume and Morphline. Kindly let me know if you need more details 017-04-26 18:25:31,767 (SinkRunner-PollingRunner-DefaultSinkProcessor) [DEBUG - org.kitesdk.morphline.base.AbstractCo

counting_number_of_term_in_a_doc

2017-04-26 Thread Saman Rasheed
Hi, I've been trying to figure out how to return the (number) of matching words in a regex term lookup with no luck. Basically i have a large text document indexed, next when i do a regex term lookup like the following: http://localhost:8983/solr/core1/terms?terms.fl=content&terms.regex=.*te

Last chance: ApacheCon is just three weeks away

2017-04-26 Thread Rich Bowen
ApacheCon is just three weeks away, in Miami, Florida, May 15th - 18th. http://apachecon.com/ There's still time to register and attend. ApacheCon is the best place to find out about tomorrow's software, today. ApacheCon is the official convention of The Apache Software Foundation, and includes t

Re: Help with facet.limit

2017-04-26 Thread Erick Erickson
The only two canned orderings are "index" which means lexically ordered and the default frequency, the top 500 most frequent facets will be returned. You can always specify facet.query=XXX and I think they are returned in the order you define the facets. If you have a small number of facets you re

Re: Poll: Master-Slave or SolrCloud?

2017-04-26 Thread Erick Erickson
Steve: You might be interested in: https://issues.apache.org/jira/browse/SOLR-10233, please comment on whether that JIRA is along the lines you're thinking. Best, Erick On Wed, Apr 26, 2017 at 6:35 AM, Stephen Weiss wrote: > We run both, and we are running the latest versions for both. There a

Re: Managed Schema multiValued Predict Problem

2017-04-26 Thread Rick Leir
Lova, When a search term is "foo*" or similar, you have a multivalue search. In schema.xml you have for a typical field, an index analysis chain and a query analysis chain. In the multivalue case, neither of these chains is followed. There is a wiki page which explains what chain gets followed,

Re: Poll: Master-Slave or SolrCloud?

2017-04-26 Thread Stephen Weiss
We run both, and we are running the latest versions for both. There are different use cases for each one. Where we are using solrcloud, it only has to operate in one datacenter, and sharding is incredibly important because we have billions and billions of documents. In a separate group of ser

Re: SolrServerException: Invalid use of BasicClientConnManager: connection still allocated.

2017-04-26 Thread Shawn Heisey
On 4/25/2017 1:40 PM, Putul S wrote: > I am using single instance CloudSolrClient using my HttpClinet. > Problem with using this httpClient is that, whenever I add more than > one document, LBHttpSolrClient complains about connection not > released. Everything works fine is I do not use my own Http

Securing solr web Client

2017-04-26 Thread bay chae
I have secured solr using basic authentication so that php client and curl requests require the password. Using solr cloud as I gave up trying to setup on standalone. However this does not secure the solr web client!!! Where is the documentation to secure solr web client? Any direction gratefu

Help with facet.limit

2017-04-26 Thread kshitij tyagi
Hi Team, I am using facet on particular field along with facet.limit=500, problem I am facing is: 1. As there are more than 500 facets and it is giving me 500 results, I want particular facets to be returned i.e can I specify to solr to return me 500 facets along with ones I require? eg facets

recommended zookeeper version for solr cloud

2017-04-26 Thread David Michael Gang
Hi all, Which version of external zookeper is recommended to use in production environments? 3.4.6 which is the version shipped with solr or 3.4.10 which is the latest stable? Thanks, David

Re: Managed Schema multiValued Predict Problem

2017-04-26 Thread Lova
Hello, I have this error org.apache.solr.common.SolrException: can not use FieldCache on multivalued field: post_title I can need specific field as multivalue, it's a bug in my app what I change in solrconfig.xml please? Thanks -- View this message in context: http://lucene.472066.n3.nabble.