Re: SOLR Sizing

2016-10-03 Thread Susheel Kumar
In short, if you want your estimate to be closer then run some actual ingestion for say 1-5% of your total docs and extrapolate since every search product may have different schema,different set of fields, different index vs. stored fields, copy fields, different analysis chain etc. If you want t

Re: Average of Averages in Solr

2016-10-06 Thread Susheel Kumar
Please look into streaming expressions. I think that is what you are looking for. https://cwiki.apache.org/confluence/display/solr/Streaming+Expressions Thanks, Susheel On Thu, Oct 6, 2016 at 11:56 AM, John Bickerstaff wrote: > This may help? Note the "Bloomberg Analytics" at the bottom of

Re: Advice on implementing SOLR Cloud

2016-10-18 Thread Susheel Kumar
In case if you need exact commands etc. you can follow this http://blog.thedigitalgroup.com/susheelk/2015/08/03/solrcloud-2-nodes-solr-1-node-zk-setup/ Thanks, Susheel On Mon, Oct 17, 2016 at 7:17 PM, John Bickerstaff wrote: > I had quite a lot of "fun" figuring out how to install Solr Cloud.

Re: Can we query across collections in SOLR?

2016-10-21 Thread Susheel Kumar
You may wanna to checkout below these options as well https://cwiki.apache.org/confluence/display/solr/Advanced+Distributed+Request+Options https://cwiki.apache.org/confluence/display/solr/Streaming+Expressions#StreamingExpressions-innerJoin On Fri, Oct 21, 2016 at 7:49 AM, Adi wrote: > Hi, >

Re: SUM Function performance

2016-10-23 Thread Susheel Kumar
Hi Ganesh, In general it shouldn't be an issue if you execute sum queries every other hour but you may want to share your cluster configuration (solr version, solr cloud?, # machines, machine configuration, index size) and load(indexing & query load) and perform some tests. Also FYI, there is str

OOM Error

2016-10-24 Thread Susheel Kumar
Hello, I am seeing OOM script killed solr (solr 6.0.0) on couple of our VM's today. So far our solr cluster has been running fine but suddenly today many of the VM's Solr instance got killed. I had 8G of heap allocated on 64 GB machines with 20+ GB of index size on each shards. What could be look

Re: OOM Error

2016-10-24 Thread Susheel Kumar
ath/to/the/dump" Is that something we should add to the Solr launch scripts to have it included or may be at least in disabled (comment) mode? Thanks, Susheel On Mon, Oct 24, 2016 at 8:20 PM, Shawn Heisey wrote: > On 10/24/2016 4:27 PM, Susheel Kumar wrote: > > I am seeing OOM s

Re: OOM Error

2016-10-25 Thread Susheel Kumar
: > On Mon, 2016-10-24 at 18:27 -0400, Susheel Kumar wrote: > > I am seeing OOM script killed solr (solr 6.0.0) on couple of our VM's > > today. So far our solr cluster has been running fine but suddenly > > today many of the VM's Solr instance got killed. > > A

Re: OOM Error

2016-10-25 Thread Susheel Kumar
ld look into using docValues. docValues are stored off heap and > hence you would be better off than just bumping up the heap. > > Don't enable docValues on existing fields unless you plan to reindex data > from scratch. > > On Oct 25, 2016 3:04 PM, "Susheel Kumar" wrot

Re: OOM Error

2016-10-26 Thread Susheel Kumar
> On Tue, 2016-10-25 at 15:04 -0400, Susheel Kumar wrote: > > Thanks, Toke. Analyzing GC logs helped to determine that it was a > > sudden > > death. > > > The peaks in last 20 mins... See http://tinypic.com/r/n2zonb/9 > > Peaks yes, but there is a pattern of >

Re: Apache Solr Question

2016-11-03 Thread Susheel Kumar
For media like images etc, there is LIRE solr plugin which can be utilised. I have used in the past and may meet your requirement. See http://www.lire-project.net/ Thanks, Susheel On Thu, Nov 3, 2016 at 9:57 AM, Shawn Heisey wrote: > On 11/3/2016 2:49 AM, Chien Nguyen wrote: > > Hi everyone! I'

Re: OOM Error

2016-11-08 Thread Susheel Kumar
ge on large indexes" which can cause this issue. Also we have frequent updates and wondering if not optimizing the index can result into this situation Any thoughts ? GC Viewer https://www.dropbox.com/s/bb29ub5q2naljdl/gc_log_snapshot.png?dl=0 On Wed, Oct 26, 2016 at 10:47 AM, Sus

Re: OOM Error

2016-11-09 Thread Susheel Kumar
mit small batches (like 500 docs max) more frequently, than submitting bigger batches less frequently. So far bigger batch size has not caused any issues except these two incidents. Thanks, Susheel On Wed, Nov 9, 2016 at 10:19 AM, Shawn Heisey wrote: > On 11/8/2016 12:49 PM, Susheel Kumar wr

How to stop long running/memory eating query

2016-11-17 Thread Susheel Kumar
Hello, We found a query which was running forever and thus causing OOM ( q=+AND++AND+Tom+AND+Jerry...). Is there any way similar to SQL/No SQL world where we can watch currently executed queries and able to kill them. This can be desiring feature in these situations and avoid whole cluster going

Re: How to stop long running/memory eating query

2016-11-17 Thread Susheel Kumar
o:m...@apache.org] > > Sent: Thursday, November 17, 2016 6:55 AM > > To: solr-user > > Subject: Re: How to stop long running/memory eating query > > > > There is a circuit breaker > > https://cwiki.apache.org/confluence/display/solr/ > Common+Query+Paramet

Re: Function queries with Json facet

2017-06-05 Thread Susheel Kumar
You are looking for something like below. Please adjust the start and end. You can also give a fixed date instead of NOW etc.. curl http://localhost:8983/solr/techproducts/query -d 'q=*:*& json.facet={ byyeaar:{ type:range, field:"manufacturedate_dt", start : NOW-15YEAR/YEAR,

Re: Function queries with Json facet

2017-06-05 Thread Susheel Kumar
Also change manufacturedate_dt to dob field and price to grade... On Mon, Jun 5, 2017 at 10:33 AM, Susheel Kumar wrote: > You are looking for something like below. Please adjust the start and > end. You can also give a fixed date instead of NOW etc.. > > curl http://localho

Re: composite hash

2017-06-05 Thread Susheel Kumar
Its should be _route_=myshard/3! On Mon, Jun 5, 2017 at 12:54 PM, Shawn Feldman wrote: > I am indexing with a composite hash of "myshard/3!myid" > > If i want to query with the _route_ param, what does my route look like > > _route_=myshard/3! > or > _route_=myshard! > ? > > shawn >

Re: composite hash

2017-06-05 Thread Susheel Kumar
ill i need to reindex? > > On Mon, Jun 5, 2017 at 11:50 AM Susheel Kumar > wrote: > > > Its should be _route_=myshard/3! > > > > On Mon, Jun 5, 2017 at 12:54 PM, Shawn Feldman > > wrote: > > > > > I am indexing with a composite hash of "mysh

Re: Slow inserting with SolrCloud when increasing replicas

2017-06-06 Thread Susheel Kumar
Which version of Solr are you using. See https://lucidworks.com/2015/06/10/indexing-performance-solr-5-2-now-twice-fast/ https://issues.apache.org/jira/browse/SOLR-7333 Also would suggest to index using SolrJ with parallelism (multiple threads and/or machines) to increase indexing thru-put furt

Re: Slow inserting with SolrCloud when increasing replicas

2017-06-06 Thread Susheel Kumar
the problem seems to be related to replication. It looks like the writes > need to get to all the replicas before the indexing can continue with the > next batch > > Isart > > > > On Tue, Jun 6, 2017 at 2:31 PM, Susheel Kumar > wrote: > > > Which version of Sol

Re: Adding a Basic Authentication user fails with 404

2017-06-06 Thread Susheel Kumar
Please chk if it is not due to 6.5 which is fixed in 6.6 http://issues.apache.org/jira/browse/SOLR-10718 On Tue, Jun 6, 2017 at 8:05 PM, David Parker wrote: > Hello, > > I am running a stand-alone instance of Solr 6.5 (without ZooKeeper). I am > attempting to implement Basic Authentication per

Re: Slow inserting with SolrCloud when increasing replicas

2017-06-07 Thread Susheel Kumar
Does 50K batch size is what are you using to ingest into Solr? If that's the case it may be too high and you may want to start with 100-1000 batch size depending on your document size and gradually increase until it starts degrading the performance. On Wed, Jun 7, 2017 at 5:51 AM, Isart Montane

Re: Face the issues to setup solr with zookeeper

2017-06-07 Thread Susheel Kumar
Also use solr-user mailing list for general issues / queries / questions and please subscribe and repost this to solr-user@lucene.apache.org Refer http://lucene.apache.org/solr/community.html Thanks, Susheel On Wed, Jun 7, 2017 at 9:29 AM, Susheel Kumar wrote: > Please provide more detail

Re: Sharding vs single index vs separate collection

2017-06-08 Thread Susheel Kumar
You mentioned most of the searches will use document routing based on year as route key, correct? and then you mentioning huge amount of searches again without routing. Can you give some no# how many will utilise routing vs not routing? In general, we should try to serve all the queries with one

Re: Sharding vs single index vs separate collection

2017-06-08 Thread Susheel Kumar
correction: shared => sharded On Thu, Jun 8, 2017 at 10:10 PM, Susheel Kumar wrote: > You mentioned most of the searches will use document routing based on year > as route key, correct? and then you mentioning huge amount of searches > again without routing. Can you give some

Re: including a minus sign "-" in the token

2017-06-09 Thread Susheel Kumar
Hi Phil, The WordDelimiterFilterFactory ( https://wiki.apache.org/solr/AnalyzersTokenizersTokenFilters#solr.WordDelimiterFilterFactory) can be used to avoid splitting at hypen etc along with WhiteSpaceTokenizerFactory. Use generateWordParts="0"... Thnx On Thu, Jun 8, 2017 at 10:39 PM, Phil Sca

Re: Use of blanks in context filter field with AnalyzingInfixLookupFactory

2017-06-12 Thread Susheel Kumar
Change below type to string and try... text_en text_en Thanks, Susheel On Mon, Jun 12, 2017 at 1:28 PM, Alfonso Muñoz-Pomer Fuentes < amu...@ebi.ac.uk> wrote: > Hi all, > > I was wondering if anybody has experience setting up a suggester with > filtering using a context field that has blan

Re: Multi tenant setup

2017-06-13 Thread Susheel Kumar
Going with single cluster having multiple collections (for each client) is what I would try. How many clients do you have? If 10K, mean 10K collections and then how many documents, their size etc. you will need to come up with to nail down #machines and their memory/cpu requirements. Going with si

Re: Multi tenant setup

2017-06-14 Thread Susheel Kumar
I'll suggest to raise a JIRA and link to https://issues.apache.org/jira/browse/SOLR-7759 but before that see if updating the settings in Solrcofig for statsCache as described works here https://issues.apache.org/jira/browse/SOLR-1632 Thanks, Susheel On Tue, Jun 13, 2017 at 5:16 PM, Zisis T. wrot

install solr service possible bug

2017-06-14 Thread Susheel Kumar
Hi, Can anyone confirm if this "service --version" command works ? For me to install in SUSE distribution, "service --version" commands always fail and abort the solr installation with printing the error "Script requires the 'service' command" To make it work, i had to change "service --versio

Parallel SQL - column not found intermittent error

2017-06-14 Thread Susheel Kumar
I have setup Solr-6.6-0 on local (local ZK and Solr) and then on servers (3 ZK and 2 machines, 2 shards) and on both the env, i see this intermittent error "column not found". The same query works sometime and other time fails. Is that a bug or am I missing something... Console === -> solr-6.6

Re: Parallel SQL - column not found intermittent error

2017-06-14 Thread Susheel Kumar
s lastName FROM collection1 ORDEr BY dv_sv_userLastName LIMIT 15' against JDBC connection 'jdbc:calcitesolr:'.\nError while executing SQL \"SELECT sr_sv_userFirstName as firstName, sr_sv_userLastName as lastName FROM collection1 ORDEr BY dv_sv_userLastName LIMIT 15\": From lin

Re: Parallel SQL - column not found intermittent error

2017-06-14 Thread Susheel Kumar
Also i tried with both docValues and non docValue fields/column. On Wed, Jun 14, 2017 at 11:42 AM, Susheel Kumar wrote: > Yes, Joel. Kind of every other command runs into this issue. I just > executed below queries and 3 of them failed while 1 succeeded. I just > have 6 documents

Re: Out of Memory Errors

2017-06-14 Thread Susheel Kumar
You may have gc logs saved when OOM happened. Can you draw it in GC Viewer or so and share. Thnx On Wed, Jun 14, 2017 at 11:26 AM, Satya Marivada wrote: > Hi, > > I am getting Out of Memory Errors after a while on solr-6.3.0. > The -XX:OnOutOfMemoryError=/sanfs/mnt/vol01/solr/solr-6.3.0/bin/oom

Re: Parallel SQL - column not found intermittent error

2017-06-14 Thread Susheel Kumar
t have any documents, and when the query > happens to hit such a shard, it does not find the fields it's looking for > and turns this into "column not found". If you resubmit the query and hit > a different shards (with docs), the query will succeed. > > On 6/14/2017 11

Re: Out of Memory Errors

2017-06-14 Thread Susheel Kumar
The attachment will not come thru. Can you upload thru dropbox / other sharing sites etc. On Wed, Jun 14, 2017 at 12:41 PM, Satya Marivada wrote: > Susheel, Please see attached. There heap towards the end of graph has > spiked > > > > On Wed, Jun 14, 2017 at 11:46 AM Sush

Re: Parallel SQL - column not found intermittent error

2017-06-14 Thread Susheel Kumar
Created JIRA https://issues.apache.org/jira/browse/SOLR-10890 Thank you. On Wed, Jun 14, 2017 at 1:59 PM, Joel Bernstein wrote: > Let's create a jira for this. > > Joel Bernstein > http://joelsolr.blogspot.com/ > > On Wed, Jun 14, 2017 at 12:26 PM, Susheel Kumar >

Re: Can't upload pdf file to example Core

2017-06-14 Thread Susheel Kumar
Try using the curl command directly on terminal/console and it will work. I just tried on 6.6 on a mac. The upload thru UI would not work for PDF's unless more parameters are provided. The upload thru UI though works directly for XML/JSON files etc. curl ' http://localhost:8983/solr/techproduct

Re: Phrase Exact Match with Margin of Error

2017-06-15 Thread Susheel Kumar
CompledPhraseQuery parser is what you need to look https://cwiki.apache.org/confluence/display/solr/Other+Parsers#OtherParsers-ComplexPhraseQueryParser. See below for e.g. http://localhost:8983/solr/techproducts/select?debugQuery=on&indent=on&q=manu:%22Bridge%20the%20gat~1%20between%20your%20ski

Re: Phrase Exact Match with Margin of Error

2017-06-15 Thread Susheel Kumar
to match > the document title exactly (up to a few characters) and the document title > match the query exactly (up to a few characters). KeywordTokenizer allows > that. But complexphrase does not seem to work with KeywordTokenizer. > > On Thu, Jun 15, 2017 at 10:23 AM, Susheel Kumar &g

Re: Give boost only if entire value is present in Query

2017-06-19 Thread Susheel Kumar
In general, the documents which has more or all terms matched against query terms will be boosted higher per lucene tf/idf scoring. So for document having ABC DEF queries like ABC DEF XYZ or XYZ ABC DEF will find a match(assuming q.op=AND) and will be boosted higher compare to documents with ABC

Re: install solr service possible bug

2017-06-21 Thread Susheel Kumar
Thanks for checking and confirming Shawn. I have created JIRA https://issues.apache.org/jira/browse/SOLR-10932 On Tue, Jun 20, 2017 at 10:06 AM, Shawn Heisey wrote: > On 6/14/2017 7:47 AM, Susheel Kumar wrote: > > Can anyone confirm if this "service --version" command wo

Complement Stream function - Invalid ReducerStream - substream comparator (sort) must be a superset of this stream's comparator

2017-06-21 Thread Susheel Kumar
Hi, Two issues with complement function (solr 6.6) 1) When i execute below streaming expression, == let(a=fetch(collection1,having(rollup(over=email, count(email), select(search(collection1, q=*:*, fl="id,business

Re: Complement Stream function - Invalid ReducerStream - substream comparator (sort) must be a superset of this stream's comparator

2017-06-21 Thread Susheel Kumar
id, personal_email as email)), eq(count(email),1)), fl="id,personal_email as email", on="email=personal_email"), c=hashJoin(get(a),hashed=get(b),on="email"), d=hashJoin(get(b),hashed=get(a),on="email"), e=select(get(c),id,email), f=select(get(d),id,email

echo streaming expression - two/multiple fields

2017-06-21 Thread Susheel Kumar
How can we echo two/multiple fields out. The current echo expression emits "echo" field which can be then renamed with select but if i wanted to emit another field like id along echo, is that supported or we need to enhance echo for that... select(echo("A"),echo as email) Thanks, Susheel

Re: echo streaming expression - two/multiple fields

2017-06-21 Thread Susheel Kumar
s a tuple expression which returns a single > tuple: > > tuple(a="hello", b="world") > > You can also do things like: > > tuple(a=search(...), b=search(...)) > > > > > > Joel Bernstein > http://joelsolr.blogspot.com/ > > On Wed, Jun

com.ibm.icu dependency errors when building solr source code

2017-06-22 Thread Susheel Kumar
Hello, Am i missing something or the source code is broken. Took latest code from master and when doing "ant eclipse" or "ant test", getting below error. ivy-configure: [ivy:configure] :: loading settings :: file = /Users/kumars5/src/git/code/lucene-solr/lucene/top-level-ivy-settings.xml res

Re: Complement Stream function - Invalid ReducerStream - substream comparator (sort) must be a superset of this stream's comparator

2017-06-22 Thread Susheel Kumar
; I suspect something is wrong in the syntax but I'm not seeing it. > > Have you tried building up the expression piece by piece until you get the > syntax error? > > Joel Bernstein > http://joelsolr.blogspot.com/ > > On Wed, Jun 21, 2017 at 3:20 PM, Susheel Kumar &g

Index 0, Size 0 - hashJoin Stream function Error

2017-06-22 Thread Susheel Kumar
Hello Joel, Facing a weird behavior when using hashJoin / innerJoin etc. The below expression display tuples from variable a and the moment I use get on innerJoin / hashJoin expr on variable c let(a=fetch(SMS,having(rollup(over=email, count(email), select(searc

Re: Index 0, Size 0 - hashJoin Stream function Error

2017-06-22 Thread Susheel Kumar
a. When data matches during join then its fine but otherwise I am running into this issue and whole next expressions doesn't get evaluated due to this... after uncomment var d === { "result-set": { "docs": [ { "EXCEPTION": "Index: 0, S

Re: Error after moving index

2017-06-22 Thread Susheel Kumar
Usually we index directly into Prod solr than copying from local/lower environments. If that works in your scenario, i would suggest to directly index into Prod than copying/restoring from local Windows env to Linux. On Thu, Jun 22, 2017 at 12:13 PM, Moritz Michael wrote: > > > > > > > > >

Re: Complement Stream function - Invalid ReducerStream - substream comparator (sort) must be a superset of this stream's comparator

2017-06-22 Thread Susheel Kumar
Please let me know if I shall create a JIRA and i can provide both expressions and data to reproduce. On Thu, Jun 22, 2017 at 11:23 AM, Susheel Kumar wrote: > Yes, i tried building up expression piece by piece but looks like there is > an issue with how complement expects / behave fo

Re: Index 0, Size 0 - hashJoin Stream function Error

2017-06-22 Thread Susheel Kumar
ONSE_TIME": 1 } ] } } ===Complement outside let complement(sort(select(tuple(id=1,email="A"),id,email),by="id asc,email asc"), sort(select(tuple(id=1,email="A"),id,email),by="id asc,email asc"), on="id,email") Result === { "r

Re: Index 0, Size 0 - hashJoin Stream function Error

2017-06-23 Thread Susheel Kumar
cal analysis. > > The problem with joining variables is that is doesn't scale very well > because all the records are read into memory. Also the parallel stream > won't work over variables. > > Joel Bernstein > http://joelsolr.blogspot.com/ > > On Thu, Jun 22,

Re: Index 0, Size 0 - hashJoin Stream function Error

2017-06-23 Thread Susheel Kumar
, 2017 at 11:07 AM, Susheel Kumar > wrote: > > > Hi Joel, > > > > As i am getting deeper, it doesn't look like a problem due to hashJoin > etc. > > > > > > Below is a simple let expr where if search would not find a match and > > return 0 result

Re: Using of Streaming to join between shards

2017-06-26 Thread Susheel Kumar
You may want to start with innerJoin which is the simple typical join in database world. On Mon, Jun 26, 2017 at 1:46 AM, mganeshs wrote: > Hi Erick, > > My scenario goes with two kind of SOLR documents > > Document #1 - Real document > #D_uniqueId #D_documentId(unique), #D_documentname, #D_docu

Re: Dynamic fields vs parent child

2017-06-27 Thread Susheel Kumar
Can you describe your use case in terms of what business functionality you are looking to achieve. Thanks, Susheel On Mon, Jun 26, 2017 at 4:26 PM, Saurabh Sethi wrote: > Number of dynamic fields will be in thousands (millions of users + > thousands of events shared between subsets of users). >

Re: Index 0, Size 0 - hashJoin Stream function Error

2017-06-27 Thread Susheel Kumar
Hi Joel, I have submitted a patch to handle this. Please review. https://issues.apache.org/jira/secure/attachment/12874681/SOLR-10944.patch Thanks, Susheel On Fri, Jun 23, 2017 at 12:32 PM, Susheel Kumar wrote: > Thanks for confirming. Here is the JIRA > > https://issues.apache

Re: Dynamic fields vs parent child

2017-06-27 Thread Susheel Kumar
mic fields. > > On Tue, Jun 27, 2017 at 5:39 AM, Susheel Kumar > wrote: > > > Can you describe your use case in terms of what business functionality > you > > are looking to achieve. > > > > Thanks, > > Susheel > > > > On Mon, Jun 26, 2017

Re: Number of occurrences in Solr Documents

2017-06-29 Thread Susheel Kumar
Checkout Term Vector component https://wiki.apache.org/solr/TermVectorComponent On Thu, Jun 29, 2017 at 10:40 AM, Kaushik wrote: > Hello, > > We are trying to get the most frequently used words in a collection. My > understanding is that using facet.field=content_txt. An e.g. of content_txt > v

Re: Number of occurrences in Solr Documents

2017-06-29 Thread Susheel Kumar
That's even better. Thanks, Shawn. On Thu, Jun 29, 2017 at 11:45 AM, Shawn Heisey wrote: > On 6/29/2017 8:40 AM, Kaushik wrote: > > We are trying to get the most frequently used words in a collection. > > My understanding is that using facet.field=content_txt. An e.g. of > > content_txt value is

Re: Allow Join over two sharded collection

2017-06-30 Thread Susheel Kumar
How many documents you have currently and how much it will be after growing drastically. Either you can add hardware and keep one shard until the joins are fully available or You can shard and distribute using composite id router and that's still better even though some/one shard(s) may get high l

Re: Include JSON facet inside Solr Streaming

2017-06-30 Thread Susheel Kumar
I doubt it can work. Why not utilise facet stream expression which use JSON facet under the cover. On Thu, Jun 29, 2017 at 9:44 PM, Zheng Lin Edwin Yeo wrote: > Hi, > > Is it currently possible to include JSON facet inside Solr Streaming? > > I am trying out with the following query, which comb

Re: Allow Join over two sharded collection

2017-07-01 Thread Susheel Kumar
As Eric said 1docs/month isn't a big deal. I have 45+ million docs in one shard but YMMV depending on other factors. Also there is lot of confusion in the terminology. The default routing is compositeID routing. The implicit routing which Eric mentioned is the manual routing. https://issues.apa

Re: Allow Join over two sharded collection

2017-07-01 Thread Susheel Kumar
Depending on your use case people also use collection aliasing for time series data. See below https://blog.cloudera.com/blog/2013/10/collection-aliasing-near-real-time-search-for-really-big-data/ On Sat, Jul 1, 2017 at 7:13 PM, Susheel Kumar wrote: > As Eric said 1docs/month isn't a

Re: Include JSON facet inside Solr Streaming

2017-07-01 Thread Susheel Kumar
1 July 2017 at 03:30, Susheel Kumar wrote: > > > I doubt it can work. Why not utilise facet stream expression which use > > JSON facet under the cover. > > > > On Thu, Jun 29, 2017 at 9:44 PM, Zheng Lin Edwin Yeo < > edwinye...@gmail.com > > > > > wrot

Re: Automatically Restart Solr

2017-07-03 Thread Susheel Kumar
I am curios why you need to restart solr every week. Our Prod solr instance (6.0) has been running Since Nov,16 with no restart On Sun, Jul 2, 2017 at 12:55 PM, Furkan KAMACI wrote: > Hi Jeck, > > Here is the documentation about how you can run Solr as service: > https://lucene.apache.org/solr/

Re: Automatically Restart Solr

2017-07-03 Thread Susheel Kumar
nt an automated restart to refresh Solr, just another > proactive initiative. > > Best Regards, > Jeck > > > On 3 Jul 2017, at 7:02 PM, Susheel Kumar wrote: > > > > I am curios why you need to restart solr every week. Our Prod solr > > instance (6.0) has been runn

Re: Work-around for "indexed without position data"

2017-07-04 Thread Susheel Kumar
Did you try to reproduce this on latest Solr (6.6) just to rule out any bug with that version (though less likely). Pls download and do a quick test. On Mon, Jul 3, 2017 at 5:01 PM, Solr User wrote: > Not sure if it helps beyond the steps to reproduce that I supplied above, > but I also see tha

Re: Unique() metrics not supported in Solr Streaming facet stream source

2017-07-04 Thread Susheel Kumar
Hello Joel, I tried to create a patch to add UniqueMetric and it works, but soon realized, we have UniqueStream as well and can't load both of them (like below) when required, since both uses "unique" keyword. Any advice how we can handle this. Come up with different keyword for UniqueMetric or

Re: Unique() metrics not supported in Solr Streaming facet stream source

2017-07-05 Thread Susheel Kumar
Does "uniq" expression sounds good to use for UniqueMetric class? Thanks, Susheel On Tue, Jul 4, 2017 at 5:45 PM, Susheel Kumar wrote: > Hello Joel, > > I tried to create a patch to add UniqueMetric and it works, but soon > realized, we have UniqueStream as well and c

Re: Allow Join over two sharded collection

2017-07-05 Thread Susheel Kumar
How are you planing to manual route? What key(s) are you thinking to use. Second the link i shared was collection aliasing and if you use that, you will end up with multiple collections. Just want to clarify as you said above "...manual routing and creating alias" Again until the join feature is

Re: Unique() metrics not supported in Solr Streaming facet stream source

2017-07-05 Thread Susheel Kumar
come to an agreement yet > on the best way forward for this yet. I think we should open a separate > ticket to discuss how best to handle this issue. > > > Joel Bernstein > http://joelsolr.blogspot.com/ > > On Wed, Jul 5, 2017 at 10:04 AM, Susheel Kumar > wrote: > >

Re: help on implicit routing

2017-07-06 Thread Susheel Kumar
Eric has provided the details on other email. See below Use the _route_ field and put in "day_1" or "day_2". You've presumably named the shards (the "shard" parameter) when you added them with the CREATESHARD command so use the value you specified there. Best, Erick On Wed, Jul 5, 2017 at

Re: Boolean Field

2017-07-06 Thread Susheel Kumar
and how do you create the field? Share the the line where you are creating the above field1 wrote: > Do we need to store boolean field in order to query it? > > The query I am running is "field1:true" > > With the following field type, where "stored=false", query returns 0 > result. > > stored

Re: Slowly running OOM due to Query instances?!

2017-07-07 Thread Susheel Kumar
Xms 800m sounds low regardless did you know how much total cache consumption may go based on your current solrconfig.xml settings. Also 2 shards and 3 replca's are on 6 such machines i assume. Thanks, Susheel On Fri, Jul 7, 2017 at 7:01 AM, Markus Jelsma wrote: > Hello, > > This morning i spott

Re: Slowly running OOM due to Query instances?!

2017-07-07 Thread Susheel Kumar
Around 300MB will be spent for your filterCache and query cache (taking avg size for query string.. https://teaspoon-consulting.com/articles/solr-cache-tuning.html). So during continous indexing and queries (complex) your cache thus heap utilization may go up. On Fri, Jul 7, 2017 at 9:41 AM, Mark

Re: mm = 1 and multi-field searches

2017-07-10 Thread Susheel Kumar
How are you specifying multiple fields. Use qf parameter to specify multiple fields e.g. http://localhost:8983/solr/techproducts/select?indent=on&q=Samsung%20Maxtor%20hard&wt=json&defType=edismax&qf=name%20manu&debugQuery=on&mm=1 On Mon, Jul 10, 2017 at 4:51 PM, Michael Joyner wrote: > Hello a

Re: How to "chain" import handlers: import from DB and from file system

2017-07-10 Thread Susheel Kumar
Use SolrJ if you end up developing Indexer in Java to send documents to Solr. Its been a long i have used DIH but you can gave it a try first, otherwise as Walter suggested developing external indexer is best. On Sun, Jul 9, 2017 at 6:46 PM, Walter Underwood wrote: > 4. Write an external progra

Re: Is JSON facet output removing characters like \t from output

2017-07-12 Thread Susheel Kumar
I checked on 6.6 and don't see any such issues. I assume the field you are bucketing on is string/keywordtokenizer not text/analyzed field. === "facets":{ "count":5, "myfacet":{ "buckets":[{ "val":"A\t\t\t", "count":2}, { "val":"L\t\t\t"

Re: accessing numfound value

2017-07-13 Thread Susheel Kumar
You get back QueryResponse after executing a query. Then you can simply use below to get qTime, ElapsedTime and numFound. response.getQTime(), response.getElapsedTime() response.getResults().getNumFound() Thanks, Susheel On Wed, Jul 12, 2017 at 4:29 PM, Steve Pruitt wrote: > I'm having diffi

Re: Need domain configuration assistance

2017-07-13 Thread Susheel Kumar
This window solr server must have a name and IP address associated with it. Check from external content deliver servers if port 8983 to Solr server is open and if so you can refer solr via http://:/solr. if port 8983 is not open then try to run solr 80/8080 or work with network team to open the por

Re: Need domain configuration assistance

2017-07-13 Thread Susheel Kumar
But don't expose Solr outside to public... On Thu, Jul 13, 2017 at 1:41 PM, Susheel Kumar wrote: > This window solr server must have a name and IP address associated with > it. Check from external content deliver servers if port 8983 to Solr server > is open and if so you can

Re: NullPointerException on openStreams

2017-07-13 Thread Susheel Kumar
This the working code snippet I have, if that helps public static void main(String []args) throws IOException { String clause; TupleStream stream; List tuples; StreamContext streamContext = new StreamContext(); SolrClientCache solrClientCache = new SolrClientCache(); streamContext.s

Re: How to determine the user Solr is using

2017-07-14 Thread Susheel Kumar
The first column is the UID/user column as output of ps -ef on linux machines... On Fri, Jul 14, 2017 at 1:38 PM, Iridian Group wrote: > How do I determine which user the Solr server starts up with and is > running under? > > Thanks > > Keith Savoie > >

Re: How to determine the user Solr is using

2017-07-14 Thread Susheel Kumar
’t the > service also not start on boot? > 2) Are the permissions of the Solr instance supposed to be something other > than root:root? > > Apologies for all the 101 questions. > > > > Thanks > > Keith > > > > > On Jul 14, 2017, at 1:14 PM, Susheel K

Re: How to determine the user Solr is using

2017-07-14 Thread Susheel Kumar
It should be 2) solr:solr On Fri, Jul 14, 2017 at 3:18 PM, Susheel Kumar wrote: > If you setup solr using install_service which comes with solr, it sets up > solr running as "solr" user. Solr is not recommended to run as root user > due to security concerns. You either launc

Re: How to determine the user Solr is using

2017-07-15 Thread Susheel Kumar
2) solr:solr > > As my install is running as a service, everything under /opt/solr should > be solr:sold? Currently root:root but seems to be working correctly. > My /var/solr is already solr:solr. > > K > > > > > On Jul 14, 2017, at 2:19 PM, Susheel Kumar > wrote:

Re: CDCR - how to deal with the transaction log files

2017-07-17 Thread Susheel Kumar
I just voted for https://issues.apache.org/jira/browse/SOLR-11069 to get it resolved, as we are discussing to start using CDCR soon. On Fri, Jul 14, 2017 at 5:21 PM, Varun Thacker wrote: > https://issues.apache.org/jira/browse/SOLR-11069 is tracking why is > LASTPROCESSEDVERSION=-1 > on the sour

Re: Cant stop/start server

2017-07-17 Thread Susheel Kumar
Exactly. Both are different and for the purpose if you see the content. The later refers the prev one. On Mon, Jul 17, 2017 at 9:15 AM, Iridian Group wrote: > So I installed SOLR on another server using just the service install > script and am experiencing the same issue when starting/stopping

Re: Solr 6.6.0 - Indexing errors

2017-07-17 Thread Susheel Kumar
There is some analysis error also. I would suggest to test the indexer on just one shard setup first, then test for a replica (1 shard and 1 replica) and then test for 2 shards and 2 replica. This would confirm if there is basic issue with indexing / cluster setup. On Mon, Jul 17, 2017 at 9:04 A

Re: Solr 6.6.0 - Indexing errors

2017-07-17 Thread Susheel Kumar
st overloading HDFS? The same nodes that run > Solr > > also run HDFS datanodes, but they are pretty beefy machines; we're not > > swapping. > > > > As Shawn pointed out, I will be checking the HDFS version (we're using > > Cloudera CDH 5.10.2), and t

Re: How to exclude stop words in spellcheck collations

2017-07-17 Thread Susheel Kumar
The field which you are using for spellcheck suggestions can utilise stopword filter factory. Thanks, Susheel On Sun, Jul 16, 2017 at 12:47 PM, Naveen Pajjuri wrote: > Hi, > Is there any way i can exclude stop words from the collations and > sugesstions from spell check component ? > > Regards,

Re: CloudSolrClient preferred over LBHttpSolrClient

2017-07-17 Thread Susheel Kumar
Also per def of CloudSolrClient. SolrJ client class to communicate with SolrCloud. Instances of this class communicate with Zookeeper to discover Solr endpoints for SolrCloud collections, and then use the LBHttpSolrClient

Re: Embedded documents in solr

2017-07-18 Thread Susheel Kumar
How many availabilities.day can be there for a single document? Is it for a week/month/year? On Tue, Jul 18, 2017 at 4:21 AM, Swapnil Pande wrote: > Hi , > I am new to solr. I am facing a problem for embedding documents to solr. I > dont want to use solr joins. > The document is similar to > {"n

Re: SolrJ 6.6.0 Connection pool shutdown now with stack trace

2017-07-18 Thread Susheel Kumar
Do you see any errors etc. in solr.log during this time? On Tue, Jul 18, 2017 at 7:10 AM, Markus Jelsma wrote: > The problem was never resolved but Shawn asked for the stack trace, here > it is: > > org.apache.solr.client.solrj.SolrServerException: > java.lang.IllegalStateException: > Connectio

Re: Get results in multiple orders (multiple boosts)

2017-07-18 Thread Susheel Kumar
As Eric suggested, its possible by sorting using custom function. You may have to use if, sum and exists function etc. to come up with custom score field and sort using this field. The if condition would check for the conditions mentioned and keep adding the score etc. Thanks, Susheel On Tue, Jul

Re: SolrJ 6.6.0 Connection pool shutdown now with stack trace

2017-07-18 Thread Susheel Kumar
Then most likely its due to closing of connection as mentioned above though you said it's not happening in that part of your code. To rule out firewall possibility, you can test in some other/local env. Also how many requests/client/connections happening concurrently. Thanks, Susheel On Tue, Ju

Re: Solr Issue While indexing Data

2017-07-19 Thread Susheel Kumar
What is you current a) softcommit and hardcommit settings. you can share as it is from config and how are you committing then? b) how much is heap out of 124gb c) how many documents are you adding that is taking long and approx how many fields including copy fields? Thnx On Wed, Jul 19, 2017 at 7

<    1   2   3   4   5   >