RE: Delete all, index all, end up with 1 segment with 50% deletes

2018-11-28 Thread Markus Jelsma
Hello Shawn, Erick,

I thought about that too, but dismissed it, other similar batched processes 
don't show this problem. Nonetheless i reset cumulativeAdds and watched a batch 
being indexed, it got indexed twice!

Thanks!
Markus
 
-Original message-
> From:Erick Erickson 
> Sent: Wednesday 28th November 2018 2:59
> To: solr-user 
> Subject: Re: Delete all, index all, end up with 1 segment with 50% deletes
> 
> Shawn's comment seems likely, somehow you're adding all the docs twice
> and only committing at the end. In that case there'd be only 1
> segment. That's about the only way I can imagine your index has
> exactly one segment with exactly half the docs deleted.
> 
> It'd be interesting for you to look at the admin UI>>schema browser
> for your  field. It'll report the most frequent entries and
> if every  has exactly 2 entries, then you're indexing the
> same docs twice in one go.
> 
> Plus, the default TieredMergePolicy doesn't necessarily kick in unless
> there are multiple segments of roughly the same size. With an index
> this small it's perfectly possible that TMP is getting triggered and
> saying, in essence, "there's not enough work to do here to bother".
> 
> In Solr 7.5, you can optimize/forceMerge without any danger of
> creating massive segments, see:
> https://lucidworks.com/2017/10/13/segment-merging-deleted-documents-optimize-may-bad/
> (pre Solr 7.5)
> and
> https://lucidworks.com/2018/06/20/solr-and-optimizing-your-index-take-ii/
> (Solr 7.5+).
> 
> Best,
> Erick
> On Tue, Nov 27, 2018 at 4:29 AM Markus Jelsma
>  wrote:
> >
> > Hello,
> >
> > A background  batch process compiles a data set, when finished, it sends a 
> > delete all to its target collection, then everything gets sent by SolrJ, 
> > followed by a regular commit. When inspecting the core i notice it has one 
> > segment with 9578 documents, of which exactly half are deleted.
> >
> > That Solr node is on 7.5, how can i encourage the merge scheduler to do its 
> > job and merge away all those deletes?
> >
> > Thanks,
> > Markus
> 


Re: Is reload necessary for updates to files referenced in schema, like synonyms, protwords, etc?

2018-11-28 Thread Vincenzo D'Amore
Very likely I'm late to this party :) not sure with solr standalone, but
with solrcloud (7.3.1) you have to reload the core every time synonyms
referenced by a schema are changed.

On Mon, Nov 26, 2018 at 8:51 PM Walter Underwood 
wrote:

> Should be easy to check with the analysis UI. Add a synonym and see if it
> is used.
>
> I seem to remember some work on reloading synonyms on the fly without a
> core reload. These seem related...
>
> https://issues.apache.org/jira/browse/SOLR-5200
> https://issues.apache.org/jira/browse/SOLR-5234
>
> wunder
> Walter Underwood
> wun...@wunderwood.org
> http://observer.wunderwood.org/  (my blog)
>
> > On Nov 26, 2018, at 11:43 AM, Shawn Heisey  wrote:
> >
> > I know that changes to the schema require a reload.  But do changes to
> files referenced by a schema also require a reload?  So if for instance I
> were to change the contents of a synonym file, would I need to reload the
> core before Solr would use the new file?  Synonyms in this case are at
> query time, but other files like protwords are used at index time.
> >
> > I *THINK* that a reload is required, but I can't be sure without
> checking the code, and it would probably take me more than a couple of
> hours to unravel the code enough to answer the question myself.
> >
> > It is not SolrCloud, so there's no ZK to worry about.
> >
> > Thanks,
> > Shawn
> >
>
>

-- 
Vincenzo D'Amore


Enable SSL for the existing SOLR Cloud Cluster

2018-11-28 Thread Tech Support
Dear Solr Team, 

   

 

In my SolrCloud cluster, I am using 3  Zookeper External ensemble and 2 Solr
Instance.

 

I already created Collection using the PORT 8983 and It has more data.

 

Now I want to enable SSL. 

 

As per your help document, I had enabled SSL. But it's using 8984 PORT. When
I login with the ADMIN GUI, unable to view and search the existing data.

 

How to use access the existing data?

 

If I start by using the 8983 PORT as like below command, It's  just login
with the Admin GUI. But collections are in Down Status.

bin\solr.cmd -cloud -s cloud\node1  -p 8983

 

Please guide me, how to enable SSL for the existing SOLR Cloud Cluster.

 

 

Thanks,

 

Karthick Ramu

 



Re: Is reload necessary for updates to files referenced in schema, like synonyms, protwords, etc?

2018-11-28 Thread Shawn Heisey

On 11/28/2018 6:37 AM, Vincenzo D'Amore wrote:

Very likely I'm late to this party :) not sure with solr standalone, but
with solrcloud (7.3.1) you have to reload the core every time synonyms
referenced by a schema are changed.


I have a 7.5.0 download on my workstation, so I fired that up, created a 
core, and tried it out.  I did learn that a reload is required when 
changing files referenced by analysis components in the schema.  That's 
what I had thought was probably the case, now I know for sure.


Thanks,
Shawn



Re: Time-Routed Alias Not Distributing Wrongly Placed Docs

2018-11-28 Thread Jason Gerlowski
Hi John,

I'm not an expert on TRA, but I don't think so.  The TRA functionality
I'm familiar with involves creating and deleting underlying
collections and then routing documents based on that information.  As
far as I know that happens at the UpdateRequestProcessor level - once
your data is indexed there's nothing available to move it around.

Best,

Jason
On Tue, Nov 27, 2018 at 12:42 PM John Nashorn  wrote:
>
> Hello Everyone,
> I'm using "hive-solr" from Lucidworks to index my data into Solr (v:7.5, 
> cloud mode). As written in the Solr Manual, TRA expects documents to be 
> indexed using its alias name, and not directly into the collections under it. 
> Unfortunately, hive-solr doesn't allow using TRA names as indexing targets. 
> So what I do is: I index data using the first collection created by TRA and 
> expect Solr to distribute my data into its respective collection under the 
> hood. This works to some extent, but a big portion of data stays in where 
> they were indexed, ie. the first collection of the TRA. For example 
> (approximate numbers):
>
> * coll_2018-07-01 => 800.000.000 docs
> * coll_2018-08-01 => 0 docs
> * coll_2018-09-01 => 0 docs
> * coll_2018-10-01 => 150.000.000 docs
> * coll_2018-11-01 => 0 docs
>
> Here, coll_2018-07-01 contains data that should normally be in the other four 
> collections.
>
> Is there a way to make TRA scan (somehow intentionally) misplaced data and 
> send them to their correct places?


PSA: Activate 2018 videos are now available

2018-11-28 Thread Alexandre Rafalovitch
For all those who wanted to be at the conference for the talks :-) but
could not:
https://www.youtube.com/watch?v=Hm98XL0Mw5c&list=PLU6n9Voqu_1HW8-VavVMa9lP8-oF8Oh5t

(Plug) Mine was: "JSON in Solr: from top to bottom", video at:
https://www.youtube.com/watch?v=WzYbTe3-nFI , slides at:
https://www.slideshare.net/arafalov/json-in-solr-from-top-to-bottom .
See if you can figure out which word I totally cannot pronounce under
stress :-)

Regards,
   Alex.


Re: PSA: Activate 2018 videos are now available

2018-11-28 Thread Doug Turnbull
Thanks Alex, and thanks to everyone who was part of organizing the
conference!

On Wed, Nov 28, 2018 at 12:28 PM Alexandre Rafalovitch 
wrote:

> For all those who wanted to be at the conference for the talks :-) but
> could not:
>
> https://www.youtube.com/watch?v=Hm98XL0Mw5c&list=PLU6n9Voqu_1HW8-VavVMa9lP8-oF8Oh5t
>
> (Plug) Mine was: "JSON in Solr: from top to bottom", video at:
> https://www.youtube.com/watch?v=WzYbTe3-nFI , slides at:
> https://www.slideshare.net/arafalov/json-in-solr-from-top-to-bottom .
> See if you can figure out which word I totally cannot pronounce under
> stress :-)
>
> Regards,
>Alex.
>
-- 
CTO, OpenSource Connections
Author, Relevant Search
http://o19s.com/doug


Re: PSA: Activate 2018 videos are now available

2018-11-28 Thread Gus Heck
I noticed some were out a few days ago, but I don't think they're all there
yet (mine isn't)

On Wed, Nov 28, 2018 at 12:46 PM Doug Turnbull <
dturnb...@opensourceconnections.com> wrote:

> Thanks Alex, and thanks to everyone who was part of organizing the
> conference!
>
> On Wed, Nov 28, 2018 at 12:28 PM Alexandre Rafalovitch  >
> wrote:
>
> > For all those who wanted to be at the conference for the talks :-) but
> > could not:
> >
> >
> https://www.youtube.com/watch?v=Hm98XL0Mw5c&list=PLU6n9Voqu_1HW8-VavVMa9lP8-oF8Oh5t
> >
> > (Plug) Mine was: "JSON in Solr: from top to bottom", video at:
> > https://www.youtube.com/watch?v=WzYbTe3-nFI , slides at:
> > https://www.slideshare.net/arafalov/json-in-solr-from-top-to-bottom .
> > See if you can figure out which word I totally cannot pronounce under
> > stress :-)
> >
> > Regards,
> >Alex.
> >
> --
> CTO, OpenSource Connections
> Author, Relevant Search
> http://o19s.com/doug
>


-- 
http://www.the111shift.com


Moving Solr index from Staging to Production

2018-11-28 Thread Arunan Sugunakumar
Hi,

I have deployed Solr 7.2 in a staging server in standalone mode. I want to
move it to the production server.

I would like to know whether I need to run the indexing process again or is
there any easier way to move the existing index?

I went through this documentation but I couldn't figure out whether it is
the right way.
https://lucene.apache.org/solr/guide/6_6/making-and-restoring-backups.html

Thanks in advance!!

Regards
Arunan


Re: Moving Solr index from Staging to Production

2018-11-28 Thread David Hastings
you just set up the solr install on the production server as a slave to
your current install and hit the replicate button from the admin interface
on the production server

On Wed, Nov 28, 2018 at 1:34 PM Arunan Sugunakumar 
wrote:

> Hi,
>
> I have deployed Solr 7.2 in a staging server in standalone mode. I want to
> move it to the production server.
>
> I would like to know whether I need to run the indexing process again or is
> there any easier way to move the existing index?
>
> I went through this documentation but I couldn't figure out whether it is
> the right way.
> https://lucene.apache.org/solr/guide/6_6/making-and-restoring-backups.html
>
> Thanks in advance!!
>
> Regards
> Arunan
>


Re: Solr Delta Import Issue

2018-11-28 Thread ~$alpha`
Time for import - 5-6 minutes
Warmup time - 40seconds

autoCommit and autoSoftCommit setting both disabled and We fire commit only
after the import is completed.

I have some more doubt 
1.  In case of master-slave, is auto warm strategy is available for slaves 

2. Also should I also have a limit on max connection to make server Solr
server won't go down and if yes how to choose that max no. ? 



--
Sent from: http://lucene.472066.n3.nabble.com/Solr-User-f472068.html


Re: Period on-line index optimization

2018-11-28 Thread Christopher Schultz
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA256

Erick,

On 11/27/18 20:47, Erick Erickson wrote:
> And do note one implication of the link Shawn gave you. Now that 
> you've optimized, you probably have one huge segment. It _will not_
> be merged unless and until it has < 2.5G "live" documents. So you
> may see your percentage of deleted documents get quite a bit larger
> than you've seen before merging kicks in. Solr 7.5 will rewrite
> this segment (singleton merge) over time as deletes accumulate, or
> you can optimize/forceMerge and it'll gradually shrink (assuming
> you do not merge down to 1 segment).

Ack. It sounds like I shouldn't worry too much about "optimization" at
all. If I find that I have a performance problem (hah! I'm comparing
the performance to a relational table-scan, which was intolerably
long), I can investigate whether or not optimization will help me.

> Oh, and the admin UI segments view is misleading prior to Solr
> 7.5. Hover over each one and you'll see the number of deleted docs.
> It's _supposed_ to be proportional to the number of deleted docs,
> with light gray being live docs and dark gray being deleted, but
> the calculation was off. If you hover over you'll see the raw
> numbers and see what I mean.

Thanks for this clarification. I'm using 7.4.0, so I think that's what
was confusing me.

I'm fairly certain to upgrade to 7.5 in the next few weeks. For me,
it's basically a untar/stop/ln/start operation as long as testing goes
well.

- -chris

> On Tue, Nov 27, 2018 at 2:11 PM Shawn Heisey 
> wrote:
>> 
>> On 11/27/2018 10:04 AM, Christopher Schultz wrote:
>>> So, it's pretty much like GC promotion: the number of live
>>> objects is really the only things that matters?
>> 
>> That's probably a better analogy than most anything else I could
>> come up with.
>> 
>> Lucene must completely reconstruct all of the index data from
>> the documents that haven't been marked as deleted.  The fastest
>> I've ever seen an optimize proceed is about 30 megabytes per
>> second, even on RAID10 disk subsystems that are capable of far
>> faster sustained transfer rates.  The operation strongly impacts
>> CPU and garbage generation, in addition to the I/O impact.
>> 
>>> I was thinking once per day. AFAIK, this index hasn't been
>>> optimized since it was first built which was a few months ago.
>> 
>> For an index that small, I wouldn't expect a once-per-day
>> optimization to have much impact on overall operation.  Even for
>> big indexes, if you can do the operation when traffic on your
>> system is very low, users might never even notice.
>> 
>>> We aren't explicitly deleting anything, ever. The only deletes 
>>> occurring should be when we perform an update() on a document,
>>> and Solr/Lucene automatically deletes the existing document
>>> with the same id
>> 
>> If you do not use deleteByQuery, then ongoing index updates and
>> segment merging (which is what an optimize is) will not interfere
>> with each other, as long as you're using version 4.0 or later.
>> 3.6 and earlier were not able to readily mix merging with ongoing
>> indexing operations.
>> 
>>> I'd want to schedule this thing with cron, so curl is better
>>> for me. "nohup optimize &" is fine with me, especially if it
>>> will give me stats on how long the optimization actually took.
>> 
>> If you want to know how long it takes, it's probably better to
>> throw the whole script into the background rather than the curl
>> itself.  But you're headed in the right general direction.  Just
>> a few details to think about.
>> 
>>> I have dev and test environments so I have plenty of places to 
>>> play-around. I can even load my production index into dev to
>>> see how long the whole 1M document index will take to optimize,
>>> though the number of segments in the index will be different,
>>> unless I just straight-up copy the index files from the disk. I
>>> probably won't do that because I'd prefer not to take-down the
>>> index long enough to take a copy.
>> 
>> If you're dealing with the small index, I wouldn't expect copying
>> the index data while the machine is online to be problematic --
>> the I/O load would be small.  But if you're running on Windows, I
>> wouldn't be 100% sure that you could copy index data that's in
>> use -- Windows does odd things with file locking that aren't a
>> problem on most other operating systems.
>> 
>>> You skipped question 4 which was "can I update my index during
>>> an optimization", but you did mention in your answer to
>>> question 3 ("can I still query during optimize?") that I
>>> "should" be able to update the index (e.g. add/update). Can you
>>> clarify why you said "should" instead of "can"?
>> 
>> I did skip it, because it was answered with question 3, as you
>> noticed.
>> 
>> If that language is there in my reply, I am not really
>> surprised. Saying "should" rather than "can" is just part of a
>> general "cover my ass" stance that I adopt whenever I'm answering
>> questions l

Boosting score based off a match in a particular field

2018-11-28 Thread Tanya Bompi
Hi,
  I have an index that is built using a combination of fields (Title,
Description, Phone, Email etc). I have an indexed all the fields and the
combined copy field as well.
In the query that i have which is a combination of all the fields as input
(Title + Description+Phone+email).
There are some samples where if the Email/Phone has a match the resulting
Solr score is lower still. I have tried boosting the fields say Email^2 but
that results in any token in the input query being matched against the
email which results in erroneous results.

How can i formulate a query that I can boost for Email to match against
Email with a boost along with the combined field match against the combined
field index.

Thanks,
Tanya


Re: Moving Solr index from Staging to Production

2018-11-28 Thread Toke Eskildsen
Arunan Sugunakumar  wrote:

> https://lucene.apache.org/solr/guide/6_6/making-and-restoring-backups.html

We (also?) prefer to keep our stage/build setup separate from production. 
Backup + restore works well for us. It is very fast, as it is basically just 
copying the segment files.

- Toke Eskildsen


Re: Boosting score based off a match in a particular field

2018-11-28 Thread Doug Turnbull
The terminology we use at my company is you want to *gate* the effect of
boost to only very precise scenarios. A lot of this depends on how your
Email and Phone numbers are being tokenized/analyzed (ie what analyzer is
on the field type), because you really only want to boost when you have
high confidence email/phone number matches. You may actually have more of a
matching problem than a relevance problem. You can debug this in the Solr
analysis screen.

Another tool you can use is putting a mm on just the boost query. This
gates that specific boost based on how many query terms match that field.
It's good for doing a kind of poor-man's entity recognition (how much does
the query correspond to one kind of entity)

Something like

bq={!edismax mm=80% qf=Email^100 v=$q} <--Boost emails only when there's a
strong match, 80% of query terms match the email

alongside your main qf with the combined field

qf=text_all

There's a lot of strategies, and it usually involves a combination of query
and analysis work (and lots of good test data to prove your approach works)

(shameless plug is we cover a lot of this in Solr relevance training
https://opensourceconnections.com/events/training/)

Hope that helps
-Doug


On Wed, Nov 28, 2018 at 3:21 PM Tanya Bompi  wrote:

> Hi,
>   I have an index that is built using a combination of fields (Title,
> Description, Phone, Email etc). I have an indexed all the fields and the
> combined copy field as well.
> In the query that i have which is a combination of all the fields as input
> (Title + Description+Phone+email).
> There are some samples where if the Email/Phone has a match the resulting
> Solr score is lower still. I have tried boosting the fields say Email^2 but
> that results in any token in the input query being matched against the
> email which results in erroneous results.
>
> How can i formulate a query that I can boost for Email to match against
> Email with a boost along with the combined field match against the combined
> field index.
>
> Thanks,
> Tanya
>
-- 
CTO, OpenSource Connections
Author, Relevant Search
http://o19s.com/doug


Re: Boosting score based off a match in a particular field

2018-11-28 Thread Doug Turnbull
Ah yes, this is a common gotcha, its because the bq is recursively applied
to itself

So you have to change that bq to have itself a bq that's empty

bq={!edismax bq='' mm=80% qf=Email^100 v=$q}

v is simply the 'q' for this subquery, by passing v=$q you explicitly set
it to whatever was passed in q

Best
-Doug

On Wed, Nov 28, 2018 at 4:30 PM Tanya Bompi  wrote:

> Hi Doug,
>   Thank you for your response. I tried the above boost syntax but I get
> the following error of going into an infinite loop. In the wiki page I
> couldnt figure out what the 'v' parameter is. (
> https://lucene.apache.org/solr/guide/7_0/the-extended-dismax-query-parser.html).
> I will try the analysis tool as well.
>
> "bq":"{!edismax mm=80% qf=ContactEmail^100 v=$q}"}},
> "error":{ "metadata":[ "error-class",
> "org.apache.solr.common.SolrException", "root-error-class",
> "org.apache.solr.search.SyntaxError"], 
> "msg":"org.apache.solr.search.SyntaxError:
> Infinite Recursion detected parsing query
>
> Thank you,
> Tanya
>
> On Wed, Nov 28, 2018 at 12:36 PM Doug Turnbull <
> dturnb...@opensourceconnections.com> wrote:
>
>> The terminology we use at my company is you want to *gate* the effect of
>> boost to only very precise scenarios. A lot of this depends on how your
>> Email and Phone numbers are being tokenized/analyzed (ie what analyzer is
>> on the field type), because you really only want to boost when you have
>> high confidence email/phone number matches. You may actually have more of
>> a
>> matching problem than a relevance problem. You can debug this in the Solr
>> analysis screen.
>>
>> Another tool you can use is putting a mm on just the boost query. This
>> gates that specific boost based on how many query terms match that field.
>> It's good for doing a kind of poor-man's entity recognition (how much does
>> the query correspond to one kind of entity)
>>
>> Something like
>>
>> bq={!edismax mm=80% qf=Email^100 v=$q} <--Boost emails only when there's a
>> strong match, 80% of query terms match the email
>>
>> alongside your main qf with the combined field
>>
>> qf=text_all
>>
>> There's a lot of strategies, and it usually involves a combination of
>> query
>> and analysis work (and lots of good test data to prove your approach
>> works)
>>
>> (shameless plug is we cover a lot of this in Solr relevance training
>> https://opensourceconnections.com/events/training/)
>>
>> Hope that helps
>> -Doug
>>
>>
>> On Wed, Nov 28, 2018 at 3:21 PM Tanya Bompi 
>> wrote:
>>
>> > Hi,
>> >   I have an index that is built using a combination of fields (Title,
>> > Description, Phone, Email etc). I have an indexed all the fields and the
>> > combined copy field as well.
>> > In the query that i have which is a combination of all the fields as
>> input
>> > (Title + Description+Phone+email).
>> > There are some samples where if the Email/Phone has a match the
>> resulting
>> > Solr score is lower still. I have tried boosting the fields say Email^2
>> but
>> > that results in any token in the input query being matched against the
>> > email which results in erroneous results.
>> >
>> > How can i formulate a query that I can boost for Email to match against
>> > Email with a boost along with the combined field match against the
>> combined
>> > field index.
>> >
>> > Thanks,
>> > Tanya
>> >
>> --
>> CTO, OpenSource Connections
>> Author, Relevant Search
>> http://o19s.com/doug
>>
> --
CTO, OpenSource Connections
Author, Relevant Search
http://o19s.com/doug


Re: Boosting score based off a match in a particular field

2018-11-28 Thread Tanya Bompi
Hi Doug,
  Thank you for your response. I tried the above boost syntax but I get the
following error of going into an infinite loop. In the wiki page I couldnt
figure out what the 'v' parameter is. (
https://lucene.apache.org/solr/guide/7_0/the-extended-dismax-query-parser.html).
I will try the analysis tool as well.

"bq":"{!edismax mm=80% qf=ContactEmail^100 v=$q}"}},
"error":{ "metadata":[ "error-class","org.apache.solr.common.SolrException",
"root-error-class","org.apache.solr.search.SyntaxError"],
"msg":"org.apache.solr.search.SyntaxError:
Infinite Recursion detected parsing query

Thank you,
Tanya

On Wed, Nov 28, 2018 at 12:36 PM Doug Turnbull <
dturnb...@opensourceconnections.com> wrote:

> The terminology we use at my company is you want to *gate* the effect of
> boost to only very precise scenarios. A lot of this depends on how your
> Email and Phone numbers are being tokenized/analyzed (ie what analyzer is
> on the field type), because you really only want to boost when you have
> high confidence email/phone number matches. You may actually have more of a
> matching problem than a relevance problem. You can debug this in the Solr
> analysis screen.
>
> Another tool you can use is putting a mm on just the boost query. This
> gates that specific boost based on how many query terms match that field.
> It's good for doing a kind of poor-man's entity recognition (how much does
> the query correspond to one kind of entity)
>
> Something like
>
> bq={!edismax mm=80% qf=Email^100 v=$q} <--Boost emails only when there's a
> strong match, 80% of query terms match the email
>
> alongside your main qf with the combined field
>
> qf=text_all
>
> There's a lot of strategies, and it usually involves a combination of query
> and analysis work (and lots of good test data to prove your approach works)
>
> (shameless plug is we cover a lot of this in Solr relevance training
> https://opensourceconnections.com/events/training/)
>
> Hope that helps
> -Doug
>
>
> On Wed, Nov 28, 2018 at 3:21 PM Tanya Bompi  wrote:
>
> > Hi,
> >   I have an index that is built using a combination of fields (Title,
> > Description, Phone, Email etc). I have an indexed all the fields and the
> > combined copy field as well.
> > In the query that i have which is a combination of all the fields as
> input
> > (Title + Description+Phone+email).
> > There are some samples where if the Email/Phone has a match the resulting
> > Solr score is lower still. I have tried boosting the fields say Email^2
> but
> > that results in any token in the input query being matched against the
> > email which results in erroneous results.
> >
> > How can i formulate a query that I can boost for Email to match against
> > Email with a boost along with the combined field match against the
> combined
> > field index.
> >
> > Thanks,
> > Tanya
> >
> --
> CTO, OpenSource Connections
> Author, Relevant Search
> http://o19s.com/doug
>


solr optimize command

2018-11-28 Thread Wei
Hi,

 I use the following http request to start solr index optimization:

http://localhost:8983/solr//update?skipError=true -F stream.body='
'


 The request returns status code 200 shortly, but when looking at the solr
instance I noticed that actual optimization has not completed yet as there
are more than 1 segments. Is the optimize command async? What is the best
approach to validate that optimize is truly completed?


Thanks,

Wei


Re: solr optimize command

2018-11-28 Thread Zheng Lin Edwin Yeo
Hi,

How big is your index size, and do you have enough space in your disk to do
the optimization? You need at least twice the disk space in order for the
optimization to be successful, and even more if you are still doing
indexing during the optimization.

Also, which Solr version are you using?

Regards,
Edwin

On Thu, 29 Nov 2018 at 09:23, Wei  wrote:

> Hi,
>
>  I use the following http request to start solr index optimization:
>
> http://localhost:8983/solr//update?skipError=true -F stream.body='
> '
>
>
>  The request returns status code 200 shortly, but when looking at the solr
> instance I noticed that actual optimization has not completed yet as there
> are more than 1 segments. Is the optimize command async? What is the best
> approach to validate that optimize is truly completed?
>
>
> Thanks,
>
> Wei
>


Re: solr optimize command

2018-11-28 Thread Walter Underwood
Why do you think you need to optimize? Most configurations don’t need that.

And no, there is not synchronous optimize request.

wunder
Walter Underwood
wun...@wunderwood.org
http://observer.wunderwood.org/  (my blog)

> On Nov 28, 2018, at 6:50 PM, Zheng Lin Edwin Yeo  wrote:
> 
> Hi,
> 
> How big is your index size, and do you have enough space in your disk to do
> the optimization? You need at least twice the disk space in order for the
> optimization to be successful, and even more if you are still doing
> indexing during the optimization.
> 
> Also, which Solr version are you using?
> 
> Regards,
> Edwin
> 
> On Thu, 29 Nov 2018 at 09:23, Wei  wrote:
> 
>> Hi,
>> 
>> I use the following http request to start solr index optimization:
>> 
>> http://localhost:8983/solr//update?skipError=true -F stream.body='
>> '
>> 
>> 
>> The request returns status code 200 shortly, but when looking at the solr
>> instance I noticed that actual optimization has not completed yet as there
>> are more than 1 segments. Is the optimize command async? What is the best
>> approach to validate that optimize is truly completed?
>> 
>> 
>> Thanks,
>> 
>> Wei
>> 



Re: Enable SSL for the existing SOLR Cloud Cluster

2018-11-28 Thread Zheng Lin Edwin Yeo
Hi,

Have you tried with the steps in this document?
https://lucene.apache.org/solr/guide/7_5/enabling-ssl.html

This is from the guide in the latest Solr 7.5.0 version. Which version are
you using?

Regards,
Edwin

On Wed, 28 Nov 2018 at 22:41, Tech Support  wrote:

> Dear Solr Team,
>
>
>
>
>
> In my SolrCloud cluster, I am using 3  Zookeper External ensemble and 2
> Solr
> Instance.
>
>
>
> I already created Collection using the PORT 8983 and It has more data.
>
>
>
> Now I want to enable SSL.
>
>
>
> As per your help document, I had enabled SSL. But it's using 8984 PORT.
> When
> I login with the ADMIN GUI, unable to view and search the existing data.
>
>
>
> How to use access the existing data?
>
>
>
> If I start by using the 8983 PORT as like below command, It's  just login
> with the Admin GUI. But collections are in Down Status.
>
> bin\solr.cmd -cloud -s cloud\node1  -p 8983
>
>
>
> Please guide me, how to enable SSL for the existing SOLR Cloud Cluster.
>
>
>
>
>
> Thanks,
>
>
>
> Karthick Ramu
>
>
>
>


Enquiry about scheduling for re-indexing

2018-11-28 Thread Ma Man
To whom it might concern,

Recently, I am studying if Apache Solr able to re-index (Full Import /
Delta Import) periodically by configuration instead of triggering by URL (
e.g.
http://localhost:8983/solr/{collection_name}/dataimport?command=full-import
) in scheduler tool.

Version of the Solr using is 7.3.1.

Please advise if any.


Thanks for answering.

Best Regards,
Man Ma