Streaming Expressions (/stream) NPE

2015-12-22 Thread Jason Gerlowski
I'll preface this email by saying that I wasn't sure which mailing list it
belonged on.  It might fit on the dev list (since it involves a potential
Solr bug), but maybe the solr-users list is a better choice (since I'm
probably just misusing Solr).  I settled on the solr-users list.  Sorry if
I chose incorrectly.

Moving on...

I've run into a NullPointerException when trying to use the /stream
handler.  I'm not sure whether I'm doing something wrong with the commands
I'm sending to Solr via curl, or if there's an underlying bug causing this
behavior.

I'm making the stream request:

curl --data-urlencode 'stream=search(gettingstarted, q="*:*",
fl="title,url", sort="_version_ asc", rows="10")'
"localhost:8983/solr/gettingstarted/stream"

Solr responds with:

{"result-set":{"docs":[
{"EXCEPTION":null,"EOF":true}]}}

At this point, I assumed that something was wrong with my command, so I
checked the solr-logs for a hint at the problem.  I found:

ERROR - 2015-12-23 01:32:32.535; [c:gettingstarted s:shard2 r:core_node2
x:gettingstarted_shard2_replica2] org.apache.solr.common.SolrException;
java.lang.NullPointerException
  at
org.apache.solr.client.solrj.io.stream.expr.StreamExpressionParser.generateStreamExpression(StreamExpressionParser.java:47)
  at
org.apache.solr.client.solrj.io.stream.expr.StreamExpressionParser.parse(StreamExpressionParser.java:38)
  at
org.apache.solr.client.solrj.io.stream.expr.StreamFactory.constructStream(StreamFactory.java:168)
  at
org.apache.solr.handler.StreamHandler.handleRequestBody(StreamHandler.java:155)
  at
org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:156)

Has anyone seen this behavior before?  Is Solr reacting to something amiss
in my request, or is there maybe a bug here?  I'll admit this is my first
attempt at using the /stream API, so I might be getting something wrong
here.  I consulted the reference guide's examples on using the streaming
API (https://cwiki.apache.org/confluence/display/solr/Streaming+Expressions)
when coming up with my curl command, but I might've missed something.

Anyways, I'd appreciate any insight that anyone can offer on this.  If it
helps, I've included reproduction steps below.

1.) Download and compile Solr trunk.
2.) Start Solr using one of the examples (bin/solr start -e cloud).  Accept
default values.
3.) Index some docs (bin/post -c gettingstarted
http://lucene.apache.org/solr -recursive 1 -delay 1)
4.) Do a search to sanity check the ingestion (curl
"localhost:8983/solr/gettingstarted/select?q=*:*&wt=json")
5.) Make a /stream request for some docs (curl --data-urlencode
'stream=search(gettingstarted, q="*:*", fl="title,url", sort="_version_
asc", rows="10")' "localhost:8983/solr/gettingstarted/stream")

Thanks again for any ideas/help anyone can give.

Best,

Jason


Re: Streaming Expressions (/stream) NPE

2015-12-23 Thread Jason Gerlowski
Thanks for the heads up Joel.  Glad this was just user error, and not an
actual problem.

Though it is interesting that Solr's response didn't contain any
information about what was wrong.  I probably would've expected a message
to the effect of: "the required parameter 'expr' was not found".

Also, it was a little disappointing that when the thrown exception has no
message, ExceptionStream puts 'null' in the EXCEPTION Tuple (i.e.
{"EXCEPTION":null,"EOF":true}).  It might be nice if the name/type of the
exception was used when no message can be found.

I'd be happy to create JIRAs and push up a patch for one/both of those
behaviors if people agree that this would make the API a little nicer.

Thanks again Joel.

Best,

Jason

On Tue, Dec 22, 2015 at 10:06 PM, Joel Bernstein  wrote:

> The http parameter "stream" was recently changed to "expr" in SOLR-8443.
>
> Joel Bernstein
> http://joelsolr.blogspot.com/
>
> On Tue, Dec 22, 2015 at 8:45 PM, Jason Gerlowski 
> wrote:
>
> > I'll preface this email by saying that I wasn't sure which mailing list
> it
> > belonged on.  It might fit on the dev list (since it involves a potential
> > Solr bug), but maybe the solr-users list is a better choice (since I'm
> > probably just misusing Solr).  I settled on the solr-users list.  Sorry
> if
> > I chose incorrectly.
> >
> > Moving on...
> >
> > I've run into a NullPointerException when trying to use the /stream
> > handler.  I'm not sure whether I'm doing something wrong with the
> commands
> > I'm sending to Solr via curl, or if there's an underlying bug causing
> this
> > behavior.
> >
> > I'm making the stream request:
> >
> > curl --data-urlencode 'stream=search(gettingstarted, q="*:*",
> > fl="title,url", sort="_version_ asc", rows="10")'
> > "localhost:8983/solr/gettingstarted/stream"
> >
> > Solr responds with:
> >
> > {"result-set":{"docs":[
> > {"EXCEPTION":null,"EOF":true}]}}
> >
> > At this point, I assumed that something was wrong with my command, so I
> > checked the solr-logs for a hint at the problem.  I found:
> >
> > ERROR - 2015-12-23 01:32:32.535; [c:gettingstarted s:shard2 r:core_node2
> > x:gettingstarted_shard2_replica2] org.apache.solr.common.SolrException;
> > java.lang.NullPointerException
> >   at
> >
> >
> org.apache.solr.client.solrj.io.stream.expr.StreamExpressionParser.generateStreamExpression(StreamExpressionParser.java:47)
> >   at
> >
> >
> org.apache.solr.client.solrj.io.stream.expr.StreamExpressionParser.parse(StreamExpressionParser.java:38)
> >   at
> >
> >
> org.apache.solr.client.solrj.io.stream.expr.StreamFactory.constructStream(StreamFactory.java:168)
> >   at
> >
> >
> org.apache.solr.handler.StreamHandler.handleRequestBody(StreamHandler.java:155)
> >   at
> >
> >
> org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:156)
> >
> > Has anyone seen this behavior before?  Is Solr reacting to something
> amiss
> > in my request, or is there maybe a bug here?  I'll admit this is my first
> > attempt at using the /stream API, so I might be getting something wrong
> > here.  I consulted the reference guide's examples on using the streaming
> > API (
> > https://cwiki.apache.org/confluence/display/solr/Streaming+Expressions)
> > when coming up with my curl command, but I might've missed something.
> >
> > Anyways, I'd appreciate any insight that anyone can offer on this.  If it
> > helps, I've included reproduction steps below.
> >
> > 1.) Download and compile Solr trunk.
> > 2.) Start Solr using one of the examples (bin/solr start -e cloud).
> Accept
> > default values.
> > 3.) Index some docs (bin/post -c gettingstarted
> > http://lucene.apache.org/solr -recursive 1 -delay 1)
> > 4.) Do a search to sanity check the ingestion (curl
> > "localhost:8983/solr/gettingstarted/select?q=*:*&wt=json")
> > 5.) Make a /stream request for some docs (curl --data-urlencode
> > 'stream=search(gettingstarted, q="*:*", fl="title,url", sort="_version_
> > asc", rows="10")' "localhost:8983/solr/gettingstarted/stream")
> >
> > Thanks again for any ideas/help anyone can give.
> >
> > Best,
> >
> > Jason
> >
>


Request for SOLR-wiki edit permissions

2016-02-08 Thread Jason Gerlowski
Hi all,

Can someone please give me edit permissions for the Solr wiki.  Is
there anything I should or need to do to get these permissions?  My
wiki username is "Jason.Gerlowski", and my wiki email is
"gerlowsk...@gmail.com".

I spotted a few things that could use some clarification on the
HowToContribute page (https://wiki.apache.org/solr/HowToContribute)
and wanted to make them a bit clearer.

Jason


Re: Request for SOLR-wiki edit permissions

2016-02-08 Thread Jason Gerlowski
Thanks Anshum!

On Mon, Feb 8, 2016 at 1:01 PM, Anshum Gupta  wrote:
> Done.
>
> On Mon, Feb 8, 2016 at 9:55 AM, Jason Gerlowski 
> wrote:
>
>> Hi all,
>>
>> Can someone please give me edit permissions for the Solr wiki.  Is
>> there anything I should or need to do to get these permissions?  My
>> wiki username is "Jason.Gerlowski", and my wiki email is
>> "gerlowsk...@gmail.com".
>>
>> I spotted a few things that could use some clarification on the
>> HowToContribute page (https://wiki.apache.org/solr/HowToContribute)
>> and wanted to make them a bit clearer.
>>
>> Jason
>>
>
>
>
> --
> Anshum Gupta


Re: ConcurrentUpdateSolrClient example?

2017-08-13 Thread Jason Gerlowski
Hi Paul,

I'll try reproducing this with the snippet provided, but I don't see
anything inherently wrong with the Builder usage you mentioned, assuming
the Solr base URL you provided is correct.

It would be easier to troubleshoot your issue though if you included some
more information about the NPE you're seeing. Could you post the stacktrace
to help others investigate please?

Best,

Jason

On Aug 13, 2017 5:43 AM, "Paul Smith Parker" 
wrote:

> Hello,
>
> I can’t find an example on how to properly instantiate/configure an
> instance of ConcurrentUpdateSolrClient.
>
> I tried this but it gives me a NPE:
>
> ConcurrentUpdateSolrClient solrClient = new ConcurrentUpdateSolrClient.
> Builder(“http://localhost:8389/solr /
> core").build();
>
> While this seems to work (it should use an internal httpClient):
> ConcurrentUpdateSolrClient solrClient = new ConcurrentUpdateSolrClient.
> Builder(“http://localhost:8389/solr /core")
> .withHttpClient(null)
> .withQueueSize(1000)
> .withThreadCount(20)
> .build();
>
> Is this the correct way to set it up?
>
> Thanks,
> P.


Re: ConcurrentUpdateSolrClient example?

2017-08-14 Thread Jason Gerlowski
Ah, glad you figured it out.  And thanks for the clarification.  That
does look like something that could/should be improved though.
QueueSize could be given a reasonable (and documented) default, to
save people from the IAE.  I'll take a look this afternoon and create
a JIRA if there's not a rationale behind this (which there might be).

On Mon, Aug 14, 2017 at 2:23 AM, Paul Smith Parker
 wrote:
> Hello Jason,
>
> I figured it out:
>
> 1) ConcurrentUpdateSolrClient build = new 
> ConcurrentUpdateSolrClient.Builder("http://localhost:8389/solr/core";).build();
> 2) ConcurrentUpdateSolrClient build = new 
> ConcurrentUpdateSolrClient.Builder("http://localhost:8389/solr/core";)
> .withQueueSize(20)
> .build();
>
> 1) fails with an IllegalArgumentException due to the fact the the queue size 
> is not specified.
> 2) works as expected.
>
> Cheers,
> P.
>
>> On 13 Aug 2017, at 22:58, Jason Gerlowski  wrote:
>>
>> Hi Paul,
>>
>> I'll try reproducing this with the snippet provided, but I don't see
>> anything inherently wrong with the Builder usage you mentioned, assuming
>> the Solr base URL you provided is correct.
>>
>> It would be easier to troubleshoot your issue though if you included some
>> more information about the NPE you're seeing. Could you post the stacktrace
>> to help others investigate please?
>>
>> Best,
>>
>> Jason
>>
>> On Aug 13, 2017 5:43 AM, "Paul Smith Parker" 
>> wrote:
>>
>>> Hello,
>>>
>>> I can’t find an example on how to properly instantiate/configure an
>>> instance of ConcurrentUpdateSolrClient.
>>>
>>> I tried this but it gives me a NPE:
>>>
>>> ConcurrentUpdateSolrClient solrClient = new ConcurrentUpdateSolrClient.
>>> Builder(“http://localhost:8389/solr <http://localhost:8389/solr>/
>>> core").build();
>>>
>>> While this seems to work (it should use an internal httpClient):
>>> ConcurrentUpdateSolrClient solrClient = new ConcurrentUpdateSolrClient.
>>> Builder(“http://localhost:8389/solr <http://localhost:8389/solr>/core")
>>>.withHttpClient(null)
>>>.withQueueSize(1000)
>>>.withThreadCount(20)
>>>.build();
>>>
>>> Is this the correct way to set it up?
>>>
>>> Thanks,
>>> P.
>


Re: Issue found with install_solr_service.sh

2017-08-17 Thread Jason Gerlowski
Hi Eddie, thanks for reporting.

This is a common issue with "xargs".  When xargs doesn't receive any
input through the pipe (i.e. if "find" doesn't find anything), it
isn't smart enough to exit-early and still tries to run the "chmod"
command without a filename.  The "-r" flag is present in most versions
of xargs to prevent this behavior.  I'll create a JIRA to suggest this
change.  (For context:
https://stackoverflow.com/questions/36617999/error-rm-missing-operand-when-using-along-with-find-command)


Also concerning though is that "find" isn't outputting any files in
the first place.  I'll also take a look at that.  Can you provide any
information on what environment you're seeing this on (OS, version,
Solr version, etc.)?

On Wed, Aug 16, 2017 at 11:55 PM, Eddie Trejo
 wrote:
> Hi There
>
> Not sure if this is the right channel to report a possible bug, but I think 
> there is a syntax error on lines 280 - 281
>
>   find "$SOLR_INSTALL_DIR" -type d -print0 | xargs -0 chmod 0755
>   find "$SOLR_INSTALL_DIR" -type f -print0 | xargs -0 chmod 0644
>
> The below is printed on screen during the execution of the script:
>
> chmod: missing operand after ‘0750’
> Try 'chmod --help' for more information.
> chmod: missing operand after ‘0640’
> Try 'chmod --help' for more information.
>
> Downloaded source code from here 
> http://www.apache.org/dyn/closer.lua/lucene/solr/6.6.0 
> 
>
> Thanks
>
> ---
> Eddie Trejo - Infrastructure and DevOps Manager
>
> AUS:  +61 3 8618 7800
> ---
>


Re: Issue found with install_solr_service.sh

2017-08-17 Thread Jason Gerlowski
Also, can you confirm whether there are files in your install/extract directory?

On Thu, Aug 17, 2017 at 2:22 PM, Jason Gerlowski  wrote:
> Hi Eddie, thanks for reporting.
>
> This is a common issue with "xargs".  When xargs doesn't receive any
> input through the pipe (i.e. if "find" doesn't find anything), it
> isn't smart enough to exit-early and still tries to run the "chmod"
> command without a filename.  The "-r" flag is present in most versions
> of xargs to prevent this behavior.  I'll create a JIRA to suggest this
> change.  (For context:
> https://stackoverflow.com/questions/36617999/error-rm-missing-operand-when-using-along-with-find-command)
>
>
> Also concerning though is that "find" isn't outputting any files in
> the first place.  I'll also take a look at that.  Can you provide any
> information on what environment you're seeing this on (OS, version,
> Solr version, etc.)?
>
> On Wed, Aug 16, 2017 at 11:55 PM, Eddie Trejo
>  wrote:
>> Hi There
>>
>> Not sure if this is the right channel to report a possible bug, but I think 
>> there is a syntax error on lines 280 - 281
>>
>>   find "$SOLR_INSTALL_DIR" -type d -print0 | xargs -0 chmod 0755
>>   find "$SOLR_INSTALL_DIR" -type f -print0 | xargs -0 chmod 0644
>>
>> The below is printed on screen during the execution of the script:
>>
>> chmod: missing operand after ‘0750’
>> Try 'chmod --help' for more information.
>> chmod: missing operand after ‘0640’
>> Try 'chmod --help' for more information.
>>
>> Downloaded source code from here 
>> http://www.apache.org/dyn/closer.lua/lucene/solr/6.6.0 
>> <http://www.apache.org/dyn/closer.lua/lucene/solr/6.6.0>
>>
>> Thanks
>>
>> ---
>> Eddie Trejo - Infrastructure and DevOps Manager
>>
>> AUS:  +61 3 8618 7800
>> ---
>>


Re: ConcurrentUpdateSolrClient example?

2017-08-17 Thread Jason Gerlowski
Created SOLR-11256 for giving "queueSize" a default.  There's a patch
attached on that JIRA with 10 as the chosen default.  Whether that
particular value sticks or not, at least there's a fix in the works!

On Mon, Aug 14, 2017 at 9:36 AM, Jason Gerlowski  wrote:
> Ah, glad you figured it out.  And thanks for the clarification.  That
> does look like something that could/should be improved though.
> QueueSize could be given a reasonable (and documented) default, to
> save people from the IAE.  I'll take a look this afternoon and create
> a JIRA if there's not a rationale behind this (which there might be).
>
> On Mon, Aug 14, 2017 at 2:23 AM, Paul Smith Parker
>  wrote:
>> Hello Jason,
>>
>> I figured it out:
>>
>> 1) ConcurrentUpdateSolrClient build = new 
>> ConcurrentUpdateSolrClient.Builder("http://localhost:8389/solr/core";).build();
>> 2) ConcurrentUpdateSolrClient build = new 
>> ConcurrentUpdateSolrClient.Builder("http://localhost:8389/solr/core";)
>> .withQueueSize(20)
>> .build();
>>
>> 1) fails with an IllegalArgumentException due to the fact the the queue size 
>> is not specified.
>> 2) works as expected.
>>
>> Cheers,
>> P.
>>
>>> On 13 Aug 2017, at 22:58, Jason Gerlowski  wrote:
>>>
>>> Hi Paul,
>>>
>>> I'll try reproducing this with the snippet provided, but I don't see
>>> anything inherently wrong with the Builder usage you mentioned, assuming
>>> the Solr base URL you provided is correct.
>>>
>>> It would be easier to troubleshoot your issue though if you included some
>>> more information about the NPE you're seeing. Could you post the stacktrace
>>> to help others investigate please?
>>>
>>> Best,
>>>
>>> Jason
>>>
>>> On Aug 13, 2017 5:43 AM, "Paul Smith Parker" 
>>> wrote:
>>>
>>>> Hello,
>>>>
>>>> I can’t find an example on how to properly instantiate/configure an
>>>> instance of ConcurrentUpdateSolrClient.
>>>>
>>>> I tried this but it gives me a NPE:
>>>>
>>>> ConcurrentUpdateSolrClient solrClient = new ConcurrentUpdateSolrClient.
>>>> Builder(“http://localhost:8389/solr <http://localhost:8389/solr>/
>>>> core").build();
>>>>
>>>> While this seems to work (it should use an internal httpClient):
>>>> ConcurrentUpdateSolrClient solrClient = new ConcurrentUpdateSolrClient.
>>>> Builder(“http://localhost:8389/solr <http://localhost:8389/solr>/core")
>>>>.withHttpClient(null)
>>>>.withQueueSize(1000)
>>>>.withThreadCount(20)
>>>>.build();
>>>>
>>>> Is this the correct way to set it up?
>>>>
>>>> Thanks,
>>>> P.
>>


Re: Solr returning same object in different page

2017-09-12 Thread Jason Gerlowski
Is it possible that your indexed data contains duplicated or
nearly-duplicated documents.  (See:
https://cwiki.apache.org/confluence/display/solr/De-Duplication)

Also, I'm curious whether you see the same duplicates when making a single,
larger query.  Can you run a single query that returns the number of
results normally found in two "pages" of results, and check whether you see
duplicates with that single query?

On Tue, Sep 12, 2017 at 3:35 PM, ruby  wrote:

> Hi Shawn,
> No index change is happening in this case.
>
>
>
> --
> Sent from: http://lucene.472066.n3.nabble.com/Solr-User-f472068.html
>


Re: CloudSolrServer set http request timeout

2017-09-21 Thread Jason Gerlowski
Hi Vincenzo,

Have you tried setting the read/socket timeout on your client?
CloudSolrServer uses a LBHttpSolrServer under the hood, which you can
get with the getLBServer method
(https://lucene.apache.org/solr/4_1_0/solr-solrj/org/apache/solr/client/solrj/impl/CloudSolrServer.html#getLbServer()).
Once you have access to LBHttpSolrServer, you can use the
"setSoTimeout" method
(https://lucene.apache.org/solr/4_1_0/solr-solrj/org/apache/solr/client/solrj/impl/LBHttpSolrServer.html#setSoTimeout(int))
to choose an appropriate maximum timeout.

At least, that's how the Javadocs make it look in 4.x, and how I know
it works in more recent versions.  Hope that helps.

Jason

On Thu, Sep 21, 2017 at 1:07 PM, Vincenzo D'Amore  wrote:
> Hi,
>
> I have a huge problem with few queries in SolrCloud 4.8.1 that hangs the
> client.
>
> Actually I'm unable to understand even if the cluster really receives the
> requests.
>
> How can I set a timeout when Solrj client wait too much ?
>
> Best regards,
> Vincenzo
>
> --
> Vincenzo D'Amore
> email: v.dam...@gmail.com
> skype: free.dev
> mobile: +39 349 8513251


Re: Expected mime type application/octet-stream but got text/html

2017-10-17 Thread Jason Gerlowski
At a glance, I'd guess that your SolrClient object isn't setup correctly,
probably because it has the wrong "baseURL" specified.  Solr has a
"/solr//update" URL, but the error above makes it look like
your application is reaching out to "/solr/update" which isn't a valid
endpoint.

If your SolrClient is setup with a baseUrl like "http://localhost:8983/solr";,
add a collection or core to the end of the url, such as:
"http:/localhost:8983/solr/some-valid-collection".

On Tue, Oct 17, 2017 at 2:07 AM, Shoaib  wrote:

> I have been following tutorial from below link to implement Spring data
> Solr
> http://www.baeldung.com/spring-data-solr
>
> Attached is my config file, model and repository for spring data solr.
>
> when i make any query or save my model i receive the below exception.
> my solr is working fine when i ping from browser "
> http://127.0.0.1:8983/solr/";
>
> MerchantModel model = new MerchantModel();
>
> model.setId("2");
>
> model.setLocation("31.5287,74.4121");
>
> model.setTitle("khawaja");
>
> merchantRepository.save(model);
>
> upon save i am getting the below exception
> ###
> org.springframework.data.solr.UncategorizedSolrException: Error from
> server at http://127.0.0.1:8983/solr: Expected mime type
> application/octet-stream but got text/html. 
> 
> 
> Error 404 Not Found
> 
> HTTP ERROR 404
> Problem accessing /solr/update. Reason:
>  Not Found
> 
> 
> ; nested exception is 
> org.apache.solr.client.solrj.impl.HttpSolrClient$RemoteSolrException:
> Error from server at http://127.0.0.1:8983/solr: Expected mime type
> application/octet-stream but got text/html. 
> 
> 
> Error 404 Not Found
> 
> HTTP ERROR 404
> Problem accessing /solr/update. Reason:
>  Not Found
> 
> 
> ###
>
>
>
>
>
>
>
> Regards,
>
>
>
> Khawaja MUHAMMAD Shoaib
>
> Software Engineering Department
>
> *Inov8 Limited* - *Enabling the mobile payments ecosystem since 2004*
>
> [image: Description: Description: Description: Description: Description:
> Description: cid:image002.png@01D0573A.AD54F450]
>
> GSM: +92 (322) 4001158
>
> Email:   *m.sho...@inov8.com.pk *
>
> URL: www.inov8.com.pk
>
>
>
>
>


Re: authentication

2017-11-18 Thread Jason Gerlowski
Hey Arkadi,

Your "nagios" user is under "role_monitoring", which has "config-read"
permissions.  The default config-read gets you access to the Config
API and Request Parameters API, but not the /admin/mbeans API (afaik).

See 
https://lucene.apache.org/solr/guide/6_6/rule-based-authorization-plugin.html#Rule-BasedAuthorizationPlugin-PredefinedPermissions
for a bit more explanation.

I think you'll need to update the configured permissions to allow
access to /admin/mbeans.  (The linked page above is a good reference
for that as well).

Best,

Jason

On Thu, Nov 16, 2017 at 8:06 AM, Arkadi Colson  wrote:
> Hi
>
> I'm having trouble with setting up authentication. My security.json looks
> like this:
>
> {
> "authentication":{
> "class":"solr.BasicAuthPlugin",
> "blockUnknown": false,
> "credentials":{
> "admin":"IV0EHq1OnNrj6gvRCwvFwTrZ1+z1oBbnQdiVC3otuq0=
> Ndd7LKvVBAaZIF0QAVi1ekCfAJXr1GGfLtRUXhgrF8c=",
> "nagios":"IV0EHq1OnNrj6gvRCwvFwTrZ1+z1oBbnQdiVC3otuq0=
> Ndd7LKvVBAaZIF0QAVi1ekCfAJXr1GGfLtRUXhgrF8c=",
> "smsc":"IV0EHq1OnNrj6gvRCwvFwTrZ1+z1oBbnQdiVC3otuq0=
> Ndd7LKvVBAaZIF0QAVi1ekCfAJXr1GGfLtRUXhgrF8c="
> }
> },
> "authorization":{
> "class":"solr.RuleBasedAuthorizationPlugin",
> "user-role":{
> "admin":"role_admin",
> "nagios":"role_monitoring",
> "smsc":"role_smsc"
> },
> "permissions":[
> {
> "name":"all",
> "role":"role_admin"
> },
> {
> "name":"config-read",
> "role":"role_monitoring"
> },
> {
> "name":"read",
> "role":"role_smsc"
> },
> {
> "name":"update",
> "role":"role_smsc"
> }
> ]
> }
> }
>
> When trying to login with for example check_solr_metrics.pl and the nagios
> user the output is "CRITICAL: 403 Unauthorized request, Response code: 403".
> Solr logging is showing these lines:
>
> DEBUG - 2017-11-16 13:42:51.785; [c:smsc_lvs s:shard2 r:core_node1
> x:smsc_lvs_shard2_replica1] org.apache.solr.servlet.SolrDispatchFilter;
> Request to authenticate: Request(GET
> //solr01:8983/solr/mydoc/admin/mbeans?stats=true&cat=UPDATE&key=%2Fupdate&omitHeader=off&wt=json&start=0&rows=3)@2722dc57,
> domain: 10.1.1.42, port: 8983
> DEBUG - 2017-11-16 13:42:51.786; [c:smsc_lvs s:shard2 r:core_node1
> x:smsc_lvs_shard2_replica1] org.apache.solr.servlet.SolrDispatchFilter; User
> principal: [principal: nagios]
> DEBUG - 2017-11-16 13:42:51.786; [c:smsc_mydoc s:shard1 r:core_node2
> x:smsc_mydoc_shard1_replica1] org.apache.solr.servlet.HttpSolrCall;
> PkiAuthenticationPlugin says authorization required : true
> DEBUG - 2017-11-16 13:42:51.786; [c:smsc_mydoc s:shard1 r:core_node2
> x:smsc_mydoc_shard1_replica1] org.apache.solr.servlet.HttpSolrCall;
> AuthorizationContext : userPrincipal: [[principal: nagios]] type: [UNKNOWN],
> collections: [smsc_mydoc, smsc_mydoc,], Path: [/admin/mbeans] path :
> /admin/mbeans params
> :stats=true&omitHeader=off&cat=UPDATE&start=0&rows=3&wt=json&key=/update&collection=smsc_mydoc
> INFO  - 2017-11-16 13:42:51.786; [c:smsc_mydoc s:shard1 r:core_node2
> x:smsc_mydoc_shard1_replica1]
> org.apache.solr.security.RuleBasedAuthorizationPlugin; This resource is
> configured to have a permission {
>   "name":"all",
>   "role":"role_admin"}, The principal [principal: nagios] does not have the
> right role
> INFO  - 2017-11-16 13:42:51.787; [c:smsc_mydoc s:shard1 r:core_node2
> x:smsc_mydoc_shard1_replica1] org.apache.solr.servlet.HttpSolrCall;
> USER_REQUIRED auth header Basic bmFnaW9zOlNvbHJSb2Nrcw== context :
> userPrincipal: [[principal: nagios]] type: [UNKNOWN], collections:
> [smsc_mydoc, smsc_mydoc,], Path: [/admin/mbeans] path : /admin/mbeans params
> :stats=true&omitHeader=off&cat=UPDATE&start=0&rows=3&wt=json&key=/update&collection=smsc_mydoc
> DEBUG - 2017-11-16 13:42:51.787; [c:smsc_mydoc s:shard1 r:core_node2
> x:smsc_mydoc_shard1_replica1] org.apache.solr.servlet.HttpSolrCall; Closing
> out SolrRequest:
> {stats=true&omitHeader=off&cat=UPDATE&start=0&rows=3&wt=json&key=/update&collection=smsc_mydoc}
>
> Anybody an idea what I'm doing wrong here?
>
> Thx!
> Arkadi


Re: Performance warning: Overlapping onDeskSearchers=2 solr

2017-05-17 Thread Jason Gerlowski
Hey Shawn, others.

This is a pitfall that Solr users seem to run into with some
frequency.  (Anecdotally, I've bookmarked the Lucidworks article you
referenced because I end up referring people to it often enough.)

The immediate first advice when someone encounters these
onDeckSearcher error messages is to examine their commit settings.  Is
there any other possible cause for those messages?  If not, can we
consider changing the log/exception error message to be more explicit
about the cause?

A strawman new message could be: "Performance warning: Overlapping
onDeskSearchers=2; consider reducing commit frequency if performance
problems encountered"

Happy to create a JIRA/patch for this; just wanted to get some
feedback first in case there's an obvious reason the messages don't
get explicit about the cause.

Jason

On Wed, May 17, 2017 at 8:49 AM, Shawn Heisey  wrote:
> On 5/17/2017 5:57 AM, Srinivas Kashyap wrote:
>> We are using Solr 5.2.1 version and are currently experiencing below Warning 
>> in Solr Logging Console:
>>
>> Performance warning: Overlapping onDeskSearchers=2
>>
>> Also we encounter,
>>
>> org.apache.solr.common.SolrException: Error opening new searcher. exceeded 
>> limit of maxWarmingSearchers=2, try again later.
>>
>>
>> The reason being, we are doing mass update on our application and solr 
>> experiencing the higher loads at times. Data is being indexed using DIH(sql 
>> queries).
>>
>> In solrconfig.xml below is the code.
>>
>> 
>>
>> Should we be uncommenting the above lines and try to avoid this error? 
>> Please help me.
>
> This warning means that you are committing so frequently that there are
> already two searchers warming when you start another commit.
>
> DIH does a commit exactly once -- at the end of the import.  One import will 
> not cause the warning message you're seeing, so if there is one import 
> happening at a time, either you are sending explicit commit requests during 
> the import, or you have autoSoftCommit enabled with values that are far too 
> small.
>
> You should definitely have autoCommit configured, but I would remove
> maxDocs and set maxTime to something like 6 -- one minute.  The
> autoCommit should also set openSearcher to false.  This kind of commit
> will not make new changes visible, but it will start a new transaction
> log frequently.
>
>
>  6
>  false
>
>
> An automatic commit (soft or hard) with a one second interval is going to 
> cause that warning you're seeing.
>
> https://lucidworks.com/understanding-transaction-logs-softcommit-and-commit-in-sorlcloud/
>
> Thanks,
> Shawn
>


Re: Want to start contributing.

2018-08-22 Thread Jason Gerlowski
Hi Rohan,

Welcome!

To add just a bit to the good advice from Erick and Alex:

If you want a smaller issue to familiarize yourself with our
development workflow, the "newdev" label is a good place to start (as
the wiki page referenced above points out).  But I would also suggest
taking a look at issues with the "documentation" label, or creating
your own for any inconsistencies you find.  Not all documentation
issues make great starting points, but there's a few reasons I think
they tend to be good first tasks:

- we need our examples to be rock solid, since that's a common first
starting point for those using Solr (as Alex pointed out).  If
anything is unclear as you read through those, getting the docs
improved is really important.
- improving a small piece of the docs is a great way to absorb that
info as you work.
- doc JIRA's tend to be limited in scope to a particular topic or
page, which is great for new contributors
- my gut feeling is that documentation reviews are less intimidating
and are more likely to draw the eyes of a committer

Good luck, and again, welcome.

Jason
On Mon, Aug 20, 2018 at 6:41 PM Alexandre Rafalovitch
 wrote:
>
> And I would recommend to start by trying all the (10?) examples that ship
> with Solr and going through their config files. Even briefly. That may help
> you fins the area to focus on, perhaps by something not being clear, etc.
>
> Regards,
>  Alex
> P.s. And Solr on Windows could always get more love, if that is your
> platform
>
> On Mon, Aug 20, 2018, 5:40 PM Erick Erickson, 
> wrote:
>
> > Rohan:
> >
> > Here's the place everybody starts ;)
> >
> > https://wiki.apache.org/solr/HowToContribute
> >
> > There's a _lot_ to Solr/Lucene, so I'd advise picking something you're
> > interested in to start rather than trying to understand _everything_.
> >
> > Best,
> > Erick
> >
> > On Mon, Aug 20, 2018 at 10:45 AM, Rohan Chhabra
> >  wrote:
> > > Hi all,
> > >
> > > I am an absolute beginner (dummy) in the field of contributing open
> > source.
> > > But I am interested in contributing to open source. How do i start? Solr
> > is
> > > a java based search engine based on Lucene. I am good at Java and
> > therefore
> > > chose this to start.
> > >
> > > I need guidance. Help required!!
> >


Re: Non-Essential Components

2018-09-20 Thread Jason Gerlowski
Hi Seamus,

Have you looked at the "contrib" module?  The "contrib" module is a
collection of commonly used packages that aren't strictly required for
Solr's "core".  What's more, "contrib" is pretty large: it takes up
~82mb of a ~185mb Solr distribution.  If you don't plan on using any
contrib modules, that's where I'd start.

Best,

Jason
On Thu, Sep 20, 2018 at 3:08 AM Holland, Seamus Martin
 wrote:
>
> Hello,
>
>
> I am helping implement solr for a "downloadable library" of sorts. The 
> objective is that communities without internet access will be able to access 
> a library's worth of information on a small, portable device. As such, I am 
> working within strict space constraints. What are some non-essential 
> components of solr that can be cut to conserve space for more information?
>
>
> Best,
>
> Seamus Holland
>
>
> ---
>
>
> Seamus Holland
>
> Executive Board, Parr Center for Ethics
>
> President, UNC Irish Sports
>
> UNC-Chapel Hill, Class of 2020
>
> Computer Science (B.S.) & Philosophy (B.A.)
>
> sea...@live.unc.edu | (919)780-8252


Querying with ConcurrentUpdateSolrClient

2018-09-25 Thread Jason Gerlowski
Hi all,

The Javadocs for ConcurrentUpdateSolrClient steer users away from
using it for query requests:

"Although any SolrClient request can be made with this implementation,
it is only recommended to use ConcurrentUpdateSolrClient with /update
requests. The class HttpSolrClient is better suited for the query
interface."

Looking at CUSC's code though, it immediately defers all non-update
requests to an internal HttpSolrClient.
(https://github.com/apache/lucene-solr/blob/master/solr/solrj/src/java/org/apache/solr/client/solrj/impl/ConcurrentUpdateSolrClient.java#L477)
 I can't see how this would be any better or worse than using an
unwrapped HttpSolrClient instead.

Is there something I'm missing that changes how this internal
HttpSolrClient behaves?  Or is the advice in CUSC's javadocs maybe
outdated and ripe for removal?

Best,

Jason


Re: Making Solr Indexing Errors Visible

2018-09-30 Thread Jason Gerlowski
Hi

Also worth mentioning that bin/post only handles certain file
extensions, and AFAIR it doesn't mention specifically when it skips
over a file because of the extension. You mentioned you're trying to
index Word docs and pdf's.  Are there any other formats in the
directory that might be messing up your counts?

I also second Shawn's suggestion that you post the "bin/post" output
and a directory listing.  Additionally, if you're able to clean up the
output a bit, you might be able to diff the two lists of files and see
if the ones missing have anything particular in common.

Good luck,

Jason
On Thu, Sep 27, 2018 at 9:58 AM Shawn Heisey  wrote:
>
> On 9/26/2018 2:39 PM, Terry Steichen wrote:
> > Let me try to clarify a bit - I'm just using bin/post to index the files
> > in a directory.  That indexing process produces a lengthy screen display
> > of files that were indexed.  (I realize this isn't production-quality,
> > but I'm not ready for production just yet, so that should be OK.)
>
> I see a previous message on the list from you indicating solr 6.6.0.
> FYI there are five bugfix releases after 6.6.0 -- the latest 6.x release
> is 6.6.5.  I don't see any fixes related to the post tool, but maybe one
> of the problems that did get fixed might help your server behave better.
>
> Switching my source checkout to the 6.6.0 tag and checking that version...
>
> Each time a file is sent, you should get a log line starting with
> "POSTing file".
>
> The error detection in SimplePostTool has a bunch of parts.  It seems
> that *most* errors will abort the tool entirely, skipping any files that
> have not yet been processed, and logging a message with "FATAL" included.
>
> Can you show us a directory listing and all the output that you get from
> bin/post when processing that directory?
>
> Thanks,
> Shawn
>


Re: Metrics API via Solrj

2018-10-03 Thread Jason Gerlowski
Hi Deniz,

I don't think there are any classes that simplify accessing the
metrics API like there are for other APIs (e.g.
CollectionAdminRequest, CoreAdminRequest, ..).  But sending metrics
requests in SolrJ is still possible; it's just a little bit more
complicated.

Anytime you want to make an API call that doesn't have specific
objects for it, you can use one of the general-purpose SolrRequest
objects.  I've included an example below that reads the
"classes.loaded" JVM metric:

final SolrClient client = new
HttpSolrClient.Builder("http://localhost:8983/solr";).build();

final ModifiableSolrParams params = new ModifiableSolrParams();
params.set("group", "jvm");
final GenericSolrRequest req = new
GenericSolrRequest(SolrRequest.METHOD.GET, "/admin/metrics", params);

SimpleSolrResponse response = req.process(client);
NamedList respNL = response.getResponse();
NamedList metrics = (NamedList)respNL.get("metrics");
NamedList jvmMetrics = (NamedList)
metrics.get("solr.jvm");
Long numClassesLoaded = (Long) jvmMetrics.get("classes.loaded");
System.out.println("Num classes loaded was: " + numClassesLoaded);

It's a little more painful to have to dig through the NamedList
yourself, but it's still very do-able.  Hope that helps.

Best,

Jason
On Wed, Oct 3, 2018 at 3:03 AM deniz  wrote:
>
> Are there anyway to get the metrics via solrj ? all of the examples seem like
> using plain curl or http reqs with json response. I have found
> org.apache.solr.client.solrj.io.stream.metrics package, but couldnt figure
> out how to send the requests via solrj...
>
> could anyone help me to figure out how to deal with metrics api on solrj?
>
>
>
> -
> Zeki ama calismiyor... Calissa yapar...
> --
> Sent from: http://lucene.472066.n3.nabble.com/Solr-User-f472068.html


Re: How to do rollback from solrclient using python

2018-10-03 Thread Jason Gerlowski
Hi Chetra,

The syntax that you're looking for is "/solr/someCoreName/update?rollback=true".

But I'm afraid Rollback might not be quite what you think it is.  You
mentioned: "but it doesn't work, whenever there is a commit the
request still updates on the server".  Yes, that is the expected
behavior with rollbacks.  Rollbacks reset your index to the last
commit point.  If there was a commit right before a rollback, the
rollback will have no effect.

One last point is that you should be very careful using rollbacks.
Rollbacks are going to undo all changes to your index since the last
commit.  If you have more than one client thread changing documents,
this can be very dangerous as you will reset a lot of things you
didn't intend.  Even if you can guarantee that there's only one client
making changes to your index, and that client is itself
single-threaded, the result of a rollback is still indeterminate if
you're using server auto-commit settings.  The client-triggered
rollback will occasionally race against the server-triggered commit.
Will your doc changes get rolled back?  They will if the rollback
happens first, but if the commit happens right before the rollback,
your rollback won't do anything!  Anyways rollbacks have their place,
but be very careful when using them!

Hope that helps,

Jason
On Wed, Oct 3, 2018 at 4:41 AM Chetra Tep  wrote:
>
> Hi Solr team,
> Current I am creating a python application that accesses to solr server.
> I have to handle updating document and need a rollback function.
> I want to send a rollback request whenever exception occurs.
> first I try sth like this from curl command :
> curl http://localhost:8983/solr/mysolr/update?command=rollback
> and I also try
> curl http://localhost:8983/solr/mysolr/update?rollback true
>
> but it doesn't work. whenever there is a commit the request still updates
> on the server.
>
> I also try to submit xml document  , but it doesn't work, too.
>
> Could you guide me how to do this?  I haven't found much documentation
> about this on the internet.
>
> Thanks you in advance.
> Best regards,
> Chetra


Re: Is there a tool to directly index hdfs files to solr?

2018-10-21 Thread Jason Gerlowski
Not familiar with the contrib you mentioned, or the rationale behind
its removal.  But as to your first question, you might be interested
in looking at: https://github.com/lucidworks/hadoop-solr

Disclaimer: I help maintain the "hadoop-solr" project mentioned.
On Thu, Oct 18, 2018 at 8:17 AM shreck  wrote:
>
> why  remove "\solr\contrib\map-reduce" lib from solr6.6.1?
>
>
>
> --
> Sent from: http://lucene.472066.n3.nabble.com/Solr-User-f472068.html


Re: Does ConcurrentUpdateSolrClient apply for SolrCloud ?

2018-10-25 Thread Jason Gerlowski
One comment to complicate Erick's already-good advice.

> If a doc that needs to go to shard2 is received by a replica on shard1, it 
> must be forwarded to the leader of shard1, introducing an extra hop.

Definitely true, but I don't think that's the only factor in the
relative performance of CUSC vs CSC.  CUSC responds asynchronously
when you're using it for updates, which lets users continue on to
prepare the next set of docs while a CloudSolrClient might still be
waiting to hear back from Solr.  I benchmarked this recently and was
surprised to see that ConcurrentUpdateSolrClient actually came out
ahead in some setups.

Now I'm not trying to say that CUSC performs better than CSC, just
that "It Depends" (Erick's TM) on the rest of your ETL code, on the
topology of your SolrCloud cluster, etc.

Good luck!

Jason



On Wed, Oct 24, 2018 at 6:49 PM shamik  wrote:
>
> Thanks Erick, appreciate your help
>
>
>
> --
> Sent from: http://lucene.472066.n3.nabble.com/Solr-User-f472068.html


JSON Query DSL Param Dropping/Overwriting

2018-10-28 Thread Jason Gerlowski
Hi all,

Had a question about how parameters are combined/overlaid in the JSON
Query DSL.  Ran into some behavior that struck me as odd/maybe-buggy.

The query DSL allows params to be provided a few different ways:
1. As query-params in the URI (e.g. "/select?fq=inStock:true")
2. In the JSON request proper (e.g. "json={'filter': 'inStock:true'}"
3. In a special "params" block in the JSON (e.g. "json={ params:
{'fq': 'inStock:true'}}")

When the same parameter (e.g. fq/filter) is provided in more than one
syntax, Solr generally respects each of them, adding all filters to
the request.  But when a filter is present in the JSON "params" block,
it gets ignored or overwritten. (This is reflected in the results, but
it's also easy to tell from the "parsed_filter_queries" block when
requests are run with debug=true)  Does anyone know if this is
expected or a mistake?  It seems like a bug, but maybe I'm just not
seeing the use-case.

I've got a more detailed script showing the difference here for anyone
interested: https://pastebin.com/u1pdMvrq

Best,

Jason


Re: CloudSolrClient produces tons of CLUSTERSTATUS commands against single server in Cloud

2018-11-06 Thread Jason Gerlowski
My understanding was that we always tried to use the cached version of
this information until either (a) Solr responds in a way that
indicates our cache is out of date, or (b) the TTL on the cache entry
expires.  Though there might very well be a code path that behaves
differently as Erick suggests above.

A few more questions that might shed light on this for you (or for us):
1. How are you creating your CloudSolrClient?  Can you share the
2. Did you modify the TTL on your cache via CloudSolrClient's
"setCollectionCacheTTl" method?
3. Are all of the CLUSTERSTATUS requests you're seeing for the same
collection, or different collections?  How many collections do you
have on your cluster?

Best,

Jason

On Tue, Nov 6, 2018 at 11:25 AM Erick Erickson  wrote:
>
> Is the box you're seeing this on the Overseer? Or is it in any other
> way "special", like has all the leaders? And I'm assuming all these
> are NRT replicas, not TLOG or PULL.
>
> What are you doing when these occur? Queries? Updates? If you're doing
> updates, are these coincident with each request? Each commit (which
> you shouldn't be doing from the client anyway)? If they're coincident
> with updating, are you updating in batches or a single doc at a time?
>
> I can imagine each update or commit gets the status, although even
> that seems questionable.
>
> If you can pin down a bit what actions trigger the request that'd help a lot.
>
> Best,
> Erick
>
>
>
> On Tue, Nov 6, 2018 at 8:06 AM Zimmermann, Thomas
>  wrote:
> >
> > Question about CloudSolrClient and CLUSTERSTATUS. We just deployed a 3 
> > server ZK cluster and a 5 node solr cluster using the CloudSolrClient in 
> > Solr 7.4.
> >
> > We're seeing a TON of traffic going to one server with just cluster status 
> > commands. Every single query seems to be hitting this box for status, but 
> > the rest of the query load is divided evenly amongst the servers. Is this 
> > an expected interaction in this client?
> >
> > For example - 75k request per minute going to this one box, and 3.5k RPM to 
> > all other nodes in the cloud.
> >
> > All of those extra requests on the one box are 
> > "/solr/admin/collections?collection=collectionName&action=CLUSTERSTATUS&wt=javabin&version=2"
> >
> > Our plan right now is to roll back to the basic HTTP client and pass all 
> > traffic through our load balancer, but would like to understand if this is 
> > an expected interaction for the Cloud Client, a misconfiguration on our 
> > end, or a bug


Re: CloudSolrClient produces tons of CLUSTERSTATUS commands against single server in Cloud

2018-11-08 Thread Jason Gerlowski
Hey Shawn,

A few answers, where I can give them.

1. It's easy to miss in the thread, but the user mentioned that
they're creating their CloudSolrClient via solr URLs.
2. When you create a CloudSolrClient with a Solr URL, it's not just
used to fetch the ZK connString so that it can use ZK from then on.
It continues to use the Solr URL for fetching cluster state.  If
you're curious, the codepaths are "ZkClientClusterStateProvider", and
"HttpClusterStateProvider", respectively.

I don't know much about the ClusterStateProvider implementations to
say anything more helpful about the original problem, but figured I'd
chime in where I could.

Best,

Jason
On Wed, Nov 7, 2018 at 12:42 PM Shawn Heisey  wrote:
>
> On 11/6/2018 10:06 PM, Gus Heck wrote:
> > One thing that causes a clusterstatus call is alias resolution if the
> > HttpClusterStateProvider is in use instead of the ZkClusterStateProvider.
> > I've just been fixing spurious error messages generated by this
> > in SOLR-12938.
>
> Gus,
>
> If CloudSolrClient is created using URLs instead of a zkHost, does it do
> EVERYTHING with http instead of ZK?  Because if it does, it might
> actually make sense for it to initiate lot of http requests.  The ZK
> client is capable of nearly instantaneous notification of cluster
> changes ... duplicating that with HTTP would require constant checking.
>
> What I would have hoped for when constructing CloudSolrClient with URLs
> was that it would ask the server(s) for the zkHost setting, and then
> proceed to use ZK.  But I suppose if you're using URLs because ZK isn't
> reachable, that would be problematic.
>
> Thomas,
>
> Are you initializing CloudSolrClient with your ZK server info, or using
> one or more Solr URLs?
>
> Thanks,
> Shawn
>


Re: Query regarding SolrJ

2018-11-12 Thread Jason Gerlowski
Hi,

SolrJ is a client library that helps your application talk to Solr.
It's not a full application that can be run on its own.  So the error
message you got is correct.  It's not a standalone application.   For
more information on using SolrJ, see the documentation here:
https://lucene.apache.org/solr/guide/6_6/using-solrj.html .

If you can be a little more explicit on what you're trying to
accomplish, maybe someone here can help you further as well.

Best,

Jason
On Mon, Nov 12, 2018 at 12:37 AM Harshit Arora
 wrote:
>
> Hi,
>
> i am currently using Apache Solr 6. i tried to launch SolrJ6 jar file in
> eclipse oxygen by building a path and running it on jetty server but it
> said selection does not contain main type. i also tried running java -jar
> solr-solr-6.0.0.jar but it also said selection does not contaon main type.
> Please let me know about correct procedure if what i am doing is wrong.
> thanks.


Re: Upgrade 6.2.1 to 7.5.0 - "Connection evictor" Threads not closed

2018-11-26 Thread Jason Gerlowski
Hey Sebastian,

As for how Solr/SolrJ compatibility is handled, the story for SolrJ
looks a lot like the story for Solr itself - major version changes can
introduce breaking changes, so it is best to avoid using SolrJ 6.x
with Solr 7.x.  In practice I think changes that break Solr/SolrJ
compatibility are relatively rare though, so it might be possible if
your hand is forced.

As for the behavior you described...I think I understand what you're
describing, but to make sure:  Are the "connection-evictor" threads
accumulating in your client application, on the Solr server itself, or
both?

I suspect you're seeing this in your client code.  If so, it'd really
help us to help you if you could provide some more details on how
you're using SolrJ.  Can you share a small snippet (JUnit test?) that
reproduces the problem?  How are you creating the SolrClient you're
using to send requests?  Which SolrClient implementation(s) are you
using?  Are you providing your own HttpClient, or letting SolrClient
create its own?  It'll be much easier for others to help with a little
more detail there.

Best,

Jason

On Fri, Nov 23, 2018 at 10:38 AM Sebastian Riemer  wrote:
>
> Hi,
>
> we've recently changed our Solr-Version from 6.2.1 to 7.5.0, and since then, 
> whenever we execute a query on solr, a new thread is being created and never 
> closed.
>
> These threads are all labelled "Connection evictor" and the gather until a 
> critical mass is reached and either the OS cannot create anymore OS threads, 
> or an out of memory error is being produced.
>
> First I thought, that this might have as cause we were using a higher 
> SolrJ-Version than our Solr-Server (by mistakenly forgetting to uprade the 
> server version too):
>
> So we had for SolrJ: 7.4.0
>
> 
> org.apache.solr
> solr-solrj
> 7.4.0
> 
>
> And for Solr-Server:  6.2.1
>
> But now I just installed the newest Solr-Server-Version 7.5.0 and still I see 
> with each Solr-Search performed an additional Thread being created and never 
> released.
>
> When downgrading SolrJ to 6.2.1 I can verify, that no new threads are created 
> when doing a solr search.
>
> What do you think about this? Are there any known pitfalls? Maybe I missed 
> some crucial changes necessary when upgrading to 7.5.0?
>
> What about differing versions in SolrJ and Solr-Server? As far as I recall 
> the docs, one major-version-difference up/down in both ways should be o.k.
>
> Thanks for all your feedback,
>
> Yours sincerely
>
> Sebastian Riemer


Re: Time-Routed Alias Not Distributing Wrongly Placed Docs

2018-11-28 Thread Jason Gerlowski
Hi John,

I'm not an expert on TRA, but I don't think so.  The TRA functionality
I'm familiar with involves creating and deleting underlying
collections and then routing documents based on that information.  As
far as I know that happens at the UpdateRequestProcessor level - once
your data is indexed there's nothing available to move it around.

Best,

Jason
On Tue, Nov 27, 2018 at 12:42 PM John Nashorn  wrote:
>
> Hello Everyone,
> I'm using "hive-solr" from Lucidworks to index my data into Solr (v:7.5, 
> cloud mode). As written in the Solr Manual, TRA expects documents to be 
> indexed using its alias name, and not directly into the collections under it. 
> Unfortunately, hive-solr doesn't allow using TRA names as indexing targets. 
> So what I do is: I index data using the first collection created by TRA and 
> expect Solr to distribute my data into its respective collection under the 
> hood. This works to some extent, but a big portion of data stays in where 
> they were indexed, ie. the first collection of the TRA. For example 
> (approximate numbers):
>
> * coll_2018-07-01 => 800.000.000 docs
> * coll_2018-08-01 => 0 docs
> * coll_2018-09-01 => 0 docs
> * coll_2018-10-01 => 150.000.000 docs
> * coll_2018-11-01 => 0 docs
>
> Here, coll_2018-07-01 contains data that should normally be in the other four 
> collections.
>
> Is there a way to make TRA scan (somehow intentionally) misplaced data and 
> send them to their correct places?


Re: Documentation on SolrJ

2018-11-30 Thread Jason Gerlowski
Hi Thomas,

I recently added a first pass at JSON faceting support to SolrJ.  The
main classes are "JsonQueryRequest" and "DirectJsonQueryRequest" and
live in the package "org.apache.solr.client.solrj.request.json"
(https://github.com/apache/lucene-solr/tree/master/solr/solrj/src/java/org/apache/solr/client/solrj/request/json).
I've also added examples of how to use this code on the "JSON
Faceting" page in the Solr ref guide.  Unfortunately, since this is a
recent addition it hasn't been released yet.  These classes will be in
the next 7x release (if there is one), or in 8.0 when that arrives.
This probably isn't super helpful for you.

Without this code, you have a few options:

1. If the facet requests you'd like to make are relatively
structured/similar, you can subclass QueryRequest and override
getContentWriter().  "ContentWriters" are the abstraction SolrJ is
using to write out the request body.  So you can trivially implement
getContentWriter to wrap a hardcoded string with some templated
variables. If interested, also checkout
"RequestWriter.StringPayloadContentWriter".  This'll be sufficient for
very cookie cutter facet requests, where maybe only a few parameters
change but nothing else.
2. If hardcoding a string JSON body is too inflexible, the JSON
faceting API is "just query params" like everything else.  You can
build your facet request and attach it to the request as a SolrParams
entry.  Doing this wouldn't be the most fun code to write, but it's
always possible.
3. You can copy-paste the unreleased JSON faceting helper classes I
mentioned above into your codebase.  They're not released in SolrJ but
you can still use them by copying them locally and using those copies
until you're able to use a SolrJ that contains these classes.  If you
go this route, please let me or someone else in the community know
your thoughts.  Their being unreleased makes them a bit more of a pain
to use, but it also gives us an opportunity to iterate and improve
them before a release comes and ties us to the existing (maybe awful)
interfaces.

> It would be wonderful if a document of this caliber was provided solely for 
> SolrJ in the form of a tutorial.
We definitely need more "SolrJ Examples" coverage, though I'm not sure
the best way to expose/structure that.  Solr has a *ton* of API
surface area, and SolrJ is responsible for covering all of it.  Even
if I imagine a SolrJ version of the standard "Getting Started"
tutorial which shows users how to create a collection, index docs, do
a query, and do a faceting request...that'd only cover a fraction of
what's out there.  It might be easier to scale our SolrJ examples by
integrating them into the pages we already have for individual APIs
instead.  I'm all for a SolrJ tutorial, or SolrJ Cookbook sort of
thing if you like those ideas better though, and would also volunteer
to help edit or review things in that area.

Sorry, this got a little long.  But hope that helps.

Best,

Jason
On Fri, Nov 30, 2018 at 11:31 AM Cassandra Targett
 wrote:
>
> Support for the JSON Facet API in SolrJ was very recently committed via 
> https://issues.apache.org/jira/browse/SOLR-12965 
> . This missed the cut-off 
> for 7.6 but will be included in 7.7 (if there is one) and/or 8.0. You may be 
> able to use the patch there to see if there are gaps or bugs that could be 
> fixed before 7.7 / 8.0.
>
> Jason, who did the work on that issue, also presented on SolrJ at the 
> Activate conference, you may find it interesting:
> https://www.youtube.com/watch?v=ACPUR_GL5zM 
> 
>
> If you do find the time to write some docs, I’d be happy to give you some 
> editing help. Just open a Jira issue when/if you’ve got something and we can 
> go from there.
>
> > On Nov 30, 2018, at 9:53 AM, Thomas L. Redman  wrote:
> >
> > Hi Shawn, thanks for the prompt reply!
> >
> >> On Nov 29, 2018, at 4:55 PM, Shawn Heisey  wrote:
> >>
> >> On 11/29/2018 2:01 PM, Thomas L. Redman wrote:
> >>> Hi! I am wanting to do nested facets/Grouping/Expand-Collapse using 
> >>> SolrJ, and I can find no API for that. I see I can add a pivot field, I 
> >>> guess to a query in general, but that doesn’t seem to work at all, I get 
> >>> an NPE. The documentation on SolrJ is sorely lacking, the documentation I 
> >>> have found is less than a readme. Are there any books that provided a 
> >>> good tretise on SolrJ specifically? Does SolrJ support these more 
> >>> advanced features?
> >>
> >> I don't have any specific details for that use case.
> >
> > Check out page 498 of the PDF, that includes a brief but powerful 
> > discussion of the JSON Facet API. For just one example, I am interested in 
> > faceting a nominal field within a date range bucket. Example: I want to 
> > facet publication_date field into YEAR buckets, and within each YEAR 
> > bucket, facet on author to get the most prolific authors in that year, AND 
> > to also fa

Re: Documentation on SolrJ

2018-12-01 Thread Jason Gerlowski
> You guys are in need of more documentation. I hope I’m not hurting any 
> feelings, that is not my intention.
I can't imagine anyone would.  In a sense that's what this mailing
list is for-  to share what works, what doesn't, and what's really
lacking.  Thanks for chiming in.

Jason
On Sat, Dec 1, 2018 at 12:56 PM Erick Erickson  wrote:
>
> Thomas:
>
> All contributions welcome! Opensource software lives and dies by
> people stepping up and contributing when they see something they want
> to improve, come join the club and help make it better.
>
> Here's the basics of getting started:
> https://wiki.apache.org/solr/HowToContribute
>
> The part you care most about is probably getting the source code,
> which includes the reference guide source (all the *.adoc files). In
> essence, the process is
> > create a logon for the Lucene/Solr JIRA, this should get you in the right 
> > vicinity: https://issues.apache.org/jira/projects/SOLR/issues/
> > pull the source (including docs)
> > make whatever changes you want
> > create a JIRA describing your changes
> > attach a patch (or a pull request if you're git-savvy) to the JIRA
> > prompt for a committer to push it to the repo.
>
> Here's a bit about documentation in particular:
> https://lucene.apache.org/solr/guide/7_0/how-to-contribute.html
>
> Don't be too worried about working with AsciiDoc, just download Atom.
> Or if you use IntelliJ (and I assume Eclipse) or your favorite editor
> supports an AsciiDoc plugin use that.
>
> Best,
> Erick
> On Sat, Dec 1, 2018 at 9:31 AM Thomas L. Redman  wrote:
> >
> > Hi Jason. You Solr folks are really on top of things, I thank you Cassandra 
> > and Shawn for all the excellent support.
> >
> > Short story, I can wait. I am building a 1.0 version of a new tool to query 
> > our very complex and large (100M docs) datastore, not to find individual 
> > documents, but to find subsets of the data suitable for end users (Social 
> > Science mostly) researchers. As soon as we get to 7.6/8.0, I will work 
> > toward a 1.1 release to include the improved grouping, nested faceting and 
> > so on.  To know this is even in the pipe makes my day.
> >
> > You guys are in need of more documentation. I hope I’m not hurting any 
> > feelings, that is not my intention. Solr is a top shelf product, and I 
> > would not be one to minimize all the hard work. I think I agree with you 
> > Jason, some additions to the existing tutorial to cover more complex query 
> > capabilities would probably do the trick. I don’t think you need 600 pages 
> > like the Solr Ref Guide document. This will make more sense to do when we 
> > get to the 8.0 release (or the next release including JSON API support). I 
> > retire next year, may have some free time to build a more extensive query 
> > exemplar and document that. Is there a formal procedure I need to adhere to 
> > if I want to contribute?
> >
> >
> >
> > > On Nov 30, 2018, at 10:40 AM, Jason Gerlowski  
> > > wrote:
> > >
> > > Hi Thomas,
> > >
> > > I recently added a first pass at JSON faceting support to SolrJ.  The
> > > main classes are "JsonQueryRequest" and "DirectJsonQueryRequest" and
> > > live in the package "org.apache.solr.client.solrj.request.json"
> > > (https://github.com/apache/lucene-solr/tree/master/solr/solrj/src/java/org/apache/solr/client/solrj/request/json).
> > > I've also added examples of how to use this code on the "JSON
> > > Faceting" page in the Solr ref guide.  Unfortunately, since this is a
> > > recent addition it hasn't been released yet.  These classes will be in
> > > the next 7x release (if there is one), or in 8.0 when that arrives.
> > > This probably isn't super helpful for you.
> > >
> > > Without this code, you have a few options:
> > >
> > > 1. If the facet requests you'd like to make are relatively
> > > structured/similar, you can subclass QueryRequest and override
> > > getContentWriter().  "ContentWriters" are the abstraction SolrJ is
> > > using to write out the request body.  So you can trivially implement
> > > getContentWriter to wrap a hardcoded string with some templated
> > > variables. If interested, also checkout
> > > "RequestWriter.StringPayloadContentWriter".  This'll be sufficient for
> > > very cookie cutter facet requests, where maybe only a few parameters
> > > change but nothing else.
> > > 2. If hardcoding a

Re: Last Modified Timestamp

2019-01-02 Thread Jason Gerlowski
Hi Antony,

I don't know a ton about DIH, so I can't answer your question myself.
But you might have better luck getting an answer from others if you
include more information about the behavior you're curious about.
Where do you see this Last Modified timestamp (in the Solr admin UI?
on your filesystem?  If so, on what files?) . How are you importing
documents (what is your DIH config?). etc.

Best,

Jason

On Wed, Dec 19, 2018 at 11:56 AM Antony A  wrote:
>
> Hello Solr Users,
>
> I am trying to figure out if there was a reason for "Last Modified: about
> 20 hours ago" remaining unchanged after a full data import into solr. I am
> running solr cloud on 7.2.1.
>
> I do see this value and also the numDocs value change on a Delta import.
>
> Thanks,
> Antony


Re: The parent shard will never be delete/clean?

2019-01-22 Thread Jason Gerlowski
Hi,

You might want to check out the documentation, which goes over
split-shard in a bit more detail:
https://lucene.apache.org/solr/guide/7_6/collections-api.html#CollectionsAPI-splitshard

To answer your question directly though, no.  Split-shard creates two
new subshards, but it doesn't do anything to remove or cleanup the
original shard.  The original shard remains with its data and will
delegate future requests to the result shards.

Hope that helps,

Jason

On Tue, Jan 22, 2019 at 4:17 AM zhenyuan wei  wrote:
>
> Hi,
>If I split shard1 to shard1_0,shard1_1, Is the parent shard1 will
> never be clean up?
>
>
> Best,
> Tinswzy


Re: Asynchronous Calls to Backup/Restore Collections ignoring errors

2019-02-04 Thread Jason Gerlowski
Hi Steffen,

There are a few "known issues" in this area.  Probably most relevant
is SOLR-6595, which covers a few error-reporting issues for
"collection-admin" operations.  I don't think we've gotten any reports
yet of success/failure determination being broken for asynchronous
operations, but that's not too surprising given my understanding of
how that bit of the code works.  So "yes", this is a known issue.
We've made some progress towards improving the situation, but there's
still work to be done.

As for workarounds, I can't think of any clever suggestions.  You
might be able to issue a query to the collection to see if it returns
any docs, or a particular number of expected docs.  But that may not
be possible, depending on what you meant by the collection being
"unusable" above.

Best,

Jason

On Thu, Jan 31, 2019 at 10:10 AM Steffen Moldenhauer
 wrote:
>
> Hi all,
>
> we are using the collection API backup and restore to transfer collections 
> from a pre-prod to a production system. We are currently using Solr version 
> 6.6.5
> But sometimes that automated process fails and collections are not working on 
> the production system.
>
> It seems that the asynchronous API calls backup and restore do not report 
> some errors/exceptions.
>
> I tried it with the solrcloud gettingstarted example:
>
> http://localhost:8983/solr/admin/collections?action=BACKUP&name=backup-gettingstarted&collection=gettingstarted&location=D:\solr_backup
>
> http://localhost:8983/solr/admin/collections?action=DELETE&name=gettingstarted
>
> Now I simulate an error just by deleting somthing from the backup in the 
> file-system and try to restore the incomplete backup:
>
> http://localhost:8983/solr/admin/collections?action=RESTORE&name=backup-gettingstarted&collection=gettingstarted&location=D:\solr_backup&async=1000
>
> http://localhost:8983/solr/admin/collections?action=REQUESTSTATUS&requestid=1000
> 0 name="QTime">2 name="state">completedfound [1000] in completed 
> tasks
>
> The status is completed but the collection is not usable.
>
> With a synchronous restore call I get:
>
> http://localhost:8983/solr/admin/collections?action=RESTORE&name=backup-gettingstarted&collection=gettingstarted&location=D:\solr_backup
> 500 name="QTime">6456org.apache.solr.common.SolrException:org.apache.solr.common.SolrException:
>  Could not restore coreCould not 
> restore core500 name="metadata"> name="error-class">org.apache.solr.common.SolrException name="root-error-class">org.apache.solr.common.SolrException name="msg">Could not restore core name="trace">org.apache.solr.common.SolrException: Could not restore core
>at 
> org.apache.solr.handler.admin.CollectionsHandler.handleResponse(CollectionsHandler.java:300)
>at 
> org.apache.solr.handler.admin.CollectionsHandler.invokeAction(CollectionsHandler.java:237)
>at 
> org.apache.solr.handler.admin.CollectionsHandler.handleRequestBody(CollectionsHandler.java:215)
>at 
> org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:173)
>at 
> org.apache.solr.servlet.HttpSolrCall.handleAdmin(HttpSolrCall.java:748)
>at 
> org.apache.solr.servlet.HttpSolrCall.handleAdminRequest(HttpSolrCall.java:729)
>at 
> org.apache.solr.servlet.HttpSolrCall.call(HttpSolrCall.java:510)
>at 
> org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:361)
>at 
> org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:305)
>at 
> org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1691)
>at 
> org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:582)
>at 
> org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:143)
>at 
> org.eclipse.jetty.security.SecurityHandler.handle(SecurityHandler.java:548)
>at 
> org.eclipse.jetty.server.session.SessionHandler.doHandle(SessionHandler.java:226)
>at 
> org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1180)
>at 
> org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHandler.java:512)
>at 
> org.eclipse.jetty.server.session.SessionHandler.doScope(SessionHandler.java:185)
>at 
> org.eclipse.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:1112)
>at 
> org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:141)
>at 
> org.eclipse.jetty.server.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:213)
>at 
> org.eclipse.jetty.server.handler.HandlerCollection.handle(HandlerCollection.java:119)
>at 
> org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:134)
>at 
> org.eclipse.jetty.rewrite

Re: CloudSolrClient getDocCollection

2019-02-08 Thread Jason Gerlowski
Hi Henrik,

I'll try to answer, and let others correct me if I stray.  I wasn't
around when CloudSolrClient was written, so take this with a grain of
salt:

"Why does the client need that timeout?Wouldn't it make sense to
use a watch?"

You could probably write a CloudSolrClient that uses watch(es) to keep
track of changing collection state.  But I suspect you'd need a
watch-per-collection, instead of just a single watch.

Modern versions of Solr store the state for each collection in
individual "state.json" ZK nodes
("/solr/collections//state.json").  To catch changes
to all of these collections, you'd need to watch each of those nodes.
Which wouldn't scale well for users who want lots of collections.  I
suspect this was one of the concerns that nudged the author(s) to use
a cache-based approach.

(Even when all collection state was stored in a single ZK node, a
watch-based CloudSolrClient would likely have scaling issues for the
many-collection use case.  The client would need to recalculate its
state information for _all_ collections any time that _any_ of the
collections changed, since it has no way to tell which collection was
changed.)

Best,

Jason

On Thu, Feb 7, 2019 at 11:44 AM Hendrik Haddorp  wrote:
>
> Hi,
>
> when I perform a query using the CloudSolrClient the code first
> retrieves the DocCollection to determine to which instance the query
> should be send [1]. getDocCollection [2] does a lookup in a cache, which
> has a 60s expiration time [3]. When a DocCollection has to be reloaded
> this is guarded by a lock [4]. Per default there are 3 locks, which can
> cause some congestion. The main question though is why does the client
> need that timeout? According to this [5] comment the code does not use a
> watch. Wouldn't it make sense to use a watch? I thought the big
> advantage of the CloudSolrClient is that is knows were to send requests
> to, so that no extra hop needs to be done on the server side. Having to
> query ZooKeeper though for the current state does however take some of
> that advantage.
>
> regards,
> Hendrik
>
> [1]
> https://github.com/apache/lucene-solr/blob/master/solr/solrj/src/java/org/apache/solr/client/solrj/impl/CloudSolrClient.java#L849
> [2]
> https://github.com/apache/lucene-solr/blob/master/solr/solrj/src/java/org/apache/solr/client/solrj/impl/CloudSolrClient.java#L1180
> [3]
> https://github.com/apache/lucene-solr/blob/master/solr/solrj/src/java/org/apache/solr/client/solrj/impl/CloudSolrClient.java#L162
> [4]
> https://github.com/apache/lucene-solr/blob/master/solr/solrj/src/java/org/apache/solr/client/solrj/impl/CloudSolrClient.java#L1200
> [5]
> https://github.com/apache/lucene-solr/blob/master/solr/solrj/src/java/org/apache/solr/client/solrj/impl/CloudSolrClient.java#L821


Re: Java object binding not working

2019-02-08 Thread Jason Gerlowski
Hi Swapnil,

Ray did suggest a potential cause.  Your Java object has "name" as a
String, but Solr returns the "name" value as an ArrayList.
Usually Solr returns ArrayLists when the field in question is
multivalued, so it's a safe bet that Solr is treating your "name"
field as multivalued.

You can check this by opening Solr's admin UI, selecting your
collection from the collection dropdown menu, and clicking on the
Schema tab.  In the "Schema" window you can select your "name" field
from the dropdown and see if the table that appears shows it as
"multivalued".

If the field is multivalued, you've got a few options:
- you can start fresh with a new collection, and modify your schema so
that "name" is single-valued
- you can try to change the field-definition in place.  I'm not sure
whether Solr will allow this, but the API to try is here:
https://lucene.apache.org/solr/guide/7_6/schema-api.html#replace-a-field
- you can just change your Java object to represent "name" as a
List instead of a String.

If the field _isn't_ multivalued, then I'm not sure what's going on.

Best,

Jason

On Fri, Feb 8, 2019 at 1:40 PM Swapnil Katkar  wrote:
>
> Hi,
>
> It would be beneficial to me if you provide me at least some hint to
> resolve this problem. Thanks in advance!
>
> Regards,
> Swapnil Katkar
>
>
>
> -- Forwarded message -
> From: Swapnil Katkar 
> Date: Tue, Feb 5, 2019 at 10:58 PM
> Subject: Fwd: Java object binding not working
> To: 
>
>
> Hello,
>
> Could you please let me know how can I get the resolution of the mentioned
> issue?
>
> Regards,
> Swapnil Katkar
>
> -- Forwarded message -
> From: Swapnil Katkar 
> Date: Sun, Feb 3, 2019, 17:31
> Subject: Java object binding not working
> To: 
>
>
> Greetings!
>
> I am working on a requirement where I want to query the data and want to do
> the object mapping for the retrieved result using Solrj. For this, I am
> referring to the official document at
> *https://lucene.apache.org/solr/guide/7_6/using-solrj.html#java-object-binding
> .*
> I
> set-up the necessary class files and the collections.
>
> With the help of this document, I can create the documents in the Solr DB,
> but it is not working for fetching and mapping the fields to the Java POJO
> class. To do the mapping, I used @Field annotation.
>
> Details are as below:
> *1)* Solrj version: 7.6.0
> *2)* The line of code which is not working: *List employees =
> response.getBeans(Employee.class);*
> *3)* Exception stack trace:
> *Caused by: java.lang.IllegalArgumentException: Can not set
> java.lang.String field demo.apache.solr.vo.Employee.name
>  to java.util.ArrayList*
> * at
> sun.reflect.UnsafeFieldAccessorImpl.throwSetIllegalArgumentException(Unknown
> Source)*
> * at
> sun.reflect.UnsafeFieldAccessorImpl.throwSetIllegalArgumentException(Unknown
> Source)*
> * at sun.reflect.UnsafeObjectFieldAccessorImpl.set(Unknown Source)*
> * at java.lang.reflect.Field.set(Unknown Source)*
> *4)* Collection was created using
> *solr.cmd create -c employees -s 2 -rf 2*
>
> Please find the attached source code files. Also, I attached the stack
> trace file. Can you please help me on how to resolve them?
>
> Regards,
> Swapnil Katkar
>
>
> --
> Hello,
>
>
> Regards,
> Swapnil Katkar


Re: Load balance writes

2019-02-11 Thread Jason Gerlowski
> On the other hand, the CloudSolrClient ignores errors from Solr, which makes 
> it unacceptable for production use.

Did you mean "ConcurrentUpdateSolrClient"?  I don't think
CloudSolrClient does this, though I've been surprised before and
possible I just missed something.  Just wondering.

Jason

On Mon, Feb 11, 2019 at 2:14 PM Walter Underwood  wrote:
>
> The update router would also need to look for failures indexing at each 
> leader,
> then re-read the cluster state to see if the leader had changed. Also re-send 
> any
> failed updates, and so on.
>
> wunder
> Walter Underwood
> wun...@wunderwood.org
> http://observer.wunderwood.org/  (my blog)
>
> > On Feb 11, 2019, at 11:07 AM, lstusr 5u93n4  wrote:
> >
> > Hi Boban,
> >
> > First of all: I agree with Walter here. Because the bottleneck is during
> > indexing on the leader, a basic round robin load balancer will perform just
> > as well as a custom solution. With far less headache. A custom solution
> > will be far more work than it's worth.
> >
> > But, should you really want to write this yourself, you can get all of the
> > information you need from zookeeper, from the path:
> >
> > /collections//state.json
> >
> > There, for each shard you'll see:
> >  - the "range" parameter that tells  you which subset of documents this
> > shard is responsible for (see
> > https://lucene.apache.org/solr/guide/7_6/shards-and-indexing-data-in-solrcloud.html#document-routing
> > for details on routing)
> >  - the list of all replicas. On each replica it will tell you:
> >  - the host name (base_url)
> >  - if it is the leader (has the property leader: true)
> >
> > So your go-based solution would be to watch the state.json file from
> > zookeeper, and build up a function that, given the proper routing structure
> > for your document (the hash of the id by default, I think) will return the
> > hostname of the replica that's the leader.
> >
> > Kyle
> >
> > On Mon, 11 Feb 2019 at 13:30, Boban Acimovic  wrote:
> >
> >> Like I said before, nginx is not a load balancer or at least not a clever
> >> load balancer. It does not talk to ZK. Please give me advanced solutions.
> >>
> >>
> >>
> >>
> >>> On 11. Feb 2019, at 18:32, Walter Underwood 
> >> wrote:
> >>>
> >>> I haven’t used Kubernetes, but a web search for “helm nginx” seems to
> >> give some useful pages.
> >>>
> >>> wunder
> >>> Walter Underwood
> >>> wun...@wunderwood.org
> >>> http://observer.wunderwood.org/  (my blog)
> >>>
>  On Feb 11, 2019, at 9:13 AM, Davis, Daniel (NIH/NLM) [C] <
> >> daniel.da...@nih.gov> wrote:
> 
>  I think that the container orchestration framework takes care of that
> >> for you, but I am not an expert.  In Kubernetes, NGINX is often the Ingress
> >> controller, and as long as the services are running within the Kubernetes
> >> cluster, it can also serve as a load balancer, AFAICT.   In Kubernetes, a
> >> "Load Balancer" appears to be a concept for accessing services outside the
> >> cluster.
> 
>  I presume you are using Kubernetes because of your reference to helm,
> >> but for what it's worth, here's an official haproxy image -
> >> https://hub.docker.com/_/haproxy
> >>
>


Re: Get details about server-side errors

2019-02-13 Thread Jason Gerlowski
Hey Chris,

Unfortunately I think you covered the main/only options above.

HTTP status code isn't the most useful, but it's worth pointing out
that there are a few things you can do with it.  Some status codes are
easy to identify and come up with a good message to display to your
end user e.g. 403 codes.  But of course it doesn't do anything to help
you disambiguate 400 error messages you get.

Error handling has always been one of SolrJ's weak spots.  One thing
people have suggested before is adding some sort of enum to error
responses that is less ambiguous and easier to interpret
programmatically, but it's never been picked up.  A bit more
information on SOLR-7170.  Feel free to vote for it or chime in there
if you think that'd be an improvement.

But unfortunately there's nothing like this to help you out now, that
I know of at least.

Best,

Jason

On Tue, Feb 12, 2019 at 5:09 PM Christopher Schultz
 wrote:
>
> -BEGIN PGP SIGNED MESSAGE-
> Hash: SHA256
>
> Hello, everyone.
>
> I'm trying to get some information about a (fairly) simple case when a
> user is searching using a wide-open query where they can type in
> anything they want, including field-names. Of course, it's possible
> that they will try to enter a field-name that does not exist and Solr
> will complain, like this:
>
> org.apache.solr.client.solrj.impl.HttpSolrClient$RemoteSolrException:
> Error from server at http://localhost:8983/solr/users: undefined field
> bad_field
>
> (This is what happens when I search my user database for "bad_field:foo"
> .)
>
> What is the best way to discover what happened on the server -- from a
> code perspective. I can certainly read the above as a human and see
> what the problem is. But my users won't understand (exactly) what that
> means and I don't always have English-language searching my user databas
> e.
>
> Is there a way to check for "was the error a bad field name?" and
> "what was the bad field name (or names) detected?"
>
> I looked at javadoc and saw two hopefuls:
>
> 1.   code -- unfortunately, this is the HTTP response code
>
> 2.  metadata -- unfortunately, this just returns
> {error-class=org.apache.solr.common.SolrException,root-error-class=org.a
> pache.solr.common.SolrException},
> which is already obvious from the exception type.
>
> Is there something in SolrJ that I'm overlooking, here, or am I
> limited to what I can parse out of the exception's "getMessage" string?
>
> Thanks,
> - -chris
> -BEGIN PGP SIGNATURE-
> Comment: Using GnuPG with Thunderbird - https://www.enigmail.net/
>
> iQIzBAEBCAAdFiEEMmKgYcQvxMe7tcJcHPApP6U8pFgFAlxjRDIACgkQHPApP6U8
> pFihjBAAty32GuiOj8XnwJu55Y9tYWFoQOhNEEJEGmeh1mOv4fxj5D4Rh+7MXTJB
> 7APLZ5IlNjpGMQ5ygLpfFTrLIEljn/f/a8hRslH/g+H3p/y4EJgeyvbNHaQZdkuh
> HlKQ9Z/M6HK+1KGvVNB+9onU3hs7+Tct7TjWO/cZ031CPovDknsYTbOBoLW+tszS
> BrsR7up0s7AOWYNkXTu8i0tf6A6nkF8+YJvml2mxNvXUCZrhHh71eL3R+v1/zGun
> 6yYyGCPm5rO9Pkxq+It4Fo8pkvo3z6k65NAflMXsFcEwWaf/5OmzAjE+TrDdqfeQ
> InKDsXj3w6ZOHOEWN/lq8kK1alZUP0i8MQJHpAXzlPL213joP9mN2AeNk7airIXE
> hPPmUGKjOVlMDJg6ICJiPVibMjwLBiy68TQJj2DX+dMVeYTQSroPBw5VUJhrxinV
> +4y6podDJ6xs+27LxfI8DZ8nGAZP/tFYMCLNIdnhOg682PfaiD3ZiDDu5dJvm871
> 7N0EK3oCkoAmQ3l7xQNtz/0nDdI5TKSOtI3KBXTY72/8dfZlSoE4kwmBh56SrKQJ
> KNfT54Cj329p5qKoNBy1bKxw4GyUx0UbKQo8HyFqzK0gQHlH+23taq5IePhocW12
> uUMGSvVUnm/E+C5w3OGLJ96Y6a3aiNUORinkTJePz+sJoUbCIwY=
> =Ril5
> -END PGP SIGNATURE-


Re: Getting repeated Error - RunExecutableListener java.io.IOException

2019-02-18 Thread Jason Gerlowski
Hi Hermant,

configoverlay.json is not a file with content provided by Solr out of
the box.  Instead, it's used to hold any changes you make to Solr's
default configuration using the config API (/config).  More details at
the top of the article here:
https://lucene.apache.org/solr/guide/6_6/config-api.html

So the fact that you see this in your configoverlay.json means that
someone with access to your Solr cluster specifically used the API to
request that RunExecutableListener configuration. (This is true of
everything in configoverlay.json).  Another possibility is that these
settings were requested via the API on a different cluster you had,
and then copied over to your new cluster by whoever set it up.  This
latter possibility could explain why your RunExecutableListener config
seems to be setup to run Linux commands, even though it runs on
Windows.

If you want you can delete configoverlay.json, or create a new
collection without it.  But since these configuration options were
chosen by a cluster admin on your end at some point (and aren't Solr
"defaults"), then your real first concern should be auditing what's in
configoverlay.json and seeing what still makes sense for use in your
cluster.

Hope that helps,

Jason

On Fri, Feb 15, 2019 at 2:05 PM Hemant Verma  wrote:
>
> Thanks Jan
> We are using Solr 6.6.3 version.
> We didn't configure RunExecutableListener in solrconfig.xml, it seems
> configured in configoverlay.json as default. Even we don't want to configure
> RunExecutableListener.
>
> Is it mandatory to use configoverlay.json or can we get rid of it? If yes
> can you share details.
>
> Attached the solrconfig.xml
>
> solrconfig.xml
> 
>
>
>
> --
> Sent from: http://lucene.472066.n3.nabble.com/Solr-User-f472068.html


Re: Solr 7.7 UpdateRequestProcessor broken

2019-02-18 Thread Jason Gerlowski
Hey all,

I have a proposed update which adds a 7.7 section to our "Upgrade
Notes" ref-guide page.  I put a mention of this in there, but don't
have a ton of context on the issue.  Would appreciate a review from
anyone more familiar.  Check out SOLR-13256 if you get a few minutes.

Best,

Jason

On Mon, Feb 18, 2019 at 9:06 AM Jan Høydahl  wrote:
>
> Thanks for chiming in Markus. Yea, same with the langid tests, they just work 
> locally with manually constructed SolrInputDocument objects.
> This bug breaking change sounds really scary and we should add an UPGRADE 
> NOTE somewhere.
>
> --
> Jan Høydahl, search solution architect
> Cominvent AS - www.cominvent.com
>
> > 15. feb. 2019 kl. 10:34 skrev Markus Jelsma :
> >
> > I stumbled upon this too yesterday and created SOLR-13249. In local unit 
> > tests we get String but in distributed unit tests we get a 
> > ByteArrayUtf8CharSequence instead.
> >
> > https://issues.apache.org/jira/browse/SOLR-13249
> >
> >
> >
> > -Original message-
> >> From:Andreas Hubold 
> >> Sent: Friday 15th February 2019 10:10
> >> To: solr-user@lucene.apache.org
> >> Subject: Re: Solr 7.7 UpdateRequestProcessor broken
> >>
> >> Hi,
> >>
> >> thank you, Jan.
> >>
> >> I've created https://issues.apache.org/jira/browse/SOLR-13255. Maybe you
> >> want to add your patch to that ticket. I did not have time to test it yet.
> >>
> >> So I guess, all SolrJ usages have to handle CharSequence now for string
> >> fields? Well, this really sounds like a major breaking change for custom
> >> code.
> >>
> >> Thanks,
> >> Andreas
> >>
> >> Jan Høydahl schrieb am 15.02.19 um 09:14:
> >>> Hi
> >>>
> >>> This is a subtle change which is not detected by our langid unit tests, 
> >>> as I think it only happens when document is trasferred with SolrJ and 
> >>> Javabin codec.
> >>> Was introduced in https://issues.apache.org/jira/browse/SOLR-12992
> >>>
> >>> Please create a new JIRA issue for langid so we can try to fix it in 7.7.1
> >>>
> >>> Other SolrInputDocument users assuming String type for strings in 
> >>> SolrInputDocument would also be vulnerable.
> >>>
> >>> I have a patch ready that you could test:
> >>>
> >>> Index: 
> >>> solr/contrib/langid/src/java/org/apache/solr/update/processor/LangDetectLanguageIdentifierUpdateProcessor.java
> >>> IDEA additional info:
> >>> Subsystem: com.intellij.openapi.diff.impl.patch.CharsetEP
> >>> <+>UTF-8
> >>> ===
> >>> --- 
> >>> solr/contrib/langid/src/java/org/apache/solr/update/processor/LangDetectLanguageIdentifierUpdateProcessor.java
> >>>   (revision 8c831daf4eb41153c25ddb152501ab5bae3ea3d5)
> >>> +++ 
> >>> solr/contrib/langid/src/java/org/apache/solr/update/processor/LangDetectLanguageIdentifierUpdateProcessor.java
> >>>   (date 1550217809000)
> >>> @@ -60,12 +60,12 @@
> >>>Collection fieldValues = doc.getFieldValues(fieldName);
> >>>if (fieldValues != null) {
> >>>  for (Object content : fieldValues) {
> >>> -  if (content instanceof String) {
> >>> -String stringContent = (String) content;
> >>> +  if (content instanceof CharSequence) {
> >>> +CharSequence stringContent = (CharSequence) content;
> >>>  if (stringContent.length() > maxFieldValueChars) {
> >>> -  detector.append(stringContent.substring(0, 
> >>> maxFieldValueChars));
> >>> +  detector.append(stringContent.subSequence(0, 
> >>> maxFieldValueChars).toString());
> >>>  } else {
> >>> -  detector.append(stringContent);
> >>> +  detector.append(stringContent.toString());
> >>>  }
> >>>  detector.append(" ");
> >>>} else {
> >>> Index: 
> >>> solr/contrib/langid/src/java/org/apache/solr/update/processor/LanguageIdentifierUpdateProcessor.java
> >>> IDEA additional info:
> >>> Subsystem: com.intellij.openapi.diff.impl.patch.CharsetEP
> >>> <+>UTF-8
> >>> ===
> >>> --- 
> >>> solr/contrib/langid/src/java/org/apache/solr/update/processor/LanguageIdentifierUpdateProcessor.java
> >>> (revision 8c831daf4eb41153c25ddb152501ab5bae3ea3d5)
> >>> +++ 
> >>> solr/contrib/langid/src/java/org/apache/solr/update/processor/LanguageIdentifierUpdateProcessor.java
> >>> (date 1550217691000)
> >>> @@ -413,10 +413,10 @@
> >>>  Collection fieldValues = doc.getFieldValues(fieldName);
> >>>  if (fieldValues != null) {
> >>>for (Object content : fieldValues) {
> >>> -if (content instanceof String) {
> >>> -  String stringContent = (String) content;
> >>> +if (content instanceof CharSequence) {
> >>> +  CharSequence stringContent = (CharSequence) content;
> >>>if (stringContent.length() > maxFieldValueChars) {
> >>> -sb.append(str

Re: Suppress stack trace in error response

2019-02-22 Thread Jason Gerlowski
Hi Jeremy,

Unfortunately Solr doesn't offer anything like what you're looking
for, at least that I know of.  There's no sort of global "quiet" or
"suppressStack" option that you can pass on a request to _not_ get the
stacktrace information back.  There might be individual APIs which
offer something like this, but I've never run into them, so I doubt
it.

Best,

Jason

On Thu, Feb 21, 2019 at 10:53 PM Zheng Lin Edwin Yeo
 wrote:
>
> Hi,
>
> There's too little information provided in your questions.
> You can explain more on the issue or the exception that you are facing.
>
> Regards,
> Edwin
>
> On Thu, 21 Feb 2019 at 23:45, Branham, Jeremy (Experis) 
> wrote:
>
> > When Solr throws an exception, like when a client sends a badly formed
> > query string, is there a way to suppress the stack trace in the error
> > response?
> >
> >
> >
> > Jeremy Branham
> > jb...@allstate.com
> > Allstate Insurance Company | UCV Technology Services | Information
> > Services Group
> >
> >


Re: Spring Boot Solr+ Kerberos+ Ambari

2019-03-01 Thread Jason Gerlowski
Hi Rushikesh,

Solr's Kerberos authentication is completely independent of Ranger.
You can set it up to use Ranger, as is common with Hortonworks HDP,
but it's also possible to setup Kerberos+Solr without Ranger in the
picture at all.  I haven't come across a concise explanation of _how_
to do this within Ambari online anywhere.  But there are several
useful resources for configuring Solr+Kerberos outside Ambari that
apply just as well in an Ambari/HDP environment.  (Since Ambari gives
users ultimate flexibility in configuring components, almost anything
you do outside of Ambari can be done inside Ambari)   See:

https://github.com/chatman/solr-kerberos-docker - a set of demo docker
containers that run a KDC and configure Solr to use it.  A helpful
starting place for configuration.
https://lucene.apache.org/solr/guide/7_6/kerberos-authentication-plugin.html
- Solr's documentation on enabling Kerberos.

That should help you get Kerberos configured.  If your questions are
more around how to do particular operations or how to change
particular configuration options in the Ambari UI, those questions are
better addressed to the Ambari mailing lists or by Hortonworks
support.

Hope that helps,

Jason

On Tue, Feb 26, 2019 at 7:58 AM Rushikesh Garadade
 wrote:
>
> Hi,
> Thanks for the links. I have followed these steps earlier as well, however
> I did not excuted  steps from Ranger as I don't want authorization.
>  I didn't get any success.
>
> Thats why My question is
> *Is Ranger mandatory when you just want authentication with Kerberos?*
>
>
> Thank you,
> Rushikesh Garadade
>
> On Thu, Feb 21, 2019, 6:34 PM Furkan KAMACI  wrote:
>
> > Hi,
> >
> > You can also check here:
> >
> > https://community.hortonworks.com/articles/15159/securing-solr-collections-with-ranger-kerberos.html
> > On
> > the other hand, we have a section for Solr Kerberos at documentation:
> >
> > https://lucene.apache.org/solr/guide/6_6/kerberos-authentication-plugin.html
> > For
> > any Ambari specific questions, you can ask them at this forum:
> > https://community.hortonworks.com/topics/forum.html
> >
> > Kind Regards,
> > Furkan KAMACI
> >
> > On Thu, Feb 21, 2019 at 1:43 PM Rushikesh Garadade <
> > rushikeshgarad...@gmail.com> wrote:
> >
> > > Hi Furkan,
> > > I think the link you provided is for ranger audit setting, please correct
> > > me if wrong?
> > >
> > > I use HDP 2.6.5. which has Solr 5.6
> > >
> > > Thank you,
> > > Rushikesh Garadade
> > >
> > >
> > > On Thu, Feb 21, 2019, 2:57 PM Furkan KAMACI 
> > > wrote:
> > >
> > > > Hi Rushikesh,
> > > >
> > > > Did you check here:
> > > >
> > > >
> > >
> > https://docs.hortonworks.com/HDPDocuments/HDP2/HDP-2.6.5/bk_security/content/solr_ranger_configure_solrcloud_kerberos.html
> > > >
> > > > By the way, which versions do you use?
> > > >
> > > > Kind Regards,
> > > > Furkan KAMACI
> > > >
> > > > On Thu, Feb 21, 2019 at 11:41 AM Rushikesh Garadade <
> > > > rushikeshgarad...@gmail.com> wrote:
> > > >
> > > > > Hi All,
> > > > >
> > > > > I am trying to set Kerberos for Solr which is installed on
> > Hortonworks
> > > > > Ambari.
> > > > >
> > > > > Q1. Is Ranger a mandatory component for Solr Kerberos configuration
> > on
> > > > > ambari.?
> > > > >
> > > > > I am getting little confused with documents available on internet for
> > > > this.
> > > > > I tried to do without ranger but not getting any success.
> > > > >
> > > > > If is there any good document for the same, please let me know.
> > > > >
> > > > > Thanks,
> > > > > Rushikesh Garadade.
> > > > >
> > > >
> > >
> >


Re: %solr_logs_dir% does not like spaces

2019-03-01 Thread Jason Gerlowski
+1 to submitting a JIRA, even if you cannot find an edit to solr.cmd
to fix the issue.

And +1 to the issue likely just being a lack of double-quotes around
the reference to SOLR_LOG_DIR.

Best,

Jason Gerlowski

On Tue, Feb 26, 2019 at 11:56 AM Erick Erickson  wrote:
>
> If you can munge the solr.cmd file and it works for you, _please_ submit a 
> JIRA and a patch!
>
> most of the Solr devs develop on *nix boxes, so this kind of thing creeps in 
> and we need to fix it.
>
> Best,
> Erick
>
> > On Feb 26, 2019, at 6:38 AM, paul.d...@ub.unibe.ch wrote:
> >
> > Perhaps the instances of %SOLR_LOGS_DIR% in the solr.cmd files should be 
> > quoted i.e. "%SOLR_LOGS_DIR%" ??
> >
> >
> >
> > Gesendet von Mail<https://go.microsoft.com/fwlink/?LinkId=550986> für 
> > Windows 10
> >
> >
> >
> > Von: Arturas Mazeika<mailto:maze...@gmail.com>
> > Gesendet: Dienstag, 26. Februar 2019 15:10
> > An: solr-user@lucene.apache.org<mailto:solr-user@lucene.apache.org>
> > Betreff: Re: %solr_logs_dir% does not like spaces
> >
> >
> >
> > Hi Paul,
> >
> > getting rid of space in "program files" is doable, you are right. One way
> > to do it is through
> >
> >   - echo %programfiles% ==> C:\Program Files
> >   - echo %programfiles(x86)% ==> C:\Program Files (x86)
> >
> > Getting rid of spaces in sub directories is very difficult as we use tons
> > of those for different components of our suite.
> >
> > Any other options to set it in some XML file or something?
> >
> > Cheers,
> > Arturas
> >
> >
> > On Tue, Feb 26, 2019 at 3:03 PM  wrote:
> >
> >> Looks like a bug in solr.cmd. You could try eliminating the spaces and/or
> >> opening an issue.
> >>
> >>
> >>
> >> Instead of ‘Program Files (x86)’ use ‘PROGRA~2’
> >>
> >> And don’t have spaces in your subdirectory…
> >>
> >>
> >>
> >> NB: Depending on your Windows Version you may Have another alias for
> >> ‘Program Files (x86)’; use «dir /X» to view the aliases.
> >>
> >>
> >>
> >> Gesendet von Mail<https://go.microsoft.com/fwlink/?LinkId=550986> für
> >> Windows 10
> >>
> >>
> >>
> >> Von: Arturas Mazeika<mailto:maze...@gmail.com>
> >> Gesendet: Dienstag, 26. Februar 2019 14:41
> >> An: solr-user@lucene.apache.org<mailto:solr-user@lucene.apache.org>
> >> Betreff: %solr_logs_dir% does not like spaces
> >>
> >>
> >>
> >> Hi All,
> >>
> >> I am testing solr 7.7 (and 7.6) under windows. My aim is to set logging
> >> into a subdirectory that contains spaces of a directory that contains
> >> spaces.
> >>
> >> If I set on windows:
> >>
> >> setx /m SOLR_LOGS_DIR "f:\solr_deployment\logs"
> >>
> >> and start a solr instance:
> >>
> >> F:\solr_deployment\solr-7.7.0\bin\solr.cmd start -h localhost -p 8983 -s
> >> F:\solr_deployment\solr_data -m 1g
> >>
> >> this goes smoothly.
> >>
> >> However If I set the logging directory to:
> >>
> >> setx /m SOLR_LOGS_DIR  "C:\Program Files (x86)\My Directory\Another
> >> Directory\logs\solr"
> >>
> >> then I get a cryptic error:
> >>
> >> F:\solr_deployment\solr-7.7.0\bin\solr.cmd start -h localhost -p 8983 -s
> >> F:\solr_deployment\solr_data -m 1g
> >> Files was unexpected at this time.
> >>
> >> If I comment "@echo off" in both solr.cmd and solr.cmd.in, it shows that
> >> it
> >> dies around those lines in solr.cmd:
> >>
> >> F:\solr_deployment\solr-7.7.0\bin>IF "" == "" set STOP_KEY=solrrocks
> >> Files was unexpected at this time.
> >>
> >> In the solr.cmd the following block is shown:
> >>
> >> IF "%STOP_KEY%"=="" set STOP_KEY=solrrocks
> >>
> >> @REM This is quite hacky, but examples rely on a different log4j2.xml
> >> @REM so that we can write logs for examples to %SOLR_HOME%\..\logs
> >> IF [%SOLR_LOGS_DIR%] == [] (
> >>  set "SOLR_LOGS_DIR=%SOLR_SERVER_DIR%\logs"
> >> ) ELSE (
> >>  set SOLR_LOGS_DIR=%SOLR_LOGS_DIR:"=%
> >> )
> >>
> >> comments?
> >>
> >> Cheers,
> >> Arturas
> >>
>


Re: Giving SolrJ credentials for Zookeeper

2019-03-01 Thread Jason Gerlowski
Hi Ryan,

I haven't tried this myself, but wanted to offer a sanity check based
on how I understand those instructions.

Are you setting the "zkCredentialsProvider", "zkDigestUsername", and
"zkDigestPassword" system-properties on your client app/process as
well as on your Solr/ZK servers?  Or are you just setting it in the
config for your Solr/ZK servers?  I expect those system properties
need to be set for the client process as well, though the ref-guide
page doesn't explicitly say so.

Best,

Jason

On Tue, Feb 26, 2019 at 12:56 PM Snead, Ryan [USA]  wrote:
>
> I am following along with the example found in Zookeeper Access Control of 
> the Apache Solr 7.5 Reference Guide. I have gotten to the point where I can 
> use the zkcli.sh control script to access my secured Zookeeper environment. I 
> can also connect using Zookeeper's zkCli.sh and then authenticate using the 
> auth command. The point where I run into trouble is having completed the 
> steps in the article, how do I find what parameters to set with SolrJ to 
> allow my indexer code to communicate with Zookeeper.
>
> The error my Java code is returning when I try to process a QueryRequest is: 
> Error reading cluster properties from zookeeper 
> org.apache.zookeeper.KeeperException$NoAuthException: KeeperError Code = 
> NoAuth for /clusterprops.json
>
> My code is:
> solrClient = new CloudSolrClient.Builder("localhost:2181", 
> Optional.of("/")).build();
> String solrQuery = String.format("PRODUCT_TYPE:USER and PRODUCT_SK:%s", 
> productSk);
> SolrQuery q = new SolrQuery();
> q.set("q", solrQuery);
> QueryRequest request = new QueryRequest(q);
> numfound = request.process(solrClient).getResults().getNumFound();
> Error occurs at the last line. I suspect that I need to set a property in 
> solrClient, but it is not clear to me what that would be.
>
> References:
> https://lucene.apache.org/solr/guide/7_5/zookeeper-access-control.html
> ZooKeeper Access Control | Apache Solr Reference Guide 
> 7.5
> Content stored in ZooKeeper is critical to the operation of a SolrCloud 
> cluster. Open access to SolrCloud content on ZooKeeper could lead to a 
> variety of problems.
> lucene.apache.org
>
>


Re: Python Client for Solr Cloud - Leader aware

2019-03-01 Thread Jason Gerlowski
Hi Ganesh,

I'm not an expert on pysolr, but from a quick scan of their update
code, it does look like pysolr attempts to send update requests to _a_
leader node for a particular collection.  But that's all it does.  It
doesn't check which shard the document(s) will belong to and try to
pick the _correct_ leader. If your collections only have 1 shard, this
is still pretty great.  But if your collections have multiple shards
(and multiple leaders), then this will perform worse than SolrJ.

(This is based on what I gleaned from the code here:
https://github.com/django-haystack/pysolr/blob/master/pysolr.py#L1268
. Happy to be corrected by someone with more context.)

Best,

Jason

On Tue, Feb 26, 2019 at 1:50 PM Ganesh Sethuraman
 wrote:
>
> We are using Solr Cloud 7.2.1. Is there a leader aware python client (like
> SolrJ for Java), which can send the updates to the leader and it its highly
> available?
> I see PySolr https://pypi.org/project/pysolr/ project, not able to find any
> documentation if it supports leader aware updates.
>
> Regards
> Ganesh


Re: Solr Reference Guide for version 7.7

2019-03-01 Thread Jason Gerlowski
Hi Edwin,

I volunteered to release the 7.7 ref-guide last week but decided to
wait until 7.7.1 came out to work on it.  (You probably know that
7.7.0 contained some serious bugs.  These would've required
non-trivial documentation effort in the ref-guide, and 7.7.1 already
had a release-manager and was coming soon, so it was simpler to wait.)

I'm back working on the 7.7 ref-guide today and hopefully we'll have
one out next week.  In the meantime, if you'd like to have the latest
documentation you can always check out the source code and build the
ref-guide locally ("ant clean default" from the solr/solr-ref-guide
directory, see the README in that same directory for more help)

Best,

Jason

On Thu, Feb 28, 2019 at 11:05 PM Zheng Lin Edwin Yeo
 wrote:
>
> Hi,
>
> Understand that Solr 7.7.1 has just been released, but Solr 7.7.0 has been
> released almost a month ago.
>
> However, from http://lucene.apache.org/solr/guide/, I still could not
> access the guide for version 7.7, the latest version is still 7.6.
>
> Is there any plans to release the guide for 7.7, or has the site been
> shifted to a new URL?
>
> Regards,
> Edwin


Re: Hide BasicAuth JVM param on SOLR admin UI

2019-03-07 Thread Jason Gerlowski
Solr has a configuration option that allows redacting particular
properties that appear in the Admin UI.  I _think_ this is the
functionality you're looking for.  For more information, Kevin Risden
has a great little writeup of it here:
https://risdenk.github.io/2018/11/27/apache-solr-hide-redact-sensitive-properties.html

Hope that helps,

Jason

On Wed, Mar 6, 2019 at 9:27 PM Aroop Ganguly  wrote:
>
> try changing the passwords using the auth api 
> https://lucene.apache.org/solr/guide/6_6/basic-authentication-plugin.html#BasicAuthenticationPlugin-AddaUserorEditaPassword
>  
> 
>
> That point onwards your credentials will be encrypted on the admin ui.
> I do not think your -DbasicAuth password will change but your actual password 
> would be different and base64 encrypted.
>
>
> > On Mar 6, 2019, at 12:22 AM, el mas capo  wrote:
> >
> > Hi everyone,
> > I am trying to configure Cloud Solr(7.7.0) with basic Authentification. All 
> >  seems to work nicely, but when I enter on the Web UI I can see the basic 
> > Auth Password configured in solr.in.sh in clear format:
> > -Dbasicauth=solr:SolrRocks
> > Can this behaviour be avoided?
> > Thank you by your attention.
> >
>


Re: Solrj, Json Facets, (Date) stats facets

2019-03-11 Thread Jason Gerlowski
Hi Andrea,

It looks like you've stumbled on a bug in NestableJsonFacet.  I
clearly wasn't thinking about Date stats when I first wrote it; it
looks like it doesn't detect/parse them correctly in the current
iteration.  I'll try to fix this in a subsequent release.  But in the
meantime, unfortunately your only option is to use the NamedList
structures directly to retrieve the stat value.

Thanks for bringing it to our attention.

Best,

Jason

On Fri, Mar 8, 2019 at 4:42 AM Andrea Gazzarini  wrote:
>
> Good morning guys, I have a questions about Solrj and JSON facets.
>
> I'm using Solr 7.7.1 and I'm sending a request like this:
>
> json.facet={x:'max(iterationTimestamp)'}
>
> where "iterationTimestamp" is a solr.DatePointField. The JSON response
> correctly includes what I'm expecting:
>
>  "facets": {
>  "count": 8,
>  "x": "1973-09-20T17:33:18.700Z"
>  }
>
> but Solrj doesn't. Specifically, the jsonFacetingResponse contains only
> the domainCount attribute (8).
> Looking at the code I see that in NestableJsonFacet a stats is taken in
> account only if the corresponding value is an instance of Number (and x
> in the example above is a java.util.Date).
>
> Is that expected? Is there a way (other than dealing with nested
> NamedLists) for retrieving that value?
>
> Cheers,
> Andrea


Apache Solr Reference Guide 7.7 Released

2019-03-11 Thread Jason Gerlowski
The Lucene PMC is pleased to announce that the Solr Reference Guide
for 7.7 is now available.

This 1,431-page PDF is the definitive guide to using Apache Solr, the
search server built on Lucene.

The PDF Guide can be downloaded from:
https://www.apache.org/dyn/closer.cgi/lucene/solr/ref-guide/apache-solr-ref-guide-7.7.pdf.

It is also available online at https://lucene.apache.org/solr/guide/7_7.


Re: ClassCastException in SolrJ 7.6+

2019-03-11 Thread Jason Gerlowski
Hi Gerald,

That looks like it might be a bug in SolrJ's JSON faceting support.
Do you have a small code snippet that reproduces the problem?  That'll
help us confirm it's a bug, and get us started on fixing it.

Best,

Jason

On Mon, Mar 11, 2019 at 10:29 AM Gerald Bonfiglio  wrote:
>
> I'm seeing the following Exception using JSON Facet API in SolrJ 7.6, 7.7, 
> 7.7.1:
>
> Caused by: java.lang.ClassCastException: java.lang.Long cannot be cast to 
> java.lang.Integer
>   at 
> org.apache.solr.client.solrj.response.json.NestableJsonFacet.(NestableJsonFacet.java:52)
>   at 
> org.apache.solr.client.solrj.response.QueryResponse.extractJsonFacetingInfo(QueryResponse.java:200)
>   at 
> org.apache.solr.client.solrj.response.QueryResponse.getJsonFacetingResponse(QueryResponse.java:571)
>
>
>
>
>
> [Nastel  Technologies]
>
> The information contained in this e-mail and in any attachment is 
> confidential and
> is intended solely for the use of the individual or entity to which it is 
> addressed.
> Access, copying, disclosure or use of such information by anyone else is 
> unauthorized.
> If you are not the intended recipient, please delete the e-mail and refrain 
> from use of such information.


Re: Solr collection indexed to pdf in hdfs throws error during solr restart

2019-03-14 Thread Jason Gerlowski
> When I restart Solr

How exactly are you restarting Solr?  Are you running a "bin/solr
restart"?  Or is Solr already shut down and you're just starting it
back up with a "bin/solr start "?  Depending on how Solr
was shut down, you might be running into a bit of a known-issue with
Solr's HDFS support.  Solr creates lock files for each index, to
restrict who can write to that index in the interest of avoiding race
conditions and protecting against file corruption.  Often when Solr
crashes or is shut down abruptly (via a "kill -9") it doesn't have
time to clean up these lock files and it fails to start up the next
time because it is still locked out from touching that index.  This
might be what you're running in to.  In which case you could carefully
make sure that no Solr nodes are using the index in question, delete
the lock file manually out of HDFS, and try starting Solr again.

The advice above is what we usually tell people with write.lock issues
on HDFS...though some elements of the stack trace you provided make me
wonder whether you're seeing the same exact problem.  Your stack trace
has a NullPointerException, and a "Filesystem Closed" error (typically
seen when a Java object gets closed too early and may indicate a bug).
I'm not used to seeing either of these associated with the "standard"
write.lock issues.  What version of Solr are you seeing this on?

Best regards,

Jason

On Thu, Mar 14, 2019 at 5:28 AM VAIBHAV SHUKLA
shuklavaibha...@yahoo.in  wrote:
>
> When I restart Solr it throws the following error. Solr collection indexed to 
> pdf in hdfs throws error during solr restart.
>
>
>
> Error
>
> java.util.concurrent.ExecutionException: 
> org.apache.solr.common.SolrException: Unable to create core [PDFIndex]
> at java.util.concurrent.FutureTask.report(FutureTask.java:122)
> at java.util.concurrent.FutureTask.get(FutureTask.java:192)
> at 
> org.apache.solr.core.CoreContainer.lambda$load$6(CoreContainer.java:594)
> at 
> com.codahale.metrics.InstrumentedExecutorService$InstrumentedRunnable.run(InstrumentedExecutorService.java:176)
> at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
> at java.util.concurrent.FutureTask.run(FutureTask.java:266)
> at 
> org.apache.solr.common.util.ExecutorUtil$MDCAwareThreadPoolExecutor.lambda$execute$0(ExecutorUtil.java:229)
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
> at java.lang.Thread.run(Thread.java:748)
> Caused by: org.apache.solr.common.SolrException: Unable to create core 
> [PDFIndex]
> at 
> org.apache.solr.core.CoreContainer.createFromDescriptor(CoreContainer.java:966)
> at 
> org.apache.solr.core.CoreContainer.lambda$load$5(CoreContainer.java:565)
> at 
> com.codahale.metrics.InstrumentedExecutorService$InstrumentedCallable.call(InstrumentedExecutorService.java:197)
> ... 5 more
> Caused by: org.apache.solr.common.SolrException: Index dir 
> 'hdfs://192.168.1.16:8020/PDFIndex/data/index/' of core 'PDFIndex' is already 
> locked. The most likely cause is another Solr server (or another solr core in 
> this server) also configured to use this directory; other possible causes may 
> be specific to lockType: hdfs
> at org.apache.solr.core.SolrCore.(SolrCore.java:977)
> at org.apache.solr.core.SolrCore.(SolrCore.java:830)
> at 
> org.apache.solr.core.CoreContainer.createFromDescriptor(CoreContainer.java:950)
> ... 7 more
> Caused by: org.apache.lucene.store.LockObtainFailedException: Index dir 
> 'hdfs://192.168.1.16:8020/PDFIndex/data/index/' of core 'PDFIndex' is already 
> locked. The most likely cause is another Solr server (or another solr core in 
> this server) also configured to use this directory; other possible causes may 
> be specific to lockType: hdfs
> at org.apache.solr.core.SolrCore.initIndex(SolrCore.java:712)
> at org.apache.solr.core.SolrCore.(SolrCore.java:923)
> ... 9 more
> 2018-12-22 07:55:13.431 ERROR 
> (OldIndexDirectoryCleanupThreadForCore-PDFIndex) [   x:PDFIndex] 
> o.a.s.c.HdfsDirectoryFactory Error checking for old index directories to 
> clean-up.
> java.io.IOException: Filesystem closed
> at org.apache.hadoop.hdfs.DFSClient.checkOpen(DFSClient.java:808)
> at org.apache.hadoop.hdfs.DFSClient.listPaths(DFSClient.java:2083)
> at org.apache.hadoop.hdfs.DFSClient.listPaths(DFSClient.java:2069)
> at 
> org.apache.hadoop.hdfs.DistributedFileSystem.listStatusInternal(DistributedFileSystem.java:791)
> at 
> org.apache.hadoop.hdfs.DistributedFileSystem.access$700(DistributedFileSystem.java:106)
> at 
> org.apache.hadoop.hdfs.DistributedFileSystem$18.doCall(DistributedFileSystem.java:853)
> at 
> org.apache.hadoop.hdfs.DistributedFileSystem$18.doCall(DistributedFileSystem.java:849)
> at 
> org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.

Re: Upgrading solarj from 6.5.1 to 8.0.0

2019-03-21 Thread Jason Gerlowski
You should be able to set credentials on individual requests with the
SolrRequest.setBasicAuthCredentials() method.  That's the method
suggested by the latest Solr ref guide at least:
https://lucene.apache.org/solr/guide/7_7/basic-authentication-plugin.html#using-basic-auth-with-solrj

There might be a way to set the credentials on the client itself, but
I can't think of it at the moment.

Hope that helps,

Jason

On Thu, Mar 21, 2019 at 2:34 AM Lahiru Jayasekera
 wrote:
>
> Hi all,
> I need help implementing the following code in solarj 8.0.0.
>
> private SolrClient server, adminServer;
>
> this.adminServer = new HttpSolrClient(SolrClientUrl);
> this.server = new HttpSolrClient( SolrClientUrl + "/" + mapping.getCoreName() 
> );
> if (serverUserAuth) {
>   HttpClientUtil.setBasicAuth(
>   (DefaultHttpClient) ((HttpSolrClient) adminServer).getHttpClient(),
>   serverUsername, serverPassword);
>   HttpClientUtil.setBasicAuth(
>   (DefaultHttpClient) ((HttpSolrClient) server).getHttpClient(),
>   serverUsername, serverPassword);
> }
>
>
> I could get the solarClients as following
>
> this.adminServer = new HttpSolrClient.Builder(SolrClientUrl).build();
> this.server = new HttpSolrClient.Builder( SolrClientUrl + "/" +
> mapping.getCoreName() ).build();
>
> But i can't find a way to implement basic authentication. I think that it
> can be done via SolrHttpClientBuilder.
> Can you please help me to solve this?
>
> Thank and regards
> Lahiru
> --
> Lahiru Jayasekara
> Batch 15
> Faculty of Information Technology
> University of Moratuwa
> 0716492170


Re: Upgrading solarj from 6.5.1 to 8.0.0

2019-03-25 Thread Jason Gerlowski
Hi Lahiru,

I had a chance to refresh myself on how this works over the weekend.
There are two ways in SolrJ to talk to a Solr protected by basic-auth:

1. The SolrRequest.setBasicAuthCredentials() method I mentioned
before.  This can be painful though, and isn't even possible in all
usecases.
2. Configuring your client process with several System Properties.
First, set the property "solr.httpclient.builder.factory" to
"org.apache.solr.client.solrj.impl.PreemptiveBasicAuthClientBuilderFactory"
to tell SolrJ that you want any SolrClient's setup to use basic-auth.
Once that is setup, you can specify your credentials in one of two
ways.  If you're OK with the auth credentials appearing in the command
line for your process, you can set the "basicauth" system property to
a value of the form ":".  A slightly more approach
is to have SolrJ read the credentials from a file.  You can choose
this approach by setting the "solr.httpclient.config" system property
and giving it the full path to an accessible properties file.  You
then need to create the properties file, specifying your username and
password using the "httpBasicAuthUser" and "httpBasicAuthPassword"
properties.

Currently (2) is not documented in our Solr Ref Guide, though it
really should be since it's the most practical way to setup auth.

Hope that helps,

Jason

On Thu, Mar 21, 2019 at 1:25 PM Erick Erickson  wrote:
>
> One tangent just so you’re aware. You _must_ re-index from scratch. Lucene 8x 
> will refuse to open an index that was _ever_ touched by Solr 6.
>
> Best,
> Erick
>
> > On Mar 21, 2019, at 8:26 AM, Lahiru Jayasekera  
> > wrote:
> >
> > Hi Jason,
> > Thanks for the response. I saw the method of setting credentials based on
> > individual request.
> > But I need to set the credentials at solrclient level. If you remember the
> > way to do it please let me know.
> >
> > Thanks
> >
> > On Thu, Mar 21, 2019 at 8:26 PM Jason Gerlowski 
> > wrote:
> >
> >> You should be able to set credentials on individual requests with the
> >> SolrRequest.setBasicAuthCredentials() method.  That's the method
> >> suggested by the latest Solr ref guide at least:
> >>
> >> https://lucene.apache.org/solr/guide/7_7/basic-authentication-plugin.html#using-basic-auth-with-solrj
> >>
> >> There might be a way to set the credentials on the client itself, but
> >> I can't think of it at the moment.
> >>
> >> Hope that helps,
> >>
> >> Jason
> >>
> >> On Thu, Mar 21, 2019 at 2:34 AM Lahiru Jayasekera
> >>  wrote:
> >>>
> >>> Hi all,
> >>> I need help implementing the following code in solarj 8.0.0.
> >>>
> >>> private SolrClient server, adminServer;
> >>>
> >>> this.adminServer = new HttpSolrClient(SolrClientUrl);
> >>> this.server = new HttpSolrClient( SolrClientUrl + "/" +
> >> mapping.getCoreName() );
> >>> if (serverUserAuth) {
> >>>  HttpClientUtil.setBasicAuth(
> >>>  (DefaultHttpClient) ((HttpSolrClient) adminServer).getHttpClient(),
> >>>  serverUsername, serverPassword);
> >>>  HttpClientUtil.setBasicAuth(
> >>>  (DefaultHttpClient) ((HttpSolrClient) server).getHttpClient(),
> >>>  serverUsername, serverPassword);
> >>> }
> >>>
> >>>
> >>> I could get the solarClients as following
> >>>
> >>> this.adminServer = new HttpSolrClient.Builder(SolrClientUrl).build();
> >>> this.server = new HttpSolrClient.Builder( SolrClientUrl + "/" +
> >>> mapping.getCoreName() ).build();
> >>>
> >>> But i can't find a way to implement basic authentication. I think that it
> >>> can be done via SolrHttpClientBuilder.
> >>> Can you please help me to solve this?
> >>>
> >>> Thank and regards
> >>> Lahiru
> >>> --
> >>> Lahiru Jayasekara
> >>> Batch 15
> >>> Faculty of Information Technology
> >>> University of Moratuwa
> >>> 0716492170
> >>
> >
> >
> > --
> > Lahiru Jayasekara
> > Batch 15
> > Faculty of Information Technology
> > University of Moratuwa
> > 0716492170
>


security.json "all" predefined permission

2019-03-28 Thread Jason Gerlowski
Hi all,

Diving into the RuleBasedAuthorizationPlugin for the first time in
awhile, and found that the predefined permission "all" isn't behaving
the way I'd expect it to.  I'm trying to figure out whether it doesn't
work the way I think, whether I'm just making a dumb mistake, or
whether it's currently broken on master (and some 7x versions)

My intent is to create two users, one with readonly access, and an
admin user with access to all APIs.  I'm trying to achieve this with
the security.json below:

{
  "authentication": {
"blockUnknown": true,
"class": "solr.BasicAuthPlugin",
"credentials": {
  "readonly": "",
  "admin": ""}},
  "authorization": {
"class": "solr.RuleBasedAuthorizationPlugin",
"permissions": [
  {"name":"read","role": "*"},
  {"name":"schema-read", "role":"*"},
  {"name":"config-read", "role":"*"},
  {"name":"collection-admin-read", "role":"*"},
  {"name":"metrics-read", "role":"*"},
  {"name":"core-admin-read","role":"*"},
  {"name": "all", "role": "admin_role"}
],
"user-role": {
  "readonly": "readonly_role",
  "admin": "admin_role"
}}}

When I go to test this though, I'm surprised to find that the
"readonly" user is still able to access APIs that I would expect to be
locked down.  The "readonly" user can even update security permissions
with the curl command below!

curl -X POST -H 'Content-Type: application/json' -u
"readonly:readonlyPassword"
http://localhost:8983/solr/admin/authorization --d
@some_auth_json.json

My expectation was that the predefined "all" permission would act as a
catch all, and restrict all requests to "admin_role" that require
permissions I didn't explicitly give to my "readonly" user.  But it
doesn't seem to work that way.  Am I misunderstanding what the "all"
permission does, or is this a bug?

Thanks for any help or clarification.

Jason


Re: security.json "all" predefined permission

2019-03-29 Thread Jason Gerlowski
Thanks for the pointer Jan.

I spent much of yesterday experimenting with the ordering to make sure
that wasn't a factor and I was able to eventually rule it out with
some debug logging that showed that the requests were being allowed
because it couldn't find any governing permission rules. Apparently
RBAP fails "open"
(https://github.com/apache/lucene-solr/blob/master/solr/core/src/java/org/apache/solr/security/RuleBasedAuthorizationPlugin.java#L208)

Anyway, I'm pretty convinced this is a bug.  Most handlers implement
the PermissionNameProvider interface, which has a method that spits
out the required permission for that request handler.  (e.g.
CoreAdminHandler.getPermissionName() returns either CORE_READ_PERM or
CORE_EDIT_PERM based on the request's query params).  When the
request-handler is-a PermissionNameProvider, we do string matching to
see whether we have permissions, but we don't check for the "all"
special case.  So RBAP checks for "all" if the handler wasn't a
PermissionNameProvider (causing SOLR-13344's Admin UI behavior), but
it doesn't check for all when the handler is a PermissionNameProvider
(causing the buggy behavior I described above).

We should definitely be checking for all when there is a
PermissionNameProvider, so I'll create a JIRA for this.

Best,

Jason

On Thu, Mar 28, 2019 at 6:11 PM Jan Høydahl  wrote:
>
> There was some other issues with the "all" permission as well lately, see 
> https://issues.apache.org/jira/browse/SOLR-13344 
> <https://issues.apache.org/jira/browse/SOLR-13344>
> Order matters in permissions, the first permission matching is used, but I 
> don't know how that would change anything here.
> One thing to try could be to start with an empty RuleBasedAuth and then use 
> the REST API to add all the permissions and roles,
> in that way you are sure that they are syntactically correct, and hopefully 
> you get some errors if you do something wrong?
>
> --
> Jan Høydahl, search solution architect
> Cominvent AS - www.cominvent.com
>
> > 28. mar. 2019 kl. 20:24 skrev Jason Gerlowski :
> >
> > Hi all,
> >
> > Diving into the RuleBasedAuthorizationPlugin for the first time in
> > awhile, and found that the predefined permission "all" isn't behaving
> > the way I'd expect it to.  I'm trying to figure out whether it doesn't
> > work the way I think, whether I'm just making a dumb mistake, or
> > whether it's currently broken on master (and some 7x versions)
> >
> > My intent is to create two users, one with readonly access, and an
> > admin user with access to all APIs.  I'm trying to achieve this with
> > the security.json below:
> >
> > {
> >  "authentication": {
> >"blockUnknown": true,
> >"class": "solr.BasicAuthPlugin",
> >"credentials": {
> >  "readonly": "",
> >  "admin": ""}},
> >  "authorization": {
> >"class": "solr.RuleBasedAuthorizationPlugin",
> >"permissions": [
> >  {"name":"read","role": "*"},
> >  {"name":"schema-read", "role":"*"},
> >  {"name":"config-read", "role":"*"},
> >  {"name":"collection-admin-read", "role":"*"},
> >  {"name":"metrics-read", "role":"*"},
> >  {"name":"core-admin-read","role":"*"},
> >  {"name": "all", "role": "admin_role"}
> >],
> >"user-role": {
> >  "readonly": "readonly_role",
> >  "admin": "admin_role"
> >}}}
> >
> > When I go to test this though, I'm surprised to find that the
> > "readonly" user is still able to access APIs that I would expect to be
> > locked down.  The "readonly" user can even update security permissions
> > with the curl command below!
> >
> > curl -X POST -H 'Content-Type: application/json' -u
> > "readonly:readonlyPassword"
> > http://localhost:8983/solr/admin/authorization --d
> > @some_auth_json.json
> >
> > My expectation was that the predefined "all" permission would act as a
> > catch all, and restrict all requests to "admin_role" that require
> > permissions I didn't explicitly give to my "readonly" user.  But it
> > doesn't seem to work that way.  Am I misunderstanding what the "all"
> > permission does, or is this a bug?
> >
> > Thanks for any help or clarification.
> >
> > Jason
>


Re: Documentation for Apache Solr 8.0.0?

2019-04-01 Thread Jason Gerlowski
The Solr Reference Guide (of which the online documentation is a part)
gets built and released separately from the Solr distribution itself.
The Solr community tries to keep the code and documentation releases
as close together as we can, but the releases require work and are
done on a volunteer basis.  No one has volunteered for the 8.0.0
reference-guide release yet, but I suspect a volunteer will come
forward soon.

In the meantime though, there is documentation for Solr 8.0.0
available.  Solr's documentation is included alongside the code.  You
can checkout Solr and build the documentation yourself by moving to
"solr/solr-ref-guide" and running the command "ant clean default" from
that directory.  This will build the same HTML pages you're used to
seeing at lucene.apache.org/solr/guide, and you can open the local
copies in your browser and browse them as you normally would.

Alternatively, the Solr mirror on Github does its best to preview the
documentation.  It doesn't display perfectly, but it might be helpful
for tiding you over until the official documentation is available, if
you're unwilling or unable to build the documentation site locally:
https://github.com/apache/lucene-solr/blob/branch_8_0/solr/solr-ref-guide/src/index.adoc

Hope that helps,

Jason

On Mon, Apr 1, 2019 at 7:34 AM Yoann Moulin  wrote:
>
> Hello,
>
> I’m looking for the documentation for the latest release of SolR (8.0) but it 
> looks like it’s not online yet.
>
> https://lucene.apache.org/solr/news.html
>
> http://lucene.apache.org/solr/guide/
>
> Do you know when it will be available?
>
> Best regards.
>
> --
> Yoann Moulin
> EPFL IC-IT


Re: bin/post command not working when run from crontab

2019-04-14 Thread Jason Gerlowski
Hi Carsten,

I think this is probably worth a jira.  I'm not familiar enough with
bin/post to say definitively whether the behavior you mention is a
bug, or whether it's "expected" in some odd sense.  But there's enough
uncertainty that I think it's worth recording there.

Best,

Jason

On Fri, Apr 12, 2019 at 5:52 AM Carsten Agger  wrote:
>
> Hi all
>
> I posted the question below some time back, concerning the unusual
> behaviour of bin/post if there is no stdin.
>
> There has been no comments to that, and maybe bin/post is quaint in that
> regard - I ended up changing my application to POST directly on the Web
> endpoint instead.
>
> But I do have one question, though: Should this be considered a bug, and
> should I report it as such? Unfortunately I don't have the time to
> prepare a proper fix myself.
>
> Best
> Carsten
>
> On 3/27/19 7:55 AM, Carsten Agger wrote:
> > I'm working with a script where I want to send a command to delete all
> > elements in an index; notably,
> >
> >
> > /opt/solr/bin/post -c  -d  
> > "*:*"
> >
> >
> > When run interactively, this works fine.
> >
> > However, when run automatically as a cron job, it gives this interesting
> > output:
> >
> >
> > Unrecognized argument:   "*:*"
> >
> > If this was intended to be a data file, it does not exist relative to /root
> >
> > The culprit seems to be these lines, 143-148:
> >
> >  if [[ ! -t 0 ]]; then
> >MODE="stdin"
> >  else
> ># when no stdin exists and -d specified, the rest of the 
> > arguments
> ># are assumed to be strings to post as-is
> >MODE="args"
> >
> > This code seems to be doing the opposite of what the comment says - it
> > sets MODE="stdin" if stdin is NOT a terminal, but if it IS (i.e., there
> > IS an stdin) it assumes the rest of the args can be posted as-is.
> >
> > On the other hand, if the condition is reversed, my command will fail
> > interactively but not when run as a cron job. Both options are, of
> > course, unsatisfactory.
> >
> > It /will/ actually work in both cases, if instead the command to delete
> > the contents of the index is written as:
> >
> > echo "*:*" |  /opt/solr/bin/post -c 
> > departments -d
> >
> >
> > I've seen this bug in SOLR 7.5.0 and 7.7.1. Should I report it as a bug
> > or is there an easy explanation?
> >
> >
> > Best
> >
> > Carsten Agger
> >
> >
> --
> Carsten Agger
>
> Chief Technologist
> Magenta ApS
> Skt. Johannes Allé 2
> 8000 Århus C
>
> Tlf  +45 5060 1476
> http://www.magenta-aps.dk
> carst...@magenta-aps.dk
>


Re: Documentation Slop (DisMax parser)

2018-01-18 Thread Jason Gerlowski
Hi James,

1. Good catch, and thanks for reporting it.
2. The improved wording you proposed above matches my (limited)
understanding.  Others might see something wrong that I missed, but I
think it's definitely an improvement over the current wording.
3. If you'd like, you can start the change yourself!  The
reference-guide documentation used to be much more "locked-down", but
now it lives in Asciidoc format alongside the Solr code.  Doc
bugs/improvements are handled through JIRA issues the same as any
other bugs would be.  If you're interested in opening a JIRA for this
and proposing your wording, you can get started using the instructions
here: https://wiki.apache.org/solr/HowToContribute.  Of course, if you
don't have the time or are uninterested in moving this along, I've got
a few minutes to upload a patch to JIRA on your behalf (though it
can't actually get merged without attention from a committer).

Best,

Jason

On Thu, Jan 18, 2018 at 6:29 AM, James  wrote:
> Hi:
>
>
>
> There seems to be an error in the documentation about the slop parameter ps
> used by the eDisMax parser. It reads:
>
>
>
>
>
> "This means that if the terms "foo" and "bar" appear in the document with
> less than 10 terms between each
>
> other, the phrase will match."
>
>
>
>
>
> Counterexample:
>
> "Foo one two three four five fix seven eight nine bar" will not match with
> ps=10
>
>
>
> It seems that it must be "less than 9".
>
>
>
>
>
> However, when more query terms are used it gets complicated when one tries
> to count words in between.
>
>
>
>
>
> Easier to understand (and correct according to my testing) would be
> something like:
>
>
>
> "This means that if the terms "foo" and "bar" appear in the document within
> a group of 10 or less terms, the phrase will match. For example the doc that
> says:
>
> *Foo* term1 term2 term3 *bar*
>
> will match the phrase query. A document that says
>
> *Foo* term1 term2 term3 term4 term5 term6 term7 term8 term9 *bar*
>
> will not (because the search terms are within a group of 11 terms).
>
> Note: If any search term is a MUST-NOT term, the phrase slop query will
> never match.
>
> "
>
>
>
>
>
> Anybody willing to review and change to documentation?
>
>
>
> Thanks,
>
> James
>
>
>
>
>


Re: Long GC Pauses

2018-01-31 Thread Jason Gerlowski
Hi Maulin,

To clarify, when you said "...allocated 40 GB RAM to each shard." above,
I'm going to assume you meant "to each node" instead.  If you actually did
mean "to each shard" above, please correct me and anyone who chimes in
afterward.

Firstly, it's really hard to even take guesses about potential causes or
remediations without more details about your load characteristics
(average/peak QPS, index size, average document size, etc.).  If no one
gives any satisfactory advice, please consider uploading additional details
to help us help you.

Secondly, I don't know anything about the load characteristics you're
putting on your Solr cluster, but I'm curious whether you've experimented
with lower RAM settings.  Generally speaking, the more RAM you have, the
longer your GC pauses are likely to be (even with the tuning that various
GC settings provide).  If you can get away with giving the Solr process
less RAM, you should see your GC pauses shrink.  Was 40GB chosen after some
trial-and-error experimentation, or is it something you could investigate?

For a bit more overview on this, see this slightly outdated (but still
useful) wiki page: https://wiki.apache.org/solr/SolrPerformanceProblems#RAM

Hope that helps, even if just to disqualify some potential causes/solutions
to close in on a real fix.

Best,

Jason

On Wed, Jan 31, 2018 at 8:17 AM, Maulin Rathod  wrote:

> Hi,
>
> We are using solr cloud 6.1. We have around 20 collection on 4 nodes (We
> have 2 shards and each shard have 2 replicas). We have allocated 40 GB RAM
> to each shard.
>
> Intermittently we found long GC pauses (60 sec to 200 sec) due to which
> solr stops responding and hence collections goes in recovering mode. It
> takes minimum 5-10 minutes (sometime it takes more and we have to restart
> the solr node) for recovering all collections. We are using default GC
> setting (CMS) as per solr.cmd.
>
> We tried different G1 GC to see if it help, but still we see long GC
> pauses(60 sec to 200 sec) and also found that memory usage is more in in
> case G1 GC.
>
> What could be reason for long GC pauses and how can fix it? Insufficient
> memory or problem with GC setting or something else? Any suggestion would
> be greatly appreciated.
>
> In our analysis, we also found some inefficient queries (which uses * many
> times in query) in solr logs. Could it be reason for high memory usage?
>
> Slow Query
> --
>
> INFO  (qtp1239731077-498778) [c:documents s:shard1 r:core_node1
> x:documents] o.a.s.c.S.Request [documents]  webapp=/solr path=/select
> params={df=summary&distrib=false&fl=id&shards.purpose=4&
> start=0&fsv=true&sort=description+asc,id+desc&fq=&shard.url=
> s1.asite.com:8983/solr/documents|s1r1.asite.com:
> 8983/solr/documents&rows=250&version=2&q=((id:(
> REV78364_24705418+REV78364_24471492+REV78364_24471429+
> REV78364_24470771+REV78364_24470271+))+OR+summary:((HPC*+
> AND+*+AND+*+AND+OH1150*+AND+*+AND+*+AND+U0*+AND+*+AND+*+AND+
> HGS*+AND+*+AND+*+AND+MDL*+AND+*+AND+*+AND+100067*+AND+*+AND+
> -*+AND+Reinforcement*+AND+*+AND+Mode*)+))++AND++(title:((*
> HPC\+\-\+OH1150\+\-\+U0\+\-\+HGS\+\-\+MDL\+\-\+100067\+-\+
> Reinforcement\+Mode*)+))+AND+project_id:(-2+78243+78365+
> 78364)+AND+is_active:true+AND+((isLatest:(true)+AND+
> isFolderActive:true+AND+isXref:false+AND+-document_
> type_id:(3+7)+AND+((is_public:true+OR+distribution_list:
> 4858120+OR+folderadmin_list:4858120+OR+author_user_id:
> 4858120)+AND+((defaultAccess:(true)+OR+allowedUsers:(
> 4858120)+OR+allowedRoles:(6342201+172408+6336860)+OR+
> combinationUsers:(4858120))+AND+-blockedUsers:(4858120
> +OR+(isLatestRevPrivate:(true)+AND+allowedUsersForPvtRev:(
> 4858120)+AND+-folderadmin_list:(4858120)))&shards.tolerant=true&NOW=
> 1516786982952&isShard=true&wt=javabin} hits=0 status=0 QTime=83309
>
>
>
>
> Regards,
>
> Maulin
>
> [CC Award Winners!]
>
>


Re: solr read timeout

2018-02-15 Thread Jason Gerlowski
Hi Prateek,

Depending on the SolrServer/SolrClient implementation your application
is using, you can make use of the "setSoTimeout" method, which
controls the socket (read) timeout in milliseconds.  e.g.
http://lucene.apache.org/solr/4_8_1/solr-solrj/org/apache/solr/client/solrj/impl/HttpSolrServer.html#setSoTimeout(int)

Best,

Jason

On Thu, Feb 15, 2018 at 9:58 AM, Prateek Jain J
 wrote:
>
> Hi All,
>
> I am using solr 4.8.1 in one of our application and sometimes it gives read 
> timeout error. SolrJ is used from client side. How can I increase this 
> default read timeout?
>
>
> Regards,
> Prateek Jain
>


Re: Solrj : ConcurrentUpdateSolrClient based on QueueSize and Time

2018-02-21 Thread Jason Gerlowski
My apologies Santosh.  I added that comment a few releases back based
on a misunderstanding I've only recently been disabused of.  I will
correct it.

Anyway, Shawn's explanation above is correct.  The queueSize parameter
doesn't control batching, as he clarified.  Sorry for the trouble.

Best,

Jason

On Wed, Feb 21, 2018 at 8:50 PM, Santosh Narayan
 wrote:
> Thanks for the explanation Shawn. Very helpful. I think I got misled by the
> JavaDoc text for
> *ConcurrentUpdateSolrClient.Builder.withQueueSize*
> /**
>  * The number of documents to batch together before sending to Solr. If
> not set, this defaults to 10.
>  */
> public Builder withQueueSize(int queueSize) {
>   if (queueSize <= 0) {
> throw new IllegalArgumentException("queueSize must be a positive
> integer.");
>   }
>   this.queueSize = queueSize;
>   return this;
> }
>
>
>
> On Thu, Feb 22, 2018 at 9:41 AM, Shawn Heisey  wrote:
>
>> On 2/21/2018 7:41 AM, Santosh Narayan wrote:
>> > May be it is my understanding of the documentation. As per the
>> > JavaDoc, ConcurrentUpdateSolrClient
>> > buffers all added documents and writes them into open HTTP connections.
>> >
>> > So I thought that this class would buffer documents in the client side
>> > itself till the QueueSize is reached and then send all the cached
>> documents
>> > together in one HTTP request. Is this not the case?
>>
>> That's not how it's designed.
>>
>> What ConcurrentUpdateSolrClient does differently than HttpSolrClient or
>> CloudSolrClient is return control immediately to your program when you
>> send an update, and begin processing that update in the background.  If
>> you send a LOT of updates very quickly, then the queue will get larger,
>> and will typically be processed in parallel by multiple threads.  The
>> client won't wait for the queue to fill.  Processing of the first update
>> you send should begin right after you add it.
>>
>> Something to consider:  Because control is returned to your program
>> immediately, and the response is always a success, your program will
>> never be informed about any problems with your adds when you use the
>> concurrent client.  The concurrent client is a great choice for initial
>> bulk indexing, because it offers multi-threaded indexing without any
>> need to handle the threads yourself.  But you don't get any kind of
>> error handling.
>>
>> Thanks,
>> Shawn
>>
>>


Re: configure jetty to use both http1.1 and H2

2018-02-23 Thread Jason Gerlowski
Hi Jeff,

I haven't tested your exact use case regarding H/2, but the "bin/solr"
startup script has a special "-j" options that can be used to pass
arbitrary flags to the underlying Jetty server.  If you have options
that work with vanilla Jetty, they _should_ work when passed through
the "bin/solr" interface as well.  Check out "bin/solr start -help"
for more info.

If it doesn't work out, please let us know, and post the commands you
tried, output, etc.

Best,

Jason

On Fri, Feb 23, 2018 at 11:56 AM, Jeff Dyke  wrote:
> Hi, I've been googling around for a while and can't seem to find an answer
> to this.  Is it possible to have the embedded jetty listen to H/2 as well
> has HTTP/1.1, mainly i'd like to use this to access it on a private subnet
> on AWS through HAProxy which is set up to prefer H/2.
>
> With base jetty its as simple as passing arguments to start.jar, but can't
> find how to solve it with solr and the embedded jetty.
>
> Thanks,
> Jeff


Re: Why are cursor mark queries recommended over regular start, rows combination?

2018-03-20 Thread Jason Gerlowski
> I can take a stab at this if someone can point me how to update the 
> documentation.


Hey SG,

Please do, that'd be awesome.

Thanks to some work done by Cassandra Targett a release or two ago,
the Solr Ref Guide documentation now lives in the same codebase as the
Solr/Lucene code itself, and the process for updating it is the same
as suggesting a change to the code:


1. Open a JIRA issue detailing the improvement you'd like to make
2. Find the relevant ref guide pages to update, making the changes
you're proposing.
3. Upload a patch to your JIRA and ask for someone to take a look.
(You can tag me on this issue if you'd like).


Some more specific links you might find helpful:

- JIRA: https://issues.apache.org/jira/projects/SOLR/issues
- Pointers on JIRA conventions, creating patches:
https://wiki.apache.org/solr/HowToContribute
- Root directory for the Solr Ref-Guide code:
https://github.com/apache/lucene-solr/tree/master/solr/solr-ref-guide
- https://lucene.apache.org/solr/guide/7_2/pagination-of-results.html

Best,

Jason

On Wed, Mar 14, 2018 at 2:53 PM, Erick Erickson  wrote:
> I'm pretty sure you can use Streaming Expressions to get all the rows
> back from a sharded collection without chewing up lots of memory.
>
> Try:
> search(collection,
>  q="id:*",
>  fl="id",
>  sort="id asc",
> qt="/export")
>
> on a sharded SolrCloud installation, I believe you'll get all the rows back.
>
> NOTE:
> 1> Some while ago you couldn't _stop_ the stream part way through.
> down in the SolrJ world you could read from a stream for a while and
> call close on it but that would just spin in the background until it
> reached EOF. Search the JIRA list if you need (can't find the JIRA
> right now, 6.6 IIRC is OK and, of course, 7.3).
>
> This shouldn't chew up memory since the streams are sorted, so what
> you get in the response is the ordered set of tuples.
>
> Some of the join streams _do_ have to hold all the results in memory,
> so look at the docs if you wind up using those.
>
>
> Best,
> Erick
>
> On Wed, Mar 14, 2018 at 9:20 AM, S G  wrote:
>> Thanks everybody. This is lot of good information.
>> And we should try to update this in the documentation too to help users
>> make the right choice.
>> I can take a stab at this if someone can point me how to update the
>> documentation.
>>
>> Thanks
>> SG
>>
>>
>> On Tue, Mar 13, 2018 at 2:04 PM, Chris Hostetter 
>> wrote:
>>
>>>
>>> : > 3) Lastly, it is not clear the role of export handler. It seems that
>>> the
>>> : > export handler would also have to do exactly the same kind of thing as
>>> : > start=0 and rows=1000,000. And that again means bad performance.
>>>
>>> : <3> First, streaming requests can only return docValues="true"
>>> : fields.Second, most streaming operations require sorting on something
>>> : besides score. Within those constraints, streaming will be _much_
>>> : faster and more efficient than cursorMark. Without tuning I saw 200K
>>> : rows/second returned for streaming, the bottleneck will be the speed
>>> : that the client can read from the network. First of all you only
>>> : execute one query rather than one query per N rows. Second, in the
>>> : cursorMark case, to return a document you and assuming that any field
>>> : you return is docValues=false
>>>
>>> Just to clarify, there is big difference between the /export handler
>>> and "streaming expressions"
>>>
>>> Unless something has changed drasticly in the past few releases, the
>>> /export handler does *NOT* support exporting a full *collection* in solr
>>> cloud -- it only operates on an individual core (aka: shard/replica).
>>>
>>> Streaming expressions is a feature that does work in Cloud mode, and can
>>> make calls to the /export handler on a replica of each shard in order to
>>> process the data of an entire collection -- but when doing so it has to
>>> aggregate the *ALL* the results from every shard in memory on the
>>> coordinating node -- meaning that (in addition to the docvalues caveat)
>>> streaming expressions requires you to "spend" a lot of ram usage on one
>>> node as a trade off for spending more time & multiple requests to get teh
>>> same data from cursorMark...
>>>
>>> https://lucene.apache.org/solr/guide/exporting-result-sets.html
>>> https://lucene.apache.org/solr/guide/streaming-expressions.html
>>>
>>> An additional perk of cursorMakr that may be relevant to the OP is that
>>> you can "stop" tailing a cursor at anytime (ie: if you're post processing
>>> the results client side and decide you have "enough" results) but a simila
>>> feature isn't available (AFAICT) from streaming expressions...
>>>
>>> https://lucene.apache.org/solr/guide/pagination-of-
>>> results.html#tailing-a-cursor
>>>
>>>
>>> -Hoss
>>> http://www.lucidworks.com/
>>>


Re: Solrj Analytics component

2018-03-20 Thread Jason Gerlowski
Hi Asmaa,

As far as I know, there aren't any SolrJ classes built expressly for
Analytics component requests like what exists for the Collection Admin
APIs, etc. 
(https://lucene.apache.org/solr/7_2_0/solr-solrj/org/apache/solr/client/solrj/request/CollectionAdminRequest.html).
But it should still be possible to package your request into a
SolrRequest via some of the setters on that object, and parse the
response out of the returned NamedList.

It isn't pretty, but it _should_ be possible.  Was there a more
specific aspect of building the request that you were getting hung up
on?


Best of luck,

Jason

On Fri, Mar 16, 2018 at 4:38 PM, Asmaa Shoala  wrote:
> Hello,
>
> I want to use analytics 
> component(https://lucene.apache.org/solr/guide/7_2/analytics.html#analytic-pivot-facets)
>  in java code but i didn't find any guide over the internet .
>
> Can you please help me?
>
> Thanks,
>
> Asmaa Ramzy Shoala
>
> novomind Egypt LLC
> _
>
> 7 Abou Rafea Street, Moustafa Kamel, Alexandria, Egypt
>
> Mobile +20 1227281143
> email asmaa.sho...@nm-eg.com · Skype 
> asmaa.shoala_nmeg
>


Re: Solr failed to start after configuring Kerberos authentication

2018-06-15 Thread Jason Gerlowski
Hi,

Sorry to reply to this so late.  Hopefully you've long since figured out
the issue.  But if not...

1. Just to clarify, are you seeing the error message above when Solr tries
to talk to ZooKeeper?  Or does that error message appear in your ZK logs,
or from a ZK-client you're using to test connections to your
kerberized-ZK?  You may have done this already, but I would recommend
making sure that ZooKeeper is fully kerberized before introducing Solr into
the mix.

2. To me, the key piece of that error message is: "Server not found in
Kerberos database".  That makes is sound like the hostname (or IP) one of
your machines is using doesn't match anything the KDC knows about.
Normally this is a DNS issue.  Or if you used raw IPs when setting up your
configuration, some of them might have changed.  You can find a little more
information here:
https://steveloughran.gitbooks.io/kerberos_and_hadoop/content/sections/errors.html.
(I can't recommend that guide enough btw.  It doesn't cover Solr
explicitly, but is great for an overview on Kerberos setup and debugging.)

3. For anyone on the list to help you much beyond that, you might have to
add more information.  What do the logs tell you when you enable Kerberos
debug logging (-Dsun.security.krb5.debug=true)?  What startup parameters
are you using with Solr?  Have you tested the Zookeeper Kerberization in
isolation from Solr (i.e. with zkCli.sh)?  What do your JAAS config files
look like?

As I said above, hopefully you've long since found your problem and this
might be helpful for someone else down the road.  But if you're still
working on this, feel free to attach more information and maybe we can
figure it out.

Best,

Jason

On Thu, May 24, 2018 at 2:44 PM, adfel70  wrote:

> Hi,
> We are trying to configure Kerberos auth for Solr 6.5.1.
> We went over the steps as described through Sorl’s ref guide, but after
> restart we are getting the following error:
>
> org.apache.zookeeper.client.ZookeeperSaslClient; An error:
> (java.security.PrivilegedActionException: javax.security.sasl.
> SaslException:
> GSS initiate failed [Caused by GSSException: No valid credentials provided
> (Mechanism level: Server not found in Kerberos database (7))] occurred when
> evaluating Zookeeper Quorum Member’s received SASL token. Zookeeper Client
> will go to AUTH_FAILED state.
>
> We tested both of our keytab files (Zookeeper’s and Solr’s) using kinit and
> everything looks fine.
>
> Our Zookeeper does not configured with Kerberos yet and ‘ruok’ command
> response with ‘imok’ as expected.
>
> When examing Zokeeper’s logs we see the following:
> Successfully logged in.
> TGT refresh thread started.
> TGT valid starting at:  Thu May 21:39:10 ...
> TGT expires:   Fri May 25 07:39:44 ...
> TGT refresh sleeping until: Fri May 25 05:55:44 ...
>
> Any idea what we can do?
> Thanks.
>
>
>
> --
> Sent from: http://lucene.472066.n3.nabble.com/Solr-User-f472068.html
>


Re: Connection Problem with CloudSolrClient.Builder().build When passing a Zookeeper Addresses and RootParam

2018-06-18 Thread Jason Gerlowski
Hi,

Yes, Andy has the right idea.  If no zk-chroot is being used,
"Optional.empty()" is the correct value to specify, not null.

This API is a bit trappy (or at least unintuitive), and we're in the
process of hashing out some doc improvements (and/or API changes).  If
you're curious or would otherwise like to weigh in, check out SOLR-12309.

Best,

Jason

On Mon, Jun 18, 2018 at 1:09 PM, Andy C  wrote:

> I am using the following (Solr 7.3.1) successfully:
>
> import java.util.Optional;
>
>  Optional chrootOption = null;
>  if (StringUtils.isNotBlank(_zkChroot))
>  {
> chrootOption = Optional.of(_zkChroot);
>  }
>  else
>  {
> chrootOption = Optional.empty();
>  }
>  CloudSolrClient client = new CloudSolrClient.Builder(_zkHostList,
> chrootOption).build();
>
> Adapted from code I found somewhere (unit test?). Intent is to support the
> option of configuring a chroot or not (stored in "_zkChroot")
>
> - Andy -
>
> On Mon, Jun 18, 2018 at 12:53 PM, THADC  >
> wrote:
>
> > Hello,
> >
> > I am using solr 7.3 and zookeeper 3.4.10. I have custom client code that
> is
> > supposed to connect the a zookeeper cluster. For the sake of clarity, the
> > main code focus:
> >
> >
> > private synchronized void initSolrClient()
> > {
> > List zookeeperList = new ArrayList();
> >
> > zookeeperList.add("http://100.12.119.10:2281";);
> > zookeeperList.add("http://100.12.119.10:2282";);
> > zookeeperList.add("http://100.12.119.10:2283";);
> >
> > String collectionName = "myCollection"
> >
> > log.debug("in initSolrClient(), collectionName: " +
> > collectionName);
> >
> > try {
> > solrClient = new CloudSolrClient.Builder(
> zookeeperList,
> > null).build();
> >
> > } catch (Exception e) {
> > log.info("Exception creating solr client object.
> > ");
> > e.printStackTrace();
> > }
> > solrClient.setDefaultCollection(collectionName);
> > }
> >
> > Before executing, I test that all three zoo nodes are running
> > (./bin/zkServer.sh status zoo.cfg, ./bin/zkServer.sh status zoo2.cfg,
> > ./bin/zkServer.sh status zoo3.cfg). The status shows the quorum is
> > up and running, with one nodes as the leader and the other two as
> > followers.
> >
> > When I execute my java client to connect to the zookeeper cluster, I get
> :
> >
> > java.lang.NullPointerException
> > at
> > org.apache.solr.client.solrj.impl.CloudSolrClient$Builder.<
> > init>(CloudSolrClient.java:1387)
> >
> >
> > I am assuming it has a problem with my null value for zkChroot, but not
> > certain. Th API says zkChroot is the path to the root ZooKeeper node
> > containing Solr data. May be empty if Solr-data is located at the
> ZooKeeper
> > root.
> >
> > I am confused on what exactly should go here, and when it can be null. I
> > cannot find any coding examples.
> >
> > Any help greatly appreciated.
> >
> >
> >
> >
> > --
> > Sent from: http://lucene.472066.n3.nabble.com/Solr-User-f472068.html
> >
>


Re: CURL DELETE BLOB do not working in solr 7.3 cloud

2018-06-21 Thread Jason Gerlowski
Hi Maxence,

Yes, unfortunately that's the wrong API to delete an item from the
Blob Store.  Items in the blob store are deleted like any other Solr
document (i.e. either delete-by-id, or delete-by-query).  This is
mentioned quite obliquely in the Solr Ref Guide here:
https://lucene.apache.org/solr/guide/7_3/blob-store-api.html. (CTRL-F
for "delete").  We should really clarify that text a bit...

Anyway, to give you a concrete idea, you could delete that document
with a command like:

curl -X POST -H 'Content-Type: application/json'
'http://srv-formation-solr3:8983/solr/.system/' --data-binary ' {
"delete": "CityaUpdateProcessorJar/14" }'

Hope that helps,

Jason



On Wed, May 30, 2018 at 11:14 AM, msaunier  wrote:
> Hello,
>
>
>
> I want to delete a file in the blob but this command not work:
>
> curl -X "DELETE"
> http://srv-formation-solr3:8983/solr/.system/blob/CityaUpdateProcessorJar/14
>
>
>
> This command return just the file informations:
>
> {
>
>   "responseHeader":{
>
> "zkConnected":true,
>
> "status":0,
>
> "QTime":1},
>
>   "response":{"numFound":1,"start":0,"docs":[
>
>   {
>
> "id":"CityaUpdateProcessorJar/14",
>
> "md5":"45aeda5a01607fb668cec26a45cac9e6",
>
> "blobName":"CityaUpdateProcessorJar",
>
> "version":14,
>
> "timestamp":"2018-05-30T12:59:40.419Z",
>
> "size":22483}]
>
>   }}
>
>
>
> My command is bad ?
>
> Thanks,
>
> Maxence,
>


Re: Sole Default query parser

2018-06-22 Thread Jason Gerlowski
Hi Kamal,

Sorry for the late reply.  If you're still unsure, the "lucene" query
parser is the default one.  The first ref-guide link you posted refers
to it almost ubiquitously as the "Standard Query Parser", but it's the
same thing as the lucene query parser.  (The page does say this, but
it's easy to miss "Solr’s default Query Parser is also known as the
lucene parser")

Best,

Jason
On Wed, Jun 6, 2018 at 5:08 AM Kamal Kishore Aggarwal
 wrote:
>
> Hi Guys,
>
> What is the default query parser (QP) for solr.
>
> While I was reading about this, I came across two links which looks ambiguous 
> to me. It's not clear to me whether Standard is the default QP or Lucene is 
> the default QP or they are same. Below is the screenshot and links which are 
> confusing me.
>
> https://lucene.apache.org/solr/guide/6_6/the-standard-query-parser.html
>
> https://lucene.apache.org/solr/guide/6_6/common-query-parameters.html
>
> Please suggest. Thanks in advance.
>
>
> Regards
> Kamal Kishore


Re: Solr Default query parser

2018-06-26 Thread Jason Gerlowski
The "Standard Query Parser" _is_ the lucene query parser.  They're the
same parser.  As Shawn pointed out above, they're also the default, so
if you don't specify any defType, they will be used.  Though if you
want to be explicit and specify it anyway, the value is defType=lucene

Jason
On Mon, Jun 25, 2018 at 1:05 PM Kamal Kishore Aggarwal
 wrote:
>
> Hi Shawn,
>
> Thanks for the reply.
>
> If "lucene" is the default query parser, then how can we specify Standard
> Query Parser(QP) in the query.
>
> Dismax QP can be specified by defType=dismax and Extended Dismax Qp by
> defType=edismax, how about for declaration of Standard QP.
>
> Regards
> Kamal
>
> On Wed, Jun 6, 2018 at 9:41 PM, Shawn Heisey  wrote:
>
> > On 6/6/2018 9:52 AM, Kamal Kishore Aggarwal wrote:
> > >> What is the default query parser (QP) for solr.
> > >>
> > >> While I was reading about this, I came across two links which looks
> > >> ambiguous to me. It's not clear to me whether Standard is the default
> > QP or
> > >> Lucene is the default QP or they are same. Below is the screenshot and
> > >> links which are confusing me.
> >
> > The default query parser in Solr has the name "lucene".  This query
> > parser, which is part of Solr, deals with Lucene query syntax.
> >
> > The most recent documentation states this clearly right after the table
> > of contents:
> >
> > https://lucene.apache.org/solr/guide/7_3/the-standard-query-parser.html
> >
> > It is highly unlikely that the 6.6 documentation will receive any
> > changes, unless serious errors are found in it.  The omission of this
> > piece of information will not be seen as a serious error.
> >
> > Thanks,
> > Shawn
> >
> >


Re: Retrieving json.facet from a search

2018-06-29 Thread Jason Gerlowski
You might also have luck using the "NoOpResponseParser"

https://opensourceconnections.com/blog/2015/01/08/using-solr-cloud-for-robustness-but-returning-json-format/
https://lucene.apache.org/solr/7_0_0/solr-solrj/org/apache/solr/client/solrj/impl/NoOpResponseParser.html

(Disclaimer: Didn't try this out, but it looks like what you want).
On Thu, Jun 28, 2018 at 2:41 PM Yonik Seeley  wrote:
>
> There isn't typed support, but you can use the generic support like so:
>
> .getResponse().get("facets")
>
> -Yonik
>
> On Thu, Jun 28, 2018 at 2:31 PM, Webster Homer  wrote:
> > I have a fairly large existing code base for querying Solr. It is
> > architected where common code calls solr and returns a solrj QueryResponse
> > object.
> >
> > I'm currently using Solr 7.2 the code interacts with solr using the Solrj
> > client api
> >
> > I have a need that would be very easily met by using the json.facet api.
> > The problem is that I don't see how to get the json.facet out of a
> > QueryResponse object.
> >
> > There doesn't seem to be a lot of discussion on line about this.
> > Is there a way to get the Json object out of the QueryResponse?
> >
> > --
> >
> >
> > This message and any attachment are confidential and may be
> > privileged or
> > otherwise protected from disclosure. If you are not the intended
> > recipient,
> > you must not copy this message or attachment or disclose the
> > contents to
> > any other person. If you have received this transmission in error,
> > please
> > notify the sender immediately and delete the message and any attachment
> >
> > from your system. Merck KGaA, Darmstadt, Germany and any of its
> > subsidiaries do
> > not accept liability for any omissions or errors in this
> > message which may
> > arise as a result of E-Mail-transmission or for damages
> > resulting from any
> > unauthorized changes of the content of this message and
> > any attachment thereto.
> > Merck KGaA, Darmstadt, Germany and any of its
> > subsidiaries do not guarantee
> > that this message is free of viruses and does
> > not accept liability for any
> > damages caused by any virus transmitted
> > therewith.
> >
> >
> >
> > Click http://www.emdgroup.com/disclaimer
> >  to access the
> > German, French, Spanish
> > and Portuguese versions of this disclaimer.


Re: SolrJ Kerberos Client API

2018-06-29 Thread Jason Gerlowski
Hi Tushar,

You're right; the docs are a little out of date there.
Krb5HttpClientConfigurer underwent some refactoring recently and came
out with a different name: Krb5HttpClientBuilder.

The ref-guide should update the snippet you were referencing to
something more like:

System.setProperty("java.security.auth.login.config",
"/home/foo/jaas-client.conf");
HttpClientUtil.setHttpClientBuilder(new Krb5HttpClientBuilder());

There might be other small changes too.

Best,

Jason
On Thu, Jun 28, 2018 at 9:05 PM Tushar Inamdar  wrote:
>
> Hello,
>
> We are looking to move from SolrJ client v5.5.x to the latest v7.4.x.
>
> The documentation on wiring kerberos with the client API here
> 
> seems out-of-date. The HttpClientUtil class doesn't have a method
> setConfigurer(). Also Krb5HttpClientConfigurer class is missing from the
> SolrJ library. This mechanism used to work with v5.5.4, but doesn't work
> with any 7.x.
>
> Am I missing something or is the documentation indeed out-of-date?
>
> I am interested in the conventional jaas/keytab based access (not
> delegation token).
>
> Thanks,
> Tushar.


Re: Solr7.3.1 Installation

2018-07-11 Thread Jason Gerlowski
(I think Erick made a slight typo above: to disable "bad apple" tests,
use the flag "-Dtests.badapples=false")
On Wed, Jul 11, 2018 at 11:14 AM Erick Erickson  wrote:
>
> Note that the native test runs have the know-flaky tests _enabled_ by
> default, run tests with
>
> -Dtests.badapples=true
>
> to disable them.
>
> Second possibility is to look at the tests that failed and if there is
> an annotation
> @BadApple
> or
> @AwaitsFix
> ignore the failure if you can get the tests to pass when running individually.
>
> As Shawn says, this is a known issue that we're working on, but the
> technical debt is such that it'll
> be a long-term issue to fix.
>
> Best,
> Erick
>
>
>
> On Wed, Jul 11, 2018 at 7:13 AM, Shawn Heisey  wrote:
> > On 7/10/2018 11:20 PM, tapan1707 wrote:
> >>
> >> We are trying to install solr-7.3.1 into our existing system (We have also
> >> made some changes by adding one custom query parser).
> >>
> >> I am having some build issues and it would be really helpful if someone
> >> can
> >> help.
> >>
> >> While running ant test(in the process of building the solr package), it
> >> terminates because of failed tests.
> >
> >
> > This is a known problem.  Solr's tests are in not in a good state.
> > Sometimes they pass, sometimes they fail.  Since there are so many tests and
> > a fair number of them do fail intermittently, this creates a situation where
> > on most test runs, there is at least one test failure.  Run the tests enough
> > times, and eventually they will all pass ... but this usually takes many
> > runs.
> >
> > Looking at the commands you're using in your script:  After a user has run
> > the "ant ivy-bootstrap" command once, ivy is downloaded into the user's home
> > directory and does not need to be downloaded again.  Only the "ant package"
> > command (run in the "solr" subdirectory) is actually needed to build Solr.
> > The rest of the commands are not needed.
> >
> > As Emir said, you don't need to build Solr at all, even when using custom
> > plugins.  You can download and use the binary package.
> >
> > There is effort underway to solve the problem with Solr tests. The initial
> > phase of that effort is to disable the tests that fail most frequently.  The
> > second overlapping phase of the effort is to actually fix those tests so
> > that they don't fail - either by fixing bugs in the tests themselves, or by
> > fixing real bugs in Solr.
> >
> >> Also, does ant version has any effects in build??
> >
> >
> > Ant 1.8 and 1.9 should work.  Versions 1.10.0, 1.10.1, as well as 1.10.3 and
> > later should be fine, but 1.10.2 has a bug that results in the lucene-solr
> > build failing:
> >
> > https://issues.apache.org/jira/browse/LUCENE-8189
> >
> >> At last, at present, we are using solr-6.4.2 which has zookeeper-3.4.6
> >> dependency but for solr-7, the zookeeper dependency has been upgraded to
> >> 3.4.10, so my question is, At what extent does this might affect our
> >> system
> >> performance? Can we use zookeeper-3.4.6 with solr-7?
> >> (same with the jetty version)
> >
> >
> > You should be able to use any ZK 3.4.x server version with any version of
> > Solr.  Most versions of Solr should also work with 3.5.x (still in beta)
> > servers.  Early 4.x version s shipped with ZK 3.3.x, and the ZK project does
> > not guarantee compatibility between 3.3.x and 3.5.x.
> >
> > I can't guarantee that you won't run into bugs, but ZK is generally a very
> > stable piece of software.  Each new release of ZK includes a very large list
> > of bugfixes.  I have no idea what implications there are for performance.
> > You would need to ask a ZK support resource that question.  The latest
> > stable release that is compatible with your software is the recommended
> > version.  Currently that is 3.4.12.  The 3.5.x releases are in beta.
> >
> > Thanks,
> > Shawn
> >


Re: Creating a collection in Solr standalone mode using solrj

2018-07-20 Thread Jason Gerlowski
Hi Arunan,

Solr runs in one of two main modes: "Cloud" mode or "Standalone" mode.
Collections can only be created in Cloud mode.  Standalone mode
doesn't allow creation of collections; it uses cores instead.  From
your error message above, it looks like the problem is that you're
trying to create a collection in "standalone" mode, which doesn't
support that.

SolrJ has methods to create both cores and collections, you just have
to have Solr running in the right mode:
- Collection creation:
https://lucene.apache.org/solr/7_4_0/solr-solrj/org/apache/solr/client/solrj/request/CollectionAdminRequest.Create.html
- Core creation:
https://lucene.apache.org/solr/7_4_0/solr-solrj/org/apache/solr/client/solrj/request/CoreAdminRequest.Create.html

You'll have to decide whether you want to run Solr in cloud or
standalone mode, and adjust your core/collection creation accordingly.

Best,

Jason
On Fri, Jul 20, 2018 at 2:09 AM Arunan Sugunakumar
 wrote:
>
> Hi,
>
> I would like to know whether it is possible to create a collection in Solr
> through SolrJ. I tried to create and it throws me an error saying that
> "Solr instance is not running in SolrCloud mode.
> "
> I am trying to upgrade a system to use solr which used lucene library in
> the past. In lucene, everything is controlled via code and user does not
> have worry about creating collections. I am trying to replicate this
> experience in Solr.
>
> Thanks in Advance,
> Arunan


Re: 4 days and no solution - please help on Solr

2018-08-10 Thread Jason Gerlowski
Hi Ravion,

(Note: I'm not sure what Solr version you're using.  My answer below
assumes Solr 7 APIs.  These APIs don't change often, but you might
find them under slightly different names in your version of Solr.)

SolrJ provides 2 ways (that I know of) to provide basic auth credentials.

The first (and IMO simplest) way is to use the setBasicAuthCredentials
method on each individual SolrRequest.  You can see what this looks
like in the example below:

final SolrClient client = new
CloudSolrCLient.Builder(solrURLs).withHttpClient(myHttpClient).build();
client.setDefaultCollection("collection1");
SolrQuery req = new SolrQuery("*:*");
req.setBasicAuthCredentials("yourUsername", "yourPassword);
client.query(req);

SolrJ also has a PreemptiveBasicAuthClientBuilderFactory, which reads
the username/password from Java system properties, and is used to
configure the HttpClient that SolrJ creates internally for sending
requests.  I find this second method a little more complex, and it
looks like you're providing your own HttpClient anyways, so for both
those reasons I'd recommend sticking with the first approach (at least
while you're getting things up and running).

Hope that helps.

Best,

Jason

On Thu, Aug 9, 2018 at 5:47 PM ☼ R Nair  wrote:
>
> Dear all,
>
> I have tried my best to do it - searched all Google. But I an=m
> unsuccessful. Kindly help.
>
> We have a solo environment. Its secured with userid and password.
>
> I used
> CloudSolrClient.Builder(solrURLs).withHttpClient(mycloseablehttpclient)
> method to access it. The url is of the form http:/userid:password@/
> passionbytes.com/solr. I set defaultCollectionName later.
> In mycloseablehttpclient, I set Basic Authentication with
> CredentialProvider and gave url, port, userid and password.
> I have changed HTTPCLIENT to 4.4.1 version, even tried 4.5.3.
>
> Still, I get the JSON response from server, saying the URL did not return
> the state information from SOLR. It says HTTP 401 , Authentication Required.
>
> This is fourth day on this problem. Any help is appreciated. I have done
> whatever is available through documentation and/or Google.
>
> Best,
> Ravion


Re: 4 days and no solution - please help on Solr

2018-08-10 Thread Jason Gerlowski
I would also recommend removing the username/password from your Solr
base URL.  You might be able to get things working that way, but it's
definitely less common, and it wouldn't surprise me if some parts of
SolrJ mishandle a URL in that format.  Though that's just a hunch on
my part.
On Fri, Aug 10, 2018 at 10:09 AM Jason Gerlowski  wrote:
>
> Hi Ravion,
>
> (Note: I'm not sure what Solr version you're using.  My answer below
> assumes Solr 7 APIs.  These APIs don't change often, but you might
> find them under slightly different names in your version of Solr.)
>
> SolrJ provides 2 ways (that I know of) to provide basic auth credentials.
>
> The first (and IMO simplest) way is to use the setBasicAuthCredentials
> method on each individual SolrRequest.  You can see what this looks
> like in the example below:
>
> final SolrClient client = new
> CloudSolrCLient.Builder(solrURLs).withHttpClient(myHttpClient).build();
> client.setDefaultCollection("collection1");
> SolrQuery req = new SolrQuery("*:*");
> req.setBasicAuthCredentials("yourUsername", "yourPassword);
> client.query(req);
>
> SolrJ also has a PreemptiveBasicAuthClientBuilderFactory, which reads
> the username/password from Java system properties, and is used to
> configure the HttpClient that SolrJ creates internally for sending
> requests.  I find this second method a little more complex, and it
> looks like you're providing your own HttpClient anyways, so for both
> those reasons I'd recommend sticking with the first approach (at least
> while you're getting things up and running).
>
> Hope that helps.
>
> Best,
>
> Jason
>
> On Thu, Aug 9, 2018 at 5:47 PM ☼ R Nair  wrote:
> >
> > Dear all,
> >
> > I have tried my best to do it - searched all Google. But I an=m
> > unsuccessful. Kindly help.
> >
> > We have a solo environment. Its secured with userid and password.
> >
> > I used
> > CloudSolrClient.Builder(solrURLs).withHttpClient(mycloseablehttpclient)
> > method to access it. The url is of the form http:/userid:password@/
> > passionbytes.com/solr. I set defaultCollectionName later.
> > In mycloseablehttpclient, I set Basic Authentication with
> > CredentialProvider and gave url, port, userid and password.
> > I have changed HTTPCLIENT to 4.4.1 version, even tried 4.5.3.
> >
> > Still, I get the JSON response from server, saying the URL did not return
> > the state information from SOLR. It says HTTP 401 , Authentication Required.
> >
> > This is fourth day on this problem. Any help is appreciated. I have done
> > whatever is available through documentation and/or Google.
> >
> > Best,
> > Ravion


Re: 4 days and no solution - please help on Solr

2018-08-10 Thread Jason Gerlowski
I'd tried to type my previous SolrJ example snippet from memory.  That
didn't work out so great.  I've corrected it below:

final List zkUrls = new ArrayList<>();
zkUrls.add("localhost:9983");
final SolrClient client = new CloudSolrClient.Builder(zkUrls,
Optional.empty()).build();

final Map queryParamMap = new HashMap();
queryParamMap.put("q", "*:*");
final QueryRequest query = new QueryRequest(new MapSolrParams(queryParamMap));
query.setBasicAuthCredentials("solr", "solrRocks");

query.process(client, "techproducts"); // or, client.request(query)
On Fri, Aug 10, 2018 at 10:12 AM Jason Gerlowski  wrote:
>
> I would also recommend removing the username/password from your Solr
> base URL.  You might be able to get things working that way, but it's
> definitely less common, and it wouldn't surprise me if some parts of
> SolrJ mishandle a URL in that format.  Though that's just a hunch on
> my part.
> On Fri, Aug 10, 2018 at 10:09 AM Jason Gerlowski  
> wrote:
> >
> > Hi Ravion,
> >
> > (Note: I'm not sure what Solr version you're using.  My answer below
> > assumes Solr 7 APIs.  These APIs don't change often, but you might
> > find them under slightly different names in your version of Solr.)
> >
> > SolrJ provides 2 ways (that I know of) to provide basic auth credentials.
> >
> > The first (and IMO simplest) way is to use the setBasicAuthCredentials
> > method on each individual SolrRequest.  You can see what this looks
> > like in the example below:
> >
> > final SolrClient client = new
> > CloudSolrCLient.Builder(solrURLs).withHttpClient(myHttpClient).build();
> > client.setDefaultCollection("collection1");
> > SolrQuery req = new SolrQuery("*:*");
> > req.setBasicAuthCredentials("yourUsername", "yourPassword);
> > client.query(req);
> >
> > SolrJ also has a PreemptiveBasicAuthClientBuilderFactory, which reads
> > the username/password from Java system properties, and is used to
> > configure the HttpClient that SolrJ creates internally for sending
> > requests.  I find this second method a little more complex, and it
> > looks like you're providing your own HttpClient anyways, so for both
> > those reasons I'd recommend sticking with the first approach (at least
> > while you're getting things up and running).
> >
> > Hope that helps.
> >
> > Best,
> >
> > Jason
> >
> > On Thu, Aug 9, 2018 at 5:47 PM ☼ R Nair  wrote:
> > >
> > > Dear all,
> > >
> > > I have tried my best to do it - searched all Google. But I an=m
> > > unsuccessful. Kindly help.
> > >
> > > We have a solo environment. Its secured with userid and password.
> > >
> > > I used
> > > CloudSolrClient.Builder(solrURLs).withHttpClient(mycloseablehttpclient)
> > > method to access it. The url is of the form http:/userid:password@/
> > > passionbytes.com/solr. I set defaultCollectionName later.
> > > In mycloseablehttpclient, I set Basic Authentication with
> > > CredentialProvider and gave url, port, userid and password.
> > > I have changed HTTPCLIENT to 4.4.1 version, even tried 4.5.3.
> > >
> > > Still, I get the JSON response from server, saying the URL did not return
> > > the state information from SOLR. It says HTTP 401 , Authentication 
> > > Required.
> > >
> > > This is fourth day on this problem. Any help is appreciated. I have done
> > > whatever is available through documentation and/or Google.
> > >
> > > Best,
> > > Ravion


Re: 4 days and no solution - please help on Solr

2018-08-10 Thread Jason Gerlowski
The "setBasicAuthCredentials" method works on all SolrRequest
implementations.  There's a corresponding SolrRequest object for most
common Solr APIs.  As you mentioned, I used QueryRequest above, but
the same approach works for any SolrRequest object.

The specific one for indexing is "UpdateRequest".  Here's a short example below:

final List docsToIndex = new ArrayList<>();
...Prepare your docs for indexing
final UpdateRequest update = new UpdateRequest();
update.add(docsToIndex);
update.setBasicAuthCredentials("solr", "solrRocks");
update.process(client, "techproducts");
On Fri, Aug 10, 2018 at 12:47 PM ☼ R Nair  wrote:
>
> Hi Jason,
>
> Thanks for replying.
>
> I am adding a document, not querying. I am using 7.3 apis. Adding a
> document is done via solrclient.add(). How to set authentication in
> this case? Seems I can't use SolrRequest.
>
> Thx, bye
> RAVION
>
> On Fri, Aug 10, 2018, 10:46 AM Jason Gerlowski 
> wrote:
>
> > I'd tried to type my previous SolrJ example snippet from memory.  That
> > didn't work out so great.  I've corrected it below:
> >
> > final List zkUrls = new ArrayList<>();
> > zkUrls.add("localhost:9983");
> > final SolrClient client = new CloudSolrClient.Builder(zkUrls,
> > Optional.empty()).build();
> >
> > final Map queryParamMap = new HashMap();
> > queryParamMap.put("q", "*:*");
> > final QueryRequest query = new QueryRequest(new
> > MapSolrParams(queryParamMap));
> > query.setBasicAuthCredentials("solr", "solrRocks");
> >
> > query.process(client, "techproducts"); // or, client.request(query)
> > On Fri, Aug 10, 2018 at 10:12 AM Jason Gerlowski 
> > wrote:
> > >
> > > I would also recommend removing the username/password from your Solr
> > > base URL.  You might be able to get things working that way, but it's
> > > definitely less common, and it wouldn't surprise me if some parts of
> > > SolrJ mishandle a URL in that format.  Though that's just a hunch on
> > > my part.
> > > On Fri, Aug 10, 2018 at 10:09 AM Jason Gerlowski 
> > wrote:
> > > >
> > > > Hi Ravion,
> > > >
> > > > (Note: I'm not sure what Solr version you're using.  My answer below
> > > > assumes Solr 7 APIs.  These APIs don't change often, but you might
> > > > find them under slightly different names in your version of Solr.)
> > > >
> > > > SolrJ provides 2 ways (that I know of) to provide basic auth
> > credentials.
> > > >
> > > > The first (and IMO simplest) way is to use the setBasicAuthCredentials
> > > > method on each individual SolrRequest.  You can see what this looks
> > > > like in the example below:
> > > >
> > > > final SolrClient client = new
> > > > CloudSolrCLient.Builder(solrURLs).withHttpClient(myHttpClient).build();
> > > > client.setDefaultCollection("collection1");
> > > > SolrQuery req = new SolrQuery("*:*");
> > > > req.setBasicAuthCredentials("yourUsername", "yourPassword);
> > > > client.query(req);
> > > >
> > > > SolrJ also has a PreemptiveBasicAuthClientBuilderFactory, which reads
> > > > the username/password from Java system properties, and is used to
> > > > configure the HttpClient that SolrJ creates internally for sending
> > > > requests.  I find this second method a little more complex, and it
> > > > looks like you're providing your own HttpClient anyways, so for both
> > > > those reasons I'd recommend sticking with the first approach (at least
> > > > while you're getting things up and running).
> > > >
> > > > Hope that helps.
> > > >
> > > > Best,
> > > >
> > > > Jason
> > > >
> > > > On Thu, Aug 9, 2018 at 5:47 PM ☼ R Nair 
> > wrote:
> > > > >
> > > > > Dear all,
> > > > >
> > > > > I have tried my best to do it - searched all Google. But I an=m
> > > > > unsuccessful. Kindly help.
> > > > >
> > > > > We have a solo environment. Its secured with userid and password.
> > > > >
> > > > > I used
> > > > >
> > CloudSolrClient.Builder(solrURLs).withHttpClient(mycloseablehttpclient)
> > > > > method to access it. The url is of the form http:/userid:password@/
> > > > > passionbytes.com/solr. I set defaultCollectionName later.
> > > > > In mycloseablehttpclient, I set Basic Authentication with
> > > > > CredentialProvider and gave url, port, userid and password.
> > > > > I have changed HTTPCLIENT to 4.4.1 version, even tried 4.5.3.
> > > > >
> > > > > Still, I get the JSON response from server, saying the URL did not
> > return
> > > > > the state information from SOLR. It says HTTP 401 , Authentication
> > Required.
> > > > >
> > > > > This is fourth day on this problem. Any help is appreciated. I have
> > done
> > > > > whatever is available through documentation and/or Google.
> > > > >
> > > > > Best,
> > > > > Ravion
> >


Re: 4 days and no solution - please help on Solr

2018-08-11 Thread Jason Gerlowski
You're right that "Update" is a little misleading as a name.

Solr uses that term across the board to refer to new or updated docs.
The "add-documents" API is /solr/collection_name/update and is
implemented by "UpdateRequestHandlers".  You can configure Solr to
massage documents before indexing with a "UpdateRequestDocumentChain".
etc.

So the name is misleading, but at least it's consistent.

Best,

Jason
On Fri, Aug 10, 2018 at 10:52 PM ☼ R Nair  wrote:
>
> Thanks Christoper and Jason. Problem solved. What you mentioned works.
>
> Thanks a million. Have a good weekend.
>
> Best,
> Ravion
>
> On Fri, Aug 10, 2018 at 3:31 PM Christopher Schultz <
> ch...@christopherschultz.net> wrote:
>
> > Ravion,
> >
> > What's wrong with "update request"? Updating a document that does not
> > exist... will add it.
> >
> > -chris
> >
> > On 8/10/18 3:01 PM, ☼ R Nair wrote:
> > > Do you feel that this is only partially complete?
> > >
> > > Best, Ravion
> > >
> > > On Fri, Aug 10, 2018, 1:37 PM ☼ R Nair 
> > wrote:
> > >
> > >> I saw this. Please provide for add. My issue is with add. There is no
> > >> "AddRequesg". So how to do that, thanks
> > >>
> > >> Best Ravion
> > >>
> > >> On Fri, Aug 10, 2018, 12:58 PM Jason Gerlowski 
> > >> wrote:
> > >>
> > >>> The "setBasicAuthCredentials" method works on all SolrRequest
> > >>> implementations.  There's a corresponding SolrRequest object for most
> > >>> common Solr APIs.  As you mentioned, I used QueryRequest above, but
> > >>> the same approach works for any SolrRequest object.
> > >>>
> > >>> The specific one for indexing is "UpdateRequest".  Here's a short
> > example
> > >>> below:
> > >>>
> > >>> final List docsToIndex = new ArrayList<>();
> > >>> ...Prepare your docs for indexing
> > >>> final UpdateRequest update = new UpdateRequest();
> > >>> update.add(docsToIndex);
> > >>> update.setBasicAuthCredentials("solr", "solrRocks");
> > >>> update.process(client, "techproducts");
> > >>> On Fri, Aug 10, 2018 at 12:47 PM ☼ R Nair 
> > >>> wrote:
> > >>>>
> > >>>> Hi Jason,
> > >>>>
> > >>>> Thanks for replying.
> > >>>>
> > >>>> I am adding a document, not querying. I am using 7.3 apis. Adding a
> > >>>> document is done via solrclient.add(). How to set authentication
> > in
> > >>>> this case? Seems I can't use SolrRequest.
> > >>>>
> > >>>> Thx, bye
> > >>>> RAVION
> > >>>>
> > >>>> On Fri, Aug 10, 2018, 10:46 AM Jason Gerlowski  > >
> > >>>> wrote:
> > >>>>
> > >>>>> I'd tried to type my previous SolrJ example snippet from memory.
> > That
> > >>>>> didn't work out so great.  I've corrected it below:
> > >>>>>
> > >>>>> final List zkUrls = new ArrayList<>();
> > >>>>> zkUrls.add("localhost:9983");
> > >>>>> final SolrClient client = new CloudSolrClient.Builder(zkUrls,
> > >>>>> Optional.empty()).build();
> > >>>>>
> > >>>>> final Map queryParamMap = new HashMap > >>> String>();
> > >>>>> queryParamMap.put("q", "*:*");
> > >>>>> final QueryRequest query = new QueryRequest(new
> > >>>>> MapSolrParams(queryParamMap));
> > >>>>> query.setBasicAuthCredentials("solr", "solrRocks");
> > >>>>>
> > >>>>> query.process(client, "techproducts"); // or, client.request(query)
> > >>>>> On Fri, Aug 10, 2018 at 10:12 AM Jason Gerlowski <
> > >>> gerlowsk...@gmail.com>
> > >>>>> wrote:
> > >>>>>>
> > >>>>>> I would also recommend removing the username/password from your Solr
> > >>>>>> base URL.  You might be able to get things working that way, but
> > >>> it's
> > >>>>>> definitely 

Re: Solr cloud production set up

2020-01-28 Thread Jason Gerlowski
Hi Rajdeep,

Unfortunately it's near impossible for anyone here to tell you what
parameters to tweak.  People might take guesses based on their
individual past experience, but ultimately those are just guesses.

There are just too many variables affecting Solr performance for
anyone to have a good guess without access to the cluster itself and
the time and will to dig into it.

Are there GC params that need tweaking?  Very possible, but you'll
have to look into your gc logs to see how much time is being spent in
gc.  Are there query params you could be changing?  Very possible, but
you'll have to identify the types of queries you're submitting and see
whether the ref-guide offers any information on how to tweak
performance for those particular qparsers, facets, etc.  Is the number
of facets the reason for slow queries?  Very possible, but you'll have
to turn faceting off or run debug=timing and see how what that tells
you about the QTime's.

Tuning Solr performance is a tough, time consuming process.  I wish
there was an easier answer for you, but there's not.

Best,

Jason

On Mon, Jan 20, 2020 at 12:06 PM Rajdeep Sahoo
 wrote:
>
> Please suggest anyone
>
> On Sun, 19 Jan, 2020, 9:43 AM Rajdeep Sahoo, 
> wrote:
>
> > Apart from reducing no of facets in the query, is there any other query
> > params or gc params or heap space or anything else that we need to tweak
> > for improving search response time.
> >
> > On Sun, 19 Jan, 2020, 3:15 AM Erick Erickson, 
> > wrote:
> >
> >> Add &debug=timing to the query and it’ll show you the time each component
> >> takes.
> >>
> >> > On Jan 18, 2020, at 1:50 PM, Rajdeep Sahoo 
> >> wrote:
> >> >
> >> > Thanks for the suggestion,
> >> >
> >> > Is there any way to get the info which operation or which query params
> >> are
> >> > increasing the response time.
> >> >
> >> >
> >> > On Sat, 18 Jan, 2020, 11:59 PM Dave, 
> >> wrote:
> >> >
> >> >> If you’re not getting values, don’t ask for the facet. Facets are
> >> >> expensive as hell, maybe you should think more about your query’s than
> >> your
> >> >> infrastructure, solr cloud won’t help you at all especially if your
> >> asking
> >> >> for things you don’t need
> >> >>
> >> >>> On Jan 18, 2020, at 1:25 PM, Rajdeep Sahoo <
> >> rajdeepsahoo2...@gmail.com>
> >> >> wrote:
> >> >>>
> >> >>> We have assigned 16 gb out of 24gb for heap .
> >> >>> No other process is running on that node.
> >> >>>
> >> >>> 200 facets fields are there in the query but we will not be getting
> >> the
> >> >>> values for each facets for every search.
> >> >>> There can be max of 50-60 facets for which we will be getting values.
> >> >>>
> >> >>> We are using caching,is it not going to help.
> >> >>>
> >> >>>
> >> >>>
> >>  On Sat, 18 Jan, 2020, 11:36 PM Shawn Heisey, 
> >> >> wrote:
> >> 
> >> > On 1/18/2020 10:09 AM, Rajdeep Sahoo wrote:
> >> > We are having 2.3 million documents and size is 2.5 gb.
> >> >  10 core cpu and 24 gb ram . 16 slave nodes.
> >> >
> >> >  Still some of the queries are taking 50 sec at solr end.
> >> > As we are using solr 4.6 .
> >> >  Other thing is we are having 200 (avg) facet fields  in a query.
> >> > And 30 searchable fields.
> >> > Is there any way to identify why it is taking 50 sec for a query.
> >> >Multiple concurrent requests are there.
> >> 
> >>  Searching 30 fields and computing 200 facets is never going to be
> >> super
> >>  fast.  Switching to cloud will not help, and might make it slower.
> >> 
> >>  Your index is pretty small to a lot of us.  There are people running
> >>  indexes with billions of documents that take terabytes of disk space.
> >> 
> >>  As Walter mentioned, computing 200 facets is going to require a fair
> >>  amount of heap memory.  One *possible* problem here is that the Solr
> >>  heap size is too small, so a lot of GC is required.  How much of the
> >>  24GB have you assigned to the heap?  Is there any software other than
> >>  Solr running on these nodes?
> >> 
> >>  Thanks,
> >>  Shawn
> >> 
> >> >>
> >>
> >>


Re: Solr fact response strange behaviour

2020-01-29 Thread Jason Gerlowski
Hey Adi,

There was a separate JIRA for this on the SolrJ objects it sounds like
you're using: SOLR-13780.  That JIRA was fixed, apparently in 8.3, so
I'm surprised you're still seeing the issue.  If you include the full
stacktrace and a snippet of code to reproduce, I'm curious to take a
look.

That won't help you in the short term though.  For that, yes, you'll
have to use ((Number)count).longValue() in the interim.

Best,

Jason

On Tue, Jan 28, 2020 at 2:20 AM Kaminski, Adi  wrote:
>
> Thanks Mikhail  !
>
> In issue comments that you have shared it seems that Yonik S doesn't agree 
> it's a defect...so probably will remain opened for a while.
>
>
>
> So meanwhile, is it recommended to perform casting 
> ((Number)count).longValue()  to our relevant logic ?
>
>
>
> Thanks,
> Adi
>
>
>
> -Original Message-
> From: Mikhail Khludnev 
> Sent: Tuesday, January 28, 2020 9:14 AM
> To: solr-user 
> Subject: Re: Solr fact response strange behaviour
>
>
>
> https://issues.apache.org/jira/browse/SOLR-11775
>
>
>
> On Tue, Jan 28, 2020 at 10:00 AM Kaminski, Adi 
> mailto:adi.kamin...@verint.com>>
>
> wrote:
>
>
>
> > Is it existing issue and tracked for future fix consideration ?
>
> >
>
> > What's the suggestion as W/A until fix - to case every related
>
> > response with ((Number)count).longValue() ?
>
> >
>
> > -Original Message-
>
> > From: Mikhail Khludnev mailto:m...@apache.org>>
>
> > Sent: Tuesday, January 28, 2020 8:53 AM
>
> > To: solr-user 
> > mailto:solr-user@lucene.apache.org>>
>
> > Subject: Re: Solr fact response strange behaviour
>
> >
>
> > I suppose there's an issue, which no one ever took a look.
>
> >
>
> > https://lucene.472066.n3.nabble.com/JSON-facets-count-a-long-or-an-int
>
> > eger-in-cloud-and-non-cloud-modes-td4265291.html
>
> >
>
> >
>
> > On Mon, Jan 27, 2020 at 11:47 PM Kaminski, Adi
>
> > mailto:adi.kamin...@verint.com>>
>
> > wrote:
>
> >
>
> > > SolrJ client is used of SolrCloud of Solr 8.3 version for JSON
>
> > > Facets requests...any idea why not consistent ?
>
> > >
>
> > > Sent from Workspace ONE Boxer
>
> > >
>
> > > On Jan 27, 2020 22:13, Mikhail Khludnev 
> > > mailto:m...@apache.org>> wrote:
>
> > > Hello,
>
> > > It might be different between SolrCloud and standalone mode. No data
>
> > > enough to make a conclusion.
>
> > >
>
> > > On Mon, Jan 27, 2020 at 5:40 PM Rudenko, Artur
>
> > > mailto:artur.rude...@verint.com>>
>
> > > wrote:
>
> > >
>
> > > > I'm trying to parse facet response, but sometimes the count
>
> > > > returns as Long type and sometimes as Integer type(on different
>
> > > > environments), The error is:
>
> > > > "java.lang.ClassCastException: java.lang.Integer cannot be cast to
>
> > > > java.lang.Long"
>
> > > >
>
> > > > Can you please explain why this happenes? Why it not consistent?
>
> > > >
>
> > > > I know the workaround to use Number class and longValue method but
>
> > > > I want to to the root cause before using this workaround
>
> > > >
>
> > > > Artur Rudenko
>
> > > >
>
> > > >
>
> > > >
>
> > > > This electronic message may contain proprietary and confidential
>
> > > > information of Verint Systems Inc., its affiliates and/or subsidiaries.
>
> > > The
>
> > > > information is intended to be for the use of the individual(s) or
>
> > > > entity(ies) named above. If you are not the intended recipient (or
>
> > > > authorized to receive this e-mail for the intended recipient), you
>
> > > > may
>
> > > not
>
> > > > use, copy, disclose or distribute to anyone this message or any
>
> > > information
>
> > > > contained in this message. If you have received this electronic
>
> > > > message
>
> > > in
>
> > > > error, please notify us by replying to this e-mail.
>
> > > >
>
> > >
>
> > >
>
> > > --
>
> > > Sincerely yours
>
> > > Mikhail Khludnev
>
> > >
>
> > >
>
> > > This electronic message may contain proprietary and confidential
>
> > > information of Verint Systems Inc., its affiliates and/or
>
> > > subsidiaries. The information is intended to be for the use of the
>
> > > individual(s) or
>
> > > entity(ies) named above. If you are not the intended recipient (or
>
> > > authorized to receive this e-mail for the intended recipient), you
>
> > > may not use, copy, disclose or distribute to anyone this message or
>
> > > any information contained in this message. If you have received this
>
> > > electronic message in error, please notify us by replying to this e-mail.
>
> > >
>
> >
>
> >
>
> > --
>
> > Sincerely yours
>
> > Mikhail Khludnev
>
> >
>
> >
>
> > This electronic message may contain proprietary and confidential
>
> > information of Verint Systems Inc., its affiliates and/or
>
> > subsidiaries. The information is intended to be for the use of the
>
> > individual(s) or
>
> > entity(ies) named above. If you are not the intended recipient (or
>
> > authorized to receive this e-mail for the intended recipient), you may
>
> > not use, copy, disclose or distribute to anyone this message or any
>
> > information conta

Re: Solr fact response strange behaviour

2020-01-29 Thread Jason Gerlowski
a:1415)
>  [tomcat-embed-core-9.0.17.jar:9.0.17]
> at 
> org.apache.tomcat.util.net.SocketProcessorBase.run(SocketProcessorBase.java:49)
>  [tomcat-embed-core-9.0.17.jar:9.0.17]
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
>  [?:1.8.0_201]
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
>  [?:1.8.0_201]
> at 
> org.apache.tomcat.util.threads.TaskThread$WrappingRunnable.run(TaskThread.java:61)
>  [tomcat-embed-core-9.0.17.jar:9.0.17]
> at java.lang.Thread.run(Thread.java:748) [?:1.8.0_201]
>
> -Original Message-
> From: Jason Gerlowski 
> Sent: Wednesday, January 29, 2020 5:40 PM
> To: solr-user@lucene.apache.org
> Subject: Re: Solr fact response strange behaviour
>
> Hey Adi,
>
> There was a separate JIRA for this on the SolrJ objects it sounds like you're 
> using: SOLR-13780.  That JIRA was fixed, apparently in 8.3, so I'm surprised 
> you're still seeing the issue.  If you include the full stacktrace and a 
> snippet of code to reproduce, I'm curious to take a look.
>
> That won't help you in the short term though.  For that, yes, you'll have to 
> use ((Number)count).longValue() in the interim.
>
> Best,
>
> Jason
>
> On Tue, Jan 28, 2020 at 2:20 AM Kaminski, Adi  wrote:
> >
> > Thanks Mikhail  !
> >
> > In issue comments that you have shared it seems that Yonik S doesn't agree 
> > it's a defect...so probably will remain opened for a while.
> >
> >
> >
> > So meanwhile, is it recommended to perform casting 
> > ((Number)count).longValue()  to our relevant logic ?
> >
> >
> >
> > Thanks,
> > Adi
> >
> >
> >
> > -Original Message-
> > From: Mikhail Khludnev 
> > Sent: Tuesday, January 28, 2020 9:14 AM
> > To: solr-user 
> > Subject: Re: Solr fact response strange behaviour
> >
> >
> >
> > https://issues.apache.org/jira/browse/SOLR-11775
> >
> >
> >
> > On Tue, Jan 28, 2020 at 10:00 AM Kaminski, Adi
> > mailto:adi.kamin...@verint.com>>
> >
> > wrote:
> >
> >
> >
> > > Is it existing issue and tracked for future fix consideration ?
> >
> > >
> >
> > > What's the suggestion as W/A until fix - to case every related
> >
> > > response with ((Number)count).longValue() ?
> >
> > >
> >
> > > -Original Message-
> >
> > > From: Mikhail Khludnev mailto:m...@apache.org>>
> >
> > > Sent: Tuesday, January 28, 2020 8:53 AM
> >
> > > To: solr-user
> > > mailto:solr-user@lucene.apache.org>>
> >
> > > Subject: Re: Solr fact response strange behaviour
> >
> > >
> >
> > > I suppose there's an issue, which no one ever took a look.
> >
> > >
> >
> > > https://lucene.472066.n3.nabble.com/JSON-facets-count-a-long-or-an-i
> > > nt
> >
> > > eger-in-cloud-and-non-cloud-modes-td4265291.html
> >
> > >
> >
> > >
> >
> > > On Mon, Jan 27, 2020 at 11:47 PM Kaminski, Adi
> >
> > > mailto:adi.kamin...@verint.com>>
> >
> > > wrote:
> >
> > >
> >
> > > > SolrJ client is used of SolrCloud of Solr 8.3 version for JSON
> >
> > > > Facets requests...any idea why not consistent ?
> >
> > > >
> >
> > > > Sent from Workspace ONE Boxer
> >
> > > >
> >
> > > > On Jan 27, 2020 22:13, Mikhail Khludnev 
> > > > mailto:m...@apache.org>> wrote:
> >
> > > > Hello,
> >
> > > > It might be different between SolrCloud and standalone mode. No
> > > > data
> >
> > > > enough to make a conclusion.
> >
> > > >
> >
> > > > On Mon, Jan 27, 2020 at 5:40 PM Rudenko, Artur
> >
> > > > mailto:artur.rude...@verint.com>>
> >
> > > > wrote:
> >
> > > >
> >
> > > > > I'm trying to parse facet response, but sometimes the count
> >
> > > > > returns as Long type and sometimes as Integer type(on different
> >
> > > > > environments), The error is:
> >
> > > > > "java.lang.ClassCastException: java.lang.Integer cannot be cast
> > > > > to
> >
> > > > > java.lang.Long"
> >
> > > > >
> >
> > > > > Can you please 

Re: Replica type affinity

2020-02-03 Thread Jason Gerlowski
This is a bit of a guess - I haven't used this functionality before.
But to a novice the "tag" Rule Condition for "Rule Based Replica
Placement" sounds similar to the requirements you mentioned above.

https://lucene.apache.org/solr/guide/8_3/rule-based-replica-placement.html#rule-conditions

Good luck,

Jason

On Thu, Jan 30, 2020 at 1:00 PM Karl Stoney
 wrote:
>
> Hey,
> Thanks for the reply but I'm trying to have something fully automated and 
> dynamic.  For context I run solr on kubernetes, and at the moment it works 
> beautifully with autoscaling (i can scale up the kubernetes deployment and 
> solr adds replicas and removes them).
>
> I'm trying to add a new type of node though, backed by very fast but 
> ephemeral disks and the idea was to have only PULL replicas running on those 
> nodes automatically and NRT on the persistent disk instances.
>
> Might be a pipe dream but I'm striving for no manual configuration.
> 
> From: Edward Ribeiro 
> Sent: 30 January 2020 16:56
> To: solr-user@lucene.apache.org 
> Subject: Re: Replica type affinity
>
> Hi Karl,
>
> During collection creation you can specify the `createNodeSet` parameter as
> specified by the Solr Reference Guide snippet below:
>
> "createNodeSet
> Allows defining the nodes to spread the new collection across. The format
> is a comma-separated list of node_names, such as
> localhost:8983_solr,localhost:8984_solr,localhost:8985_solr.
> If not provided, the CREATE operation will create shard-replicas spread
> across all live Solr nodes.
> Alternatively, use the special value of EMPTY to initially create no
> shard-replica within the new collection and then later use the ADDREPLICA
> operation to add shard-replicas when and where required."
>
>
> There's also Collections API that you can use the node parameter of
> ADDREPLICA to specify the node that replica shard should be created on.
> See:
> https://eur03.safelinks.protection.outlook.com/?url=https%3A%2F%2Flucene.apache.org%2Fsolr%2Fguide%2F6_6%2Fcollections-api.html%23CollectionsAPI-Input.9&data=02%7C01%7Ckarl.stoney%40autotrader.co.uk%7Ce6f81aab85274cd0081408d7a5a56464%7C926f3743f3d24b8a816818cfcbe776fe%7C0%7C0%7C637160002076345528&sdata=3pFUtr6o7vK0srGR60lIUc%2Fo9QSftmAcnQDkcx5z%2Bl8%3D&reserved=0
> Other
> commands that can be useful are REPLACENODE, MOVEREPLICA.
>
> Edward
>
>
> On Thu, Jan 30, 2020 at 1:00 PM Karl Stoney
>  wrote:
>
> > Hey everyone,
> > Does anyone know of a way to have solr replicas assigned to specific nodes
> > by some sort of identifying value (in solrcloud).
> >
> > In summary I’m trying to have some Read only replicas only every be
> > assigned to nodes named “solr-ephemeral-x” and my nrt and masters assigned
> > to “solr-index”.
> >
> > Kind of like rack affinity in elasticsearch!
> >
> > Get Outlook for 
> > iOS
> > This e-mail is sent on behalf of Auto Trader Group Plc, Registered Office:
> > 1 Tony Wilson Place, Manchester, Lancashire, M15 4FN (Registered in England
> > No. 9439967). This email and any files transmitted with it are confidential
> > and may be legally privileged, and intended solely for the use of the
> > individual or entity to whom they are addressed. If you have received this
> > email in error please notify the sender. This email message has been swept
> > for the presence of computer viruses.
> >
> This e-mail is sent on behalf of Auto Trader Group Plc, Registered Office: 1 
> Tony Wilson Place, Manchester, Lancashire, M15 4FN (Registered in England No. 
> 9439967). This email and any files transmitted with it are confidential and 
> may be legally privileged, and intended solely for the use of the individual 
> or entity to whom they are addressed. If you have received this email in 
> error please notify the sender. This email message has been swept for the 
> presence of computer viruses.


Re: Checking in on Solr Progress

2020-03-02 Thread Jason Gerlowski
Very low-tech and manual, but worth mentioning...

If there's a particularly large core that's doing a full recovery, and
you have access to the disk itself you can navigate to the relevant
directory for that core and run something like "watch -n 10 ls -lah"
or "watch -n 10 du -sh ." to see how the data transfer is going.

On Fri, Feb 7, 2020 at 11:16 AM Walter Underwood  wrote:
>
> I wrote some Python that checks CLUSTERSTATUS and reports replica status to 
> Telegraf. Great for charts and alerts, but it only shows status, not progress.
>
> wunder
> Walter Underwood
> wun...@wunderwood.org
> http://observer.wunderwood.org/  (my blog)
>
> > On Feb 7, 2020, at 7:58 AM, Erick Erickson  wrote:
> >
> > I was wondering about using metrics myself. I confess I didn’t look to see 
> > what was already there either ;)
> >
> > Actually, using metrics might be easiest all told, but I also confess I 
> > have no clue what it takes to build a new metric in. Nor how to use the 
> > same (?) collection process for the 5 situations I outlined, and those just 
> > off the top of my head.
> >
> > It’s particularly frustrating when diagnosing these not knowing whether the 
> > “recovering” state is going to resolve itself sometime or not. I’ve seen 
> > Solr replicas stuck in that state forever….
> >
> > Andrzej could certainly shed some light on that question.
> >
> > All ideas welcome of course!
> >
> >> On Feb 7, 2020, at 10:40 AM, Jan Høydahl  wrote:
> >>
> >> Could we expose some high level recovery info as part of metrics api? Then 
> >> people could track number of cores recovering, recovery time, recovery 
> >> phase, number of recoveries failed etc, and also build alerts on top of 
> >> that.
> >>
> >> Jan Høydahl
> >>
> >>> 6. feb. 2020 kl. 19:42 skrev Erick Erickson :
> >>>
> >>> There’s actually a crying need for this, but there’s nothing that’s 
> >>> there yet, basically you have to look at the log files and try to figure 
> >>> it out.
> >>>
> >>> Actually I think this would be a great thing to work on, but it’d be 
> >>> pretty much all new. If you’d like, you can create a Solr Improvement 
> >>> Proposal here: 
> >>> https://cwiki.apache.org/confluence/display/SOLR/SIP+Template to flesh 
> >>> out what this would look like.
> >>>
> >>> A couple of thoughts off the top of my head:
> >>>
> >>> I really think what would be most useful would be a collections API 
> >>> command, something like “RECOVERYSTATUS”, or maybe extend CLUSTERSTATUS. 
> >>> Currently a replica can be stuck in recovery and never get out. There are 
> >>> several scenarios that’d have to be considered:
> >>>
> >>> 1> normal startup. The replica briefly goes from down->recovering->active 
> >>> which should be quite brief.
> >>> 1a> Waiting for a leader to be elected before continuing
> >>>
> >>> 2> “peer sync” where another replica is replaying documents from the tlog.
> >>>
> >>> 3> situations where the replica is replaying documents from its own tlog. 
> >>> This can be very, very, very long too.
> >>>
> >>> 4> full sync where it’s copying the entire index from a leader.
> >>>
> >>> 5> knickers in a knot, it’s given up even trying to recover.
> >>>
> >>> In either case, you’d want to report “all ok” if nothing was in recovery, 
> >>> “just the ones having trouble” and “everything because I want to look”.
> >>>
> >>> But like I said, there’s nothing really built into the system to 
> >>> accomplish this now that I know of.
> >>>
> >>> Best,
> >>> Erick
> >>>
>  On Feb 6, 2020, at 12:15 PM, dj-manning  
>  wrote:
> 
>  Erick Erickson wrote
> > When you say “look”, where are you looking from? Http requests? SolrJ? 
> > The
> > admin UI?
> 
>  I'm open to looking form anywhere  - http request, or the admin UI, or
>  following a log if possible.
> 
>  My objective for this ask would be to human interactively follow/watch
>  solr's recovery progress - if that's even possible.
> 
>  Stretch goal would be to autonomously report on recovery progress.
> 
>  The question stems from seeing recovery in log or the admin UI, then
>  wondering what progress is.
> 
>  Appreciation.
> 
> 
> 
> 
>  --
>  Sent from: https://lucene.472066.n3.nabble.com/Solr-User-f472068.html
> >>>
> >
>


Re: Request Tracking in Solr

2020-04-01 Thread Jason Gerlowski
Hi Prakhar,

Newer versions of Solr offer an "Audit Logging" plugin for use cases
similar to yours.
https://lucene.apache.org/solr/guide/8_1/audit-logging.html

If don't think that's available as far back as 5.2.1 though.  Just
thought I'd mention it in case upgrading is an option.

Best,

Jason

On Wed, Apr 1, 2020 at 2:29 AM Prakhar Kumar
 wrote:
>
> Hello Folks,
>
> I'm looking for a way to track requests in Solr from a
> particular user/client. Suppose, I've created a user, say *Client1*, using
> the basic authentication/authorization plugin. Now I want to get a count of
> the number of requests/queries made by *Client1* on the Solr server.
> Looking forward to some valuable suggestions.
>
> P.S. We are using Solr 5.2.1
>
> --
> Kind Regards,
> Prakhar Kumar
> Sr. Enterprise Software Engineer
>
> *HotWax Systems*
> *Enterprise open source experts*
> cell: +91-89628-81820
> office: 0731-409-3684
> http://www.hotwaxsystems.com


Re: SolrJ connection leak with SolrCloud and Jetty Gzip compression enabled

2020-04-22 Thread Jason Gerlowski
Hi Samuel,

Thanks for the very detailed description of the problem here.  Very
thorough!  I don't think you're missing anything obvious, please file the
jira tickets if you haven't already.

Best,

Jason

On Mon, Apr 13, 2020 at 6:12 PM Samuel Garcia Martinez <
samuel...@inditex.com> wrote:

> Reading again the last two paragraphs I realized that, those two
> specially, are very poorly worded (grammar 😓). I tried to rephrase them
> and correct some of the errors below.
>
> Here I can see three different problems:
>
> * HttpSolrCall should not use HttpServletResponse#setCharacterEncoding to
> set the Content-Encoding header. This is obviously a mistake.
> * HttpSolrClient, specifically the HttpClientUtil, should be modified to
> prevent that if the Content-Encoding header lies about the actual content,
> the connection is leaked forever. It should the exception though.
> * HttpSolrClient should allow clients to customize HttpClient's
> connectionRequestTimeout, preventing the application to be blocked forever
> waiting for a connection to be available. This way, the application could
> respond to requests that won’t use Solr instead of rejecting any incoming
> requests because all threads are blocked forever for a connection that
> won’t be available ever.
>
> I think the two first points are bugs that should be fixed.  The third one
> is a feature improvement to me.
>
> Unless I missed something, I'll file the two bugs and provide a patch for
> them. The same goes for the the feature improvement.
>
>
>
> Get Outlook for iOS
>
>
>
> En el caso de haber recibido este mensaje por error, le rogamos que nos lo
> comunique por esta misma vía, proceda a su eliminación y se abstenga de
> utilizarlo en modo alguno.
> If you receive this message by error, please notify the sender by return
> e-mail and delete it. Its use is forbidden.
>
>
>
> 
> From: Samuel Garcia Martinez 
> Sent: Monday, April 13, 2020 10:08:36 PM
> To: solr-user@lucene.apache.orG 
> Subject: SolrJ connection leak with SolrCloud and Jetty Gzip compression
> enabled
>
> Hi!
>
> Today, I've seen a weird issue in production workloads when the gzip
> compression was enabled. After some minutes, the client app ran out of
> connections and stopped responding.
>
> The cluster setup is pretty simple:
> Solr version: 7.7.2
> Solr cloud enabled
> Cluster topology: 6 nodes, 1 single collection, 10 shards and 3 replicas.
> 1 HTTP LB using Round Robin over all nodes
> All cluster nodes have gzip enabled for all paths, all HTTP verbs and all
> MIME types.
> Solr client: HttpSolrClient targeting the HTTP LB
>
> Problem description: when the Solr node that receives the request has to
> forward the request to a Solr Node that actually can perform the query, the
> response headers are added incorrectly to the client response, causing the
> SolrJ client to fail and to never release the connection back to the pool.
>
> To simplify the case, let's try to start from the following repro scenario:
>
>   *   Start one node with cloud mode and port 8983
>   *   Create one single collection (1 shard, 1 replica)
>   *   Start another node with port 8984 and the previusly started zk (-z
> localhost:9983)
>   *   Start a java application and query the cluster using the node on
> port 8984 (the one that doesn't host the collection)
>
> So, the steps occur like:
>
>   *   The application queries node:8984 with compression enabled
> ("Accept-Encoding: gzip") and wt=javabin
>   *   Node:8984 can't perform the query and creates a http request behind
> the scenes to node:8983
>   *   Node:8983 returns a gzipped response with "Content-Encoding: gzip"
> and "Content-Type: application/octet-stream"
>   *   Node:8984 adds the "Content-Encoding: gzip" header as character
> stream to the response (it should be forwarded as "Content-Encoding"
> header, not character encoding)
>   *   HttpSolrClient receives a "Content-Type:
> application/octet-stream;charset=gzip", causing an exception.
>   *   HttpSolrClient tries to quietly close the connection, but since the
> stream is broken, the Utils.consumeFully fails to actually consume the
> entity (it throws another exception in GzipDecompressingEntity#getContent()
> with "not in GZIP format")
>
> The exception thrown by HttpSolrClient is:
> java.nio.charset.UnsupportedCharsetException: gzip
>at java.nio.charset.Charset.forName(Charset.java:531)
>at
> org.apache.http.entity.ContentType.create(ContentType.java:271)
>at
> org.apache.http.entity.ContentType.create(ContentType.java:261)
>at
> org.apache.http.entity.ContentType.parse(ContentType.java:319)
>at
> org.apache.solr.client.solrj.impl.HttpSolrClient.executeMethod(HttpSolrClient.java:591)
>at
> org.apache.solr.client.solrj.impl.HttpSolrClient.request(HttpSolrClient.java:255)
>at
> org.apache.solr.client.solrj.impl.HttpSolrCli

Re: Rule-Based Auth - update not working

2020-05-17 Thread Jason Gerlowski
Hi Isabelle,

Two things to keep in mind with Solr's Rule-Based Authorization.

1. Each request is controlled by the first permission to that matches
the request.
2. With the permissions you have present, Solr will check them in
descending list order.  (This isn't always true - collection-specific
and path-specific permissions are given precedence, so you don't need
to consider that.)

As you can imagine given the rules above - permission order is very
important.  In your case the "all" rule will match pretty much all
requests, which explains why an "indexing" user can't actually index.
Generally speaking, it's best to put the most specific rules first,
with the broader ones coming later.

For more information, see the "Permission Ordering and Resolution"
section in the page you linked to in your request.

Good luck, hope that helps.

Jason

On Tue, May 12, 2020 at 12:34 PM Isabelle Giguere
 wrote:
>
> Hi;
>
> I'm using Solr 8.5.0.
>
> I'm having trouble setting up some permissions using the rule-based 
> authorization plugin: 
> https://lucene.apache.org/solr/guide/8_5/rule-based-authorization-plugin.html
>
> I have 3 users: "admin", "search", and "indexer".
>
> I have set permissions and user roles:
> "permissions": [  {  "name": "all", "role": "admin", "index": 1  },
>   { "name": "admin-luke", "collection": "*", "role": "luke", "index": 2, 
> "path": "/admin/luke"  },
>   { "name": "read", "role": "searching", "index": 3  },
>   {  "name": "update", "role": "indexing", "index": 4 }],
> "user-role": {  "admin": "admin",
>   "search": ["searching","luke"],
>   "indexer": "indexing"   }  }
> Attached: full output of GET /admin/authorization
>
> So why can't user "indexer" add anything in a collection ?  I always get HTTP 
> 403 Forbidden.
> Using Postman, I click the checkbox to show the password, so I'm sure I typed 
> the right one.
>
> Note that user "search" can't use the /select handler either, as should be 
> the case with permission to "read".   This user can, however, use the Luke 
> handler, as the custom permission allows.
>
> User "admin" can use any API.  So at least the predefined permission "all" 
> does work.
>
> Note that the collections were created before enabling authentication and 
> authorization.  Could that be the cause of the permission issues ?
>
> Thanks;
>
> Isabelle Giguère
> Computational Linguist & Java Developer
> Linguiste informaticienne & développeur java
>
>


Re: Rule-Based Auth - update not working

2020-05-17 Thread Jason Gerlowski
One slight correction: I missed that you actually do have a
path/collection-specific permission in your list there.  So Solr will
check the permissions in descending list-order for most requests - the
exception being /luke requests when the /luke permission filters to
the top and is checked first.

We should really change this resolution order to be something more commonsense.

Jason

On Sun, May 17, 2020 at 2:52 PM Jason Gerlowski  wrote:
>
> Hi Isabelle,
>
> Two things to keep in mind with Solr's Rule-Based Authorization.
>
> 1. Each request is controlled by the first permission to that matches
> the request.
> 2. With the permissions you have present, Solr will check them in
> descending list order.  (This isn't always true - collection-specific
> and path-specific permissions are given precedence, so you don't need
> to consider that.)
>
> As you can imagine given the rules above - permission order is very
> important.  In your case the "all" rule will match pretty much all
> requests, which explains why an "indexing" user can't actually index.
> Generally speaking, it's best to put the most specific rules first,
> with the broader ones coming later.
>
> For more information, see the "Permission Ordering and Resolution"
> section in the page you linked to in your request.
>
> Good luck, hope that helps.
>
> Jason
>
> On Tue, May 12, 2020 at 12:34 PM Isabelle Giguere
>  wrote:
> >
> > Hi;
> >
> > I'm using Solr 8.5.0.
> >
> > I'm having trouble setting up some permissions using the rule-based 
> > authorization plugin: 
> > https://lucene.apache.org/solr/guide/8_5/rule-based-authorization-plugin.html
> >
> > I have 3 users: "admin", "search", and "indexer".
> >
> > I have set permissions and user roles:
> > "permissions": [  {  "name": "all", "role": "admin", "index": 1  },
> >   { "name": "admin-luke", "collection": "*", "role": "luke", "index": 
> > 2, "path": "/admin/luke"  },
> >   { "name": "read", "role": "searching", "index": 3  },
> >   {  "name": "update", "role": "indexing", "index": 4 }],
> > "user-role": {  "admin": "admin",
> >   "search": ["searching","luke"],
> >   "indexer": "indexing"   }  }
> > Attached: full output of GET /admin/authorization
> >
> > So why can't user "indexer" add anything in a collection ?  I always get 
> > HTTP 403 Forbidden.
> > Using Postman, I click the checkbox to show the password, so I'm sure I 
> > typed the right one.
> >
> > Note that user "search" can't use the /select handler either, as should be 
> > the case with permission to "read".   This user can, however, use the Luke 
> > handler, as the custom permission allows.
> >
> > User "admin" can use any API.  So at least the predefined permission "all" 
> > does work.
> >
> > Note that the collections were created before enabling authentication and 
> > authorization.  Could that be the cause of the permission issues ?
> >
> > Thanks;
> >
> > Isabelle Giguère
> > Computational Linguist & Java Developer
> > Linguiste informaticienne & développeur java
> >
> >


Re: Query takes more time in Solr 8.5.1 compare to 6.1.0 version

2020-05-21 Thread Jason Gerlowski
Hi Jay,

I can't speak to why you're seeing a performance change between 6.x
and 8.x.  What I can suggest though is an alternative way of
formulating the query: you might get different performance if you run
your query using Solr's "terms" query parser:
https://lucene.apache.org/solr/guide/8_5/other-parsers.html#terms-query-parser
 It's not guaranteed to help, but there's a chance it'll work for you.
And knowing whether or not it helps might point others here towards
the cause of your slowdown.

Even if "terms" performs better for you, it's probably worth
understanding what's going on here of course.

Are all other queries running comparably?

Jason

On Thu, May 21, 2020 at 10:25 AM jay harkhani  wrote:
>
> Hello,
>
> Please refer below details.
>
> >Did you create Solrconfig.xml for the collection from scratch after 
> >upgrading and reindexing?
> Yes, We have created collection from scratch and also re-indexing.
>
> >Was it based on the latest template?
> Yes, It was as per latest template.
>
> >What happens if you reexecute the query?
> Not more visible difference. Minor change in milliseconds.
>
> >Are there other processes/containers running on the same VM?
> No
>
> >How much heap and how much total memory you have?
> My heap and total memory are same as Solr 6.1.0. heap memory 5 gb and total 
> memory 25gb. As per me there is no issue related to memory.
>
> >Maybe also you need to increase the corresponding caches in the config.
> We are not using cache in both version.
>
> Both version have same configuration.
>
> Regards,
> Jay Harkhani.
>
> 
> From: Jörn Franke 
> Sent: Thursday, May 21, 2020 7:05 PM
> To: solr-user@lucene.apache.org 
> Subject: Re: Query takes more time in Solr 8.5.1 compare to 6.1.0 version
>
> Did you create Solrconfig.xml for the collection from scratch after upgrading 
> and reindexing? Was it based on the latest template?
> If not then please try this. Maybe also you need to increase the 
> corresponding caches in the config.
>
> What happens if you reexecute the query?
>
> Are there other processes/containers running on the same VM?
>
> How much heap and how much total memory you have? You should only have a 
> minor fraction of the memory as heap and most of it „free“ (this means it is 
> used for file caches).
>
>
>
> > Am 21.05.2020 um 15:24 schrieb vishal patel :
> >
> > Any one is looking this issue?
> > I got same issue.
> >
> > Regards,
> > Vishal Patel
> >
> >
> >
> > 
> > From: jay harkhani 
> > Sent: Wednesday, May 20, 2020 7:39 PM
> > To: solr-user@lucene.apache.org 
> > Subject: Query takes more time in Solr 8.5.1 compare to 6.1.0 version
> >
> > Hello,
> >
> > Currently I upgrade Solr version from 6.1.0 to 8.5.1 and come across one 
> > issue. Query which have more ids (around 3000) and grouping is applied 
> > takes more time to execute. In Solr 6.1.0 it takes 677ms and in Solr 8.5.1 
> > it takes 26090ms. While take reading we have same solr schema and same no. 
> > of records in both solr version.
> >
> > Please refer below details for query, logs and thead dump (generate from 
> > Solr Admin while execute query).
> >
> > Query : 
> > https://drive.google.com/file/d/1bavCqwHfJxoKHFzdOEt-mSG8N0fCHE-w/view
> >
> > Logs and Thread dump stack trace
> > Solr 8.5.1 : 
> > https://drive.google.com/file/d/149IgaMdLomTjkngKHrwd80OSEa1eJbBF/view
> > Solr 6.1.0 : 
> > https://drive.google.com/file/d/13v1u__fM8nHfyvA0Mnj30IhdffW6xhwQ/view
> >
> > To analyse further more we found that if we remove grouping field or we 
> > reduce no. of ids from query it execute fast. Is anything change in 8.5.1 
> > version compare to 6.1.0 as in 6.1.0 even for large no. Ids along with 
> > grouping it works faster?
> >
> > Can someone please help to isolate this issue.
> >
> > Regards,
> > Jay Harkhani.


Re: SolrCloud upgrade concern

2020-05-27 Thread Jason Gerlowski
Hi Arnold,

>From what I saw in the community, CDCR saw an initial burst of
development around when it was contributed, but hasn't seen much
attention or improvement since.  So while it's been around for a few
years, I'm not sure it's improved much in terms of stability or
compatibility with other Solr features.

Some of the bigger ticket issues still open around CDCR:
- SOLR-11959 no support for basic-auth
- SOLR-12842 infinite retry of failed update-requests (leads to
sync/recovery problems)
- SOLR-12057 no real support for NRT/TLOG/PULL replicas
- SOLR-10679 no support for collection aliases

These are in addition to other more architectural issues: CDCR can be
a bottleneck on clusters with high ingestion rates, CDCR uses
full-index-replication more than traditional indexing setups, which
can cause issues with modern index sizes, etc.

So, unfortunately, no real good news in terms of CDCR maturing much in
recent releases.  Joel Bernstein filed a JIRA recently suggesting its
removal entirely actually.  Though I don't think it's gone anywhere.

That said, I gather from what you said that you're already using CDCR
successfully with Master-Slave.  If none of these pitfalls are biting
you in your current Master-Slave setup, you might not be bothered by
them any more in SolrCloud.  Most of the problems with CDCR are
applicable in master-slave as well as SolrCloud.  I wouldn't recommend
CDCR if you were starting from scratch, and I still recommend you
consider other options.  But since you're already using it with some
success, it might be an orthogonal concern to your potential migration
to SolrCloud.

Best of luck deciding!

Jason

On Fri, May 22, 2020 at 7:06 PM gnandre  wrote:
>
> Thanks for this reply, Jason.
>
> I am mostly worried about CDCR feature. I am relying heavily on it.
> Although, I am planning to use Solr 8.3. It has been long time since CDCR
> was first introduced. I wonder what is the state of CDCR is 8.3. Is it
> stable now?
>
> On Wed, Jan 22, 2020, 8:01 AM Jason Gerlowski  wrote:
>
> > Hi Arnold,
> >
> > The stability and complexity issues Mark highlighted in his post
> > aren't just imagined - there are real, sometimes serious, bugs in
> > SolrCloud features.  But at the same time there are many many stable
> > deployments out there where SolrCloud is a real success story for
> > users.  Small example, I work at a company (Lucidworks) where our main
> > product (Fusion) is built heavily on top of SolrCloud and we see it
> > deployed successfully every day.
> >
> > In no way am I trying to minimize Mark's concerns (or David's).  There
> > are stability bugs.  But the extent to which those need affect you
> > depends a lot on what your deployment looks like.  How many nodes?
> > How many collections?  How tightly are you trying to squeeze your
> > hardware?  Is your network flaky?  Are you looking to use any of
> > SolrCloud's newer, less stable features like CDCR, etc.?
> >
> > Is SolrCloud better for you than Master/Slave?  It depends on what
> > you're hoping to gain by a move to SolrCloud, and on your answers to
> > some of the questions above.  I would be leery of following any
> > recommendations that are made without regard for your reason for
> > switching or your deployment details.  Those things are always the
> > biggest driver in terms of success.
> >
> > Good luck making your decision!
> >
> > Best,
> >
> > Jason
> >


Re: CDCR behaviour

2020-06-05 Thread Jason Gerlowski
Hi Daniel,

Just a heads up that attachments and images are stripped pretty
aggressively by the mailing list - none of your images made it through.
You might more success linking to the images in Dropbox or some other
online storage medium.

Best,

Jason

On Thu, Jun 4, 2020 at 10:55 AM Gell-Holleron, Daniel <
daniel.gell-holle...@gb.unisys.com> wrote:

> Hi,
>
>
>
> Looks for some advice, sent a few questions on CDCR the last couple of
> days.
>
>
>
> I just want to see if this is expected behavior from Solr or not?
>
>
>
> When a document is added to Site A, it is then supposed to replicate
> across, however in the statistics page I see the following:
>
>
>
> Site A
>
>
>
>
> Site B
>
>
>
>
>
> When I perform a search on Site B through the Solr admin page, I do get
> results (which I find strange). The only way for the numb docs parameter to
> be matching is restart Solr, I then get the below:
>
>
>
>
>
> I just want to know whether this behavior is expected or is a bug? My
> expectation is that the data will always be current between the two sites.
>
>
>
> Thanks,
>
> Daniel
>
>
>


Re: HTTP 401 when searching on alias in secured Solr

2020-06-16 Thread Jason Gerlowski
Just wanted to close the loop here: Isabelle filed SOLR-14569 for this
and eventually reported there that the problem seems specific to her
custom configuration which specifies a seemingly innocuous
 in solrconfig.xml.

See that jira for more detailed explanation (and hopefully a
resolution coming soon).

On Wed, Jun 10, 2020 at 4:01 PM Jan Høydahl  wrote:
>
> Please share your security.json file
>
> Jan Høydahl
>
> > 10. jun. 2020 kl. 21:53 skrev Isabelle Giguere 
> > :
> >
> > Hi;
> >
> > I'm using Solr 8.5.0.  I have uploaded security.json to Zookeeper.  I can 
> > log in the Solr Admin UI.  I can create collections and aliases, and I can 
> > index documents in Solr.
> >
> > Collections : test1, test2
> > Alias: test (combines test1, test2)
> >
> > Indexed document "solr-word.pdf" in collection test1
> >
> > Searching on a collection works:
> > http://localhost:8983/solr/test1/select?q=*:*&wt=xml
> > 
> >
> > But searching on an alias results in HTTP 401
> > http://localhost:8983/solr/test/select?q=*:*&wt=xml
> >
> > Error from server at null: Expected mime type application/octet-stream but 
> > got text/html.> content="text/html;charset=utf-8"/> Error 401 Authentication failed, 
> > Response code: 401  HTTP ERROR 401 Authentication 
> > failed, Response code: 401  
> > URI:/solr/test1_shard1_replica_n1/select 
> > STATUS:401 
> > MESSAGE:Authentication failed, Response code: 
> > 401 SERVLET:default   
> > 
> >
> > Even if https://issues.apache.org/jira/browse/SOLR-13510 is fixed in Solr 
> > 8.5.0, I did try to start Solr with -Dsolr.http1=true, and I set 
> > "forwardCredentials":true in security.json.
> >
> > Nothing works.  I just cannot use aliases when Solr is secured.
> >
> > Can anyone confirm if this may be a configuration issue, or if this could 
> > possibly be a bug ?
> >
> > Thank you;
> >
> > Isabelle Giguère
> > Computational Linguist & Java Developer
> > Linguiste informaticienne & développeur java
> >
> >


Re: Can't fetch table from cassandra through jdbc connection

2020-06-16 Thread Jason Gerlowski
The way I read the stack trace you provided, it looks like DIH is
running the query "select test_field from test_keyspace.test_table
limit 10", but the Cassandra jdbc driver is reporting that Cassandra
doesn't support some aspect of that query.  If I'm reading that right,
this seems like a question for the Cassandra folks who wrote that jdbc
driver instead of the Solr folks here.  Though maybe there's someone
here who happens to know.

The only thing I'd suggest to get more DIH logging would be to raise
the log levels for DIH classes, but from what you said above it sounds
like you already did that for the root logger and it didn't give you
anything that helped solve the issue.  So I'm stumped.

Good luck,

Jason

On Tue, Jun 16, 2020 at 6:05 AM Ирина Камалова  wrote:
>
> Could you please tell me if I can expand log trace here?
> (if I'm trying to do it through solr admin and make root log ALL - it
> doesn't help me)
>
>
> Best regards,
> Irina Kamalova
>
>
> On Mon, 15 Jun 2020 at 10:12, Ирина Камалова 
> wrote:
>
> > I’m using Solr 7.7.3 and latest Cassandra jdbc driver 1.3.5
> >
> > I get  *SQLFeatureNotSupportedException *
> >
> >
> > I see this error and have no idea what’s wrong (not enough verbose - table
> > name or field wrong/ couldn’t mapping type or driver doesn’t support?)
> >
> >
> > Full Import failed:java.lang.RuntimeException: java.lang.RuntimeException: 
> > org.apache.solr.handler.dataimport.DataImportHandlerException: Unable to 
> > execute query: select test_field from test_keyspace.test_table limit 10; 
> > Processing Document # 1
> > at 
> > org.apache.solr.handler.dataimport.DocBuilder.execute(DocBuilder.java:271)
> > at 
> > org.apache.solr.handler.dataimport.DataImporter.doFullImport(DataImporter.java:424)
> > at 
> > org.apache.solr.handler.dataimport.DataImporter.runCmd(DataImporter.java:483)
> > at 
> > org.apache.solr.handler.dataimport.DataImporter.lambda$runAsync$0(DataImporter.java:466)
> > at java.lang.Thread.run(Thread.java:748)
> > Caused by: java.lang.RuntimeException: 
> > org.apache.solr.handler.dataimport.DataImportHandlerException: Unable to 
> > execute query: select test_field from test_keyspace.test_table limit 10; 
> > Processing Document # 1
> > at 
> > org.apache.solr.handler.dataimport.DocBuilder.buildDocument(DocBuilder.java:417)
> > at 
> > org.apache.solr.handler.dataimport.DocBuilder.doFullDump(DocBuilder.java:330)
> > at 
> > org.apache.solr.handler.dataimport.DocBuilder.execute(DocBuilder.java:233)
> > ... 4 more
> > Caused by: org.apache.solr.handler.dataimport.DataImportHandlerException: 
> > Unable to execute query: select test_field from test_keyspace.test_table 
> > limit 10; Processing Document # 1
> > at 
> > org.apache.solr.handler.dataimport.DataImportHandlerException.wrapAndThrow(DataImportHandlerException.java:69)
> > at 
> > org.apache.solr.handler.dataimport.JdbcDataSource$ResultSetIterator.(JdbcDataSource.java:327)
> > at 
> > org.apache.solr.handler.dataimport.JdbcDataSource.createResultSetIterator(JdbcDataSource.java:288)
> > at 
> > org.apache.solr.handler.dataimport.JdbcDataSource.getData(JdbcDataSource.java:283)
> > at 
> > org.apache.solr.handler.dataimport.JdbcDataSource.getData(JdbcDataSource.java:52)
> > at 
> > org.apache.solr.handler.dataimport.SqlEntityProcessor.initQuery(SqlEntityProcessor.java:59)
> > at 
> > org.apache.solr.handler.dataimport.SqlEntityProcessor.nextRow(SqlEntityProcessor.java:73)
> > at 
> > org.apache.solr.handler.dataimport.EntityProcessorWrapper.nextRow(EntityProcessorWrapper.java:267)
> > at 
> > org.apache.solr.handler.dataimport.DocBuilder.buildDocument(DocBuilder.java:476)
> > at 
> > org.apache.solr.handler.dataimport.DocBuilder.buildDocument(DocBuilder.java:415)
> > ... 6 more
> > Caused by: java.sql.SQLFeatureNotSupportedException
> > at 
> > com.dbschema.CassandraConnection.createStatement(CassandraConnection.java:75)
> > at 
> > org.apache.solr.handler.dataimport.JdbcDataSource$ResultSetIterator.createStatement(JdbcDataSource.java:342)
> > at 
> > org.apache.solr.handler.dataimport.JdbcDataSource$ResultSetIterator.(JdbcDataSource.java:318)
> > ... 14 more
> >
> >
> >
> >
> > Best regards,
> > Irina Kamalova
> >


Re: [EXTERNAL] Re: Getting rid of Master/Slave nomenclature in Solr

2020-06-18 Thread Jason Gerlowski
+1 to rename master/slave, and +1 to choosing terminology distinct
from what's used for SolrCloud.  I could be happy with several of the
proposed options.  Since a good few have been proposed though, maybe
an eventual vote thread is the most organized way to aggregate the
opinions here.

I'm less positive about the prospect of changing the name of our
primary git branch.  Most projects that contributors might come from,
most tutorials out there to learn git, most tools built on top of git
- the majority are going to assume "master" as the main branch.  I
appreciate the change that Github is trying to effect in changing the
default for new projects, but it'll be a long time before that
competes with the huge bulk of projects, documentation, etc. out there
using "master".  Our contributors are smart and I'm sure they'd figure
it out if we used "main" or something else instead, but having a
non-standard git setup would be one more "papercut" in understanding
how to contribute to a project that already makes that harder than it
should.

Jason


On Thu, Jun 18, 2020 at 7:33 AM Demian Katz  wrote:
>
> Regarding people having a problem with the word "master" -- GitHub is 
> changing the default branch name away from "master," even in isolation from a 
> "slave" pairing... so the terminology seems to be falling out of favor in all 
> contexts. See:
>
> https://www.cnet.com/news/microsofts-github-is-removing-coding-terms-like-master-and-slave/
>
> I'm not here to start a debate about the semantics of that, just to provide 
> evidence that in some communities, the term "master" is causing concern all 
> by itself. If we're going to make the change anyway, it might be best to get 
> it over with and pick the most appropriate terminology we can agree upon, 
> rather than trying to minimize the amount of change. It's going to be 
> backward breaking anyway, so we might as well do it all now rather than risk 
> having to go through two separate breaking changes at different points in 
> time.
>
> - Demian
>
> -Original Message-
> From: Noble Paul 
> Sent: Thursday, June 18, 2020 1:51 AM
> To: solr-user@lucene.apache.org
> Subject: [EXTERNAL] Re: Getting rid of Master/Slave nomenclature in Solr
>
> Looking at the code I see a 692 occurrences of the word "slave".
> Mostly variable names and ref guide docs.
>
> The word "slave" is present in the responses as well. Any change in the 
> request param/response payload is backward incompatible.
>
> I have no objection to changing the names in ref guide and other internal 
> variables. Going ahead with backward incompatible changes is painful. If 
> somebody has the appetite to take it up, it's OK
>
> If we must change, master/follower can be a good enough option.
>
> master (noun): A man in charge of an organization or group.
> master(adj) : having or showing very great skill or proficiency.
> master(verb): acquire complete knowledge or skill in (a subject, technique, 
> or art).
> master (verb): gain control of; overcome.
>
> I hope nobody has a problem with the term "master"
>
> On Thu, Jun 18, 2020 at 3:19 PM Ilan Ginzburg  wrote:
> >
> > Would master/follower work?
> >
> > Half the rename work while still getting rid of the slavery connotation...
> >
> >
> > On Thu 18 Jun 2020 at 07:13, Walter Underwood  wrote:
> >
> > > > On Jun 17, 2020, at 4:00 PM, Shawn Heisey  wrote:
> > > >
> > > > It has been interesting watching this discussion play out on
> > > > multiple
> > > open source mailing lists.  On other projects, I have seen a VERY
> > > high level of resistance to these changes, which I find disturbing
> > > and surprising.
> > >
> > > Yes, it is nice to see everyone just pitch in and do it on this list.
> > >
> > > wunder
> > > Walter Underwood
> > > wun...@wunderwood.org
> > > https://nam04.safelinks.protection.outlook.com/?url=http%3A%2F%2Fobs
> > > erver.wunderwood.org%2F&data=02%7C01%7Cdemian.katz%40villanova.e
> > > du%7C1eef0604700a442deb7e08d8134b97fb%7C765a8de5cf9444f09cafae5bf8cf
> > > a366%7C0%7C0%7C637280562684672329&sdata=0GyK5Tlq0PGsWxl%2FirJOVN
> > > VaFCELlEChdxuLJ5RxdQs%3D&reserved=0  (my blog)
> > >
> > >
>
>
>
> --
> -
> Noble Paul


Re: Index files on Windows fileshare

2020-06-25 Thread Jason Gerlowski
Hi Fiz,

Since you're just looking for a POC solution, I think Solr's
"bin/post" tool would probably help you achieve your first
requirement.

But I don't think "bin/post" gives you much control over the fields
that get indexed - if you need the file path to be stored, you might
be better off writing a small crawler in Java and using SolrJ to do
the indexing.

Good luck!

Jason

On Fri, Jun 19, 2020 at 9:34 AM Fiz N  wrote:
>
> Hello Solr experts,
>
> I am using standalone version of SOLR 8.5 on Windows machine.
>
> 1)  I want to index all types of files under different directory in the
> file share.
>
> 2) I need to index  absolute path of the files and store it solr field. I
> need that info so that end user can click and open the file(Pop-up)
>
> Could you please tell me how to go about this?
> This is for POC purpose once we finalize the solution we would be further
> going ahead with stable approach.
>
> Thanks
> Fiz Nadian.


Re: Restored collection cluster status rendering some values as Long (as opposed to String for other collections)

2020-06-25 Thread Jason Gerlowski
Hi Aliaksandr

This sounds like a bug to me - I can't think of any reason why this
would be intentional behavior.  Maybe I'm missing something and this
is "expected", but if so someone will come along and correct me.

Can you file a JIRA ticket with this information in it?

Jason

On Wed, Jun 24, 2020 at 10:03 AM Aliaksandr Asiptsou
 wrote:
>
> Sorry I forgot to mention: we use Solr 8.3.1
>
> Best regards,
> Aliaksandr Asiptsou
> From: Aliaksandr Asiptsou
> Sent: Wednesday, June 24, 2020 12:44 AM
> To: solr-user@lucene.apache.org
> Subject: Restored collection cluster status rendering some values as Long (as 
> opposed to String for other collections)
>
> Hello Solr experts,
>
> Our team noticed the below behavior:
>
> 1. A collection is restored from a backup, and a replication factor is 
> specified within the restore command:
>
> /solr/admin/collections?action=RESTORE&name=backup_name&location=/backups/solr&collection=collection_name&collection.configName=config_name&replicationFactor=1&maxShardsPerNode=1
>
> 2. Collection restored successfully, but looking into cluster status we see 
> several values are rendered as Long for this particular collection:
>
> /solr/admin/collections?action=clusterstatus&wt=xml
>
> 0
> 1
> 1
> false
> 1
> 0
> 138
>
> Whereas for all the other collections pullReplicas, replicationFactor, 
> nrtReplicas and tlogReplicas are Strings.
>
> Please advise whether it is known and expected or it needs to be fixed (if 
> so, is there a Jira ticket already for this or should we create one)?
>
> Best regards,
> Aliaksandr Asiptsou


Re: SOLR and Zookeeper compatibility

2020-07-22 Thread Jason Gerlowski
Hi Mithun,

AFAIK, Solr 7.5.0 comes with ZooKeeper 3.4.11.  At least, those are
the jar versions I see when I unpack a Solr 7.5.0 distribution.  Where
are you seeing 1.3.11?  There is no 1.3.11 ZooKeeper release as far as
I'm aware.  There must be some confusion here.

Generally speaking, since 3.4.11 is the version the community
primarily was testing with at the time of Solr 7.5.0's release, that's
also probably the safest version to use.  That said, users do
frequently choose other ZooKeeper versions within the same release
line (3.4.x) for one reason or another and don't report many issues
doing so.  A few of the exceptions are tracked in our JIRA portal and
you can get more info by searching there.

Best,

Jason


On Mon, Jul 13, 2020 at 5:24 AM Mithun Seal  wrote:
>
> Hi Team,
>
> Could you please help me with below compatibility question.
>
> 1. We are trying to install zookeeper externally along with SOLR 7.5.0. As
> noted, SOLR 7.5.0 comes with Zookeeper 1.3.11. Can I install Zookeeper
> 1.3.10 with SOLR 7.5.0. Zookeeper 1.3.10 will be compatible with SOLR 7.5.0?
>
> 2. What is the suggested version of Zookeeper should be used with SOLR
> 7.5.0?
>
>
> Thanks,
> Mithun


Re: bin/solr auth enable

2020-07-31 Thread Jason Gerlowski
Hi David,

I tried this out locally but couldn't reproduce. The command you
provided above works just fine for me.

Can you tell us a bit about your environment?  Do you have the full
stack trace of the NPE handy?

Best,

Jason

On Fri, Jul 24, 2020 at 8:01 PM David Glick  wrote:
>
> When I issue “bin/solr auth enable -prompt true -blockUnknown true”, I get a 
> Null Pointer Exception.  I’m using the 8.5.1 release.  Am I doing something 
> wrong?
>
> Thanks.
>
> Sent from my iPhone


Re: Survey on ManagedResources feature

2020-08-11 Thread Jason Gerlowski
Hey Noble,

Can you explain what you mean when you say it's not secured?  Just for
those of us who haven't been following the discussion so far?  On the
surface of things users taking advantage of our RuleBasedAuth plugin
can secure this API like they can any other HTTP API.  Or are you
talking about some other security aspect here?

Jason

On Tue, Aug 11, 2020 at 9:55 AM Noble Paul  wrote:
>
> Hi all,
> The end-point for Managed resources is not secured. So it needs to be
> fixed/eliminated.
>
> I would like to know what is the level of adoption for that feature
> and if it is a critical feature for users.
>
> Another possibility is to offer a replacement for the feature using a
> different API
>
> Your feedback will help us decide on what a potential solution should be
>
> --
> -
> Noble Paul


Re: Slow query response from SOLR 5.4.1

2020-08-11 Thread Jason Gerlowski
Hey Abhijit,

The information you provided isn't really enough for anyone else on
the mailing list to debug the problem.  If you'd like help, please
provide some more information.

Good places to start would be: what is the query, what does Solr tell
you when you add a "debug=timing" parameter to your request, what does
your Solr setup look like (num nodes, shards, replicas, other
collections/cores, QPS).  It's hard to say upfront what piece of info
will be the one that helps you get an answer to your question -
performance problems have a lot of varied causes.  But providing
_some_ of these things or other related details might help you get the
answer you're looking for.

Alternately, if you've figured out the issue already post the answer
on this thread - help anyone with a similar issue in the future.
Jason

On Tue, Aug 4, 2020 at 4:11 PM Abhijit Pawar  wrote:
>
> Hello,
>
> I am seeing a performance issue in querying in one of the SOLR servers -
> instance version 5.4.1.
> Total number of documents indexed are 20K plus.
> Data returned for this particular query is just as less as 22 documents
> however it takes almost 2 minutes to get the results back.
>
> Is there a way to improve on performance of query - in general the query
> response time is slow..
>
> I have most of the fields which are stored and indexed both.I can take off
> some fields which are just needed to be indexed however those are not many
> fields.
>
> Can I do something solrconfig.xml in terms of cache or something else?
>
> Any suggestions?
>
> Thanks!!


Re: Incorrect Insecure Settings Check in CoreContainer

2020-08-11 Thread Jason Gerlowski
Yikes, yeah it's hard to argue with that.

I'm a little confused because I remember testing this, but maybe it
snuck in at the last minute?  In any case, I'll reopen that jira to
fix the check there.

Sorry guys.

Jason


On Wed, Aug 5, 2020 at 9:22 AM Jan Høydahl  wrote:
>
> This seems to have been introduced in 
> https://issues.apache.org/jira/browse/SOLR-13972 in 8.4
> That test seems to be inverted for sure.
>
> Jason?
>
> Jan
>
> > 5. aug. 2020 kl. 13:15 skrev Mark Todd1 :
> >
> >
> > I've configured SolrCloud (8.5) with both SSL and Authentication which is 
> > working correctly. However, I get the following warning in the logs
> >
> > Solr authentication is enabled, but SSL is off. Consider enabling SSL to 
> > protect user credentials and data with encryption
> >
> > Looking at the source code for SolrCloud there appears to be a bug
> > if (authenticationPlugin !=null && 
> > StringUtils.isNotEmpty(System.getProperty("solr.jetty.https.port"))) {
> >
> > log.warn("Solr authentication is enabled, but SSL is off.  Consider 
> > enabling SSL to protect user credentials and data with encryption.");
> >
> > }
> >
> > Rather than checking for an empty system property (which would indicate SLL 
> > is off) its checking for a populated one which is what you get when SSL is 
> > on.
> >
> > Should I raise this as a Jira bug?
> >
> > Mark ToddUnless stated otherwise above:
> > IBM United Kingdom Limited - Registered in England and Wales with number 
> > 741598.
> > Registered office: PO Box 41, North Harbour, Portsmouth, Hampshire PO6 3AU
> >
>


Re: Incorrect Insecure Settings Check in CoreContainer

2020-08-13 Thread Jason Gerlowski
Hey Mark,

I've fixed it for 8.7 as a part of this ticket here:
https://issues.apache.org/jira/browse/SOLR-14748. Thanks for reporting
this.

Jason

On Tue, Aug 11, 2020 at 3:19 PM Jason Gerlowski  wrote:
>
> Yikes, yeah it's hard to argue with that.
>
> I'm a little confused because I remember testing this, but maybe it
> snuck in at the last minute?  In any case, I'll reopen that jira to
> fix the check there.
>
> Sorry guys.
>
> Jason
>
>
> On Wed, Aug 5, 2020 at 9:22 AM Jan Høydahl  wrote:
> >
> > This seems to have been introduced in 
> > https://issues.apache.org/jira/browse/SOLR-13972 in 8.4
> > That test seems to be inverted for sure.
> >
> > Jason?
> >
> > Jan
> >
> > > 5. aug. 2020 kl. 13:15 skrev Mark Todd1 :
> > >
> > >
> > > I've configured SolrCloud (8.5) with both SSL and Authentication which is 
> > > working correctly. However, I get the following warning in the logs
> > >
> > > Solr authentication is enabled, but SSL is off. Consider enabling SSL to 
> > > protect user credentials and data with encryption
> > >
> > > Looking at the source code for SolrCloud there appears to be a bug
> > > if (authenticationPlugin !=null && 
> > > StringUtils.isNotEmpty(System.getProperty("solr.jetty.https.port"))) {
> > >
> > > log.warn("Solr authentication is enabled, but SSL is off.  Consider 
> > > enabling SSL to protect user credentials and data with encryption.");
> > >
> > > }
> > >
> > > Rather than checking for an empty system property (which would indicate 
> > > SLL is off) its checking for a populated one which is what you get when 
> > > SSL is on.
> > >
> > > Should I raise this as a Jira bug?
> > >
> > > Mark ToddUnless stated otherwise above:
> > > IBM United Kingdom Limited - Registered in England and Wales with number 
> > > 741598.
> > > Registered office: PO Box 41, North Harbour, Portsmouth, Hampshire PO6 3AU
> > >
> >


  1   2   >