Recommended Java Distribution

2020-03-19 Thread Kayak28
Hello, Solr Community:

My customer would like to use Amazon Corretto JDK instead of OpenJDK.

I wonder if it is ok to say, "yes, you can use" or I should not recommend
it at all.

Is anyone in the Community using Amazon Corretto for your Solr?

Have you ever had any problems with that?

If you share any experience, I would be really appreciated.


-- 

Sincerely,
Kaya
github: https://github.com/28kayak


How to get boosted field and values?

2020-03-19 Thread Taisuke Miyazaki
I'm using Solr 7.5.0.
I want to get boosted field and values per documents.

e.g.
documents:
  id: 1, features: [1]
  id: 2, features: [1,2]
  id: 3, features: [1,2,3]

query:
  bq: features:2^1.0 AND features:3^1.0

I expect results like below.
boosted:
  - id: 2
- field: features, value: 2
  - id: 3
- field: features, value: 2
- field: features, value: 3

I have an idea that set boost score like bit-flag, but it's not good I
think because I must send query twice.

bit-flag:
  bq: features:2^2.0 AND features:3^4.0
  docs:
- id: 1, score: 1.0(0x001)
- id: 2, score: 3.0(0x011) # have feature:2(2nd bit is 1)
- id: 3, score: 7.0(0x111) # have feature:2 and feature:3(2nd and 3rd
bit are 1)
check score value then I can get boosted field.

Is there a better way?


Re: Recommended Java Distribution

2020-03-19 Thread Jan Høydahl
Our official statement is here

https://lucene.apache.org/solr/guide/8_4/solr-system-requirements.html#sources-for-java

I have no experience with Corretto in production, but Amazon uses it heavily 
for all their Java workloads in the cloud. I believe it is based on OpenJDK but 
with Amazon’s own patches.
I would not hesitate to make such a decision. But perhaps people with 
first-hand experience can share what they found?

Jan

> 19. mar. 2020 kl. 11:13 skrev Kayak28 :
> 
> Hello, Solr Community:
> 
> My customer would like to use Amazon Corretto JDK instead of OpenJDK.
> 
> I wonder if it is ok to say, "yes, you can use" or I should not recommend
> it at all.
> 
> Is anyone in the Community using Amazon Corretto for your Solr?
> 
> Have you ever had any problems with that?
> 
> If you share any experience, I would be really appreciated.
> 
> 
> -- 
> 
> Sincerely,
> Kaya
> github: https://github.com/28kayak



Solr query Slow in Solr 6.1.0

2020-03-19 Thread vishal patel
I am using solr 6.1.0. We have 2 shards and each has one replica. Our index 
size is very large.
I find out that position of field in query will impact of performance.

If I made below query I got slow response

(doc_ref:((*KON\-N2*) )) AND (title:((*cdrl*) )) AND project_id:(2104616) AND 
is_active:true AND ((isLatest:(true) AND isFolderActive:true AND isXref:false 
AND -document_type_id:(3 7) AND ((is_public:true OR distribution_list:1 OR 
folderadmin_list:1 OR author_user_id:1) AND (((allowedUsers:(1) OR 
allowedRoles:( 6440215 6368478) OR combinationUsers:(1)) AND 
-blockedUsers:(1)) OR (defaultAccess:(true) AND -blockedUsers:(1) AND 
-blockedRoles:( 6440215 6368478) OR (isLatestRevPrivate:(true) AND 
allowedUsersForPvtRev:(1) AND -folderadmin_list:(1)))

If I changed (doc_ref:((*KON\-N2*) )) AND (title:((*cdrl*) )) part in last then 
got fast response compare to above.

project_id:(2104616) AND is_active:true AND ((isLatest:(true) AND 
isFolderActive:true AND isXref:false AND -document_type_id:(3 7) AND 
((is_public:true OR distribution_list:1 OR folderadmin_list:1 OR 
author_user_id:1) AND (((allowedUsers:(1) OR allowedRoles:( 6440215 
6368478) OR combinationUsers:(1)) AND -blockedUsers:(1)) OR 
(defaultAccess:(true) AND -blockedUsers:(1) AND -blockedRoles:( 6440215 
6368478) OR (isLatestRevPrivate:(true) AND allowedUsersForPvtRev:(1) 
AND -folderadmin_list:(1))) AND (doc_ref:((*KON\-N2*) )) AND 
(title:((*cdrl*) ))

Is it possible? How does Solr execute this query? field sequence is matter for 
performance?
I want to know the step by step Solr query execution same like database query 
because I will arrange field for better performance.

Regards,
Vishal

Sent from Outlook


Re: How do *you* restrict access to Solr?

2020-03-19 Thread Mark H. Wood
On Mon, Mar 16, 2020 at 11:43:10AM -0400, Ryan W wrote:
> On Mon, Mar 16, 2020 at 11:40 AM Walter Underwood 
> wrote:
> 
> > Also, even if you prevent access to the admin UI, a request to /update can
> > delete
> > all the content. It is really easy. This Gist shows how.
> >
> > https://gist.github.com/nz/673027/313f70681daa985ea13ba33a385753aef951a0f3
> 
> 
> 
> This seems important.  In other words, my work isn't necessarily done if
> I've secured the graphical UI.  I can't just visit the admin UI page to see
> if my efforts are successful.

It is VERY IMPORTANT.  You are correct.  The Admin. GUI is just a
convenience layer over extensive REST APIs.  You need to secure access
to the APIs, not just the admin. application that runs on top of them.

If all use is from the local host, then running Solr only on the
loopback address will keep outsiders from connecting to any part of
it.

If other internal hosts need access, then I would run Solr only on an
RFC1918 (non-routed) address, and set up the Solr host's firewall to
grant access to Solr's port (8983 by default) only from permitted hosts.

  https://tools.ietf.org/html/rfc1918

Who/what needs access to Solr?  Do you need to grant different levels
of access to specific groups of users?  Then you need something like
Role-Based Access Control.  This is true even if access is only
internal or even just from the same host.  Address-based controls only
divide the universe between those who can do nothing to your Solr and
those who can do *everything* to your Solr.

-- 
Mark H. Wood
Lead Technology Analyst

University Library
Indiana University - Purdue University Indianapolis
755 W. Michigan Street
Indianapolis, IN 46202
317-274-0749
www.ulib.iupui.edu


signature.asc
Description: PGP signature


Re: Solr query Slow in Solr 6.1.0

2020-03-19 Thread Erick Erickson
First of all, if you’re really using pre-and-postfix wildcards and those 
asterisks are not just bold formatting, those are very expensive operations. 
I’d suggest you investigate alternatives (like ngramming) or other alternate 
ways of analyzing your input (both at indexing and query time) before trying to 
understand the internals of query processing. If for no other reason than 
there’s no guarantee that the internal processing will remain the same between 
versions of Solr.

Second, I suspect that you’re seeing spurious speedups. There’s a lot of work 
that has to be done to gather terms like *cdrl*, so you may be getting caching. 
Unless you run a rigorous test (i.e. use many different queries with similar 
form), your timing is unreliable.

Third, try adding &debug=query to some of these and looking at the output, 
that’ll give you some clue how the queries are parsed.

Best,
Erick

> On Mar 19, 2020, at 9:08 AM, vishal patel  
> wrote:
> 
> I am using solr 6.1.0. We have 2 shards and each has one replica. Our index 
> size is very large.
> I find out that position of field in query will impact of performance.
> 
> If I made below query I got slow response
> 
> (doc_ref:((*KON\-N2*) )) AND (title:((*cdrl*) )) AND project_id:(2104616) AND 
> is_active:true AND ((isLatest:(true) AND isFolderActive:true AND isXref:false 
> AND -document_type_id:(3 7) AND ((is_public:true OR distribution_list:1 
> OR folderadmin_list:1 OR author_user_id:1) AND 
> (((allowedUsers:(1) OR allowedRoles:( 6440215 6368478) OR 
> combinationUsers:(1)) AND -blockedUsers:(1)) OR (defaultAccess:(true) 
> AND -blockedUsers:(1) AND -blockedRoles:( 6440215 6368478) OR 
> (isLatestRevPrivate:(true) AND allowedUsersForPvtRev:(1) AND 
> -folderadmin_list:(1)))
> 
> If I changed (doc_ref:((*KON\-N2*) )) AND (title:((*cdrl*) )) part in last 
> then got fast response compare to above.
> 
> project_id:(2104616) AND is_active:true AND ((isLatest:(true) AND 
> isFolderActive:true AND isXref:false AND -document_type_id:(3 7) AND 
> ((is_public:true OR distribution_list:1 OR folderadmin_list:1 OR 
> author_user_id:1) AND (((allowedUsers:(1) OR allowedRoles:( 
> 6440215 6368478) OR combinationUsers:(1)) AND -blockedUsers:(1)) OR 
> (defaultAccess:(true) AND -blockedUsers:(1) AND -blockedRoles:( 
> 6440215 6368478) OR (isLatestRevPrivate:(true) AND 
> allowedUsersForPvtRev:(1) AND -folderadmin_list:(1))) AND 
> (doc_ref:((*KON\-N2*) )) AND (title:((*cdrl*) ))
> 
> Is it possible? How does Solr execute this query? field sequence is matter 
> for performance?
> I want to know the step by step Solr query execution same like database query 
> because I will arrange field for better performance.
> 
> Regards,
> Vishal
> 
> Sent from Outlook



Re: Recommended Java Distribution

2020-03-19 Thread Eric Buss
Hi Kaya,

We have been using Amazon Corretto for Solr for the past 6 months without 
issue. We did not notice any difference from running on Open JDK prior to that.

Cheers
Eric

On 2020-03-19, 6:04 AM, "Jan Høydahl"  wrote:

Our official statement is here


https://lucene.apache.org/solr/guide/8_4/solr-system-requirements.html#sources-for-java

I have no experience with Corretto in production, but Amazon uses it 
heavily for all their Java workloads in the cloud. I believe it is based on 
OpenJDK but with Amazon’s own patches.
I would not hesitate to make such a decision. But perhaps people with 
first-hand experience can share what they found?

Jan

> 19. mar. 2020 kl. 11:13 skrev Kayak28 :
> 
> Hello, Solr Community:
> 
> My customer would like to use Amazon Corretto JDK instead of OpenJDK.
> 
> I wonder if it is ok to say, "yes, you can use" or I should not recommend
> it at all.
> 
> Is anyone in the Community using Amazon Corretto for your Solr?
> 
> Have you ever had any problems with that?
> 
> If you share any experience, I would be really appreciated.
> 
> 
> -- 
> 
> Sincerely,
> Kaya
> github: https://github.com/28kayak





Core name mismatch in Solr admin panel 8.3

2020-03-19 Thread vishal patel
I am upgrading Solr 6.1 to 8.3. I am creating collection using below API
http://10.31.32.29:8983/solr/admin/collections?_=1578288589068&action=CREATE&autoAddReplicas=false&collection.configName=catalogue&maxShardsPerNode=2&name=catalogue&numShards=2&replicationFactor=1&router.name=compositeId&wt=json&createNodeSet=10.31.32.29:8983_solr,15.21.12.21:8983_solr&createNodeSet.shuffle=false&property.name=catalogue

My solr\catalogue_shard1_replica_n1\core.properties below:

numShards=2
collection.configName=catalogue
name=catalogue
replicaType=NRT
shard=shard1
collection=catalogue
coreNodeName=core_node3


I found out core name as catalogue_shard1_replica_n1 When I saw in Admin panel. 
Why core name is mismatch in core property and admin panel.
After I got same name when rename API call : 
10.31.32.29:8983/solr/admin/cores?action=RENAME&core=catalogue_shard1_replica_n1&other=catalogue

Actually I want to do same name of core and collection. Can I do at the time of 
collection creation?

Sent from Outlook