how to add multiple value for a filter query in Solrj

2020-03-24 Thread Szűcs Roland
Hi All,

I use Solr 8.4.1 and the latest solrj client.
There is a field let's which can have 3 different values. If I use the
admin UI, I write to the fq the following: filterName:"value1"
filterName:"value2" and it is working as expected.
If I use solrJ SolrQuery.addFilterQuery method and call it twice like:
addFilterQuery(filterName+":\""+value1+"\"");
addFilterQuery(filterName+":\""+value2+"\"");
I got no any document back.

Can somebody help me what syntax is appropriate with solrj to add filter
queries one by one if there is one filter field but multiple values?

Thanks,

Roland


RE: how to add multiple value for a filter query in Solrj

2020-03-24 Thread Raboah, Avi
You can do something like that if we are talking on the same filter query name.

addFilterQuery(String.format("%s:(%s %s)", filterName, value1, value2));


-Original Message-
From: Szűcs Roland 
Sent: Tuesday, March 24, 2020 11:35 AM
To: solr-user@lucene.apache.org
Subject: how to add multiple value for a filter query in Solrj

Hi All,

I use Solr 8.4.1 and the latest solrj client.
There is a field let's which can have 3 different values. If I use the admin 
UI, I write to the fq the following: filterName:"value1"
filterName:"value2" and it is working as expected.
If I use solrJ SolrQuery.addFilterQuery method and call it twice like:
addFilterQuery(filterName+":\""+value1+"\"");
addFilterQuery(filterName+":\""+value2+"\"");
I got no any document back.

Can somebody help me what syntax is appropriate with solrj to add filter 
queries one by one if there is one filter field but multiple values?

Thanks,

Roland


This electronic message may contain proprietary and confidential information of 
Verint Systems Inc., its affiliates and/or subsidiaries. The information is 
intended to be for the use of the individual(s) or entity(ies) named above. If 
you are not the intended recipient (or authorized to receive this e-mail for 
the intended recipient), you may not use, copy, disclose or distribute to 
anyone this message or any information contained in this message. If you have 
received this electronic message in error, please notify us by replying to this 
e-mail.


Re: how to add multiple value for a filter query in Solrj

2020-03-24 Thread Szűcs Roland
Thanks Avi, it worked.

Raboah, Avi  ezt írta (időpont: 2020. márc. 24., K,
11:08):

> You can do something like that if we are talking on the same filter query
> name.
>
> addFilterQuery(String.format("%s:(%s %s)", filterName, value1, value2));
>
>
> -Original Message-
> From: Szűcs Roland 
> Sent: Tuesday, March 24, 2020 11:35 AM
> To: solr-user@lucene.apache.org
> Subject: how to add multiple value for a filter query in Solrj
>
> Hi All,
>
> I use Solr 8.4.1 and the latest solrj client.
> There is a field let's which can have 3 different values. If I use the
> admin UI, I write to the fq the following: filterName:"value1"
> filterName:"value2" and it is working as expected.
> If I use solrJ SolrQuery.addFilterQuery method and call it twice like:
> addFilterQuery(filterName+":\""+value1+"\"");
> addFilterQuery(filterName+":\""+value2+"\"");
> I got no any document back.
>
> Can somebody help me what syntax is appropriate with solrj to add filter
> queries one by one if there is one filter field but multiple values?
>
> Thanks,
>
> Roland
>
>
> This electronic message may contain proprietary and confidential
> information of Verint Systems Inc., its affiliates and/or subsidiaries. The
> information is intended to be for the use of the individual(s) or
> entity(ies) named above. If you are not the intended recipient (or
> authorized to receive this e-mail for the intended recipient), you may not
> use, copy, disclose or distribute to anyone this message or any information
> contained in this message. If you have received this electronic message in
> error, please notify us by replying to this e-mail.
>


edge ngram/find as you type sorting

2020-03-24 Thread matthew sporleder
I have added an edge ngram field to my index and get decent results
with partial words but the results appear randomly sorted and all
contain the same score.  Ideally I would like to sort by shortest
ngram match within my other qualifiers.

Is there a canonical solution to this?

Thanks,
Matt

p.s. I mostly followed
https://lucidworks.com/post/auto-suggest-from-popular-queries-using-edgengrams/

schema bits:


 
   
   
   
 

  

  





Re: edge ngram/find as you type sorting

2020-03-24 Thread Erick Erickson
Sort by the full field. You’ll need to copy to a field with keywordTokenizer 
and lowercaseFilter (string_ci? assuming it’s not really a :”string”) type.

Best,
Erick

> On Mar 24, 2020, at 7:10 AM, matthew sporleder  wrote:
> 
> I have added an edge ngram field to my index and get decent results
> with partial words but the results appear randomly sorted and all
> contain the same score.  Ideally I would like to sort by shortest
> ngram match within my other qualifiers.
> 
> Is there a canonical solution to this?
> 
> Thanks,
> Matt
> 
> p.s. I mostly followed
> https://lucidworks.com/post/auto-suggest-from-popular-queries-using-edgengrams/
> 
> schema bits:
> 
> 
> 
>   
>   
>maxGramSize="25" />
> 
> 
>   multiValued="false" />
> 
>   omitNorms="false" omitTermFreqAndPositions="true" multiValued="true"
> />
> 
> 
> 



Re: how to add multiple value for a filter query in Solrj

2020-03-24 Thread Erick Erickson
Your original formation of the filter query has two problems:

1> you included a “+” in the value. My guess is that you misinterpreted the 
 URL you got back from the browser in the admin UI where a “+” is a 
 URL-encoded space. You’ll also see a bunch of %XX in the URL which are
 other encodings.

2> you include double quotes, which can change things to be phrase queries.

Looking at the debug version would have helped you pinpoint these.


> On Mar 24, 2020, at 5:34 AM, Szűcs Roland  wrote:
> 
> Hi All,
> 
> I use Solr 8.4.1 and the latest solrj client.
> There is a field let's which can have 3 different values. If I use the
> admin UI, I write to the fq the following: filterName:"value1"
> filterName:"value2" and it is working as expected.
> If I use solrJ SolrQuery.addFilterQuery method and call it twice like:
> addFilterQuery(filterName+":\""+value1+"\"");
> addFilterQuery(filterName+":\""+value2+"\"");
> I got no any document back.
> 
> Can somebody help me what syntax is appropriate with solrj to add filter
> queries one by one if there is one filter field but multiple values?
> 
> Thanks,
> 
> Roland



[ANNOUNCE] Apache Solr 8.5.0 released

2020-03-24 Thread Alan Woodward
## 24 March 2020, Apache Solr™ 8.5.0 available

The Lucene PMC is pleased to announce the release of Apache Solr 8.5.0.

Solr is the popular, blazing fast, open source NoSQL search platform from the 
Apache Lucene project. Its major features include powerful full-text search, 
hit highlighting, faceted search, dynamic clustering, database integration, 
rich document handling, and geospatial search. Solr is highly scalable, 
providing fault tolerant distributed search and indexing, and powers the search 
and navigation features of many of the world's largest internet sites.

Solr 8.5.0 is available for immediate download at:

  

### Solr 8.5.0 Release Highlights:

 * A new queries property of JSON Request API let to declare queries in Query 
DSL format and refer to them by their names.
 * A new command line tool bin/postlogs allows you to index Solr logs into a 
Solr collection. This is helpful for log analysis and troubleshooting. 
Documentation is not yet integrated into the Solr Reference Guide, but is 
available in a branch via GitHub: 
https://github.com/apache/lucene-solr/blob/visual-guide/solr/solr-ref-guide/src/logs.adoc.
 * A new stream decorator delete() is available to help solve some issues with 
traditional delete-by-query, which can be expensive in large indexes.
 * Solr now has the ability to run with a Java Security Manager enabled.

Please read CHANGES.txt for a full list of changes:

  

Solr 8.5.0 also includes improvements and bugfixes in the corresponding Apache 
Lucene release:

  



Re: [ANNOUNCE] Apache Solr 8.5.0 released

2020-03-24 Thread Hasan Diwan
Congrats! -- H

On Tue, 24 Mar 2020 at 05:32, Alan Woodward  wrote:

> ## 24 March 2020, Apache Solr™ 8.5.0 available
>
> The Lucene PMC is pleased to announce the release of Apache Solr 8.5.0.
>
> Solr is the popular, blazing fast, open source NoSQL search platform from
> the Apache Lucene project. Its major features include powerful full-text
> search, hit highlighting, faceted search, dynamic clustering, database
> integration, rich document handling, and geospatial search. Solr is highly
> scalable, providing fault tolerant distributed search and indexing, and
> powers the search and navigation features of many of the world's largest
> internet sites.
>
> Solr 8.5.0 is available for immediate download at:
>
>   
>
> ### Solr 8.5.0 Release Highlights:
>
>  * A new queries property of JSON Request API let to declare queries in
> Query DSL format and refer to them by their names.
>  * A new command line tool bin/postlogs allows you to index Solr logs into
> a Solr collection. This is helpful for log analysis and troubleshooting.
> Documentation is not yet integrated into the Solr Reference Guide, but is
> available in a branch via GitHub:
> https://github.com/apache/lucene-solr/blob/visual-guide/solr/solr-ref-guide/src/logs.adoc
> .
>  * A new stream decorator delete() is available to help solve some issues
> with traditional delete-by-query, which can be expensive in large indexes.
>  * Solr now has the ability to run with a Java Security Manager enabled.
>
> Please read CHANGES.txt for a full list of changes:
>
>   
>
> Solr 8.5.0 also includes improvements and bugfixes in the corresponding
> Apache Lucene release:
>
>   
>
>

-- 
OpenPGP:
https://sks-keyservers.net/pks/lookup?op=get&search=0xFEBAD7FFD041BBA1
If you wish to request my time, please do so using
*bit.ly/hd1AppointmentRequest
*.
Si vous voudrais faire connnaisance, allez a *bit.ly/hd1AppointmentRequest
*.

Sent
from my mobile device
Envoye de mon portable


Re: edge ngram/find as you type sorting

2020-03-24 Thread matthew sporleder
Oh maybe a schema bug!

my string_ci:
 
 
  
  
 
  

going to try this instead:
  
 
  
  
 
  

Then I can probably kill the lowercasefilter on edgeytext:



On Tue, Mar 24, 2020 at 7:44 AM Erick Erickson  wrote:
>
> Sort by the full field. You’ll need to copy to a field with keywordTokenizer 
> and lowercaseFilter (string_ci? assuming it’s not really a :”string”) type.
>
> Best,
> Erick
>
> > On Mar 24, 2020, at 7:10 AM, matthew sporleder  wrote:
> >
> > I have added an edge ngram field to my index and get decent results
> > with partial words but the results appear randomly sorted and all
> > contain the same score.  Ideally I would like to sort by shortest
> > ngram match within my other qualifiers.
> >
> > Is there a canonical solution to this?
> >
> > Thanks,
> > Matt
> >
> > p.s. I mostly followed
> > https://lucidworks.com/post/auto-suggest-from-popular-queries-using-edgengrams/
> >
> > schema bits:
> >
> >  > positionIncrementGap="100">
> > 
> >   
> >   
> >> maxGramSize="25" />
> > 
> >
> >   > multiValued="false" />
> >
> >   > omitNorms="false" omitTermFreqAndPositions="true" multiValued="true"
> > />
> >
> >
> > 
>


Re: edge ngram/find as you type sorting

2020-03-24 Thread Erick Erickson
Won’t work. String types are totally unanalyzed. Your string_ci fieldType is 
what I was looking for.

No, you shouldn’t kill the lowercasefilter unless you want all of your searches 
will then be case-sensitive.

So you should try:

q=edgy_text:whatever&sort=string_ci asc

Please use the admin>>pick_core>>analysis page when thinking about changing 
your schema, it’ll answer a _lot_ of these questions immediately.

Best,
Erick

> On Mar 24, 2020, at 8:37 AM, matthew sporleder  wrote:
> 
> Oh maybe a schema bug!
> 
> my string_ci:
>  sortMissingLast="true" omitNorms="true">
> 
>  
>  
> 
>  
> 
> going to try this instead:
>   sortMissingLast="true" omitNorms="true">
> 
>  
>  
> 
>  
> 
> Then I can probably kill the lowercasefilter on edgeytext:
> 
> 
> 
> On Tue, Mar 24, 2020 at 7:44 AM Erick Erickson  
> wrote:
>> 
>> Sort by the full field. You’ll need to copy to a field with keywordTokenizer 
>> and lowercaseFilter (string_ci? assuming it’s not really a :”string”) type.
>> 
>> Best,
>> Erick
>> 
>>> On Mar 24, 2020, at 7:10 AM, matthew sporleder  wrote:
>>> 
>>> I have added an edge ngram field to my index and get decent results
>>> with partial words but the results appear randomly sorted and all
>>> contain the same score.  Ideally I would like to sort by shortest
>>> ngram match within my other qualifiers.
>>> 
>>> Is there a canonical solution to this?
>>> 
>>> Thanks,
>>> Matt
>>> 
>>> p.s. I mostly followed
>>> https://lucidworks.com/post/auto-suggest-from-popular-queries-using-edgengrams/
>>> 
>>> schema bits:
>>> 
>>> >> positionIncrementGap="100">
>>> 
>>>  
>>>  
>>>  >> maxGramSize="25" />
>>> 
>>> 
>>> >> multiValued="false" />
>>> 
>>> >> omitNorms="false" omitTermFreqAndPositions="true" multiValued="true"
>>> />
>>> 
>>> 
>>> 
>> 



Re: edge ngram/find as you type sorting

2020-03-24 Thread matthew sporleder
Okay I appreciate you responding.

Switching "slug" from "string_ci" class="solr.StrField" accomplished
about the same results, which makes sense to me now :)

The previous definition of string_ci was:
  
 
  
  
 
  

So lowercase + KeywordTokenizerFactory;

I am trying again with omitNorms=false  to see if I can get the more
"exact" matches to score better this time around.


On Tue, Mar 24, 2020 at 9:54 AM Erick Erickson  wrote:
>
> Won’t work. String types are totally unanalyzed. Your string_ci fieldType is 
> what I was looking for.
>
> No, you shouldn’t kill the lowercasefilter unless you want all of your 
> searches will then be case-sensitive.
>
> So you should try:
>
> q=edgy_text:whatever&sort=string_ci asc
>
> Please use the admin>>pick_core>>analysis page when thinking about changing 
> your schema, it’ll answer a _lot_ of these questions immediately.
>
> Best,
> Erick
>
> > On Mar 24, 2020, at 8:37 AM, matthew sporleder  wrote:
> >
> > Oh maybe a schema bug!
> >
> > my string_ci:
> >  > sortMissingLast="true" omitNorms="true">
> > 
> >  
> >  
> > 
> >  
> >
> > going to try this instead:
> >   > sortMissingLast="true" omitNorms="true">
> > 
> >  
> >  
> > 
> >  
> >
> > Then I can probably kill the lowercasefilter on edgeytext:
> >
> >
> >
> > On Tue, Mar 24, 2020 at 7:44 AM Erick Erickson  
> > wrote:
> >>
> >> Sort by the full field. You’ll need to copy to a field with 
> >> keywordTokenizer and lowercaseFilter (string_ci? assuming it’s not really 
> >> a :”string”) type.
> >>
> >> Best,
> >> Erick
> >>
> >>> On Mar 24, 2020, at 7:10 AM, matthew sporleder  
> >>> wrote:
> >>>
> >>> I have added an edge ngram field to my index and get decent results
> >>> with partial words but the results appear randomly sorted and all
> >>> contain the same score.  Ideally I would like to sort by shortest
> >>> ngram match within my other qualifiers.
> >>>
> >>> Is there a canonical solution to this?
> >>>
> >>> Thanks,
> >>> Matt
> >>>
> >>> p.s. I mostly followed
> >>> https://lucidworks.com/post/auto-suggest-from-popular-queries-using-edgengrams/
> >>>
> >>> schema bits:
> >>>
> >>>  >>> positionIncrementGap="100">
> >>> 
> >>>  
> >>>  
> >>>   >>> maxGramSize="25" />
> >>> 
> >>>
> >>>  >>> multiValued="false" />
> >>>
> >>>  >>> omitNorms="false" omitTermFreqAndPositions="true" multiValued="true"
> >>> />
> >>>
> >>>
> >>> 
> >>
>


FW: SOLR version 8 bug???

2020-03-24 Thread Staley, Phil R - DCF
I just updated to SOLR 8.5.0 on one of our test servers and I continue to get 
the same issue/bug I described below.  Below my description of the problem I 
have also included the log message detail from Drupal.

This is the third time I have submitted this item.  For the time being we will 
continue to run SOLR 7.7.2 in production.

Thanks,

Phil Staley
Webmaster
Wisconsin Dept. of Childern and Families
608 422-6569
phil.sta...@wisconsin.gov

We recently upgraded to our Drupal 8 sites to SOLR 8.3.1.  We are now getting 
reports of certain patterns of search terms resulting in an error that reads, 
“The website encountered an unexpected error. Please try again later.”

Below is a list of example terms that always result in this error and a similar 
list that works fine.  The problem pattern seems to be a search term that 
contains 2 or 3 characters followed by a space, followed by additional text.

To confirm that the problem is version 8 of SOLR, I have updated our local and 
UAT sites with the latest Drupal updates that did include an update to the 
Search API Solr module and tested the terms below under SOLR 7.7.2, 8.3.1, and 
8.4.1.  Under version 7.7.2  everything works fine. Under either of the version 
8, the problem returns.

Thoughts?

 Search terms that result in error

• w-2 agency directory

• agency w-2 directory

• w-2 agency

• w-2 directory

• w2 agency directory

• w2 agency

• w2 directory

 Search terms that do not result in error

• w-22 agency directory

• agency directory w-2

• agency w-2directory

• agencyw-2 directory

• w-2

• w2

• agency directory

• agency • directory

• -2 agency directory

• 2 agency directory

• w-2agency directory

• w2agency directory
 Drupal\search_api_solr\SearchApiSolrException: An error occurred while trying 
to search with Solr: { "error":{ "msg":"0", 
"trace":"java.lang.ArrayIndexOutOfBoundsException: 0\n\tat 
org.apache.lucene.util.QueryBuilder.newSynonymQuery(QueryBuilder.java:701)\n\tat
 
org.apache.solr.parser.SolrQueryParserBase.newSynonymQuery(SolrQueryParserBase.java:636)\n\tat
 
org.apache.lucene.util.QueryBuilder.analyzeGraphBoolean(QueryBuilder.java:581)\n\tat
 
org.apache.lucene.util.QueryBuilder.createFieldQuery(QueryBuilder.java:343)\n\tat
 
org.apache.lucene.util.QueryBuilder.createFieldQuery(QueryBuilder.java:263)\n\tat
 
org.apache.solr.parser.SolrQueryParserBase.newFieldQuery(SolrQueryParserBase.java:527)\n\tat
 org.apache.solr.parser.QueryParser.newFieldQuery(QueryParser.java:62)\n\tat 
org.apache.solr.parser.SolrQueryParserBase.getFieldQuery(SolrQueryParserBase.java:1141)\n\tat
 org.apache.solr.parser.QueryParser.MultiTerm(QueryParser.java:593)\n\tat 
org.apache.solr.parser.QueryParser.Query(QueryParser.java:142)\n\tat 
org.apache.solr.parser.QueryParser.Clause(QueryParser.java:282)\n\tat 
org.apache.solr.parser.QueryParser.Query(QueryParser.java:162)\n\tat 
org.apache.solr.parser.QueryParser.Clause(QueryParser.java:282)\n\tat 
org.apache.solr.parser.QueryParser.Query(QueryParser.java:162)\n\tat 
org.apache.solr.parser.QueryParser.Clause(QueryParser.java:282)\n\tat 
org.apache.solr.parser.QueryParser.Query(QueryParser.java:162)\n\tat 
org.apache.solr.parser.QueryParser.TopLevelQuery(QueryParser.java:131)\n\tat 
org.apache.solr.parser.SolrQueryParserBase.parse(SolrQueryParserBase.java:263)\n\tat
 org.apache.solr.search.LuceneQParser.parse(LuceneQParser.java:49)\n\tat 
org.apache.solr.search.QParser.getQuery(QParser.java:174)\n\tat 
org.apache.solr.handler.component.QueryComponent.prepare(QueryComponent.java:161)\n\tat
 
org.apache.solr.handler.component.SearchHandler.handleRequestBody(SearchHandler.java:302)\n\tat
 
org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:211)\n\tat
 org.apache.solr.core.SolrCore.execute(SolrCore.java:2596)\n\tat 
org.apache.solr.servlet.HttpSolrCall.execute(HttpSolrCall.java:802)\n\tat 
org.apache.solr.servlet.HttpSolrCall.call(HttpSolrCall.java:579)\n\tat 
org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:420)\n\tat
 
org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:352)\n\tat
 
org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1596)\n\tat
 
org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:545)\n\tat
 
org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:143)\n\tat
 
org.eclipse.jetty.security.SecurityHandler.handle(SecurityHandler.java:590)\n\tat
 
org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:127)\n\tat
 
org.eclipse.jetty.server.handler.ScopedHandler.nextHandle(ScopedHandler.java:235)\n\tat
 
org.eclipse.jetty.server.session.SessionHandler.doHandle(SessionHandler.java:1607)\n\tat
 
org.eclipse.jetty.server.handler.ScopedHandler.nextHandle(ScopedHandler.java:233)\n\tat
 
org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1297)\n\tat
 
org.eclipse.jetty.server.h

Searching individual pages in solr

2020-03-24 Thread Dustin Lebsock
Hi!

I'm looking for some guidance on engineering a solution for searching 
individual pages of PDF documents. I currently have a SolrCloud setup that uses 
an external tika server to extract text data from PDFs. I'd like to be able to 
search individual pages for search results and for the overall documents 
themselves (such as titles that link to external repo). I'm having trouble 
coming up with a clean solution.

I ran across a discussion on stackoverflow about this found here:
https://stackoverflow.com/a/50160163

I can't really see the pros and cons verse indexing a single document with 
multiple fields for each page vs indexing each page separately and using group 
queries. What does the solr community recommend?

Thank you for all the help!

Dustin Lebsock


Re: FW: SOLR version 8 bug???

2020-03-24 Thread Charlie Hull

Hi Phil,

The error you mention “The website encountered an unexpected error. 
Please try again later.” isn't being generated by Solr but by Drupal. We 
can't tell from the error text you're providing what the Drupal Solr 
plugin is actually sending to Solr as a query I'm afraid: if you could 
figure that out, then you could try running it on Solr itself using the 
standard Solr API and see if you could reproduce the problem. It's also 
certainly possible that your Drupal system is sending something crazy to 
Solr which causes the error. You might look into raising the Solr 
logging level (see 
https://lucene.apache.org/solr/guide/8_5/configuring-logging.html)


Best

Charlie

On 24/03/2020 15:38, Staley, Phil R - DCF wrote:

I just updated to SOLR 8.5.0 on one of our test servers and I continue to get 
the same issue/bug I described below.  Below my description of the problem I 
have also included the log message detail from Drupal.

This is the third time I have submitted this item.  For the time being we will 
continue to run SOLR 7.7.2 in production.

Thanks,

Phil Staley
Webmaster
Wisconsin Dept. of Childern and Families
608 422-6569
phil.sta...@wisconsin.gov

We recently upgraded to our Drupal 8 sites to SOLR 8.3.1.  We are now getting 
reports of certain patterns of search terms resulting in an error that reads, 
“The website encountered an unexpected error. Please try again later.”

Below is a list of example terms that always result in this error and a similar 
list that works fine.  The problem pattern seems to be a search term that 
contains 2 or 3 characters followed by a space, followed by additional text.

To confirm that the problem is version 8 of SOLR, I have updated our local and 
UAT sites with the latest Drupal updates that did include an update to the 
Search API Solr module and tested the terms below under SOLR 7.7.2, 8.3.1, and 
8.4.1.  Under version 7.7.2  everything works fine. Under either of the version 
8, the problem returns.

Thoughts?

  Search terms that result in error

• w-2 agency directory

• agency w-2 directory

• w-2 agency

• w-2 directory

• w2 agency directory

• w2 agency

• w2 directory

  Search terms that do not result in error

• w-22 agency directory

• agency directory w-2

• agency w-2directory

• agencyw-2 directory

• w-2

• w2

• agency directory

• agency • directory

• -2 agency directory

• 2 agency directory

• w-2agency directory

• w2agency directory
  Drupal\search_api_solr\SearchApiSolrException: An error occurred while trying to search with Solr: { "error":{ 
"msg":"0", "trace":"java.lang.ArrayIndexOutOfBoundsException: 0\n\tat 
org.apache.lucene.util.QueryBuilder.newSynonymQuery(QueryBuilder.java:701)\n\tat 
org.apache.solr.parser.SolrQueryParserBase.newSynonymQuery(SolrQueryParserBase.java:636)\n\tat 
org.apache.lucene.util.QueryBuilder.analyzeGraphBoolean(QueryBuilder.java:581)\n\tat 
org.apache.lucene.util.QueryBuilder.createFieldQuery(QueryBuilder.java:343)\n\tat 
org.apache.lucene.util.QueryBuilder.createFieldQuery(QueryBuilder.java:263)\n\tat 
org.apache.solr.parser.SolrQueryParserBase.newFieldQuery(SolrQueryParserBase.java:527)\n\tat 
org.apache.solr.parser.QueryParser.newFieldQuery(QueryParser.java:62)\n\tat 
org.apache.solr.parser.SolrQueryParserBase.getFieldQuery(SolrQueryParserBase.java:1141)\n\tat 
org.apache.solr.parser.QueryParser.MultiTerm(QueryParser.java:593)\n\tat org.apache.solr.parser.QueryParser.Query(QueryParser.java:142)\n\tat 
org.apache.solr.parser.QueryParser.Clause(QueryParser.java:282)\n\tat org.apache.solr.parser.QueryParser.Query(QueryParser.java:162)\n\tat 
org.apache.solr.parser.QueryParser.Clause(QueryParser.java:282)\n\tat org.apache.solr.parser.QueryParser.Query(QueryParser.java:162)\n\tat 
org.apache.solr.parser.QueryParser.Clause(QueryParser.java:282)\n\tat org.apache.solr.parser.QueryParser.Query(QueryParser.java:162)\n\tat 
org.apache.solr.parser.QueryParser.TopLevelQuery(QueryParser.java:131)\n\tat 
org.apache.solr.parser.SolrQueryParserBase.parse(SolrQueryParserBase.java:263)\n\tat 
org.apache.solr.search.LuceneQParser.parse(LuceneQParser.java:49)\n\tat org.apache.solr.search.QParser.getQuery(QParser.java:174)\n\tat 
org.apache.solr.handler.component.QueryComponent.prepare(QueryComponent.java:161)\n\tat 
org.apache.solr.handler.component.SearchHandler.handleRequestBody(SearchHandler.java:302)\n\tat 
org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:211)\n\tat 
org.apache.solr.core.SolrCore.execute(SolrCore.java:2596)\n\tat org.apache.solr.servlet.HttpSolrCall.execute(HttpSolrCall.java:802)\n\tat 
org.apache.solr.servlet.HttpSolrCall.call(HttpSolrCall.java:579)\n\tat 
org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:420)\n\tat 
org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:352)\n\tat 
org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1596)\n\tat 
org.ec

Re: Searching individual pages in solr

2020-03-24 Thread Erick Erickson
Well, given the structure of an inverted index, how would you have a clue what 
page the hit was on? You could conceivably index enough data with payloads and 
the like, but that’d cause a lot more bloat than just indexing each page.

Using grouping would allow you to show, say, the top three pages from the books 
with the highest score on an individual page basis.

But there are complications (aren’t there always?). Consider a page with one 
sentence. Indexed as an individual document, it might score quite high even if 
not the best choice. Or any embedded illustrations, what do you do with those? 
Index the caption os apart of the text? Ignore the caption? Etc.

I’d certainly start with a doc-per-page. Not quite sure what I’d do with the 
title and such, but that depends on your use-case.

Best,
Erick

> On Mar 24, 2020, at 12:22 PM, Dustin Lebsock  
> wrote:
> 
> Hi!
> 
> I'm looking for some guidance on engineering a solution for searching 
> individual pages of PDF documents. I currently have a SolrCloud setup that 
> uses an external tika server to extract text data from PDFs. I'd like to be 
> able to search individual pages for search results and for the overall 
> documents themselves (such as titles that link to external repo). I'm having 
> trouble coming up with a clean solution.
> 
> I ran across a discussion on stackoverflow about this found here:
> https://stackoverflow.com/a/50160163
> 
> I can't really see the pros and cons verse indexing a single document with 
> multiple fields for each page vs indexing each page separately and using 
> group queries. What does the solr community recommend?
> 
> Thank you for all the help!
> 
> Dustin Lebsock



Best practices to return total records vs total filtered records in query?

2020-03-24 Thread Todd Stevenson
I have a use case that I would think is a common one but I cannot find any help 
with this use case.

I am wanting to do a query that returns a list of records that I will display 
in an html table in an app.  This table only displays n records of the complete 
data set, but is able to page through the data set.   This use case is handled 
wonderfully by Solr, by specifying the offset and limit of the records in the 
data set and repeatedly rerunning the query.

Another facet of this use case is to be able to filter the records returned, 
typically by typing filter text in a search box.  This should filter the 
records and only display those that match the filter string.  This works fine 
in Solr with one exception.  I would like to be able to display with the table 
the total number of record in the data set (which I can) and the total filtered 
records in the data set (which I cannot).

Is there a way for Solr to return the total record count and the total filtered 
record count (possibly based on the q and fq queries) ?

The only way I can see to do this is to run the query twice,  once with the 
filter string included and once without.  This seems to e terribly inefficient. 
 Is there a better way?

Todd Stevenson
Care Transformation Application Developer
Intermountain Healthcare
3930 Parkway Blvd |Salt Lake City, UT 84120
Office: 801-442-5112 | Cell: 801-589-1115
todd.steven...@imail.org
[cid:image003.jpg@01D601FD.EACE4A70]

NOTICE: This e-mail is for the sole use of the intended recipient and may 
contain confidential and privileged information. If you are not the intended 
recipient, you are prohibited from reviewing, using, disclosing or distributing 
this e-mail or its contents. If you have received this e-mail in error, please 
contact the sender by reply e-mail and destroy all copies of this e-mail and 
its contents.


Re: Best practices to return total records vs total filtered records in query?

2020-03-24 Thread Erick Erickson
I’m not sure I get the problem.

How do you “filter the records and only display those that match the filter 
string”? Do you attach an fq clause to the original query? If so, the return 
set _is_ the number of docs that match the filter (and the original query), and 
the numFound from the original query could be preserved on the app side. Or you 
could send a bogus parameter with the original count with the query and it 
should be echoed back in the results.

OTOH, if what you’re after is getting the total number of docs matched by 
_just_ the filter query including docs not matched by the original query, then 
no, there’s nothing in Solr to do that OOB, you’d have to send the query again.

If you were willing to write some Solr code, fqs are just a bitSet that you 
could return in the result set. I’m not at all sure how difficult that would be.

But before going there, let’s assume 1> you haven’t specified 
fq={!cached=false}… and 2> your entry in filterCache won’t be aged out by the 
time you get to asking about it. There’s very little work done if all the query 
has to do is check the filterCache. You might want to just try it and see what 
the QTimes are. The sequence would be:

1> q=original query
2> user types some stuff in the text box
3> q=original query&fq=stuff in the text box
4> q=*:*&fq=stuff in the text box&rows=0

q=*:* is a special short-circuit query that does very little work and by 
specifying rows=0 you’re not returning any docs. The numFound returned from <4> 
is the number of docs matched _only_ by the fq clause. It shouldn’t be nearly 
as expensive as you fear, I’d measure first before doing any Solr coding.

Or all this is off base and I don’t understand the problem at all.

Best,
Erick


> On Mar 24, 2020, at 7:01 PM, Todd Stevenson  wrote:
> 
> I have a use case that I would think is a common one but I cannot find any 
> help with this use case.
>  
> I am wanting to do a query that returns a list of records that I will display 
> in an html table in an app.  This table only displays n records of the 
> complete data set, but is able to page through the data set.   This use case 
> is handled wonderfully by Solr, by specifying the offset and limit of the 
> records in the data set and repeatedly rerunning the query. 
>  
> Another facet of this use case is to be able to filter the records returned, 
> typically by typing filter text in a search box.  This should filter the 
> records and only display those that match the filter string.  This works fine 
> in Solr with one exception.  I would like to be able to display with the 
> table the total number of record in the data set (which I can) and the total 
> filtered records in the data set (which I cannot). 
>  
> Is there a way for Solr to return the total record count and the total 
> filtered record count (possibly based on the q and fq queries) ? 
>  
> The only way I can see to do this is to run the query twice,  once with the 
> filter string included and once without.  This seems to e terribly 
> inefficient.  Is there a better way?
>  
> Todd Stevenson
> Care Transformation Application Developer
> Intermountain Healthcare
> 3930 Parkway Blvd |Salt Lake City, UT 84120
> Office: 801-442-5112 | Cell: 801-589-1115
> todd.steven...@imail.org
> 
>  
> NOTICE: This e-mail is for the sole use of the intended recipient and may 
> contain confidential and privileged information. If you are not the intended 
> recipient, you are prohibited from reviewing, using, disclosing or 
> distributing this e-mail or its contents. If you have received this e-mail in 
> error, please contact the sender by reply e-mail and destroy all copies of 
> this e-mail and its contents.



Re: How to get boosted field and values?

2020-03-24 Thread Yasufumi Mizoguchi
Hi,

I think "debug" query parameter or "explain" document transformer will help
you
to know which fields and query conditions are boosted.

https://lucene.apache.org/solr/guide/7_5/common-query-parameters.html
https://lucene.apache.org/solr/guide/7_5/transforming-result-documents.html

Thanks,
Yasufumi

2020年3月23日(月) 10:27 Taisuke Miyazaki :

> The blog looks like it's going to be useful from now on, so I'll take a
> look.Thank you.
>
> What I wanted, however, was a way to know what field was boosted as a
> result.
> But I couldn't find a way to do that, so instead I tried to get the field
> and value out of the resulting score by putting a binary bit on the
> field/value pair.
> It doesn't really matter to me whether you do it additively or
> multiplicatively, as it's good to know the field boosted as a result.
>
> Do you see what I mean?
>
>
> 2020年3月20日(金) 18:56 Alessandro Benedetti :
>
> > Hi Taisuke,
> > there are various ways of approaching boosting and scoring in Apache
> Solr.
> > First of all you must decide if you are interested in multiplicative or
> > additive boost.
> > Multiplicative will multiply the score of your search result by a certain
> > factor while the additive will just add the factor to the final score.
> >
> > Using advanced query parsers such as the dismax and edismax you can use
> the
> > :
> > *boost* parameter - multiplicative - takes function in input -
> >
> >
> https://lucene.apache.org/solr/guide/6_6/the-extended-dismax-query-parser.html#TheExtendedDisMaxQueryParser-TheboostParameter
> > *bq*(boost query) - additive -
> >
> >
> https://lucene.apache.org/solr/guide/6_6/the-dismax-query-parser.html#TheDisMaxQueryParser-Thebq_BoostQuery_Parameter
> > *bf*(boost function) - additive -
> >
> >
> https://lucene.apache.org/solr/guide/6_6/the-dismax-query-parser.html#TheDisMaxQueryParser-Thebf_BoostFunctions_Parameter
> >
> > This blog post is old but should help :
> > https://nolanlawson.com/2012/06/02/comparing-boost-methods-in-solr/
> >
> > Then you can boost fields or even specific query clauses:
> >
> >  1)
> >
> >
> https://lucene.apache.org/solr/guide/6_6/the-dismax-query-parser.html#TheDisMaxQueryParser-Theqf_QueryFields_Parameter
> >
> > 2) q= features:2^1.0 AND features:3^5.0
> >
> > 1.0 is the default, you are multiplying the score contribution of the
> term
> > by 1.0, so no effect.
> > features:3^5.0 means that the score contribution of a match for the term
> > '3' in the field 'features' will be multiplied by 5.0 (you can also see
> > that enabling debug=results
> >
> > Finally you can force the score contribution of a term to be a constant,
> > it's not recommended unless you are truly confident you don't need other
> > types of scoring:
> > q= features:2^=1.0 AND features:3^=5.0
> >
> > in this example your document  id: 3 will have a score of 6.0
> >
> > Not sure if this answers your question, if not feel free to elaborate
> more.
> >
> > Cheers
> >
> > --
> > Alessandro Benedetti
> > Search Consultant, R&D Software Engineer, Director
> > www.sease.io
> >
> >
> > On Thu, 19 Mar 2020 at 11:18, Taisuke Miyazaki <
> miyazakitais...@lifull.com
> > >
> > wrote:
> >
> > > I'm using Solr 7.5.0.
> > > I want to get boosted field and values per documents.
> > >
> > > e.g.
> > > documents:
> > >   id: 1, features: [1]
> > >   id: 2, features: [1,2]
> > >   id: 3, features: [1,2,3]
> > >
> > > query:
> > >   bq: features:2^1.0 AND features:3^1.0
> > >
> > > I expect results like below.
> > > boosted:
> > >   - id: 2
> > > - field: features, value: 2
> > >   - id: 3
> > > - field: features, value: 2
> > > - field: features, value: 3
> > >
> > > I have an idea that set boost score like bit-flag, but it's not good I
> > > think because I must send query twice.
> > >
> > > bit-flag:
> > >   bq: features:2^2.0 AND features:3^4.0
> > >   docs:
> > > - id: 1, score: 1.0(0x001)
> > > - id: 2, score: 3.0(0x011) # have feature:2(2nd bit is 1)
> > > - id: 3, score: 7.0(0x111) # have feature:2 and feature:3(2nd and
> 3rd
> > > bit are 1)
> > > check score value then I can get boosted field.
> > >
> > > Is there a better way?
> > >
> >
>