Hi Kudret,
What is your configuration for your /highlight requestHandler in
solrconfig.xml?
And also the query that you used when you get your above output?
Regards,
Edwin
On Fri, 28 Sep 2018 at 07:33, Kudrettin Güleryüz
wrote:
> Hi,
>
> For some queries, response object returns matches withou
Community members help each other out when you behave with decency. This
man definitely doesn't know how to.
[image: Screen Shot 2018-09-28 at 1.07.11 AM.png]
I want to make sure he gets recognized IF he ever reach out to the mailing
list:
https://lnkd.in/fWkfDCv
Malaviya National Institute of T
Hi,
For some queries, response object returns matches without any highlight
information. Solr node doesn't report any errors in Solr log.
query term is g12312 number of matches is 7 only 4 of them gets highlight
snippets. Any suggestions?
"highlighting":{
".../sources/test.cpp":{
"body
Yeah, I think your plan sounds fine.
Do you have a specific use case for diversity of results. I've been
wondering if diversity of results would provide better perceived relevance.
Joel Bernstein
http://joelsolr.blogspot.com/
On Thu, Sep 27, 2018 at 1:39 PM Diego Ceccarelli (BLOOMBERG/ LONDON)
Please post the Streaming Expression that you are using.
Joel Bernstein
http://joelsolr.blogspot.com/
On Thu, Sep 27, 2018 at 6:52 PM RAUNAK AGRAWAL
wrote:
> Hi Guys,
>
> Just to give you context, we were using JSON Facets for doing analytical
> queries in solr but they were slower. Hence we
If you have: true
Then you will get errors with the export handler.
After you set this to false do the errors go away?
Joel Bernstein
http://joelsolr.blogspot.com/
On Thu, Sep 27, 2018 at 6:40 PM Gaini Rajeshwar <
raja.rajeshwar2...@gmail.com> wrote:
> Is it anyway related to the following ?
Hi Guys,
Just to give you context, we were using JSON Facets for doing analytical
queries in solr but they were slower. Hence we migrated our application to
use solr streaming facet queries.
But for last few days, we are observing now that streaming facet response
is slower that json facets. Also
Is it anyway related to the following ?
https://jira.apache.org/jira/browse/SOLR-8291
I will test this by turning off
true
to check if it helps.
On Fri, Sep 28, 2018 at 3:58 AM Joel Bernstein wrote:
> Ok, that stack trace shows where the problem. I'll investigate and report
> back.
>
> Joel Be
Ok, that stack trace shows where the problem. I'll investigate and report
back.
Joel Bernstein
http://joelsolr.blogspot.com/
On Thu, Sep 27, 2018 at 4:01 PM Gaini Rajeshwar <
raja.rajeshwar2...@gmail.com> wrote:
> @Joel: On some of the shards, i am seeing the following error in logs.
>
> null:
@Joel: On some of the shards, i am seeing the following error in logs.
null:java.lang.NullPointerException
at org.apache.lucene.util.BitSetIterator.(BitSetIterator.java:61)
at org.apache.solr.handler.ExportWriter.writeDocs(ExportWriter.java:243)
at
org.apache.solr.handler
I've tried with 7.3 version also and encountering same issue.
On Thu, Sep 27, 2018 at 11:02 PM Joel Bernstein wrote:
> That is weird, I've not encountered this behavior. There has been some
> changes in 7.4 to the export handler, and I'm wondering if a bug was
> introduced. The stack trace you a
I don't think I've much to add that Steve hasn't already covered, but we've
also seen this "null doc" problem in one of our setups.
In one of our Solr Cloud instances in production where the /get handler is
hit very hard in bursts, the /get request will occasionally return "null"
for a document th
On 9/25/2018 2:14 PM, Hanjan, Harinder wrote:
Hello!
When starting a new topic on the mailing list, do not reply to an
existing message. Your thread is buried within a thread originally
titled "Extracting top level URL when indexing document".
https://home.apache.org/~hossman/#threadhijack
On 9/27/2018 11:48 AM, sgaron cse wrote:
So this is a SOLR core where we keep configuration data so it is almost
never written to. The statistics for the core say its been last modified 4
hours ago, yet I got doc:null from the API an hour ago. And also you don't
have to have a lot of data into th
I control everything except the data that's being indexed. So I can manipulate
the Solr query as needed.
I tried the facet.prefix option and initial testing shows promise.
q=*:*&facet=on&facet.field=Communities&f.Communities.facet.prefix=BANFF+TRAIL+-+BNF
Thanks much!
-Original Message--
John,
I just want to make sure I understand correctly. Replace, fq with facet.query?
So then the resultant query goes from:
q=*:*&facet=on&facet.field=Communities&fq=Communities:"BANFF TRAIL - BNF"
to:
q=*:*&facet=on&facet.field=Communities&facet.query="BANFF TRAIL - BNF"
If that's correct, th
So this is a SOLR core where we keep configuration data so it is almost
never written to. The statistics for the core say its been last modified 4
hours ago, yet I got doc:null from the API an hour ago. And also you don't
have to have a lot of data into the core. For example, this core has only
11
Yeah, I think Kmeans might be a way to implement the "top 3 stories that are
more distant", but you can also have a more naïve (and faster) strategy like
- sending a threshold
- scan the documents according to the relevance score
- select the top documents that have diversity > threshold.
The export handler does not do distributed search. So if you have a
multi-shard collection you may have to use Streaming Expressions to get
exports from all shards.
Joel Bernstein
http://joelsolr.blogspot.com/
On Thu, Sep 27, 2018 at 4:32 AM Jan Høydahl wrote:
> Hi,
>
> Yes, you can choose wh
That is weird, I've not encountered this behavior. There has been some
changes in 7.4 to the export handler, and I'm wondering if a bug was
introduced. The stack trace you are posting is coming from the node where
the expression is being run. Can you check the logs from the shards to see
if there a
Steve:
Thanks. So theoretically I should be able to set up a cluster, index a
bunch of docs to it and then just hammer RTG calls against those IDs
and sometime see a failure?
Hmmm, I guess a follow-up question is whether there's any indexing
gong on at all when this happens. Or, more specifically
Streaming expression only returns JSON. That simplified many aspects of the
implementation.
Joel Bernstein
http://joelsolr.blogspot.com/
On Thu, Sep 27, 2018 at 12:05 PM Dariusz Wojtas wrote:
> Hi,
>
> I am working with SOLR 7.4.0 and use streaming expressions.
> This works nicely, the result
I've thought about this problem a little bit. What I was considering was
using Kmeans clustering to cluster the top 50 docs, then pulling the top
scoring doc form each cluster as the top documents. This should be fast and
effective at getting diversity.
Joel Bernstein
http://joelsolr.blogspot.com
Hi,
I'm considering to write a component for diversifying the results. I know that
diversification can be achieved by using grouping but I'm thinking about
something different and query biased.
The idea is to have something that gets applied after the normal retrieval and
selects the top k do
Hey Erick,
We're using SOLR 7.3.1, which is not the latest but still not too far back.
No the document has not been recently indexed, in fact, I can use the
/search API endpoint to find the document. But I need a fast way to find
document that have not necessarily been indexed yet so /search is o
You can hand-edit everything in the conf directory whether you're
using managed resources or not, with one caution:
You _must not_ use any of the REST APIs _after_ you push manual
changes but _before_ you reload your collection.
Here's the deal. When Solr starts up, it fetches the configuration
fi
What version of Solr are you running? Mostly that's for curiosity.
Is the doc that's not returned something you've recently indexed?
Here's a possible scenario:
You send the doc out to be indexed. The primary forwards the doc to
the followers. Before the follower has a chance to process (but not
c
bq. Can you tell me where I could get insight into the testing cycles
and results?
Hmmm, here's one example:
https://jenkins.thetaphi.de/job/Lucene-Solr-master-Linux/22921/
But we're not really tracking JDK 11 separately at the moment. I'd
monitor the discussion at SOLR-12809 for the current stat
Also, to mention, if i give q="*:*" it is working fine (which is kind of
weird)
On Thu, Sep 27, 2018 at 9:35 PM Gaini Rajeshwar <
raja.rajeshwar2...@gmail.com> wrote:
> Hi All,
>
> I am using solr 7.4 version. I am trying to export data using streaming
> expressions.
>
> Following is the simple c
Hi All,
I am using solr 7.4 version. I am trying to export data using streaming
expressions.
Following is the simple curl query that i am using. I am indexing my data
and testing the following query.
curl --data-urlencode 'expr=search(collection1,
>q="text:solar",
>fl="id",
>
Hi,
I am working with SOLR 7.4.0 and use streaming expressions.
This works nicely, the result is produced in JSON format.
But I need to have it in XML.
Simplest query to show the problem:
search(myCollection,
zkHost="localhost:9983",
qt="/select",
q="*:*",
fl="id",
sort="id desc")
Docs
Thanks Alex/Shawn,
Yeah currently we handling by writing some custom code from the response
and calculating the assets, but we lossing the power of default stats and
facet features when going with this approach.
Also actually it's not duplicate data, but as per our current design the
data resides
If the duplicate data is only indexed, it is not actually duplicated. It is
only an index entry and the record ids where it shows.
Regards,
Alex
On Thu, Sep 27, 2018, 10:55 AM Balanathagiri Ayyasamypalanivel, <
bala.cit...@gmail.com> wrote:
> Hi Alex, thanks, we have that set up already in p
On 9/27/2018 8:53 AM, Balanathagiri Ayyasamypalanivel wrote:
Thanks Shawn for your prompt response.
Actually we have to filter on the query time while calculate the score.
The challenge here is we should not add the asset and put as static field
in the index time. The asset needs to be calculate
Hi Alex, thanks, we have that set up already in place, we are thinking to
optimize more to resign the data to avoid these duplication.
Regards,
Bala.
On Thu, Sep 27, 2018, 10:31 AM Alexandre Rafalovitch
wrote:
> Well, my feeling is that you are going in the wrong direction. And that
> maybe you
Thanks Shawn for your prompt response.
Actually we have to filter on the query time while calculate the score.
The challenge here is we should not add the asset and put as static field
in the index time. The asset needs to be calculated while query time with
some filters.
Regards,
Bala.
On Thu,
There is another thing to consider as well ...
When a node goes off line and then back on, unless Zookeeper has been
configured properly the ensemble may have trouble responding to the
cluster.
Jim Keeney
President, FitterWeb
E: nextves...@gmail.com
M: 703-568-5887
*FitterWeb Consulting*
*Are y
On 9/26/2018 12:46 PM, Balanathagiri Ayyasamypalanivel wrote:
But only draw back here is we have to parse the json to do the sum of the
values, is there any other way to handle this scenario.
Solr cannot do that for you. You could put this in your indexing
software -- add up the numbers and p
Well, my feeling is that you are going in the wrong direction. And that
maybe you need to focus more on separating your - non solr - storage
representation and your - solr - search oriented representation.
E.g. if your issue is storage, maybe you can focus on stored=false
indexed=true approach.
R
Any suggestions?
Regards,
Bala.
On Wed, Sep 26, 2018, 2:46 PM Balanathagiri Ayyasamypalanivel <
bala.cit...@gmail.com> wrote:
> Hi,
>
> Thanks for the reply, actually we are planning to optimize the huge volume
> of data.
>
> For example, in our current system we have as below, so we can do facet
On 9/27/2018 8:00 AM, Shawn Heisey wrote:
On 9/27/2018 7:24 AM, Kimber, Mike wrote:
I'm trying to determine if there is any health check available to
determine the above and then if the issue happens then an automated
mechanism in SolrCloud to restart the instance. Or is this something
we have
Hi,
You can escape all the characters by using \ .
Ex :
\&
\-
But it will not work only for "&" special character if you directly try in
browser.
It will work when use solr apis in the code.
Regards,
Bala.
On Thu, Sep 27, 2018, 6:52 AM Shawn Heisey wrote:
> On 9/26/2018 10:39 PM, Rathor, Piy
On 9/27/2018 7:24 AM, Kimber, Mike wrote:
I'm trying to determine if there is any health check available to determine the
above and then if the issue happens then an automated mechanism in SolrCloud to
restart the instance. Or is this something we have to code ourselves?
As shipped by the pro
On 9/26/2018 2:39 PM, Terry Steichen wrote:
Let me try to clarify a bit - I'm just using bin/post to index the files
in a directory. That indexing process produces a lengthy screen display
of files that were indexed. (I realize this isn't production-quality,
but I'm not ready for production jus
Erick,
Apologies I should have been more specific. "Failed solr node" mean's:
1. SolrCloud instance has crashed
2. SolrCloud Instance is up but not responding
3. SolrCloud Cluster is not responding
I'm trying to determine if there is any health check available to determine the
above and then if
On 9/26/2018 10:39 PM, Rathor, Piyush (US - Philadelphia) wrote:
We are facing some issues in search with special characters. Can you please
help in query if the search is done using following characters:
• “&”
• AND
• (
• )
There are two ways.
Hi Piyush,
This sounds like an encoding problem.
Can you try q= Tata%20%26%20Sons ?
I believe for '&' you can use %26 in your query. (refer to
https://meyerweb.com , encode your queries and try them if they work as
expected)
You can you also try debug=true to see what query is actually sent.
I a
Hi,
Yes, you can choose which to use, it should give you about same result. If you
already work with the Solr search API it would be the easiest for you to
consume /export as you don't need to learn the new syntax and parse the Tuple
response. However, if you need to do stuff with the docs as
G'day,
We're running Solr 5.5.5 to build a search application for a repository of
MS-Office docs and PDFs.
Our schema includes a multivalued field that holds the IDs of objects embedded
in our documents - there can be 100s sometimes 1000s of such objects per
document.
We have a custom query p
Shawn Heisey kirjoitti 26.9.2018 klo 21.16:
On 9/26/2018 9:35 AM, Jeff Courtade wrote:
My concern with using g1 is solely based on finding this.
Does anyone have any information on this?
https://wiki.apache.org/lucene-java/JavaBugs#Oracle_Java_.2F_Sun_Java_.2F_OpenJDK_Bugs
I have never had
Hi,
I have a requirement to fetch all data from a collection. One way is to use
streaming expression and other way is to use export.
Streaming expression documentation says *streaming functions are designed
to work with entire result sets rather then the top N results like normal
search. This is
51 matches
Mail list logo