Thanks for the suggestion Ere. It looks like they are actually enabled;
in schema file the field is only marked as stored (field name="_id"
type="string" multiValued="false" indexed="true" required="true"
stored="true") but the admin UI shows DocValues as enabled, so I guess
this is by default. Is the solution to add "docValues=false" in the schema?
On 12.11.18 10:43, Ere Maijala wrote:
Sofiya,
Do you have docValues enabled for the id field? Apparently that can
make a significant difference. I'm failing to find the relevant
references right now, but just something worth checking out.
Regards,
Ere
Sofiya Strochyk kirjoitti 6.11.2018 klo 16.38:
Hi Toke,
sorry for the late reply. The query i wrote here is edited to hide
production details, but I can post additional info if this helps.
I have tested all of the suggested changes none of these seem to make
a noticeable difference (usually response time and other metrics
fluctuate over time, and the changes caused by different parameters
are smaller than the fluctuations). What this probably means is that
the heaviest task is retrieving IDs by query and not fields by ID.
I've also checked QTime logged for these types of operations, and it
is much higher for "get IDs by query" than for "get fields by IDs
list". What could be done about this?
On 05.11.18 14:43, Toke Eskildsen wrote:
So far no answer from Sofiya. That's fair enough: My suggestions might
have seemed random. Let me try to qualify them a bit.
What we have to work with is the redacted query
q=<q expression>&fl=<full list of fields>&start=0&sort=<sort
expression>&fq=<fq expression>&rows=24&version=2.2&wt=json
and an earlier mention that sorting was complex.
My suggestions were to try
1) Only request simple sorting by score
If this improves performance substantially, we could try and see if
sorting could be made more efficient: Reducing complexity, pre-
calculating numbers etc.
2) Reduce rows to 0
3) Increase rows to 100
This measures one aspect of retrieval. If there is a big performance
difference between these two, we can further probe if the problem is
the number or size of fields - perhaps there is a ton of stored text,
perhaps there is a bunch of DocValued fields?
4) Set fl=id only
This is a variant of 2+3 to do a quick check if it is the resolving of
specific field values that is the problem. If using fl=id speeds up
substantially, the next step would be to add fields gradually until
(hopefully) there is a sharp performance decrease.
- Toke Eskildsen, Royal Danish Library
--
Email Signature
*Sofiia Strochyk
*
s...@interlogic.com.ua <mailto:s...@interlogic.com.ua>
InterLogic
www.interlogic.com.ua <https://www.interlogic.com.ua>
Facebook icon <https://www.facebook.com/InterLogicOfficial> LinkedIn
icon <https://www.linkedin.com/company/interlogic>
--
Email Signature
*Sofiia Strochyk
*
s...@interlogic.com.ua <mailto:s...@interlogic.com.ua>
InterLogic
www.interlogic.com.ua <https://www.interlogic.com.ua>
Facebook icon <https://www.facebook.com/InterLogicOfficial> LinkedIn
icon <https://www.linkedin.com/company/interlogic>