use a
> field in queries or facets, it should be in available to read in result set.
> Fields that won't be searched or faceted can be (indexed=false stored=false
> docValues=true) right?
>
> --uyilmaz
>
>
> On Mon, 19 Oct 2020 14:14:27 -0400
> Michael Gib
all fields need to be docValues="true", because export handler and
streaming both require fields to have docValues, and even if I won't use a
field in queries or facets, it should be in available to read in result set.
Fields that won't be searched or faceted can be (indexe
:15 AM, Erick Erickson wrote:
>
> uyilmaz:
>
> Hmm, that _is_ confusing. And inaccurate.
>
> In this context, it should read something like
>
> The Text field should have indexed="true" docValues=“false" if used for
> searching
> but not faceting
uyilmaz:
Hmm, that _is_ confusing. And inaccurate.
In this context, it should read something like
The Text field should have indexed="true" docValues=“false" if used for
searching
but not faceting and the String field should have indexed="false"
docValues=“true"
As you've observed, it is indeed possible to facet on fields with
docValues=true, indexed=false; but in almost all cases you should
probably set indexed=true. 1. for distributed facet count refinement,
the "indexed" approach is used to look up counts by value; 2. assuming
you
Thanks! This also contributed to my confusion:
https://lucene.apache.org/solr/guide/8_4/faceting.html#field-value-faceting-parameters
"If you want Solr to perform both analysis (for searching) and faceting on the
full literal strings, use the copyField directive in your Schema to create two
ver
I think this is all explained quite well in the Ref Guide:
https://lucene.apache.org/solr/guide/8_6/docvalues.html
DocValues is a different way to index/store values. Faceting is a
primary use case where docValues are better than what 'indexed=true'
gives you.
Regards,
Alex.
On Mon, 19 Oct 20
Hey all,
>From my little experiments, I see that (if I didn't make a stupid mistake) we
>can facet on fields marked as both indexed and stored being false:
I'm suprised by this, I thought I would need to index it. Can you confirm this?
Regards
--
uyilmaz
On Tue, 2018-11-20 at 21:17 -0700, Shawn Heisey wrote:
> Maybe the error condition should be related to a new schema
> property, something like allowQueryOnDocValues. This would default
> to true with current schema versions and false in the next schema
> version, which I think is 1.7. Then a use
On 11/20/2018 8:18 PM, Rahul Goswami wrote:
Erick and Toke,
Thank you for the replies. I am surprised there already isn’t a JIRA for
this. In my opinion, this should be an error condition on search or
alternatively should simply be giving zero results. That would be a defined
behavior as opposed
eting, grouping and sorting; but for
> > a field to be searchable it needs to be indexed=true.
>
> Erick explained the search thing, so I'll just note that faceting on a
> DocValues=true indexed=false field on a multi-shard index also has a
> performance penalty as the
I'll just note that faceting on a
DocValues=true indexed=false field on a multi-shard index also has a
performance penalty as the field will be slow-searched (using the
DocValues) in the secondary fine-counting phase.
- Toke Eskildsen, Royal Danish Library
ping and sorting; but for a field to be
> searchable it needs to be indexed=true. However I was dumbfounded today
> when I executed a successful search on a field with below configuration:
> docValues="true"/>
> However the searches don't always complete and often time ou
the searches don't always complete and often time out.
My question is...
Is searching on docValues=true and indexed=false fields supported? If yes,
in which cases?
What are the pitfalls (as I see that searches, although sometimes
successful are atrociously slow and quite often time out)?
Hi David,
good to know that sorting solved your problem.
I understand perfectly that given the urgency of your situation, having the
solution ready takes priority over continuing with the investigations.
I would recommend anyway to open a Jira issue in Apache Solr with all the
information gathered
Hi Erick & Alessandro,
I have solved my problem by re-ordering the data in the SQL query. I don't
know why it works but it does. I can consistently re-produce the problem
without changing anything else except the database table. As our Solr build is
scripted and we always build a new Solr s
snapshots
> as all of my test runs are completely scripted and build a new Solr server
> from scratch (both the virtual machine and the Solr software). I can diff
> the scripts between two runs to make sure I haven't accidentally changed
> anything, and I have done this.
>
&g
I haven't accidentally changed anything,
and I have done this.
The only difference is that I added docValues=false to all of the fields that
are indexed=false and stored=true in the run that is smaller. I had tested
this previously with the data in the order that makes the index larger and
Well, I'm not entirely sure either ;)
What I'm seeing. And, BTW, I'm making a couple of assumptions here. In
the one listing, your biggest segment starts with _7l and in the other
its _zd. The aggregate size is
2,815M for _7l and 705M for _zd. So multiplying the individual files
in _zd by 4 (p
Hi Erick,
Thinking some more about the differences between the two sort orders has
suggested another possibility. We also have a geo spatial field defined in the
index:
echo "$(date) Creating geoLocation field"
curl -X POST -H 'Content-type:application/json' --data-binary '{
"add-fiel
Hi Erick,
Below is the file listing for when the index is loaded with the table ordered
in a way that produces the smaller index.
I have checked the console, and we have no deleted docs and we have the same
number of docs in the index as there are rows in the staging table that we load
from.
Hi Alessandro,
There are 14,061,990 records in the staging table and that is how many
documents that we end up with in Solr. I would be surprised if we have a
problem with the id, as we use the primary key of the table as the id in Solr
so it must be unique.
The primary key of the staging ta
It's a silly thing, but to confirm the direction that Erick is suggesting :
How many rows in the DB ?
If updates are happening on Solr ( causing the deletes), I would expect a
greater number of documents in the DB than in the Solr index.
Is the DB primary key ( if any) the same of the uniqueKey fie
Hi Emir,
We have no copy field definitions. To keep things simple, we have a one to one
mapping between the columns in our staging table and the fields in our Solr
index.
Regards,
David
David Howe
Java Domain Architect
Postal Systems
Level 16, 111 Bourke Street Melbourne VIC 3000
T 039106
Hi David,
I skimmed through thread and don’t see if already eliminated, so will ask: Can
you check if there are some copyField rules that are triggered when new field
is added. You mentioned that ordering fixed the size of the index, but might be
worth checking.
Emir
--
Monitoring - Log Managem
This isn't terribly useful without a similar dump of "the other" index
directory. The point is to compare the different extensions some
segment where the sum of all the files in that segment is roughly
equal. So if you have a listing of the old index around, that would
help.
bq: We don't have any
Hi Erick,
I have the full dump of the Solr index file sizes as well if that is of any
help. I have attached it below this message.
We don't have any deleted docs in our index, as we always build it from a brand
new virtual machine with a brand new installation of Solr.
The ordering is defini
David:
Rats, the cfs files make everything I'd hoped to understand with the
sizes ambiguous, since they conceal the underlying sizes of each other
extension. We can approach it a bit differently though. Take one
segment that's _not_ in cfs format where the total size of all files
making up that se
@Alessandro I will see if I can reproduce the same issue just by turning
off omitNorms on field type. I'll open another mail thread if required.
Thanks.
On Thu, Feb 15, 2018 at 6:12 AM, Howe, David
wrote:
>
> Hi Alessandro,
>
> Some interesting testing today that seems to have gotten me closer t
Hi Alessandro,
Some interesting testing today that seems to have gotten me closer to what the
issue is. When I run the version of the index that is working correctly
against my database table that has the extra field in it, the index suddenly
increases in size. This is even though the data i
@Pratik: you should have investigated. I understand that solved your issue,
but in case you needed norms it doesn't make sense that cause your index to
grow up by a factor of 30. You must have faced a nasty bug if it was just
the norms.
@Howe :
*Compound File* .cfs, .cfe An optional "virtua
Subject: RE: Index size increases disproportionately to size of added field
when indexed=false
I have set docValues=false on all of the string fields in our index that have
indexed=false and stored=true. This gave a small improvement in the index size
from 13.3GB to 12.82GB.
I have also tried
You are right, in my case this field type was applied to many text fields.
These includes many copy fields and dynamic fields as well. In my case,
only specifying omitNorms=true for field type "text_general" fixed the
issue. I didn't do anything else or had any other bug.
On Wed, Feb 14, 2018 at 1
Hi pratik,
how is it possible that just the norms for a single field were causing such
a massive index size increment in your case ?
In your case I think it was for a field type used by multiple fields, but
it's still suspicious in my opinions,
norms should be that big.
If I remember correctly in
nes-mmapdirectory-on-64bit.html
bq: In what situation would it make sense to have indexed=false and
docValues=true?
When you want to return _only_ fields that have docValues=true. If you
return fields with stored=true and docValues=false, Solr/Lucene has to
1> read the stored values from disk (minim
2018 at 8:48 PM, Howe, David
wrote:
>
> I have set docValues=false on all of the string fields in our index that
> have indexed=false and stored=true. This gave a small improvement in the
> index size from 13.3GB to 12.82GB.
>
> I have also tried running an optimize, which then r
I have set docValues=false on all of the string fields in our index that have
indexed=false and stored=true. This gave a small improvement in the index size
from 13.3GB to 12.82GB.
I have also tried running an optimize, which then reduced the index to 12.6GB.
Next step is to dump the sizes
Thanks Hoss. I will try setting docValues to false, as we only ever want to be
able to retrieve the value of this field.
Regards,
David
David Howe
Java Domain Architect
Postal Systems
Level 16, 111 Bourke Street Melbourne VIC 3000
T 0391067904
M 0424036591
E david.h...@auspost.com.au
W
that makes a difference. It
sounds like we shouldn't have it on anyway, as we only ever want to be able to
retrieve this field. In what situation would it make sense to have
indexed=false and docValues=true?
I will re-index and get a sizing for all of the different file extensions both
Hi Alessandro,
The docker image is like a disk image of the entire server, so it includes the
operating system, the Solr installation and the data. Because we run in the
cloud and our index isn't that big, this is an easy and fast way for us to
scale our Solr cluster without having to configu
'{
> : "add-field":{
> : "name":"buildingName",
> : "type":"string",
> : "stored":true,
> : "indexed":false
> : }
> : }' http://localhost:8983/solr/address/schema
H 'Content-type:application/json' --data-binary '{
: "add-field":{
: "name":"buildingName",
: "type":"string",
: "stored":true,
: "indexed":false
: }
: }' http://localhost:8983/solr/a
David:
Right, Optimize Is Evil. Well, actually in your case it's not. In your
specific case you can optimize every time you build your index and be
OK, gory details here:
https://lucidworks.com/2017/10/13/segment-merging-deleted-documents-optimize-may-bad/
But that's just for background. The key
Hi David,
given the fact that you are actually building a new index from scratch, my
shot in the dark didn't hit any target.
When you say : "Once the import finishes we save the docker image in the
AWS docker repository. We then build our cluster using that image as the
base"
Do you mean just c
Hi Alessanro,
Thanks for responding. We rebuild the index every time starting from a fresh
installation of Solr. Because we are running at AWS, we have automated our
deployment so we start with the base docker image, configure Solr and then
import our data every time the data changes (it onl
I assume you re-index in full right ?
My shot in the dark is that this increment is temporary.
You re-index, so effectively delete and add all documents ( this means that
even if the new field is just stored, you re-build the entire index for all
the fields).
Create new segments and the old docs ar
'Content-type:application/json' --data-binary '{
"add-field":{
"name":"buildingName",
"type":"string",
"stored":true,
"indexed":false
}
}' http://localhost:8983/solr/a
se elaborate on "full table scan search"
>
>Regards
>Ashish
>
>
>
>--
>View this message in context:
>http://lucene.472066.n3.nabble.com/Indexed-false-for-a-field-but-still-able-to-search-on-field-tp4352338p4352599.html
>Sent from the Solr - User mailing
Hi,
Thanks ,got this issue is happening because of docValues=true.
Please elaborate on "full table scan search"
Regards
Ashish
--
View this message in context:
http://lucene.472066.n3.nabble.com/Indexed-false-for-a-field-but-still-able-to-search-on-field-tp4352338p4352599.html
Sen
s
>
> Renuka Srishti
>
>
>
> On Tue, Aug 29, 2017 at 1:06 AM, AshB wrote:
>
> > Hi,
> >
> > Yes docValues is true for fieldType
> >
> > > docValues="true"/>
> >
> >
> >
> >
> >
> > --
> > View this message in context: http://lucene.472066.n3.
> > nabble.com/Indexed-false-for-a-field-but-still-able-to-search-on-field-
> > tp4352338p4352442.html
> > Sent from the Solr - User mailing list archive at Nabble.com.
> >
>
, 2017 at 1:06 AM, AshB wrote:
> Hi,
>
> Yes docValues is true for fieldType
>
> docValues="true"/>
>
>
>
>
>
> --
> View this message in context: http://lucene.472066.n3.
> nabble.com/Indexed-false-for-a-field-but-still-able-to-search-on-
Hi,
Yes docValues is true for fieldType
--
View this message in context:
http://lucene.472066.n3.nabble.com/Indexed-false-for-a-field-but-still-able-to-search-on-field-tp4352338p4352442.html
Sent from the Solr - User mailing list archive at Nabble.com.
Solrversion:6.5.1
>
>
>
> --
> View this message in context:
> http://lucene.472066.n3.nabble.com/Indexed-false-for-a-field-but-still-able-to-search-on-field-tp4352338.html
> Sent from the Solr - User mailing list archive at Nabble.com.
ucene.472066.n3.nabble.com/Indexed-false-for-a-field-but-still-able-to-search-on-field-tp4352338.html
Sent from the Solr - User mailing list archive at Nabble.com.
I was talking about Solr, and not talking about copy fields ext ( that
>> will
>> be indexed) .
>> If you don't index in solr, you can't search :)
>>
>
> Alessandro,
> I just wanted to emphasize, that it searches by docValues=true field, even
> it's attributed in
Keep calm santhosh, if you read your parsed query it's easy to see that
it's not searching against the cat field ( which was not indexed ) :
"parsedquery":"text:software",
So it's returning a document because that document has the text field
indexed .
To answer your question :
"I have question
yes .. that line was commented out... and I have restarted the server ..
after updating the schema.xml .. and document was deleted and added back ..
On 22 July 2015 at 20:31, Mikhail Khludnev
wrote:
> did you removed
>
> from schema.xml ?
> did you restarted Solr or reloaded core, after that
I have question ... when we delete a document .. will all the indexes
generated for that document wont be get deleted? and when we add new
document , it wont be get checked against schema.xml ...
On 22 July 2015 at 20:04, Jack Krupansky wrote:
> Is this a feature or a bug of Solr? Seriously, ma
did you removed
from schema.xml ?
did you restarted Solr or reloaded core, after that?
did you reindex that document, after all?
On Wed, Jul 22, 2015 at 5:55 PM, santhosh kumar
wrote:
> but field 'cat' is not the 'copyFiled' not in the 'dynamicField'. :)
>
> On 22 July 2015 at 20:04, Mikhail
that will be an alternative option .. but what if the exiting field we need
to change?
On 22 July 2015 at 20:03, Alexandre Rafalovitch wrote:
> I would just reindex into a new core from scratch first. I think the
> suggestion that perhaps the content was evolving and did not get
> reindexed full
but field 'cat' is not the 'copyFiled' not in the 'dynamicField'. :)
On 22 July 2015 at 20:04, Mikhail Khludnev
wrote:
> it matches by text field
>
> On Wed, Jul 22, 2015 at 5:28 PM, santhosh kumar
> wrote:
>
> > Here is the query in debug mode ...
> >
> > {
> > "responseHeader": {
> > "s
Is this a feature or a bug of Solr? Seriously, maybe it is convenient to be
able to change the schema without reindexing documents that don't care
about the schema change, but it is definitely a support headache. I mean,
how many times have we had to ask that question on this list since the
current
It could be interesting to have a utility that will - for example -
compare schema definition with reverse-engineered definition from the
underlying indexes. That would catch the indexed leftovers. But it
probably still would not work for analyzer chain changes.
Solr Analyzers, Tokenizers, Fil
it matches by text field
On Wed, Jul 22, 2015 at 5:28 PM, santhosh kumar
wrote:
> Here is the query in debug mode ...
>
> {
> "responseHeader": {
> "status": 0,
> "QTime": 3,
> "params": {
> "debugQuery": "true",
> "indent": "true",
> "q": "software",
> "_":
some more info .. after updating schema.xml file I have restarted the Jetty
server and the document was deleted and added.
On 22 July 2015 at 19:58, santhosh kumar wrote:
> Here is the query in debug mode ...
>
> {
> "responseHeader": {
> "status": 0,
> "QTime": 3,
> "params": {
>
sorry .. I didnt get .. I am using default solr admin for query ...
http://localhost:8494/solr/myfirstcore/select?q=software&wt=json&indent=true&debugQuery=true
{
"responseHeader":{
"status":0,
"QTime":3,
"params":{
"debugQuery":"true",
"indent":"true",
"q":"soft
I would just reindex into a new core from scratch first. I think the
suggestion that perhaps the content was evolving and did not get
reindexed fully is the most likely cause.
Regards,
Alex.
Solr Analyzers, Tokenizers, Filters, URPs and even a newsletter:
http://www.solr-start.com/
On 22
excuse me. Schema Browser, I mean. Also, which query do you see at
debugQuery=true?
On Wed, Jul 22, 2015 at 5:10 PM, Mikhail Khludnev <
mkhlud...@griddynamics.com> wrote:
> What do you see at Session browser for this field?
>
> On Wed, Jul 22, 2015 at 4:49 PM, santhosh kumar
> wrote:
>
>> Hi,
>>
Here is the query in debug mode ...
{
"responseHeader": {
"status": 0,
"QTime": 3,
"params": {
"debugQuery": "true",
"indent": "true",
"q": "software",
"_": "1437575140328",
"wt": "json"
}
},
"response": {
"numFound": 1,
"start": 0,
"
I find so hard to believe you can search without the inverted index :)
Are you sure you didn't have in your index some documents with that field
indexed, before you did the change and put it as not indexed ?
Changes in the schema don't apply to already indexed documents ( if you
don't go with a r
What do you see at Session browser for this field?
On Wed, Jul 22, 2015 at 4:49 PM, santhosh kumar
wrote:
> Hi,
>
> I have started practicing Solr recently and my understanding on the field
> type properties "index=false", is that field is not searchable.
>
> But when I execute the below query I
Hi,
I have started practicing Solr recently and my understanding on the field
type properties "index=false", is that field is not searchable.
But when I execute the below query I got the results.
http://localhost:8494/solr/myfirstcore/select?q=cat%3Asoftware&wt=json&indent=true
configured in sc
Yes, surprisingly enough, if indexed=false, docValues=true — you can still
search. I’ve seen the code behind it; it’s interesting. Rob wrote it.
I’m not sure how scalable it is compared to the inverted index. I suspect
it wouldn’t do well for a lot of distinct values but will fine for a small
Thank you Shawn for the excellent use case. :)
On Wed, Jul 3, 2013 at 9:34 AM, Shawn Heisey wrote:
> On 7/3/2013 9:22 AM, Ali, Saqib wrote:
>
>> What would be the use case for such a field:
>>
>> > stored="false"/>
>>
>>
>> and
>>
>> > stored="false"/>
>>
>
> I have a field li
On 7/3/2013 9:22 AM, Ali, Saqib wrote:
What would be the use case for such a field:
and
I have a field like this in my schema. That field is used as one of the
source fields that get copied to my "catchall" field. I don't need the
field by itself, but I use it in conj
Or you may have dynamic field as stored but ignore some specific fields
that would otherwise be matching the dynamic field mask. Useful if you are
trying to get metadata but not content out of something.
This is based on specific field names matching before dynamic ones.
Regards,
Alex.
Person
.. you could still have update processors that look at the values of
> "ignored" fields and maybe assigns them to other, non-ignored fields.
>
> -- Jack Krupansky
>
> -Original Message- From: Ali, Saqib
> Sent: Wednesday, July 03, 2013 11:22 AM
> To: solr-user@lucene
i, Saqib
> Sent: Wednesday, July 03, 2013 11:22 AM
> To: solr-user@lucene.apache.org
> Subject: Use case indexed="false" stored="false" field
>
> Hello all,
>
>
> What would be the use case for such a field:
>
>
>
>
> and
>
>
>
>
> ?
>
>
> Thanks.
Krupansky
-Original Message-
From: Ali, Saqib
Sent: Wednesday, July 03, 2013 11:22 AM
To: solr-user@lucene.apache.org
Subject: Use case indexed="false" stored="false" field
Hello all,
What would be the use case for such a field:
stored="false"/>
and
?
Thanks.
Maybe to ignore?
You can set a dynamic Field to ignore as well.
On Wed, Jul 3, 2013 at 9:22 AM, Ali, Saqib wrote:
> Hello all,
>
>
> What would be the use case for such a field:
>
> stored="false"/>
>
>
> and
>
> stored="false"/>
>
>
> ?
>
>
> Thanks.
>
--
Bill Bell
billn
Hello all,
What would be the use case for such a field:
and
?
Thanks.
On Fri, Jul 31, 2009 at 3:19 PM, Yao Ge wrote:
> Having a large number of fields is not the same as having a large number of
> facets. To facets are something you would display to users as aid for query
> refinement or navigation. There is no way for a user to use 3700 facets at
> the same time.
I
he number of properties that are available
> for
> faceting, the performance can be improved. To test this, we enabled only 6
> properties for faceting by setting indexed=true (in schema.xml) for only
> these properties. All other properties which are defined as dynamic
> properties ha
iting the number of facets will actually
>> improve the performance.
>>
>> An update on this. I did verify and looks like although I set
>> indexed=false
>> for most of the properties, I have not blocked them from participating in
>> the query. I now enabled only 7 pr
we will be enabling them. However, the primary
idea of
this exercise is to verify if limiting the number of facets will
actually
improve the performance.
An update on this. I did verify and looks like although I set
indexed=false
for most of the properties, I have not blocked them from
participat
although I set indexed=false
for most of the properties, I have not blocked them from participating in
the query. I now enabled only 7 properties for faceting. Now at any given
time only a maximum of 7 facets will participate in the query. Performance
has now improved from an erstwhile 60 seconds
On Jul 31, 2009, at 7:17 AM, Rahul R wrote:
Erik,
I understand that caching is going to improve performance. Infact we
did a
PSR run with caches enabled and we got awesome results. But these
wouldn't
be really representative because the PSR scripts will be doing the
same
searches again an
VM.
I think I need to go back and check if I am not using all the fields in the
query. I understand that setting indexed=false alone will not ensure that
all fields don't participate in the query.
Thanks a lot for your response.
Regards
Rahul
On Fri, Jul 31, 2009 at 3:33 PM, Erik Hatcher wrot
faceting, the performance can be improved. To test this, we enabled
only 6
properties for faceting by setting indexed=true (in schema.xml) for
only
these properties. All other properties which are defined as dynamic
properties had indexed=false.
These settings won't matter - what matte
of properties that are available for
faceting, the performance can be improved. To test this, we enabled only 6
properties for faceting by setting indexed=true (in schema.xml) for only
these properties. All other properties which are defined as dynamic
properties had indexed=false. The observations
90 matches
Mail list logo