date:20201231

[jira] [Commented] (LUCENE-9652) DataInput.readFloats to be used by Lucene90VectorReader

2020-12-31 Thread Adrien Grand (Jira)



[ 
https://issues.apache.org/jira/browse/LUCENE-9652?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17256921#comment-17256921
 ] 

Adrien Grand commented on LUCENE-9652:
--

bq. I think we should only support little-endian floats from the beginning 
here. We're planning to move towards switching the whole IndexInput to that 
endianness, right?

+1

bq. but I think if we're still unreleased and in an experimental class that 
should be OK?

Since it's unreleased, we can change the file format of VectorFormat however we 
want without adding backward-compatibility logic indeed. In my opinion, the 
experimental flag is more about the API than the file format compatibility. 
Once a file format is released, we need to maintain backward compatibility, 
regardless of whether it's experimental or not.

bq. Also, we don't need a corresponding DataOutput.writeFloats to support the 
current usage for vectors, since there we rely on VectorValues to do the 
conversion, so I don't plan to implement that.

+1

Ultimately, it would be nice if we could avoid copying the data entirely by 
just interpreting the bytes from the DataInput as a float[] array. We can 
technically do it today by making the dot product helper take a DataInput and 
read floats one by one, but this has the downside of disabling 
auto-vectorization of the dot product. I started looking at the vector API that 
is new in JDK 16 recently, and it looks like it could help remove this copy 
since it allows [creating a FloatVector from a 
ByteBuffer|https://download.java.net/java/early_access/jdk17/docs/api/jdk.incubator.vector/jdk/incubator/vector/FloatVector.html#fromByteBuffer(jdk.incubator.vector.VectorSpecies,java.nio.ByteBuffer,int,java.nio.ByteOrder)].

> DataInput.readFloats to be used by Lucene90VectorReader
> ---
>
> Key: LUCENE-9652
> URL: https://issues.apache.org/jira/browse/LUCENE-9652
> Project: Lucene - Core
>  Issue Type: Improvement
>Reporter: Michael Sokolov
>Priority: Major
>
> Benchmarking shows a substantial performance gain can be realized by avoiding 
> the additional memory copy we must do today when converting from {{byte[]}} 
> read using {{IndexInput}} into {{float[]}} returned by 
> {{Lucene90VectorReader}}. We have a model for how to handle the various 
> alignments, and buffer underflow when a value spans buffers, in 
> {{readLELongs}}.
> I think we should only support little-endian floats from the beginning here. 
> We're planning to move towards switching the whole IndexInput to that 
> endianness, right?
> Lucene90VectorWriter relies on {{VectorValues.binaryValue()}} to return bytes 
> in the format expected by the reader, and its javadocs don't currently 
> specify their endianness. In fact the order has been the default supplied by 
> {{ByteBuffer.allocate(int)}}, which I now realize is big-endian, so this 
> issue also proposes to change the index format. That would mean a 
> backwards-incompatible index change, but I think if we're still unreleased 
> and in an experimental class that should be OK?
> Also, we don't need a corresponding {{DataOutput.writeFloats}} to support the 
> current usage for vectors, since there we rely on {{VectorValues}} to do the 
> conversion, so I don't plan to implement that.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[jira] [Resolved] (LUCENE-9442) Update dev-tools/scripts to use the Gradle build

2020-12-31 Thread Erick Erickson (Jira)



 [ 
https://issues.apache.org/jira/browse/LUCENE-9442?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Erick Erickson resolved LUCENE-9442.

Fix Version/s: master (9.0)
   Resolution: Fixed

There may be things bubbling up out of the cracks when we release 9.0 that 
we'll have to fix...

> Update dev-tools/scripts to use the Gradle build
> 
>
> Key: LUCENE-9442
> URL: https://issues.apache.org/jira/browse/LUCENE-9442
> Project: Lucene - Core
>  Issue Type: Improvement
>  Components: general/build, general/tools
>Affects Versions: master (9.0)
>Reporter: Erick Erickson
>Assignee: Erick Erickson
>Priority: Blocker
> Fix For: master (9.0)
>
>
> Assigning to myself to track, but whoever actually picks this up should 
> reassign it. I don't _think_ there are any reasons that LUCENE-9433 needs to 
> be pushed before some ambitious person could work on this.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[jira] [Commented] (SOLR-12037) Reduce noise from flakey tests

2020-12-31 Thread Erick Erickson (Jira)



[ 
https://issues.apache.org/jira/browse/SOLR-12037?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17256995#comment-17256995
 ] 

Erick Erickson commented on SOLR-12037:
---

I won't be working on JIRAs for the foreseeable future

> Reduce noise from flakey tests
> --
>
> Key: SOLR-12037
> URL: https://issues.apache.org/jira/browse/SOLR-12037
> Project: Solr
>  Issue Type: Improvement
>  Components: Tests
>Affects Versions: 7.2
>Reporter: Erick Erickson
>Assignee: Erick Erickson
>Priority: Major
>
> Recreating SOLR-12016. Please do NOT delete this without discussion. NOTE: 
> Uwe's build system modifications originally on 12016 have been incorporated 
> into SOLR-12028.
> Current situation concerns:
> > There is so much noise from flakey tests (particularly Solr tests) that 
> > they are difficult to use.
> > The number of tests that regularly fail is increasing
> > Failures are being ignored
> > The number of failing tests makes releasing more difficult.
> > The number of failing tests make it harder to determine whether recent 
> > changes actually caused problems. Running the tests again until they 
> > succeed is used commonly at present, which is not robust.
> > e-mail notifications of failing tests are largely being ignored.
> Propsal:
> > Mark all currently "flakey" tests as BadApple or AwaitsFix
> > Run Jenkins jobs with BadApple (and/or AwaitsFix) enabled and disabled. 
> > Frequency TBD, depends partly on whether we can label emails from these 
> > runs for easy filtering of the two flavors.
> >> Label these runs with something suitable in the subject line (wish list)
> > Weekly reports on the tests labeled BadApple or AwaitsFix
> >> Perhaps this could be incorporated in the reports linked below (wish list)
> > Committers should enable BadApple (or AwaitsFix) regularly as a sanity 
> > check. Leave these as defaults.
> > We start getting much more aggressive about not allowing new flakey tests.
> NOTE: It's perfectly acceptable to have failing flakey tests as long as 
> someone is activey working on fixing them.
> Concerns with solution
> > Decreases test coverage
> > Decreases visibility of flakey tests, making fixing them less likely.
> > Some tools (see below) that report on bad tests will not see tests that are 
> > annotated with BadApple or AwaitsFix.
> > Running unit tests and reporting errors are being conflated
> To be decided:
> > Can we label e-mails with failing tests with something in the subject line 
> > identifying whether they were run with BadApple/Awaits fix enabled or 
> > disabled? Can someone volunteer?
> > Is there any difference between BadApple and AwaitsFix? If not should we 
> > deprecate one? I propose we just use AwaitsFix and deprecate BadApple.
> > Can the automated reports (see below) be enhanced to also report tests 
> > labeled BadApple or AwaitsFix?
> Useful tools:
> > Steve Rowe's work on a Jenkins job to reproduce test failures (LUCENE-8106) 
> > Hoss has worked on aggregating all test failures from the 3 Jenkins systems 
> > (ASF, Policeman, and Steve's), downloading the test results & logs, and 
> > running some reports/stats on failures.
> >> http://fucit.org/solr-jenkins-reports/
> >> https://github.com/hossman/jenkins-reports/
> >> http://fucit.org/solr-jenkins-reports/failure-report.html
> I've assigned this JIRA to myslef, but all volunteers welcome, especially 
> anything that changes the build system.
> I've decided to make this a SOLR jira on the theory that most of the 
> offending tests are in the Solr hive, any sub-tasks for touching the build 
> system can go under LUCENE if wanted.
> Also, I expect to add the annotation to some more tests for a few days as 
> infrequent failures occur. Once we have stability (defined by there being 
> little noise) that'll stop.
> 3 BadApple 23 AwaitsFix annotations are currently in the code, linked to 
> these issues:
> HADOOP-9893
> LUCENE-3869
> LUCENE-5575")
> LUCENE-5595
> LUCENE-5737
> LUCENE-6709
> LUCENE-7161
> SOLR-2715
> SOLR-6213
> SOLR-6443
> SOLR-6944
> SOLR-10071
> SOLR-10136
> SOLR-10734
> SOLR-11974
> Solr JIRAS about bad tests
> SOLR-2175
> SOLR-4147
> SOLR-5880
> SOLR-6423
> SOLR-6944
> SOLR-6961
> SOLR-6974
> SOLR-8122
> SOLR-8182
> SOLR-9869
> SOLR-10053
> SOLR-10070
> SOLR-10071
> SOLR-10139
> SOLR-10287
> SOLR-11911



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[jira] [Assigned] (SOLR-12037) Reduce noise from flakey tests

2020-12-31 Thread Erick Erickson (Jira)



 [ 
https://issues.apache.org/jira/browse/SOLR-12037?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Erick Erickson reassigned SOLR-12037:
-

Assignee: (was: Erick Erickson)

> Reduce noise from flakey tests
> --
>
> Key: SOLR-12037
> URL: https://issues.apache.org/jira/browse/SOLR-12037
> Project: Solr
>  Issue Type: Improvement
>  Components: Tests
>Affects Versions: 7.2
>Reporter: Erick Erickson
>Priority: Major
>
> Recreating SOLR-12016. Please do NOT delete this without discussion. NOTE: 
> Uwe's build system modifications originally on 12016 have been incorporated 
> into SOLR-12028.
> Current situation concerns:
> > There is so much noise from flakey tests (particularly Solr tests) that 
> > they are difficult to use.
> > The number of tests that regularly fail is increasing
> > Failures are being ignored
> > The number of failing tests makes releasing more difficult.
> > The number of failing tests make it harder to determine whether recent 
> > changes actually caused problems. Running the tests again until they 
> > succeed is used commonly at present, which is not robust.
> > e-mail notifications of failing tests are largely being ignored.
> Propsal:
> > Mark all currently "flakey" tests as BadApple or AwaitsFix
> > Run Jenkins jobs with BadApple (and/or AwaitsFix) enabled and disabled. 
> > Frequency TBD, depends partly on whether we can label emails from these 
> > runs for easy filtering of the two flavors.
> >> Label these runs with something suitable in the subject line (wish list)
> > Weekly reports on the tests labeled BadApple or AwaitsFix
> >> Perhaps this could be incorporated in the reports linked below (wish list)
> > Committers should enable BadApple (or AwaitsFix) regularly as a sanity 
> > check. Leave these as defaults.
> > We start getting much more aggressive about not allowing new flakey tests.
> NOTE: It's perfectly acceptable to have failing flakey tests as long as 
> someone is activey working on fixing them.
> Concerns with solution
> > Decreases test coverage
> > Decreases visibility of flakey tests, making fixing them less likely.
> > Some tools (see below) that report on bad tests will not see tests that are 
> > annotated with BadApple or AwaitsFix.
> > Running unit tests and reporting errors are being conflated
> To be decided:
> > Can we label e-mails with failing tests with something in the subject line 
> > identifying whether they were run with BadApple/Awaits fix enabled or 
> > disabled? Can someone volunteer?
> > Is there any difference between BadApple and AwaitsFix? If not should we 
> > deprecate one? I propose we just use AwaitsFix and deprecate BadApple.
> > Can the automated reports (see below) be enhanced to also report tests 
> > labeled BadApple or AwaitsFix?
> Useful tools:
> > Steve Rowe's work on a Jenkins job to reproduce test failures (LUCENE-8106) 
> > Hoss has worked on aggregating all test failures from the 3 Jenkins systems 
> > (ASF, Policeman, and Steve's), downloading the test results & logs, and 
> > running some reports/stats on failures.
> >> http://fucit.org/solr-jenkins-reports/
> >> https://github.com/hossman/jenkins-reports/
> >> http://fucit.org/solr-jenkins-reports/failure-report.html
> I've assigned this JIRA to myslef, but all volunteers welcome, especially 
> anything that changes the build system.
> I've decided to make this a SOLR jira on the theory that most of the 
> offending tests are in the Solr hive, any sub-tasks for touching the build 
> system can go under LUCENE if wanted.
> Also, I expect to add the annotation to some more tests for a few days as 
> infrequent failures occur. Once we have stability (defined by there being 
> little noise) that'll stop.
> 3 BadApple 23 AwaitsFix annotations are currently in the code, linked to 
> these issues:
> HADOOP-9893
> LUCENE-3869
> LUCENE-5575")
> LUCENE-5595
> LUCENE-5737
> LUCENE-6709
> LUCENE-7161
> SOLR-2715
> SOLR-6213
> SOLR-6443
> SOLR-6944
> SOLR-10071
> SOLR-10136
> SOLR-10734
> SOLR-11974
> Solr JIRAS about bad tests
> SOLR-2175
> SOLR-4147
> SOLR-5880
> SOLR-6423
> SOLR-6944
> SOLR-6961
> SOLR-6974
> SOLR-8122
> SOLR-8182
> SOLR-9869
> SOLR-10053
> SOLR-10070
> SOLR-10071
> SOLR-10139
> SOLR-10287
> SOLR-11911



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[jira] [Assigned] (SOLR-13709) Race condition on core reload while core is still loading?

2020-12-31 Thread Erick Erickson (Jira)



 [ 
https://issues.apache.org/jira/browse/SOLR-13709?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Erick Erickson reassigned SOLR-13709:
-

Assignee: (was: Erick Erickson)

> Race condition on core reload while core is still loading?
> --
>
> Key: SOLR-13709
> URL: https://issues.apache.org/jira/browse/SOLR-13709
> Project: Solr
>  Issue Type: Bug
>Reporter: Chris M. Hostetter
>Priority: Major
> Attachments: apache_Lucene-Solr-Tests-8.x_449.log.txt
>
>
> A recent jenkins failure from {{TestSolrCLIRunExample}} seems to suggest that 
> there may be a race condition when attempting to re-load a SolrCore while the 
> core is currently in the process of (re)loading that can leave the SolrCore 
> in an unusable state.
> Details to follow...



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[jira] [Commented] (SOLR-13381) Unexpected docvalues type SORTED_NUMERIC Exception when grouping by a PointField facet

2020-12-31 Thread Erick Erickson (Jira)



[ 
https://issues.apache.org/jira/browse/SOLR-13381?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17256996#comment-17256996
 ] 

Erick Erickson commented on SOLR-13381:
---

I won't be working on JIRAs for the foreseeable future

> Unexpected docvalues type SORTED_NUMERIC Exception when grouping by a 
> PointField facet
> --
>
> Key: SOLR-13381
> URL: https://issues.apache.org/jira/browse/SOLR-13381
> Project: Solr
>  Issue Type: Bug
>  Components: faceting
>Affects Versions: 7.0, 7.6, 7.7, 7.7.1
> Environment: solr, solrcloud
>Reporter: Zhu JiaJun
>Priority: Major
> Attachments: SOLR-13381.patch, SOLR-13381.patch
>
>
> Hey,
> I got an "Unexpected docvalues type SORTED_NUMERIC" exception when I perform 
> group facet on an IntPointField. Debugging into the source code, the cause is 
> that internally the docvalue type for PointField is "NUMERIC" (single value) 
> or "SORTED_NUMERIC" (multi value), while the TermGroupFacetCollector class 
> requires the facet field must have a "SORTED" or "SOTRTED_SET" docvalue type: 
> [https://github.com/apache/lucene-solr/blob/2480b74887eff01f729d62a57b415d772f947c91/lucene/grouping/src/java/org/apache/lucene/search/grouping/TermGroupFacetCollector.java#L313]
>  
> When I change schema for all int field to TrieIntField, the group facet then 
> work. Since internally the docvalue type for TrieField is SORTED (single 
> value) or SORTED_SET (multi value).
> Regarding that the "TrieField" is depreciated in Solr7, please help on this 
> grouping facet issue for PointField. I also commented this issue in SOLR-7495.
>  
> In addtional, all place of "${solr.tests.IntegerFieldType}" in the unit test 
> files seems to be using the "TrieintField", if change to "IntPointField", 
> some unit tests will fail, for example: 
> [https://github.com/apache/lucene-solr/blob/3de0b3671998cc9bc723d10f1b31ce48cbd4fa64/solr/core/src/test/org/apache/solr/request/SimpleFacetsTest.java#L417]



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[jira] [Assigned] (SOLR-14861) CoreContainer shutdown needs to be aware of other ongoing operations and wait until they're complete

2020-12-31 Thread Erick Erickson (Jira)



 [ 
https://issues.apache.org/jira/browse/SOLR-14861?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Erick Erickson reassigned SOLR-14861:
-

Assignee: (was: Erick Erickson)

> CoreContainer shutdown needs to be aware of other ongoing operations and wait 
> until they're complete
> 
>
> Key: SOLR-14861
> URL: https://issues.apache.org/jira/browse/SOLR-14861
> Project: Solr
>  Issue Type: Bug
>Reporter: Erick Erickson
>Priority: Major
> Attachments: SOLR-14861.patch
>
>
> Noble and I are trying to get to the bottom of the TestBulkSchemaConcurrent 
> failures and found what looks like a glaring gap in how 
> CoreContainer.shutdown operates. I don't know the impact on production since 
> we're shutting down anyway, but I think this is responsible for the errors in 
> TestBulkSchemaConcurrent and likely behind others, especially any other test 
> that fails intermittently that involves core reloads, including and 
> especially any tests that exercise managed schema.
> We have clear evidence of this sequence:
> 1> some CoreContainer.reloads come in and get _partway_ through, in 
> particular past the test at the top where CoreContainer.reload() throws an 
> AlreadyClosed exception if (isShutdown).
> 2> Some CoreContainer.shutdown() threads get some processing time before the 
> reloads in <1> are finished.
> 3> the threads in <1> pick back up and go wonky. I suspect that there are a 
> number of different things that could be going wrong here depending on how 
> far through CoreContainer.shutdown() gets that pop out in different ways.
> Since it's my shift (Noble has to sleep sometime), I put some crude locking 
> in just to test the idea; incrementing an AtomicInteger on entry to 
> CoreContainer.reload then decrementing it at the end, and spinning in 
> CoreContainer.shutdown() until the AtomicInteger was back to zero. With that 
> in place, 100 runs and no errors whereas before I could never get even 10 
> runs to finish without an error. This is not a proper fix at all, and the way 
> it's currently running there are still possible race conditions, just much 
> smaller windows. And I suspect it risks spinning forever. But it's enough to 
> make me believe I finally understand what's happening.
> I also suspect that reload is more sensitive than most operations on a core 
> due to the fact that it runs for a long time, but I assume other operations 
> have the same potential. Shouldn't CoreContainer.shutDown() wait until no 
> other operations are in flight?
> On a quick scan of CoreContainer, there are actually few places where we even 
> check for isShutdown, I suspect the places we do are ad-hoc that we've found 
> by trial-and-error when tests fail. We need a design rather than hit-or-miss 
> hacking.
> I think that isShutdown should be replaced with something more robust. What 
> that is IDK quite yet because I've been hammering at this long enough and I 
> need a break.
> This is consistent with another observation about this particular test. If 
> there's sleep at the end, it wouldn't fail; all the reloads get a chance to 
> finish before anything was shut down.
> An open question how much this matters to production systems. In the testing 
> case, bunches of these reloads are issued then we immediately end the test 
> and start shutting things down. It needs to be fixed if we're going to cut 
> down on test failures though. Besides, it's just wrong ;)
> Assigning to myself to track. I'd be perfectly happy, now that Noble and I 
> have done the hard work, for someone to swoop in and take the credit for 
> fixing it ;)
> gradlew beast -Ptests.dups=10 --tests TestBulkSchemaConcurrent
> always fails for me on current code without my hack...



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[jira] [Commented] (SOLR-14469) Removed deprecated code in solr/core (master only)

2020-12-31 Thread Erick Erickson (Jira)



[ 
https://issues.apache.org/jira/browse/SOLR-14469?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17256998#comment-17256998
 ] 

Erick Erickson commented on SOLR-14469:
---

I won't be working on JIRAs for the foreseeable future

> Removed deprecated code in solr/core (master only)
> --
>
> Key: SOLR-14469
> URL: https://issues.apache.org/jira/browse/SOLR-14469
> Project: Solr
>  Issue Type: Sub-task
>Reporter: Erick Erickson
>Priority: Major
>
> I'm currently working on getting all the warnings out of the code, so this is 
> something of a placeholder for a week or two.
> There will be sub-tasks, please create them when you start working on a 
> project.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[jira] [Commented] (SOLR-14861) CoreContainer shutdown needs to be aware of other ongoing operations and wait until they're complete

2020-12-31 Thread Erick Erickson (Jira)



[ 
https://issues.apache.org/jira/browse/SOLR-14861?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17256999#comment-17256999
 ] 

Erick Erickson commented on SOLR-14861:
---

I won't be working on JIRAs for the foreseeable future

> CoreContainer shutdown needs to be aware of other ongoing operations and wait 
> until they're complete
> 
>
> Key: SOLR-14861
> URL: https://issues.apache.org/jira/browse/SOLR-14861
> Project: Solr
>  Issue Type: Bug
>Reporter: Erick Erickson
>Priority: Major
> Attachments: SOLR-14861.patch
>
>
> Noble and I are trying to get to the bottom of the TestBulkSchemaConcurrent 
> failures and found what looks like a glaring gap in how 
> CoreContainer.shutdown operates. I don't know the impact on production since 
> we're shutting down anyway, but I think this is responsible for the errors in 
> TestBulkSchemaConcurrent and likely behind others, especially any other test 
> that fails intermittently that involves core reloads, including and 
> especially any tests that exercise managed schema.
> We have clear evidence of this sequence:
> 1> some CoreContainer.reloads come in and get _partway_ through, in 
> particular past the test at the top where CoreContainer.reload() throws an 
> AlreadyClosed exception if (isShutdown).
> 2> Some CoreContainer.shutdown() threads get some processing time before the 
> reloads in <1> are finished.
> 3> the threads in <1> pick back up and go wonky. I suspect that there are a 
> number of different things that could be going wrong here depending on how 
> far through CoreContainer.shutdown() gets that pop out in different ways.
> Since it's my shift (Noble has to sleep sometime), I put some crude locking 
> in just to test the idea; incrementing an AtomicInteger on entry to 
> CoreContainer.reload then decrementing it at the end, and spinning in 
> CoreContainer.shutdown() until the AtomicInteger was back to zero. With that 
> in place, 100 runs and no errors whereas before I could never get even 10 
> runs to finish without an error. This is not a proper fix at all, and the way 
> it's currently running there are still possible race conditions, just much 
> smaller windows. And I suspect it risks spinning forever. But it's enough to 
> make me believe I finally understand what's happening.
> I also suspect that reload is more sensitive than most operations on a core 
> due to the fact that it runs for a long time, but I assume other operations 
> have the same potential. Shouldn't CoreContainer.shutDown() wait until no 
> other operations are in flight?
> On a quick scan of CoreContainer, there are actually few places where we even 
> check for isShutdown, I suspect the places we do are ad-hoc that we've found 
> by trial-and-error when tests fail. We need a design rather than hit-or-miss 
> hacking.
> I think that isShutdown should be replaced with something more robust. What 
> that is IDK quite yet because I've been hammering at this long enough and I 
> need a break.
> This is consistent with another observation about this particular test. If 
> there's sleep at the end, it wouldn't fail; all the reloads get a chance to 
> finish before anything was shut down.
> An open question how much this matters to production systems. In the testing 
> case, bunches of these reloads are issued then we immediately end the test 
> and start shutting things down. It needs to be fixed if we're going to cut 
> down on test failures though. Besides, it's just wrong ;)
> Assigning to myself to track. I'd be perfectly happy, now that Noble and I 
> have done the hard work, for someone to swoop in and take the credit for 
> fixing it ;)
> gradlew beast -Ptests.dups=10 --tests TestBulkSchemaConcurrent
> always fails for me on current code without my hack...



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[jira] [Assigned] (SOLR-13381) Unexpected docvalues type SORTED_NUMERIC Exception when grouping by a PointField facet

2020-12-31 Thread Erick Erickson (Jira)



 [ 
https://issues.apache.org/jira/browse/SOLR-13381?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Erick Erickson reassigned SOLR-13381:
-

Assignee: (was: Erick Erickson)

> Unexpected docvalues type SORTED_NUMERIC Exception when grouping by a 
> PointField facet
> --
>
> Key: SOLR-13381
> URL: https://issues.apache.org/jira/browse/SOLR-13381
> Project: Solr
>  Issue Type: Bug
>  Components: faceting
>Affects Versions: 7.0, 7.6, 7.7, 7.7.1
> Environment: solr, solrcloud
>Reporter: Zhu JiaJun
>Priority: Major
> Attachments: SOLR-13381.patch, SOLR-13381.patch
>
>
> Hey,
> I got an "Unexpected docvalues type SORTED_NUMERIC" exception when I perform 
> group facet on an IntPointField. Debugging into the source code, the cause is 
> that internally the docvalue type for PointField is "NUMERIC" (single value) 
> or "SORTED_NUMERIC" (multi value), while the TermGroupFacetCollector class 
> requires the facet field must have a "SORTED" or "SOTRTED_SET" docvalue type: 
> [https://github.com/apache/lucene-solr/blob/2480b74887eff01f729d62a57b415d772f947c91/lucene/grouping/src/java/org/apache/lucene/search/grouping/TermGroupFacetCollector.java#L313]
>  
> When I change schema for all int field to TrieIntField, the group facet then 
> work. Since internally the docvalue type for TrieField is SORTED (single 
> value) or SORTED_SET (multi value).
> Regarding that the "TrieField" is depreciated in Solr7, please help on this 
> grouping facet issue for PointField. I also commented this issue in SOLR-7495.
>  
> In addtional, all place of "${solr.tests.IntegerFieldType}" in the unit test 
> files seems to be using the "TrieintField", if change to "IntPointField", 
> some unit tests will fail, for example: 
> [https://github.com/apache/lucene-solr/blob/3de0b3671998cc9bc723d10f1b31ce48cbd4fa64/solr/core/src/test/org/apache/solr/request/SimpleFacetsTest.java#L417]



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[jira] [Commented] (SOLR-13709) Race condition on core reload while core is still loading?

2020-12-31 Thread Erick Erickson (Jira)



[ 
https://issues.apache.org/jira/browse/SOLR-13709?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17256997#comment-17256997
 ] 

Erick Erickson commented on SOLR-13709:
---

I won't be working on JIRAs for the foreseeable future

> Race condition on core reload while core is still loading?
> --
>
> Key: SOLR-13709
> URL: https://issues.apache.org/jira/browse/SOLR-13709
> Project: Solr
>  Issue Type: Bug
>Reporter: Chris M. Hostetter
>Priority: Major
> Attachments: apache_Lucene-Solr-Tests-8.x_449.log.txt
>
>
> A recent jenkins failure from {{TestSolrCLIRunExample}} seems to suggest that 
> there may be a race condition when attempting to re-load a SolrCore while the 
> core is currently in the process of (re)loading that can leave the SolrCore 
> in an unusable state.
> Details to follow...



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[jira] [Assigned] (SOLR-14469) Removed deprecated code in solr/core (master only)

2020-12-31 Thread Erick Erickson (Jira)



 [ 
https://issues.apache.org/jira/browse/SOLR-14469?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Erick Erickson reassigned SOLR-14469:
-

Assignee: (was: Erick Erickson)

> Removed deprecated code in solr/core (master only)
> --
>
> Key: SOLR-14469
> URL: https://issues.apache.org/jira/browse/SOLR-14469
> Project: Solr
>  Issue Type: Sub-task
>Reporter: Erick Erickson
>Priority: Major
>
> I'm currently working on getting all the warnings out of the code, so this is 
> something of a placeholder for a week or two.
> There will be sub-tasks, please create them when you start working on a 
> project.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[jira] [Commented] (SOLR-14920) Format code automatically and enforce it in Solr

2020-12-31 Thread Erick Erickson (Jira)



[ 
https://issues.apache.org/jira/browse/SOLR-14920?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17257001#comment-17257001
 ] 

Erick Erickson commented on SOLR-14920:
---

I may help out when we get to Solr, but someone else will have to shepherd this.

> Format code automatically and enforce it in Solr
> 
>
> Key: SOLR-14920
> URL: https://issues.apache.org/jira/browse/SOLR-14920
> Project: Solr
>  Issue Type: Improvement
>Reporter: Erick Erickson
>Priority: Major
>  Labels: codestyle, formatting
>
> See the discussion at: LUCENE-9564.
> This is a placeholder for the present, I'm reluctant to do this to the Solr 
> code base until after:
>  * we have some Solr-specific consensus
>  * we have some clue what this means for the reference impl.
> Reconciling the reference impl will be difficult enough without a zillion 
> format changes to add to the confusion.
> So my proposal is
> 1> do this.
> 2> Postpone this until after the reference impl is merged.
> 3> do this in one single commit for reasons like being able to conveniently 
> have this separated out from git blame.
> Assigning to myself so it doesn't get lost, but anyone who wants to take it 
> over please feel free.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[jira] [Assigned] (SOLR-14920) Format code automatically and enforce it in Solr

2020-12-31 Thread Erick Erickson (Jira)



 [ 
https://issues.apache.org/jira/browse/SOLR-14920?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Erick Erickson reassigned SOLR-14920:
-

Assignee: (was: Erick Erickson)

> Format code automatically and enforce it in Solr
> 
>
> Key: SOLR-14920
> URL: https://issues.apache.org/jira/browse/SOLR-14920
> Project: Solr
>  Issue Type: Improvement
>Reporter: Erick Erickson
>Priority: Major
>  Labels: codestyle, formatting
>
> See the discussion at: LUCENE-9564.
> This is a placeholder for the present, I'm reluctant to do this to the Solr 
> code base until after:
>  * we have some Solr-specific consensus
>  * we have some clue what this means for the reference impl.
> Reconciling the reference impl will be difficult enough without a zillion 
> format changes to add to the confusion.
> So my proposal is
> 1> do this.
> 2> Postpone this until after the reference impl is merged.
> 3> do this in one single commit for reasons like being able to conveniently 
> have this separated out from git blame.
> Assigning to myself so it doesn't get lost, but anyone who wants to take it 
> over please feel free.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[jira] [Commented] (LUCENE-9652) DataInput.readFloats to be used by Lucene90VectorReader

2020-12-31 Thread Michael Sokolov (Jira)



[ 
https://issues.apache.org/jira/browse/LUCENE-9652?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17257022#comment-17257022
 ] 

Michael Sokolov commented on LUCENE-9652:
-

bq. Ultimately, it would be nice if we could avoid copying the data entirel

Ooh it will be nice to take advantage of the Vector API. I hadn't realized it 
was part of JDK 16; something to look forward to.

I noticed that we did not implement {{ByteBuffersDataInput.readLELongs}}, so 
instead it falls back on the one-at-a-time implementation in {{DataInput}}. I'm 
curious what's the thinking there? I'm not sure how much use this 
{{ByteBuffersDirectory}} gets, but it would probably be nice to offer the bulk 
implementation there too, so I was thinking to add an implementation of 
{{readFloats}} to that, although it's not critical for standard use cases.

> DataInput.readFloats to be used by Lucene90VectorReader
> ---
>
> Key: LUCENE-9652
> URL: https://issues.apache.org/jira/browse/LUCENE-9652
> Project: Lucene - Core
>  Issue Type: Improvement
>Reporter: Michael Sokolov
>Priority: Major
>
> Benchmarking shows a substantial performance gain can be realized by avoiding 
> the additional memory copy we must do today when converting from {{byte[]}} 
> read using {{IndexInput}} into {{float[]}} returned by 
> {{Lucene90VectorReader}}. We have a model for how to handle the various 
> alignments, and buffer underflow when a value spans buffers, in 
> {{readLELongs}}.
> I think we should only support little-endian floats from the beginning here. 
> We're planning to move towards switching the whole IndexInput to that 
> endianness, right?
> Lucene90VectorWriter relies on {{VectorValues.binaryValue()}} to return bytes 
> in the format expected by the reader, and its javadocs don't currently 
> specify their endianness. In fact the order has been the default supplied by 
> {{ByteBuffer.allocate(int)}}, which I now realize is big-endian, so this 
> issue also proposes to change the index format. That would mean a 
> backwards-incompatible index change, but I think if we're still unreleased 
> and in an experimental class that should be OK?
> Also, we don't need a corresponding {{DataOutput.writeFloats}} to support the 
> current usage for vectors, since there we rely on {{VectorValues}} to do the 
> conversion, so I don't plan to implement that.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[GitHub] [lucene-solr] cpoerschke merged pull request #2152: SOLR-14034: remove deprecated min_rf references

2020-12-31 Thread GitBox



cpoerschke merged pull request #2152:
URL: https://github.com/apache/lucene-solr/pull/2152


   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[jira] [Commented] (SOLR-14034) remove deprecated min_rf references

2020-12-31 Thread ASF subversion and git services (Jira)



[ 
https://issues.apache.org/jira/browse/SOLR-14034?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17257038#comment-17257038
 ] 

ASF subversion and git services commented on SOLR-14034:


Commit 17adcc7aa499dd23500772717b075835182480b4 in lucene-solr's branch 
refs/heads/master from Tim Dillon
[ https://gitbox.apache.org/repos/asf?p=lucene-solr.git;h=17adcc7 ]

SOLR-14034: remove deprecated min_rf references (#2152)



> remove deprecated min_rf references
> ---
>
> Key: SOLR-14034
> URL: https://issues.apache.org/jira/browse/SOLR-14034
> Project: Solr
>  Issue Type: Task
>Reporter: Christine Poerschke
>Priority: Blocker
>  Labels: newdev
> Fix For: master (9.0)
>
>  Time Spent: 1h 40m
>  Remaining Estimate: 0h
>
> * {{min_rf}} support was added under SOLR-5468 in version 4.9 
> (https://github.com/apache/lucene-solr/blob/releases/lucene-solr/4.9.0/solr/solrj/src/java/org/apache/solr/client/solrj/request/UpdateRequest.java#L50)
>  and deprecated under SOLR-12767 in version 7.6 
> (https://github.com/apache/lucene-solr/blob/releases/lucene-solr/7.6.0/solr/solrj/src/java/org/apache/solr/client/solrj/request/UpdateRequest.java#L57-L61)
> * http://lucene.apache.org/solr/7_6_0/changes/Changes.html and 
> https://lucene.apache.org/solr/guide/8_0/major-changes-in-solr-8.html#solr-7-6
>  both clearly mention the deprecation
> This ticket is to fully remove {{min_rf}} references in code, tests and 
> documentation.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[jira] [Resolved] (SOLR-14034) remove deprecated min_rf references

2020-12-31 Thread Christine Poerschke (Jira)



 [ 
https://issues.apache.org/jira/browse/SOLR-14034?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Christine Poerschke resolved SOLR-14034.

Resolution: Fixed

https://github.com/apache/lucene-solr/pull/2152 merged as per above, thanks!

> remove deprecated min_rf references
> ---
>
> Key: SOLR-14034
> URL: https://issues.apache.org/jira/browse/SOLR-14034
> Project: Solr
>  Issue Type: Task
>Reporter: Christine Poerschke
>Priority: Blocker
>  Labels: newdev
> Fix For: master (9.0)
>
>  Time Spent: 1h 40m
>  Remaining Estimate: 0h
>
> * {{min_rf}} support was added under SOLR-5468 in version 4.9 
> (https://github.com/apache/lucene-solr/blob/releases/lucene-solr/4.9.0/solr/solrj/src/java/org/apache/solr/client/solrj/request/UpdateRequest.java#L50)
>  and deprecated under SOLR-12767 in version 7.6 
> (https://github.com/apache/lucene-solr/blob/releases/lucene-solr/7.6.0/solr/solrj/src/java/org/apache/solr/client/solrj/request/UpdateRequest.java#L57-L61)
> * http://lucene.apache.org/solr/7_6_0/changes/Changes.html and 
> https://lucene.apache.org/solr/guide/8_0/major-changes-in-solr-8.html#solr-7-6
>  both clearly mention the deprecation
> This ticket is to fully remove {{min_rf}} references in code, tests and 
> documentation.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[GitHub] [lucene-solr] msokolov opened a new pull request #2175: LUCENE-9652: DataInput.readFloats for use by Lucene90VectorReader

2020-12-31 Thread GitBox



msokolov opened a new pull request #2175:
URL: https://github.com/apache/lucene-solr/pull/2175


   This adds `DataInput.readFloats`  and makes use of it in 
`Lucene90VectorReader`. The implementation and tests are essentially cloned 
from `readLELongs`. With these changes I observed definite speedups, although 
not as much as I expected from microbenchmarking. There's a fair amount of 
variability, but I see at least 20%, sometimes as much as 40% reduction in time 
for HnswGraph.search, as measured by KnnGraphTester.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[jira] [Commented] (SOLR-14788) Solr: The Next Big Thing

2020-12-31 Thread Mark Robert Miller (Jira)



[ 
https://issues.apache.org/jira/browse/SOLR-14788?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17257065#comment-17257065
 ] 

Mark Robert Miller commented on SOLR-14788:
---

Quick revist to core tests run serially: 
https://www.dropbox.com/s/cmt00whigi2xq0x/serial-core-tests-revistited.mp4?dl=0

> Solr: The Next Big Thing
> 
>
> Key: SOLR-14788
> URL: https://issues.apache.org/jira/browse/SOLR-14788
> Project: Solr
>  Issue Type: Task
>Reporter: Mark Robert Miller
>Assignee: Mark Robert Miller
>Priority: Critical
>  Time Spent: 4h
>  Remaining Estimate: 0h
>
> h3. 
> [!https://www.unicode.org/consortium/aacimg/1F46E.png!|https://www.unicode.org/consortium/adopted-characters.html#b1F46E]{color:#00875a}*The
>  Policeman is {color:#de350b}NOW{color} {color:#de350b}OFF{color} 
> duty!*{color}
> {quote}_{color:#de350b}*When The Policeman is on duty, sit back, relax, and 
> have some fun. Try to make some progress. Don't stress too much about the 
> impact of your changes or maintaining stability and performance and 
> correctness so much. Until the end of phase 1, I've got your back. I have a 
> variety of tools and contraptions I have been building over the years and I 
> will continue training them on this branch. I will review your changes and 
> peer out across the land and course correct where needed. As Mike D will be 
> thinking, "Sounds like a bottleneck Mark." And indeed it will be to some 
> extent. Which is why once stage one is completed, I will flip The Policeman 
> to off duty. When off duty, I'm always* *occasionally*{color} *down for some 
> vigilante justice, but I won't be walking the beat, all that stuff about sit 
> back and relax goes out the window.*_
> {quote}
>  
> I have stolen this title from Ishan or Noble and Ishan.
> This issue is meant to capture the work of a small team that is forming to 
> push Solr and SolrCloud to the next phase.
> I have kicked off the work with an effort to create a very fast and solid 
> base. That work is not 100% done, but it's ready to join the fight.
> Tim Potter has started giving me a tremendous hand in finishing up. Ishan and 
> Noble have already contributed support and testing and have plans for 
> additional work to shore up some of our current shortcomings.
> Others have expressed an interest in helping and hopefully they will pop up 
> here as well.
> Let's organize and discuss our efforts here and in various sub issues.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[jira] [Commented] (SOLR-14788) Solr: The Next Big Thing

2020-12-31 Thread Mark Robert Miller (Jira)



[ 
https://issues.apache.org/jira/browse/SOLR-14788?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17257069#comment-17257069
 ] 

Mark Robert Miller commented on SOLR-14788:
---

Sample of more production like testing: 
https://www.dropbox.com/s/cfqmp5l9h4wnffb/some-production-type-testing.mp4?dl=0

> Solr: The Next Big Thing
> 
>
> Key: SOLR-14788
> URL: https://issues.apache.org/jira/browse/SOLR-14788
> Project: Solr
>  Issue Type: Task
>Reporter: Mark Robert Miller
>Assignee: Mark Robert Miller
>Priority: Critical
>  Time Spent: 4h
>  Remaining Estimate: 0h
>
> h3. 
> [!https://www.unicode.org/consortium/aacimg/1F46E.png!|https://www.unicode.org/consortium/adopted-characters.html#b1F46E]{color:#00875a}*The
>  Policeman is {color:#de350b}NOW{color} {color:#de350b}OFF{color} 
> duty!*{color}
> {quote}_{color:#de350b}*When The Policeman is on duty, sit back, relax, and 
> have some fun. Try to make some progress. Don't stress too much about the 
> impact of your changes or maintaining stability and performance and 
> correctness so much. Until the end of phase 1, I've got your back. I have a 
> variety of tools and contraptions I have been building over the years and I 
> will continue training them on this branch. I will review your changes and 
> peer out across the land and course correct where needed. As Mike D will be 
> thinking, "Sounds like a bottleneck Mark." And indeed it will be to some 
> extent. Which is why once stage one is completed, I will flip The Policeman 
> to off duty. When off duty, I'm always* *occasionally*{color} *down for some 
> vigilante justice, but I won't be walking the beat, all that stuff about sit 
> back and relax goes out the window.*_
> {quote}
>  
> I have stolen this title from Ishan or Noble and Ishan.
> This issue is meant to capture the work of a small team that is forming to 
> push Solr and SolrCloud to the next phase.
> I have kicked off the work with an effort to create a very fast and solid 
> base. That work is not 100% done, but it's ready to join the fight.
> Tim Potter has started giving me a tremendous hand in finishing up. Ishan and 
> Noble have already contributed support and testing and have plans for 
> additional work to shore up some of our current shortcomings.
> Others have expressed an interest in helping and hopefully they will pop up 
> here as well.
> Let's organize and discuss our efforts here and in various sub issues.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[jira] [Commented] (LUCENE-5940) change index backwards compatibility policy.

2020-12-31 Thread Michael Sokolov (Jira)



[ 
https://issues.apache.org/jira/browse/LUCENE-5940?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17257071#comment-17257071
 ] 

Michael Sokolov commented on LUCENE-5940:
-

I'll echo Erick's question, and close soon if there isn't any further comment. 
As I understand it, we did this. The current (de facto, at least) policy is to 
support the last major release with backcompat, no? I found this documented 
here: 
[https://cwiki.apache.org/confluence/display/LUCENE/BackwardsCompatibility] 
maybe it's elsewhere too?

> change index backwards compatibility policy.
> 
>
> Key: LUCENE-5940
> URL: https://issues.apache.org/jira/browse/LUCENE-5940
> Project: Lucene - Core
>  Issue Type: Bug
>Reporter: Robert Muir
>Priority: Major
>
> Currently, our index backwards compatibility is unmanageable. The length of 
> time in which we must support old indexes is simply too long.
> The index back compat works like this: everyone wants it, but there are 
> frequently bugs, and when push comes to shove, its not a very sexy thing to 
> work on/fix, so its hard to get any help.
> Currently our back compat "promise" is just a broken promise, because we 
> cannot actually guarantee it for these reasons.
> I propose we scale back the length of time for which we must support old 
> indexes.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[jira] [Resolved] (LUCENE-5862) Old segments not deleted on merge

2020-12-31 Thread Michael Sokolov (Jira)



 [ 
https://issues.apache.org/jira/browse/LUCENE-5862?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Michael Sokolov resolved LUCENE-5862.
-
Resolution: Fixed

> Old segments not deleted on merge
> -
>
> Key: LUCENE-5862
> URL: https://issues.apache.org/jira/browse/LUCENE-5862
> Project: Lucene - Core
>  Issue Type: Bug
>Affects Versions: 4.9
> Environment: Linux bigindy5 3.14.1-1.el6.elrepo.x86_64 #1 SMP Mon Apr 
> 14 19:29:19 EDT 2014 x86_64 x86_64 x86_64 GNU/Linux
> java version "1.7.0_55"
> Java(TM) SE Runtime Environment (build 1.7.0_55-b13)
> Java HotSpot(TM) 64-Bit Server VM (build 24.55-b03, mixed mode)
> [root@bigindy5 s5_0]# cat /etc/redhat-release
> CentOS release 6.5 (Final)
>Reporter: Shawn Heisey
>Priority: Major
> Fix For: 6.0, 4.10
>
> Attachments: LUCENE-5862-infostreams.zip
>
>
> After a full rebuild with the dataimport handler on a Solr install upgraded 
> to Solr 4.9.0, I ended up with an index that was considerably larger than the 
> one it replaced (built by 4.7.2), 28GB instead of 20GB.  I also upgraded a 
> third-party component at the same time, to a version which has been tested 
> with Solr 4.9.0.  The config didn't change at all.  Optimizing the index did 
> not shrink it.
> At first I thought there must have been something different about the way the 
> new version worked, or possibly a change/bug in the third-party component.
> After looking deeper, I discovered that the optimization process had created 
> one segment that was 20GB in size, but there were also a number of other 
> segments on the disk, all of which were several hours older than the large 
> segment.  Another optimize created a new segment of 20GB, and the previous 
> segment of 20GB was deleted, but the older segments remained.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[jira] [Commented] (LUCENE-9563) Add .editorConfig

2020-12-31 Thread Michael Sokolov (Jira)



[ 
https://issues.apache.org/jira/browse/LUCENE-9563?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17257075#comment-17257075
 ] 

Michael Sokolov commented on LUCENE-9563:
-

+1, I had already added {{.dir-locals.el}} which is the same idea for emacs. It 
has fairly limited style enforcement (at least how I use it), but this helped 
me stay sane sticking with 2-character indentation.

> Add .editorConfig
> -
>
> Key: LUCENE-9563
> URL: https://issues.apache.org/jira/browse/LUCENE-9563
> Project: Lucene - Core
>  Issue Type: Task
>Reporter: David Smiley
>Assignee: David Smiley
>Priority: Major
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> I propose adding a ".editorConfig" to the root of the project.  Many text 
> editors and IDEs support this file to declare code style settings such as 
> indentation and more.  In particular, IntelliJ supports this natively and 
> Eclipse has a plugin for it.
> https://editorconfig.org
> I furthermore propose I simply generate this as an export of my current 
> IntelliJ code style, which is a code style I've been using and was originally 
> imported from the Lucene's former IntelliJ config.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[jira] [Commented] (SOLR-14282) /get handler doesn't return copied fields

2020-12-31 Thread David Smiley (Jira)



[ 
https://issues.apache.org/jira/browse/SOLR-14282?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17257098#comment-17257098
 ] 

David Smiley commented on SOLR-14282:
-

I'm doing a bunch of work in RTG right now in order to fix a complicated bug, 
and I did a little JIRA searching and uncovered this issue.  Indeed, /get 
doesn't return copy-field targets by design, because it primarily exists for 
internal purposes that require an "input" document (thus no application of copy 
fields).  But the primary/original part of it is public (not the extras like 
getInputDocument), and I think it's confusing for it to behave differently from 
what a searcher would return.  I'd like to see the copy-field target filtering 
be an explicit option that you have to ask for.  Solr internal uses would do 
this, or it may be implied for "getInputDocument" and other internal 
functionality switches.

bq. Also - stored field values can come from not just the input document or a 
copyField, but also update processors.  IMO, for /get to return stored fields 
like /select would return is asking too much...

Yes this can happen but (A) that doesn't happen by default so the distinction 
is academic to many users, and (B) even then, the UpdateLog stores the final 
SolrInputDocument and thus it _will_ return whatever those URPs produce (thus 
looks like what /select returns).

> /get handler doesn't return copied fields
> -
>
> Key: SOLR-14282
> URL: https://issues.apache.org/jira/browse/SOLR-14282
> Project: Solr
>  Issue Type: Bug
>  Components: search, SolrJ
>Affects Versions: 8.4
> Environment: SOLR 8.4.0, SOLRJ, Oracle Java 8 
>Reporter: Andrei Minin
>Priority: Major
> Attachments: SOLR-14282-test-update.patch, copied_fields_test.zip, 
> managed-schema.xml
>
>
> We are using /get handler to retrieve documents by id in our Java application 
> (SolrJ)
> I found that copied fields are missing in documents returned by /get handler 
> but  same documents returned by  query contain copied (by schema) fields.
> Attached documents:
>  # Integration test project archive
>  # Managed schema file for SOLR
> SOLR schema details:
>  # Unique field name "d_ida_s"
>  # Lowecase text type definition:
> {code:java}
>   positionIncrementGap="100">
>   
> 
> 
>   
> {code}
>           3. Copy field instruction sample: 
> {code:java}
>  stored="true" multiValued="false"/>
>  /> 
> {code}
> ConcurrenceUserNamea_s is string type field and ConcurrenceUserNameu_lca_s is 
> lower case text type field.
> Integration test uploads document to SOLR server and makes 2 requests: one 
> using /get rest point to fetch document by id and one using query  field name>:.
> Document returned by /get rest, doesn't have copied fields while document 
> returned by query, contains copied fields.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[jira] [Commented] (SOLR-13034) RealTimeGetComponent#toSolrDoc should be able to resolve LazyFields

2020-12-31 Thread David Smiley (Jira)



[ 
https://issues.apache.org/jira/browse/SOLR-13034?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17257100#comment-17257100
 ] 

David Smiley commented on SOLR-13034:
-

Is there a test showing this issue?  Or can you show how to provoke it in an 
existing test with a modification, or a new test?

> RealTimeGetComponent#toSolrDoc should be able to resolve LazyFields
> ---
>
> Key: SOLR-13034
> URL: https://issues.apache.org/jira/browse/SOLR-13034
> Project: Solr
>  Issue Type: Bug
>Reporter: mosh
>Assignee: mosh
>Priority: Major
>  Labels: RealTimeGet
>
> As I was working on SOLR-12638, I noticed RealTimgeGetComponent#toSolrDoc 
> does not resolve lazy fields. 
>  This behavior is cause by the use of transformers which use 
> SolrDocumentFetcher, which caused exceptions to be thrown when said input 
> documents were written to the transaction log(TransactionLog:100).
>  IMO, These fields ought to be resolved by the 
> RealTimgeGetComponent#toSolrDoc method, which takes a Document as an 
> input(which may contain LazyFields) and returns a SolrInputDocument 
> representation of said SolrDocument.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[jira] [Created] (SOLR-15063) Consolidate SolrDocument, SolrDocumentBase, SolrInputDocument

2020-12-31 Thread David Smiley (Jira)

David Smiley created SOLR-15063:
---

 Summary: Consolidate SolrDocument, SolrDocumentBase, 
SolrInputDocument
 Key: SOLR-15063
 URL: https://issues.apache.org/jira/browse/SOLR-15063
 Project: Solr
  Issue Type: Improvement
  Security Level: Public (Default Security Level. Issues are Public)
Reporter: David Smiley


We've got a SolrDocumentBase abstraction implemented by SolrDocument (from 
search results), and SolrInputDocument (input docs for indexing).  The 
distinction was originally because the input side uniquely had ways of boosting 
particular field values into the index, but eventually Lucene dropped support 
for that.  Is there any other purpose?

I propose that we consolidate to one abstraction called SolrDocument.  It can 
also drop the child document methods, which were added before nested documents 
that are present in the field values, not set aside.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[jira] [Commented] (LUCENE-9652) DataInput.readFloats to be used by Lucene90VectorReader

[jira] [Resolved] (LUCENE-9442) Update dev-tools/scripts to use the Gradle build

[jira] [Commented] (SOLR-12037) Reduce noise from flakey tests

[jira] [Assigned] (SOLR-12037) Reduce noise from flakey tests

[jira] [Assigned] (SOLR-13709) Race condition on core reload while core is still loading?

[jira] [Commented] (SOLR-13381) Unexpected docvalues type SORTED_NUMERIC Exception when grouping by a PointField facet

[jira] [Assigned] (SOLR-14861) CoreContainer shutdown needs to be aware of other ongoing operations and wait until they're complete

[jira] [Commented] (SOLR-14469) Removed deprecated code in solr/core (master only)

[jira] [Commented] (SOLR-14861) CoreContainer shutdown needs to be aware of other ongoing operations and wait until they're complete

[jira] [Assigned] (SOLR-13381) Unexpected docvalues type SORTED_NUMERIC Exception when grouping by a PointField facet

[jira] [Commented] (SOLR-13709) Race condition on core reload while core is still loading?

[jira] [Assigned] (SOLR-14469) Removed deprecated code in solr/core (master only)

[jira] [Commented] (SOLR-14920) Format code automatically and enforce it in Solr

[jira] [Assigned] (SOLR-14920) Format code automatically and enforce it in Solr

[jira] [Commented] (LUCENE-9652) DataInput.readFloats to be used by Lucene90VectorReader

[GitHub] [lucene-solr] cpoerschke merged pull request #2152: SOLR-14034: remove deprecated min_rf references

[jira] [Commented] (SOLR-14034) remove deprecated min_rf references

[jira] [Resolved] (SOLR-14034) remove deprecated min_rf references

[GitHub] [lucene-solr] msokolov opened a new pull request #2175: LUCENE-9652: DataInput.readFloats for use by Lucene90VectorReader

[jira] [Commented] (SOLR-14788) Solr: The Next Big Thing

[jira] [Commented] (SOLR-14788) Solr: The Next Big Thing

[jira] [Commented] (LUCENE-5940) change index backwards compatibility policy.

[jira] [Resolved] (LUCENE-5862) Old segments not deleted on merge

[jira] [Commented] (LUCENE-9563) Add .editorConfig

[jira] [Commented] (SOLR-14282) /get handler doesn't return copied fields

[jira] [Commented] (SOLR-13034) RealTimeGetComponent#toSolrDoc should be able to resolve LazyFields

[jira] [Created] (SOLR-15063) Consolidate SolrDocument, SolrDocumentBase, SolrInputDocument

27 matches

Site Navigation

Mail list logo

Footer information