[GitHub] [lucene] wjp719 commented on pull request #786: LUCENE-10499: reduce unnecessary copy data overhead when growing array size
wjp719 commented on PR #786: URL: https://github.com/apache/lucene/pull/786#issuecomment-1101194508 > Thanks, the change looks correct to me. I'm not a fan of the new method's name, but I don't have a better suggestion. I'll merge this change in a few days unless someone objects. @jpountz Hi, there are no more reviews for a few days, can you help to merge this pr? thanks. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Created] (LUCENE-10519) ThreadLocal.remove under G1GC takes 100% CPU
Boicehuang created LUCENE-10519: --- Summary: ThreadLocal.remove under G1GC takes 100% CPU Key: LUCENE-10519 URL: https://issues.apache.org/jira/browse/LUCENE-10519 Project: Lucene - Core Issue Type: Bug Components: core/other Affects Versions: 8.11.1, 8.10.1, 8.9 Environment: Elasticsearch v7.16.0 OpenJDK v11 Reporter: Boicehuang h2. Problem There is a single ThreadLocalMap stored for each thread, which all ThreadLocals share, and that master map only periodically purges stale entries. When we close a CloseableThreadLocal, we only take care of the current thread right now, others will be taken care of via the WeakReferences. Under G1GC, the WeakReferences of other threads may not be reclaimed even after several rounds of mix-GC. The ThreadLocalMap may grow very large, it can take an arbitrarily long amount of CPU and time to iterate the things you had stored in it. Hot thread of elasticsearch: {code:java} ::: {x}{lCj7LcVnT328KHcJRd57yg}{WPiNCbk0R0SIKxg4-w3wew}{}{} Hot threads at 2020-04-12T05:25:10.224Z, interval=500ms, busiestThreads=3, ignoreIdleThreads=true: 105.3% (526.5ms out of 500ms) cpu usage by thread 'elasticsearch[][bulk][T#31]' 10/10 snapshots sharing following 34 elements java.lang.ThreadLocal$ThreadLocalMap.expungeStaleEntry(ThreadLocal.java:627) java.lang.ThreadLocal$ThreadLocalMap.remove(ThreadLocal.java:509) java.lang.ThreadLocal$ThreadLocalMap.access$200(ThreadLocal.java:308) java.lang.ThreadLocal.remove(ThreadLocal.java:224) java.util.concurrent.locks.ReentrantReadWriteLock$Sync.tryReleaseShared(ReentrantReadWriteLock.java:426) java.util.concurrent.locks.AbstractQueuedSynchronizer.releaseShared(AbstractQueuedSynchronizer.java:1349) java.util.concurrent.locks.ReentrantReadWriteLock$ReadLock.unlock(ReentrantReadWriteLock.java:881) org.elasticsearch.common.util.concurrent.ReleasableLock.close(ReleasableLock.java:49) org.elasticsearch.index.engine.InternalEngine.$closeResource(InternalEngine.java:356) org.elasticsearch.index.engine.InternalEngine.delete(InternalEngine.java:1272) org.elasticsearch.index.shard.IndexShard.delete(IndexShard.java:812) org.elasticsearch.index.shard.IndexShard.applyDeleteOperation(IndexShard.java:779) org.elasticsearch.index.shard.IndexShard.applyDeleteOperationOnReplica(IndexShard.java:750) org.elasticsearch.action.bulk.TransportShardBulkAction.performOpOnReplica(TransportShardBulkAction.java:623) org.elasticsearch.action.bulk.TransportShardBulkAction.performOnReplica(TransportShardBulkAction.java:577) {code} h2. Solution This bug does not reproduce under CMS. It can be reproduced always. In fact we don't need to store entry twice in the hardRefs And ThreadLocals. Remove ThreadLocal from CloseableThreadLocal so that we would not be affected by the serious flaw of Java's built-in ThreadLocal. I would like to provide a patch to fix it if needed. h2. Related issues [https://github.com/elastic/elasticsearch/issues/56766] [https://discuss.elastic.co/t/indexing-performance-degrading-over-time/40229/44] -- This message was sent by Atlassian Jira (v8.20.1#820001) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Updated] (LUCENE-10519) ThreadLocal.remove under G1GC takes 100% CPU
[ https://issues.apache.org/jira/browse/LUCENE-10519?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Boicehuang updated LUCENE-10519: Description: h2. Problem There is a single ThreadLocalMap stored for each thread, which all ThreadLocals share, and that master map only periodically purges stale entries. When we close a CloseableThreadLocal, we only take care of the current thread right now, others will be taken care of via the WeakReferences. Under G1GC, the WeakReferences of other threads may not be recycled even after several rounds of mix-GC. The ThreadLocalMap may grow very large, it can take an arbitrarily long amount of CPU and time to iterate the things you had stored in it. Hot thread of elasticsearch: {code:java} ::: {x}{lCj7LcVnT328KHcJRd57yg}{WPiNCbk0R0SIKxg4-w3wew}{}{} Hot threads at 2020-04-12T05:25:10.224Z, interval=500ms, busiestThreads=3, ignoreIdleThreads=true: 105.3% (526.5ms out of 500ms) cpu usage by thread 'elasticsearch[][bulk][T#31]' 10/10 snapshots sharing following 34 elements java.lang.ThreadLocal$ThreadLocalMap.expungeStaleEntry(ThreadLocal.java:627) java.lang.ThreadLocal$ThreadLocalMap.remove(ThreadLocal.java:509) java.lang.ThreadLocal$ThreadLocalMap.access$200(ThreadLocal.java:308) java.lang.ThreadLocal.remove(ThreadLocal.java:224) java.util.concurrent.locks.ReentrantReadWriteLock$Sync.tryReleaseShared(ReentrantReadWriteLock.java:426) java.util.concurrent.locks.AbstractQueuedSynchronizer.releaseShared(AbstractQueuedSynchronizer.java:1349) java.util.concurrent.locks.ReentrantReadWriteLock$ReadLock.unlock(ReentrantReadWriteLock.java:881) org.elasticsearch.common.util.concurrent.ReleasableLock.close(ReleasableLock.java:49) org.elasticsearch.index.engine.InternalEngine.$closeResource(InternalEngine.java:356) org.elasticsearch.index.engine.InternalEngine.delete(InternalEngine.java:1272) org.elasticsearch.index.shard.IndexShard.delete(IndexShard.java:812) org.elasticsearch.index.shard.IndexShard.applyDeleteOperation(IndexShard.java:779) org.elasticsearch.index.shard.IndexShard.applyDeleteOperationOnReplica(IndexShard.java:750) org.elasticsearch.action.bulk.TransportShardBulkAction.performOpOnReplica(TransportShardBulkAction.java:623) org.elasticsearch.action.bulk.TransportShardBulkAction.performOnReplica(TransportShardBulkAction.java:577) {code} h2. Solution This bug does not reproduce under CMS. It can be reproduced always. In fact we don't need to store entry twice in the hardRefs And ThreadLocals. Remove ThreadLocal from CloseableThreadLocal so that we would not be affected by the serious flaw of Java's built-in ThreadLocal. I would like to provide a patch to fix it if needed. h2. Related issues [https://github.com/elastic/elasticsearch/issues/56766] [https://discuss.elastic.co/t/indexing-performance-degrading-over-time/40229/44] was: h2. Problem There is a single ThreadLocalMap stored for each thread, which all ThreadLocals share, and that master map only periodically purges stale entries. When we close a CloseableThreadLocal, we only take care of the current thread right now, others will be taken care of via the WeakReferences. Under G1GC, the WeakReferences of other threads may not be reclaimed even after several rounds of mix-GC. The ThreadLocalMap may grow very large, it can take an arbitrarily long amount of CPU and time to iterate the things you had stored in it. Hot thread of elasticsearch: {code:java} ::: {x}{lCj7LcVnT328KHcJRd57yg}{WPiNCbk0R0SIKxg4-w3wew}{}{} Hot threads at 2020-04-12T05:25:10.224Z, interval=500ms, busiestThreads=3, ignoreIdleThreads=true: 105.3% (526.5ms out of 500ms) cpu usage by thread 'elasticsearch[][bulk][T#31]' 10/10 snapshots sharing following 34 elements java.lang.ThreadLocal$ThreadLocalMap.expungeStaleEntry(ThreadLocal.java:627) java.lang.ThreadLocal$ThreadLocalMap.remove(ThreadLocal.java:509) java.lang.ThreadLocal$ThreadLocalMap.access$200(ThreadLocal.java:308) java.lang.ThreadLocal.remove(ThreadLocal.java:224) java.util.concurrent.locks.ReentrantReadWriteLock$Sync.tryReleaseShared(ReentrantReadWriteLock.java:426) java.util.concurrent.locks.AbstractQueuedSynchronizer.releaseShared(AbstractQueuedSynchronizer.java:1349) java.util.concurrent.locks.ReentrantReadWriteLock$ReadLock.unlock(ReentrantReadWriteLock.java:881) org.elasticsearch.common.util.concurrent.ReleasableLock.close(ReleasableLock.java:49) org.elasticsearch.index.engine.InternalEngine.$closeResource(InternalEngine.java:356) org.elasticsearch.index.engine.InternalEngine.delete(InternalEngine.java:1272) org.elasticsearch.index.shard.IndexShard.delete(IndexShard.java:812
[jira] [Updated] (LUCENE-10519) ThreadLocal.remove under G1GC takes 100% CPU
[ https://issues.apache.org/jira/browse/LUCENE-10519?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Boicehuang updated LUCENE-10519: Description: h2. Problem I found {*}org.apache.lucene.util.CloseableThreadLocal{*}(which is using {*}ThreadLocal>{*}) may cause leak under G1GC. There is a single ThreadLocalMap stored for each thread, which all ThreadLocals share, and that master map only periodically purges stale entries. When we close a CloseableThreadLocal, we only take care of the current thread right now, others will be taken care of via the WeakReferences. Under G1GC, the WeakReferences of other threads may not be recycled even after several rounds of mix-GC. The ThreadLocalMap may grow very large, it can take an arbitrarily long amount of CPU and time to iterate the things you had stored in it. Hot thread of elasticsearch: {code:java} ::: {x}{lCj7LcVnT328KHcJRd57yg}{WPiNCbk0R0SIKxg4-w3wew}{}{} Hot threads at 2020-04-12T05:25:10.224Z, interval=500ms, busiestThreads=3, ignoreIdleThreads=true: 105.3% (526.5ms out of 500ms) cpu usage by thread 'elasticsearch[][bulk][T#31]' 10/10 snapshots sharing following 34 elements java.lang.ThreadLocal$ThreadLocalMap.expungeStaleEntry(ThreadLocal.java:627) java.lang.ThreadLocal$ThreadLocalMap.remove(ThreadLocal.java:509) java.lang.ThreadLocal$ThreadLocalMap.access$200(ThreadLocal.java:308) java.lang.ThreadLocal.remove(ThreadLocal.java:224) java.util.concurrent.locks.ReentrantReadWriteLock$Sync.tryReleaseShared(ReentrantReadWriteLock.java:426) java.util.concurrent.locks.AbstractQueuedSynchronizer.releaseShared(AbstractQueuedSynchronizer.java:1349) java.util.concurrent.locks.ReentrantReadWriteLock$ReadLock.unlock(ReentrantReadWriteLock.java:881) org.elasticsearch.common.util.concurrent.ReleasableLock.close(ReleasableLock.java:49) org.elasticsearch.index.engine.InternalEngine.$closeResource(InternalEngine.java:356) org.elasticsearch.index.engine.InternalEngine.delete(InternalEngine.java:1272) org.elasticsearch.index.shard.IndexShard.delete(IndexShard.java:812) org.elasticsearch.index.shard.IndexShard.applyDeleteOperation(IndexShard.java:779) org.elasticsearch.index.shard.IndexShard.applyDeleteOperationOnReplica(IndexShard.java:750) org.elasticsearch.action.bulk.TransportShardBulkAction.performOpOnReplica(TransportShardBulkAction.java:623) org.elasticsearch.action.bulk.TransportShardBulkAction.performOnReplica(TransportShardBulkAction.java:577) {code} h2. Solution This bug does not reproduce under CMS. It can be reproduced always. In fact we don't need to store entry twice in the hardRefs And ThreadLocals. Remove ThreadLocal from CloseableThreadLocal so that we would not be affected by the serious flaw of Java's built-in ThreadLocal. I would like to provide a patch to fix it if needed. h2. Related issues [https://github.com/elastic/elasticsearch/issues/56766] [https://discuss.elastic.co/t/indexing-performance-degrading-over-time/40229/44] was: h2. Problem There is a single ThreadLocalMap stored for each thread, which all ThreadLocals share, and that master map only periodically purges stale entries. When we close a CloseableThreadLocal, we only take care of the current thread right now, others will be taken care of via the WeakReferences. Under G1GC, the WeakReferences of other threads may not be recycled even after several rounds of mix-GC. The ThreadLocalMap may grow very large, it can take an arbitrarily long amount of CPU and time to iterate the things you had stored in it. Hot thread of elasticsearch: {code:java} ::: {x}{lCj7LcVnT328KHcJRd57yg}{WPiNCbk0R0SIKxg4-w3wew}{}{} Hot threads at 2020-04-12T05:25:10.224Z, interval=500ms, busiestThreads=3, ignoreIdleThreads=true: 105.3% (526.5ms out of 500ms) cpu usage by thread 'elasticsearch[][bulk][T#31]' 10/10 snapshots sharing following 34 elements java.lang.ThreadLocal$ThreadLocalMap.expungeStaleEntry(ThreadLocal.java:627) java.lang.ThreadLocal$ThreadLocalMap.remove(ThreadLocal.java:509) java.lang.ThreadLocal$ThreadLocalMap.access$200(ThreadLocal.java:308) java.lang.ThreadLocal.remove(ThreadLocal.java:224) java.util.concurrent.locks.ReentrantReadWriteLock$Sync.tryReleaseShared(ReentrantReadWriteLock.java:426) java.util.concurrent.locks.AbstractQueuedSynchronizer.releaseShared(AbstractQueuedSynchronizer.java:1349) java.util.concurrent.locks.ReentrantReadWriteLock$ReadLock.unlock(ReentrantReadWriteLock.java:881) org.elasticsearch.common.util.concurrent.ReleasableLock.close(ReleasableLock.java:49) org.elasticsearch.index.engine.InternalEngine.$closeResource(InternalEngine.java:356) org.elasticsearch.index.engine.Interna
[jira] [Updated] (LUCENE-10519) ThreadLocal.remove under G1GC takes 100% CPU
[ https://issues.apache.org/jira/browse/LUCENE-10519?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Boicehuang updated LUCENE-10519: Description: h2. Problem I found {*}org.apache.lucene.util.CloseableThreadLocal{*}(which is using {*}ThreadLocal>{*}) may cause leak under G1GC. There is a single ThreadLocalMap stored for each thread, which all ThreadLocals share, and that master map only periodically purges stale entries. When we close a CloseableThreadLocal, we only take care of the current thread right now, others will be taken care of via the WeakReferences. Under G1GC, the WeakReferences of other threads may not be recycled even after several rounds of mix-GC. The ThreadLocalMap may grow very large, it can take an arbitrarily long amount of CPU and time to iterate the things you had stored in it. Hot thread of elasticsearch: {code:java} ::: {x}{lCj7LcVnT328KHcJRd57yg}{WPiNCbk0R0SIKxg4-w3wew}{}{} Hot threads at 2020-04-12T05:25:10.224Z, interval=500ms, busiestThreads=3, ignoreIdleThreads=true: 105.3% (526.5ms out of 500ms) cpu usage by thread 'elasticsearch[][bulk][T#31]' 10/10 snapshots sharing following 34 elements java.lang.ThreadLocal$ThreadLocalMap.expungeStaleEntry(ThreadLocal.java:627) java.lang.ThreadLocal$ThreadLocalMap.remove(ThreadLocal.java:509) java.lang.ThreadLocal$ThreadLocalMap.access$200(ThreadLocal.java:308) java.lang.ThreadLocal.remove(ThreadLocal.java:224) java.util.concurrent.locks.ReentrantReadWriteLock$Sync.tryReleaseShared(ReentrantReadWriteLock.java:426) java.util.concurrent.locks.AbstractQueuedSynchronizer.releaseShared(AbstractQueuedSynchronizer.java:1349) java.util.concurrent.locks.ReentrantReadWriteLock$ReadLock.unlock(ReentrantReadWriteLock.java:881) org.elasticsearch.common.util.concurrent.ReleasableLock.close(ReleasableLock.java:49) org.elasticsearch.index.engine.InternalEngine.$closeResource(InternalEngine.java:356) org.elasticsearch.index.engine.InternalEngine.delete(InternalEngine.java:1272) org.elasticsearch.index.shard.IndexShard.delete(IndexShard.java:812) org.elasticsearch.index.shard.IndexShard.applyDeleteOperation(IndexShard.java:779) org.elasticsearch.index.shard.IndexShard.applyDeleteOperationOnReplica(IndexShard.java:750) org.elasticsearch.action.bulk.TransportShardBulkAction.performOpOnReplica(TransportShardBulkAction.java:623) org.elasticsearch.action.bulk.TransportShardBulkAction.performOnReplica(TransportShardBulkAction.java:577) {code} h2. Solution This bug does not reproduce under CMS. It can be reproduced always. In fact we don't need to store entry twice in the hardRefs And ThreadLocals. Remove ThreadLocal from CloseableThreadLocal so that we would not be affected by the serious flaw of Java's built-in ThreadLocal. I would like to provide a patch to fix it if needed. h2. Related issues [https://github.com/elastic/elasticsearch/issues/56766] [https://bugs.openjdk.java.net/browse/JDK-8182982] [https://discuss.elastic.co/t/indexing-performance-degrading-over-time/40229/44] was: h2. Problem I found {*}org.apache.lucene.util.CloseableThreadLocal{*}(which is using {*}ThreadLocal>{*}) may cause leak under G1GC. There is a single ThreadLocalMap stored for each thread, which all ThreadLocals share, and that master map only periodically purges stale entries. When we close a CloseableThreadLocal, we only take care of the current thread right now, others will be taken care of via the WeakReferences. Under G1GC, the WeakReferences of other threads may not be recycled even after several rounds of mix-GC. The ThreadLocalMap may grow very large, it can take an arbitrarily long amount of CPU and time to iterate the things you had stored in it. Hot thread of elasticsearch: {code:java} ::: {x}{lCj7LcVnT328KHcJRd57yg}{WPiNCbk0R0SIKxg4-w3wew}{}{} Hot threads at 2020-04-12T05:25:10.224Z, interval=500ms, busiestThreads=3, ignoreIdleThreads=true: 105.3% (526.5ms out of 500ms) cpu usage by thread 'elasticsearch[][bulk][T#31]' 10/10 snapshots sharing following 34 elements java.lang.ThreadLocal$ThreadLocalMap.expungeStaleEntry(ThreadLocal.java:627) java.lang.ThreadLocal$ThreadLocalMap.remove(ThreadLocal.java:509) java.lang.ThreadLocal$ThreadLocalMap.access$200(ThreadLocal.java:308) java.lang.ThreadLocal.remove(ThreadLocal.java:224) java.util.concurrent.locks.ReentrantReadWriteLock$Sync.tryReleaseShared(ReentrantReadWriteLock.java:426) java.util.concurrent.locks.AbstractQueuedSynchronizer.releaseShared(AbstractQueuedSynchronizer.java:1349) java.util.concurrent.locks.ReentrantReadWriteLock$ReadLock.unlock(ReentrantReadWriteLock.java:881) org.elasticsearch.common.util.concurrent.ReleasableLock
[jira] [Updated] (LUCENE-10519) ThreadLocal.remove under G1GC takes 100% CPU
[ https://issues.apache.org/jira/browse/LUCENE-10519?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Boicehuang updated LUCENE-10519: Description: h2. Problem I found {*}org.apache.lucene.util.CloseableThreadLocal{*}(which is using {*}ThreadLocal>{*}) may cause leak under G1GC. There is a single ThreadLocalMap stored for each thread, which all ThreadLocals share, and that master map only periodically purges stale entries. When we close a CloseableThreadLocal, we only take care of the current thread right now, others will be taken care of via the WeakReferences. Under G1GC, the WeakReferences of other threads may not be recycled even after several rounds of mix-GC. The ThreadLocalMap may grow very large, it can take an arbitrarily long amount of CPU and time to iterate the things you had stored in it. Hot thread of elasticsearch: {code:java} ::: {x}{lCj7LcVnT328KHcJRd57yg}{WPiNCbk0R0SIKxg4-w3wew}{}{} Hot threads at 2020-04-12T05:25:10.224Z, interval=500ms, busiestThreads=3, ignoreIdleThreads=true: 105.3% (526.5ms out of 500ms) cpu usage by thread 'elasticsearch[][bulk][T#31]' 10/10 snapshots sharing following 34 elements java.lang.ThreadLocal$ThreadLocalMap.expungeStaleEntry(ThreadLocal.java:627) java.lang.ThreadLocal$ThreadLocalMap.remove(ThreadLocal.java:509) java.lang.ThreadLocal$ThreadLocalMap.access$200(ThreadLocal.java:308) java.lang.ThreadLocal.remove(ThreadLocal.java:224) java.util.concurrent.locks.ReentrantReadWriteLock$Sync.tryReleaseShared(ReentrantReadWriteLock.java:426) java.util.concurrent.locks.AbstractQueuedSynchronizer.releaseShared(AbstractQueuedSynchronizer.java:1349) java.util.concurrent.locks.ReentrantReadWriteLock$ReadLock.unlock(ReentrantReadWriteLock.java:881) org.elasticsearch.common.util.concurrent.ReleasableLock.close(ReleasableLock.java:49) org.elasticsearch.index.engine.InternalEngine.$closeResource(InternalEngine.java:356) org.elasticsearch.index.engine.InternalEngine.delete(InternalEngine.java:1272) org.elasticsearch.index.shard.IndexShard.delete(IndexShard.java:812) org.elasticsearch.index.shard.IndexShard.applyDeleteOperation(IndexShard.java:779) org.elasticsearch.index.shard.IndexShard.applyDeleteOperationOnReplica(IndexShard.java:750) org.elasticsearch.action.bulk.TransportShardBulkAction.performOpOnReplica(TransportShardBulkAction.java:623) org.elasticsearch.action.bulk.TransportShardBulkAction.performOnReplica(TransportShardBulkAction.java:577) {code} h2. Solution This bug does not reproduce under CMS. It can be reproduced always. In fact we don't need to store entry twice in the hardRefs And ThreadLocals. Remove ThreadLocal from CloseableThreadLocal so that we would not be affected by the serious flaw of Java's built-in ThreadLocal. I would like to provide a patch to fix it if needed. h2. Related issues [https://github.com/elastic/elasticsearch/issues/56766] [https://bugs.openjdk.java.net/browse/JDK-8182982] [https://discuss.elastic.co/t/indexing-performance-degrading-over-time/40229/44] was: h2. Problem I found {*}org.apache.lucene.util.CloseableThreadLocal{*}(which is using {*}ThreadLocal>{*}) may cause leak under G1GC. There is a single ThreadLocalMap stored for each thread, which all ThreadLocals share, and that master map only periodically purges stale entries. When we close a CloseableThreadLocal, we only take care of the current thread right now, others will be taken care of via the WeakReferences. Under G1GC, the WeakReferences of other threads may not be recycled even after several rounds of mix-GC. The ThreadLocalMap may grow very large, it can take an arbitrarily long amount of CPU and time to iterate the things you had stored in it. Hot thread of elasticsearch: {code:java} ::: {x}{lCj7LcVnT328KHcJRd57yg}{WPiNCbk0R0SIKxg4-w3wew}{}{} Hot threads at 2020-04-12T05:25:10.224Z, interval=500ms, busiestThreads=3, ignoreIdleThreads=true: 105.3% (526.5ms out of 500ms) cpu usage by thread 'elasticsearch[][bulk][T#31]' 10/10 snapshots sharing following 34 elements java.lang.ThreadLocal$ThreadLocalMap.expungeStaleEntry(ThreadLocal.java:627) java.lang.ThreadLocal$ThreadLocalMap.remove(ThreadLocal.java:509) java.lang.ThreadLocal$ThreadLocalMap.access$200(ThreadLocal.java:308) java.lang.ThreadLocal.remove(ThreadLocal.java:224) java.util.concurrent.locks.ReentrantReadWriteLock$Sync.tryReleaseShared(ReentrantReadWriteLock.java:426) java.util.concurrent.locks.AbstractQueuedSynchronizer.releaseShared(AbstractQueuedSynchronizer.java:1349) java.util.concurrent.locks.ReentrantReadWriteLock$ReadLock.unlock(ReentrantReadWriteLock.java:881) org.elasticsearch.common.util.concurrent.ReleasableLock
[jira] [Updated] (LUCENE-10519) ThreadLocal.remove under G1GC takes 100% CPU
[ https://issues.apache.org/jira/browse/LUCENE-10519?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Boicehuang updated LUCENE-10519: Description: h2. Problem I found {*}org.apache.lucene.util.CloseableThreadLocal{*}(which is using {*}ThreadLocal>{*}) may still have a flaw under G1GC. There is a single ThreadLocalMap stored for each thread, which all ThreadLocals share, and that master map only periodically purges stale entries. When we close a CloseableThreadLocal, we only take care of the current thread right now, others will be taken care of via the WeakReferences. Under G1GC, the WeakReferences of other threads may not be recycled even after several rounds of mix-GC. The ThreadLocalMap may grow very large, it can take an arbitrarily long amount of CPU and time to iterate the things you had stored in it. Hot thread of elasticsearch: {code:java} ::: {x}{lCj7LcVnT328KHcJRd57yg}{WPiNCbk0R0SIKxg4-w3wew}{}{} Hot threads at 2020-04-12T05:25:10.224Z, interval=500ms, busiestThreads=3, ignoreIdleThreads=true: 105.3% (526.5ms out of 500ms) cpu usage by thread 'elasticsearch[][bulk][T#31]' 10/10 snapshots sharing following 34 elements java.lang.ThreadLocal$ThreadLocalMap.expungeStaleEntry(ThreadLocal.java:627) java.lang.ThreadLocal$ThreadLocalMap.remove(ThreadLocal.java:509) java.lang.ThreadLocal$ThreadLocalMap.access$200(ThreadLocal.java:308) java.lang.ThreadLocal.remove(ThreadLocal.java:224) java.util.concurrent.locks.ReentrantReadWriteLock$Sync.tryReleaseShared(ReentrantReadWriteLock.java:426) java.util.concurrent.locks.AbstractQueuedSynchronizer.releaseShared(AbstractQueuedSynchronizer.java:1349) java.util.concurrent.locks.ReentrantReadWriteLock$ReadLock.unlock(ReentrantReadWriteLock.java:881) org.elasticsearch.common.util.concurrent.ReleasableLock.close(ReleasableLock.java:49) org.elasticsearch.index.engine.InternalEngine.$closeResource(InternalEngine.java:356) org.elasticsearch.index.engine.InternalEngine.delete(InternalEngine.java:1272) org.elasticsearch.index.shard.IndexShard.delete(IndexShard.java:812) org.elasticsearch.index.shard.IndexShard.applyDeleteOperation(IndexShard.java:779) org.elasticsearch.index.shard.IndexShard.applyDeleteOperationOnReplica(IndexShard.java:750) org.elasticsearch.action.bulk.TransportShardBulkAction.performOpOnReplica(TransportShardBulkAction.java:623) org.elasticsearch.action.bulk.TransportShardBulkAction.performOnReplica(TransportShardBulkAction.java:577) {code} h2. Solution This bug does not reproduce under CMS. It can be reproduced always. In fact we don't need to store entry twice in the hardRefs And ThreadLocals. Remove ThreadLocal from CloseableThreadLocal so that we would not be affected by the serious flaw of Java's built-in ThreadLocal. I would like to provide a patch to fix it if needed. h2. Related issues [https://github.com/elastic/elasticsearch/issues/56766] [https://bugs.openjdk.java.net/browse/JDK-8182982] [https://discuss.elastic.co/t/indexing-performance-degrading-over-time/40229/44] was: h2. Problem I found {*}org.apache.lucene.util.CloseableThreadLocal{*}(which is using {*}ThreadLocal>{*}) may cause leak under G1GC. There is a single ThreadLocalMap stored for each thread, which all ThreadLocals share, and that master map only periodically purges stale entries. When we close a CloseableThreadLocal, we only take care of the current thread right now, others will be taken care of via the WeakReferences. Under G1GC, the WeakReferences of other threads may not be recycled even after several rounds of mix-GC. The ThreadLocalMap may grow very large, it can take an arbitrarily long amount of CPU and time to iterate the things you had stored in it. Hot thread of elasticsearch: {code:java} ::: {x}{lCj7LcVnT328KHcJRd57yg}{WPiNCbk0R0SIKxg4-w3wew}{}{} Hot threads at 2020-04-12T05:25:10.224Z, interval=500ms, busiestThreads=3, ignoreIdleThreads=true: 105.3% (526.5ms out of 500ms) cpu usage by thread 'elasticsearch[][bulk][T#31]' 10/10 snapshots sharing following 34 elements java.lang.ThreadLocal$ThreadLocalMap.expungeStaleEntry(ThreadLocal.java:627) java.lang.ThreadLocal$ThreadLocalMap.remove(ThreadLocal.java:509) java.lang.ThreadLocal$ThreadLocalMap.access$200(ThreadLocal.java:308) java.lang.ThreadLocal.remove(ThreadLocal.java:224) java.util.concurrent.locks.ReentrantReadWriteLock$Sync.tryReleaseShared(ReentrantReadWriteLock.java:426) java.util.concurrent.locks.AbstractQueuedSynchronizer.releaseShared(AbstractQueuedSynchronizer.java:1349) java.util.concurrent.locks.ReentrantReadWriteLock$ReadLock.unlock(ReentrantReadWriteLock.java:881) org.elasticsearch.common.util.concurrent.Releasa
[jira] [Updated] (LUCENE-10519) ThreadLocal.remove under G1GC takes 100% CPU
[ https://issues.apache.org/jira/browse/LUCENE-10519?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Boicehuang updated LUCENE-10519: Description: h2. Problem I found {*}org.apache.lucene.util.CloseableThreadLocal{*}(which is using {*}ThreadLocal>{*}) may still have a flaw under G1GC. There is a single ThreadLocalMap stored for each thread, which all ThreadLocals share, and that master map only periodically purges stale entries. When we close a CloseableThreadLocal, we only take care of the current thread right now, others will be taken care of via the WeakReferences. Under G1GC, the WeakReferences of other threads may not be recycled even after several rounds of mix-GC. The ThreadLocalMap may grow very large, it can take an arbitrarily long amount of CPU and time to iterate the things you had stored in it. Hot thread of elasticsearch: {code:java} ::: {x}{lCj7LcVnT328KHcJRd57yg}{WPiNCbk0R0SIKxg4-w3wew}{}{} Hot threads at 2020-04-12T05:25:10.224Z, interval=500ms, busiestThreads=3, ignoreIdleThreads=true: 105.3% (526.5ms out of 500ms) cpu usage by thread 'elasticsearch[][bulk][T#31]' 10/10 snapshots sharing following 34 elements java.lang.ThreadLocal$ThreadLocalMap.expungeStaleEntry(ThreadLocal.java:627) java.lang.ThreadLocal$ThreadLocalMap.remove(ThreadLocal.java:509) java.lang.ThreadLocal$ThreadLocalMap.access$200(ThreadLocal.java:308) java.lang.ThreadLocal.remove(ThreadLocal.java:224) java.util.concurrent.locks.ReentrantReadWriteLock$Sync.tryReleaseShared(ReentrantReadWriteLock.java:426) java.util.concurrent.locks.AbstractQueuedSynchronizer.releaseShared(AbstractQueuedSynchronizer.java:1349) java.util.concurrent.locks.ReentrantReadWriteLock$ReadLock.unlock(ReentrantReadWriteLock.java:881) org.elasticsearch.common.util.concurrent.ReleasableLock.close(ReleasableLock.java:49) org.elasticsearch.index.engine.InternalEngine.$closeResource(InternalEngine.java:356) org.elasticsearch.index.engine.InternalEngine.delete(InternalEngine.java:1272) org.elasticsearch.index.shard.IndexShard.delete(IndexShard.java:812) org.elasticsearch.index.shard.IndexShard.applyDeleteOperation(IndexShard.java:779) org.elasticsearch.index.shard.IndexShard.applyDeleteOperationOnReplica(IndexShard.java:750) org.elasticsearch.action.bulk.TransportShardBulkAction.performOpOnReplica(TransportShardBulkAction.java:623) org.elasticsearch.action.bulk.TransportShardBulkAction.performOnReplica(TransportShardBulkAction.java:577) {code} h2. Solution This bug does not reproduce under CMS. It can be reproduced under G1GC always. In fact we don't need to store entry twice in the hardRefs And ThreadLocals. Remove ThreadLocal from CloseableThreadLocal so that we would not be affected by the serious flaw of Java's built-in ThreadLocal. I would like to provide a patch to fix it if needed. h2. Related issues [https://github.com/elastic/elasticsearch/issues/56766] [https://bugs.openjdk.java.net/browse/JDK-8182982] [https://discuss.elastic.co/t/indexing-performance-degrading-over-time/40229/44] was: h2. Problem I found {*}org.apache.lucene.util.CloseableThreadLocal{*}(which is using {*}ThreadLocal>{*}) may still have a flaw under G1GC. There is a single ThreadLocalMap stored for each thread, which all ThreadLocals share, and that master map only periodically purges stale entries. When we close a CloseableThreadLocal, we only take care of the current thread right now, others will be taken care of via the WeakReferences. Under G1GC, the WeakReferences of other threads may not be recycled even after several rounds of mix-GC. The ThreadLocalMap may grow very large, it can take an arbitrarily long amount of CPU and time to iterate the things you had stored in it. Hot thread of elasticsearch: {code:java} ::: {x}{lCj7LcVnT328KHcJRd57yg}{WPiNCbk0R0SIKxg4-w3wew}{}{} Hot threads at 2020-04-12T05:25:10.224Z, interval=500ms, busiestThreads=3, ignoreIdleThreads=true: 105.3% (526.5ms out of 500ms) cpu usage by thread 'elasticsearch[][bulk][T#31]' 10/10 snapshots sharing following 34 elements java.lang.ThreadLocal$ThreadLocalMap.expungeStaleEntry(ThreadLocal.java:627) java.lang.ThreadLocal$ThreadLocalMap.remove(ThreadLocal.java:509) java.lang.ThreadLocal$ThreadLocalMap.access$200(ThreadLocal.java:308) java.lang.ThreadLocal.remove(ThreadLocal.java:224) java.util.concurrent.locks.ReentrantReadWriteLock$Sync.tryReleaseShared(ReentrantReadWriteLock.java:426) java.util.concurrent.locks.AbstractQueuedSynchronizer.releaseShared(AbstractQueuedSynchronizer.java:1349) java.util.concurrent.locks.ReentrantReadWriteLock$ReadLock.unlock(ReentrantReadWriteLock.java:881) org.elasticsearch.common.util.
[GitHub] [lucene] boicehuang opened a new pull request, #816: LUCENE-10519: ThreadLocal.remove under G1GC takes 100% CPU
boicehuang opened a new pull request, #816: URL: https://github.com/apache/lucene/pull/816 See also: https://issues.apache.org/jira/browse/LUCENE-10519 Solution --- We don't need to store entry twice in the hardRefs And ThreadLocals. Remove ThreadLocal from CloseableThreadLocal so that we would not be affected by the serious flaw of Java's built-in ThreadLocal. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Updated] (LUCENE-10519) ThreadLocal.remove under G1GC takes 100% CPU
[ https://issues.apache.org/jira/browse/LUCENE-10519?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Boicehuang updated LUCENE-10519: Description: h2. Problem I found {*}org.apache.lucene.util.CloseableThreadLocal{*}(which is using {*}ThreadLocal>{*}) may still have a flaw under G1GC. There is a single ThreadLocalMap stored for each thread, which all ThreadLocals share, and that master map only periodically purges stale entries. When we close a CloseableThreadLocal, we only take care of the current thread right now, others will be taken care of via the WeakReferences. Under G1GC, the WeakReferences of other threads may not be recycled even after several rounds of mix-GC. The ThreadLocalMap may grow very large, it can take an arbitrarily long amount of CPU and time to iterate the things you had stored in it. Hot thread of elasticsearch: {code:java} ::: {x}{lCj7LcVnT328KHcJRd57yg}{WPiNCbk0R0SIKxg4-w3wew}{}{} Hot threads at 2020-04-12T05:25:10.224Z, interval=500ms, busiestThreads=3, ignoreIdleThreads=true: 105.3% (526.5ms out of 500ms) cpu usage by thread 'elasticsearch[][bulk][T#31]' 10/10 snapshots sharing following 34 elements java.lang.ThreadLocal$ThreadLocalMap.expungeStaleEntry(ThreadLocal.java:627) java.lang.ThreadLocal$ThreadLocalMap.remove(ThreadLocal.java:509) java.lang.ThreadLocal$ThreadLocalMap.access$200(ThreadLocal.java:308) java.lang.ThreadLocal.remove(ThreadLocal.java:224) java.util.concurrent.locks.ReentrantReadWriteLock$Sync.tryReleaseShared(ReentrantReadWriteLock.java:426) java.util.concurrent.locks.AbstractQueuedSynchronizer.releaseShared(AbstractQueuedSynchronizer.java:1349) java.util.concurrent.locks.ReentrantReadWriteLock$ReadLock.unlock(ReentrantReadWriteLock.java:881) org.elasticsearch.common.util.concurrent.ReleasableLock.close(ReleasableLock.java:49) org.elasticsearch.index.engine.InternalEngine.$closeResource(InternalEngine.java:356) org.elasticsearch.index.engine.InternalEngine.delete(InternalEngine.java:1272) org.elasticsearch.index.shard.IndexShard.delete(IndexShard.java:812) org.elasticsearch.index.shard.IndexShard.applyDeleteOperation(IndexShard.java:779) org.elasticsearch.index.shard.IndexShard.applyDeleteOperationOnReplica(IndexShard.java:750) org.elasticsearch.action.bulk.TransportShardBulkAction.performOpOnReplica(TransportShardBulkAction.java:623) org.elasticsearch.action.bulk.TransportShardBulkAction.performOnReplica(TransportShardBulkAction.java:577) {code} h2. Solution This bug does not reproduce under CMS. It can be reproduced under G1GC always. In fact we don't need to store entry twice in the hardRefs And ThreadLocals. Remove ThreadLocal from CloseableThreadLocal so that we would not be affected by the serious flaw of Java's built-in ThreadLocal. I would like to provide a patch to fix it if needed. h2. See also [https://github.com/elastic/elasticsearch/issues/56766] [https://bugs.openjdk.java.net/browse/JDK-8182982] [https://discuss.elastic.co/t/indexing-performance-degrading-over-time/40229/44] was: h2. Problem I found {*}org.apache.lucene.util.CloseableThreadLocal{*}(which is using {*}ThreadLocal>{*}) may still have a flaw under G1GC. There is a single ThreadLocalMap stored for each thread, which all ThreadLocals share, and that master map only periodically purges stale entries. When we close a CloseableThreadLocal, we only take care of the current thread right now, others will be taken care of via the WeakReferences. Under G1GC, the WeakReferences of other threads may not be recycled even after several rounds of mix-GC. The ThreadLocalMap may grow very large, it can take an arbitrarily long amount of CPU and time to iterate the things you had stored in it. Hot thread of elasticsearch: {code:java} ::: {x}{lCj7LcVnT328KHcJRd57yg}{WPiNCbk0R0SIKxg4-w3wew}{}{} Hot threads at 2020-04-12T05:25:10.224Z, interval=500ms, busiestThreads=3, ignoreIdleThreads=true: 105.3% (526.5ms out of 500ms) cpu usage by thread 'elasticsearch[][bulk][T#31]' 10/10 snapshots sharing following 34 elements java.lang.ThreadLocal$ThreadLocalMap.expungeStaleEntry(ThreadLocal.java:627) java.lang.ThreadLocal$ThreadLocalMap.remove(ThreadLocal.java:509) java.lang.ThreadLocal$ThreadLocalMap.access$200(ThreadLocal.java:308) java.lang.ThreadLocal.remove(ThreadLocal.java:224) java.util.concurrent.locks.ReentrantReadWriteLock$Sync.tryReleaseShared(ReentrantReadWriteLock.java:426) java.util.concurrent.locks.AbstractQueuedSynchronizer.releaseShared(AbstractQueuedSynchronizer.java:1349) java.util.concurrent.locks.ReentrantReadWriteLock$ReadLock.unlock(ReentrantReadWriteLock.java:881) org.elasticsearch.common.util.concur
[jira] [Updated] (LUCENE-10519) ThreadLocal.remove under G1GC takes 100% CPU
[ https://issues.apache.org/jira/browse/LUCENE-10519?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Boicehuang updated LUCENE-10519: Description: h2. Problem I found {*}org.apache.lucene.util.CloseableThreadLocal{*}(which is using {*}ThreadLocal>{*}) may still have a flaw under G1GC. There is a single ThreadLocalMap stored for each thread, which all ThreadLocals share, and that master map only periodically purges stale entries. When we close a CloseableThreadLocal, we only take care of the current thread right now, others will be taken care of via the WeakReferences. Under G1GC, the WeakReferences of other threads may not be recycled even after several rounds of mix-GC. The ThreadLocalMap may grow very large, it can take an arbitrarily long amount of CPU and time to iterate the things you had stored in it. Hot thread of elasticsearch: {code:java} ::: {x}{lCj7LcVnT328KHcJRd57yg}{WPiNCbk0R0SIKxg4-w3wew}{}{} Hot threads at 2020-04-12T05:25:10.224Z, interval=500ms, busiestThreads=3, ignoreIdleThreads=true: 105.3% (526.5ms out of 500ms) cpu usage by thread 'elasticsearch[][bulk][T#31]' 10/10 snapshots sharing following 34 elements java.lang.ThreadLocal$ThreadLocalMap.expungeStaleEntry(ThreadLocal.java:627) java.lang.ThreadLocal$ThreadLocalMap.remove(ThreadLocal.java:509) java.lang.ThreadLocal$ThreadLocalMap.access$200(ThreadLocal.java:308) java.lang.ThreadLocal.remove(ThreadLocal.java:224) java.util.concurrent.locks.ReentrantReadWriteLock$Sync.tryReleaseShared(ReentrantReadWriteLock.java:426) java.util.concurrent.locks.AbstractQueuedSynchronizer.releaseShared(AbstractQueuedSynchronizer.java:1349) java.util.concurrent.locks.ReentrantReadWriteLock$ReadLock.unlock(ReentrantReadWriteLock.java:881) org.elasticsearch.common.util.concurrent.ReleasableLock.close(ReleasableLock.java:49) org.elasticsearch.index.engine.InternalEngine.$closeResource(InternalEngine.java:356) org.elasticsearch.index.engine.InternalEngine.delete(InternalEngine.java:1272) org.elasticsearch.index.shard.IndexShard.delete(IndexShard.java:812) org.elasticsearch.index.shard.IndexShard.applyDeleteOperation(IndexShard.java:779) org.elasticsearch.index.shard.IndexShard.applyDeleteOperationOnReplica(IndexShard.java:750) org.elasticsearch.action.bulk.TransportShardBulkAction.performOpOnReplica(TransportShardBulkAction.java:623) org.elasticsearch.action.bulk.TransportShardBulkAction.performOnReplica(TransportShardBulkAction.java:577) {code} h2. Solution This bug does not reproduce under CMS. It can be reproduced under G1GC always. In fact we don't need to store entry twice in the hardRefs And ThreadLocals. Remove ThreadLocal from CloseableThreadLocal so that we would not be affected by the serious flaw of Java's built-in ThreadLocal. I would like to provide a patch to fix it if needed. h2. See also [https://github.com/elastic/elasticsearch/issues/56766] [https://bugs.openjdk.java.net/browse/JDK-8182982] [https://discuss.elastic.co/t/indexing-performance-degrading-over-time/40229/44] h2. Issues link https://issues.apache.org/jira/browse/LUCENE-10519 was: h2. Problem I found {*}org.apache.lucene.util.CloseableThreadLocal{*}(which is using {*}ThreadLocal>{*}) may still have a flaw under G1GC. There is a single ThreadLocalMap stored for each thread, which all ThreadLocals share, and that master map only periodically purges stale entries. When we close a CloseableThreadLocal, we only take care of the current thread right now, others will be taken care of via the WeakReferences. Under G1GC, the WeakReferences of other threads may not be recycled even after several rounds of mix-GC. The ThreadLocalMap may grow very large, it can take an arbitrarily long amount of CPU and time to iterate the things you had stored in it. Hot thread of elasticsearch: {code:java} ::: {x}{lCj7LcVnT328KHcJRd57yg}{WPiNCbk0R0SIKxg4-w3wew}{}{} Hot threads at 2020-04-12T05:25:10.224Z, interval=500ms, busiestThreads=3, ignoreIdleThreads=true: 105.3% (526.5ms out of 500ms) cpu usage by thread 'elasticsearch[][bulk][T#31]' 10/10 snapshots sharing following 34 elements java.lang.ThreadLocal$ThreadLocalMap.expungeStaleEntry(ThreadLocal.java:627) java.lang.ThreadLocal$ThreadLocalMap.remove(ThreadLocal.java:509) java.lang.ThreadLocal$ThreadLocalMap.access$200(ThreadLocal.java:308) java.lang.ThreadLocal.remove(ThreadLocal.java:224) java.util.concurrent.locks.ReentrantReadWriteLock$Sync.tryReleaseShared(ReentrantReadWriteLock.java:426) java.util.concurrent.locks.AbstractQueuedSynchronizer.releaseShared(AbstractQueuedSynchronizer.java:1349) java.util.concurrent.locks.ReentrantReadWriteLock$ReadLock.unlock(Reentran
[jira] [Updated] (LUCENE-10519) ThreadLocal.remove under G1GC takes 100% CPU
[ https://issues.apache.org/jira/browse/LUCENE-10519?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Boicehuang updated LUCENE-10519: Description: h2. Problem {*}org.apache.lucene.util.CloseableThreadLocal{*}(which is using {*}ThreadLocal>{*}) may still have a flaw under G1GC. There is a single ThreadLocalMap stored for each thread, which all ThreadLocals share, and that master map only periodically purges stale entries. When we close a CloseableThreadLocal, we only take care of the current thread right now, others will be taken care of via the WeakReferences. Under G1GC, the WeakReferences of other threads may not be recycled even after several rounds of mix-GC. The ThreadLocalMap may grow very large, it can take an arbitrarily long amount of CPU and time to iterate the things you had stored in it. Hot thread of elasticsearch: {code:java} ::: {x}{lCj7LcVnT328KHcJRd57yg}{WPiNCbk0R0SIKxg4-w3wew}{}{} Hot threads at 2020-04-12T05:25:10.224Z, interval=500ms, busiestThreads=3, ignoreIdleThreads=true: 105.3% (526.5ms out of 500ms) cpu usage by thread 'elasticsearch[][bulk][T#31]' 10/10 snapshots sharing following 34 elements java.lang.ThreadLocal$ThreadLocalMap.expungeStaleEntry(ThreadLocal.java:627) java.lang.ThreadLocal$ThreadLocalMap.remove(ThreadLocal.java:509) java.lang.ThreadLocal$ThreadLocalMap.access$200(ThreadLocal.java:308) java.lang.ThreadLocal.remove(ThreadLocal.java:224) java.util.concurrent.locks.ReentrantReadWriteLock$Sync.tryReleaseShared(ReentrantReadWriteLock.java:426) java.util.concurrent.locks.AbstractQueuedSynchronizer.releaseShared(AbstractQueuedSynchronizer.java:1349) java.util.concurrent.locks.ReentrantReadWriteLock$ReadLock.unlock(ReentrantReadWriteLock.java:881) org.elasticsearch.common.util.concurrent.ReleasableLock.close(ReleasableLock.java:49) org.elasticsearch.index.engine.InternalEngine.$closeResource(InternalEngine.java:356) org.elasticsearch.index.engine.InternalEngine.delete(InternalEngine.java:1272) org.elasticsearch.index.shard.IndexShard.delete(IndexShard.java:812) org.elasticsearch.index.shard.IndexShard.applyDeleteOperation(IndexShard.java:779) org.elasticsearch.index.shard.IndexShard.applyDeleteOperationOnReplica(IndexShard.java:750) org.elasticsearch.action.bulk.TransportShardBulkAction.performOpOnReplica(TransportShardBulkAction.java:623) org.elasticsearch.action.bulk.TransportShardBulkAction.performOnReplica(TransportShardBulkAction.java:577) {code} h2. Solution This bug does not reproduce under CMS. It can be reproduced under G1GC always. In fact we don't need to store entry twice in the hardRefs And ThreadLocals. Remove ThreadLocal from CloseableThreadLocal so that we would not be affected by the serious flaw of Java's built-in ThreadLocal. I would like to provide a patch to fix it if needed. h2. See also [https://github.com/elastic/elasticsearch/issues/56766] [https://bugs.openjdk.java.net/browse/JDK-8182982] [https://discuss.elastic.co/t/indexing-performance-degrading-over-time/40229/44] was: h2. Problem I found {*}org.apache.lucene.util.CloseableThreadLocal{*}(which is using {*}ThreadLocal>{*}) may still have a flaw under G1GC. There is a single ThreadLocalMap stored for each thread, which all ThreadLocals share, and that master map only periodically purges stale entries. When we close a CloseableThreadLocal, we only take care of the current thread right now, others will be taken care of via the WeakReferences. Under G1GC, the WeakReferences of other threads may not be recycled even after several rounds of mix-GC. The ThreadLocalMap may grow very large, it can take an arbitrarily long amount of CPU and time to iterate the things you had stored in it. Hot thread of elasticsearch: {code:java} ::: {x}{lCj7LcVnT328KHcJRd57yg}{WPiNCbk0R0SIKxg4-w3wew}{}{} Hot threads at 2020-04-12T05:25:10.224Z, interval=500ms, busiestThreads=3, ignoreIdleThreads=true: 105.3% (526.5ms out of 500ms) cpu usage by thread 'elasticsearch[][bulk][T#31]' 10/10 snapshots sharing following 34 elements java.lang.ThreadLocal$ThreadLocalMap.expungeStaleEntry(ThreadLocal.java:627) java.lang.ThreadLocal$ThreadLocalMap.remove(ThreadLocal.java:509) java.lang.ThreadLocal$ThreadLocalMap.access$200(ThreadLocal.java:308) java.lang.ThreadLocal.remove(ThreadLocal.java:224) java.util.concurrent.locks.ReentrantReadWriteLock$Sync.tryReleaseShared(ReentrantReadWriteLock.java:426) java.util.concurrent.locks.AbstractQueuedSynchronizer.releaseShared(AbstractQueuedSynchronizer.java:1349) java.util.concurrent.locks.ReentrantReadWriteLock$ReadLock.unlock(ReentrantReadWriteLock.java:881) org.elasticsearch.common.util.concurrent.Rel
[jira] [Updated] (LUCENE-10519) ThreadLocal.remove under G1GC takes 100% CPU
[ https://issues.apache.org/jira/browse/LUCENE-10519?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Boicehuang updated LUCENE-10519: Description: h2. Problem {*}org.apache.lucene.util.CloseableThreadLocal{*}(which is using {*}ThreadLocal>{*}) may still have a flaw under G1GC. There is a single ThreadLocalMap stored for each thread, which all ThreadLocals share, and that master map only periodically purges stale entries. When we close a CloseableThreadLocal, we only take care of the current thread right now, others will be taken care of via the WeakReferences. Under G1GC, the WeakReferences of other threads may not be recycled even after several rounds of mix-GC. The ThreadLocalMap may grow very large, it can take an arbitrarily long amount of CPU and time to iterate the things you had stored in it. Hot thread of elasticsearch: {code:java} ::: {x}{lCj7LcVnT328KHcJRd57yg}{WPiNCbk0R0SIKxg4-w3wew}{}{} Hot threads at 2020-04-12T05:25:10.224Z, interval=500ms, busiestThreads=3, ignoreIdleThreads=true: 105.3% (526.5ms out of 500ms) cpu usage by thread 'elasticsearch[][bulk][T#31]' 10/10 snapshots sharing following 34 elements java.lang.ThreadLocal$ThreadLocalMap.expungeStaleEntry(ThreadLocal.java:627) java.lang.ThreadLocal$ThreadLocalMap.remove(ThreadLocal.java:509) java.lang.ThreadLocal$ThreadLocalMap.access$200(ThreadLocal.java:308) java.lang.ThreadLocal.remove(ThreadLocal.java:224) java.util.concurrent.locks.ReentrantReadWriteLock$Sync.tryReleaseShared(ReentrantReadWriteLock.java:426) java.util.concurrent.locks.AbstractQueuedSynchronizer.releaseShared(AbstractQueuedSynchronizer.java:1349) java.util.concurrent.locks.ReentrantReadWriteLock$ReadLock.unlock(ReentrantReadWriteLock.java:881) org.elasticsearch.common.util.concurrent.ReleasableLock.close(ReleasableLock.java:49) org.elasticsearch.index.engine.InternalEngine.$closeResource(InternalEngine.java:356) org.elasticsearch.index.engine.InternalEngine.delete(InternalEngine.java:1272) org.elasticsearch.index.shard.IndexShard.delete(IndexShard.java:812) org.elasticsearch.index.shard.IndexShard.applyDeleteOperation(IndexShard.java:779) org.elasticsearch.index.shard.IndexShard.applyDeleteOperationOnReplica(IndexShard.java:750) org.elasticsearch.action.bulk.TransportShardBulkAction.performOpOnReplica(TransportShardBulkAction.java:623) org.elasticsearch.action.bulk.TransportShardBulkAction.performOnReplica(TransportShardBulkAction.java:577) {code} h2. Solution This bug does not reproduce under CMS. It can be reproduced under G1GC always. In fact, *CloseableThreadLocal* doesn't need to store entry twice in the hardRefs And ThreadLocals. Remove ThreadLocal from CloseableThreadLocal so that we would not be affected by the serious flaw of Java's built-in ThreadLocal. I would like to provide a patch to fix it if needed. h2. See also [https://github.com/elastic/elasticsearch/issues/56766] [https://bugs.openjdk.java.net/browse/JDK-8182982] [https://discuss.elastic.co/t/indexing-performance-degrading-over-time/40229/44] was: h2. Problem {*}org.apache.lucene.util.CloseableThreadLocal{*}(which is using {*}ThreadLocal>{*}) may still have a flaw under G1GC. There is a single ThreadLocalMap stored for each thread, which all ThreadLocals share, and that master map only periodically purges stale entries. When we close a CloseableThreadLocal, we only take care of the current thread right now, others will be taken care of via the WeakReferences. Under G1GC, the WeakReferences of other threads may not be recycled even after several rounds of mix-GC. The ThreadLocalMap may grow very large, it can take an arbitrarily long amount of CPU and time to iterate the things you had stored in it. Hot thread of elasticsearch: {code:java} ::: {x}{lCj7LcVnT328KHcJRd57yg}{WPiNCbk0R0SIKxg4-w3wew}{}{} Hot threads at 2020-04-12T05:25:10.224Z, interval=500ms, busiestThreads=3, ignoreIdleThreads=true: 105.3% (526.5ms out of 500ms) cpu usage by thread 'elasticsearch[][bulk][T#31]' 10/10 snapshots sharing following 34 elements java.lang.ThreadLocal$ThreadLocalMap.expungeStaleEntry(ThreadLocal.java:627) java.lang.ThreadLocal$ThreadLocalMap.remove(ThreadLocal.java:509) java.lang.ThreadLocal$ThreadLocalMap.access$200(ThreadLocal.java:308) java.lang.ThreadLocal.remove(ThreadLocal.java:224) java.util.concurrent.locks.ReentrantReadWriteLock$Sync.tryReleaseShared(ReentrantReadWriteLock.java:426) java.util.concurrent.locks.AbstractQueuedSynchronizer.releaseShared(AbstractQueuedSynchronizer.java:1349) java.util.concurrent.locks.ReentrantReadWriteLock$ReadLock.unlock(ReentrantReadWriteLock.java:881) org.elasticsearch.common.util
[jira] [Updated] (LUCENE-10519) ThreadLocal.remove under G1GC takes 100% CPU
[ https://issues.apache.org/jira/browse/LUCENE-10519?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Boicehuang updated LUCENE-10519: Description: h2. Problem {*}org.apache.lucene.util.CloseableThreadLocal{*}(which is using {*}ThreadLocal>{*}) may still have a flaw under G1GC. There is a single ThreadLocalMap stored for each thread, which all ThreadLocals share, and that master map only periodically purges stale entries. When we close a CloseableThreadLocal, we only take care of the current thread right now, others will be taken care of via the WeakReferences. Under G1GC, the WeakReferences of other threads may not be recycled even after several rounds of mix-GC. The ThreadLocalMap may grow very large, it can take an arbitrarily long amount of CPU and time to iterate the things you had stored in it. Hot thread of elasticsearch: {code:java} ::: {x}{lCj7LcVnT328KHcJRd57yg}{WPiNCbk0R0SIKxg4-w3wew}{}{} Hot threads at 2020-04-12T05:25:10.224Z, interval=500ms, busiestThreads=3, ignoreIdleThreads=true: 105.3% (526.5ms out of 500ms) cpu usage by thread 'elasticsearch[][bulk][T#31]' 10/10 snapshots sharing following 34 elements java.lang.ThreadLocal$ThreadLocalMap.expungeStaleEntry(ThreadLocal.java:627) java.lang.ThreadLocal$ThreadLocalMap.remove(ThreadLocal.java:509) java.lang.ThreadLocal$ThreadLocalMap.access$200(ThreadLocal.java:308) java.lang.ThreadLocal.remove(ThreadLocal.java:224) java.util.concurrent.locks.ReentrantReadWriteLock$Sync.tryReleaseShared(ReentrantReadWriteLock.java:426) java.util.concurrent.locks.AbstractQueuedSynchronizer.releaseShared(AbstractQueuedSynchronizer.java:1349) java.util.concurrent.locks.ReentrantReadWriteLock$ReadLock.unlock(ReentrantReadWriteLock.java:881) org.elasticsearch.common.util.concurrent.ReleasableLock.close(ReleasableLock.java:49) org.elasticsearch.index.engine.InternalEngine.$closeResource(InternalEngine.java:356) org.elasticsearch.index.engine.InternalEngine.delete(InternalEngine.java:1272) org.elasticsearch.index.shard.IndexShard.delete(IndexShard.java:812) org.elasticsearch.index.shard.IndexShard.applyDeleteOperation(IndexShard.java:779) org.elasticsearch.index.shard.IndexShard.applyDeleteOperationOnReplica(IndexShard.java:750) org.elasticsearch.action.bulk.TransportShardBulkAction.performOpOnReplica(TransportShardBulkAction.java:623) org.elasticsearch.action.bulk.TransportShardBulkAction.performOnReplica(TransportShardBulkAction.java:577) {code} h2. Solution This bug does not reproduce under CMS. It can be reproduced under G1GC always. In fact, *CloseableThreadLocal* doesn't need to store entry twice in the hardRefs And ThreadLocals. Remove ThreadLocal from CloseableThreadLocal so that we would not be affected by the serious flaw of Java's built-in ThreadLocal. I would like to provide a patch to fix it if needed. h2. See also [https://github.com/elastic/elasticsearch/issues/56766] [https://bugs.openjdk.java.net/browse/JDK-8182982] [https://discuss.elastic.co/t/indexing-performance-degrading-over-time/40229/44] was: h2. Problem {*}org.apache.lucene.util.CloseableThreadLocal{*}(which is using {*}ThreadLocal>{*}) may still have a flaw under G1GC. There is a single ThreadLocalMap stored for each thread, which all ThreadLocals share, and that master map only periodically purges stale entries. When we close a CloseableThreadLocal, we only take care of the current thread right now, others will be taken care of via the WeakReferences. Under G1GC, the WeakReferences of other threads may not be recycled even after several rounds of mix-GC. The ThreadLocalMap may grow very large, it can take an arbitrarily long amount of CPU and time to iterate the things you had stored in it. Hot thread of elasticsearch: {code:java} ::: {x}{lCj7LcVnT328KHcJRd57yg}{WPiNCbk0R0SIKxg4-w3wew}{}{} Hot threads at 2020-04-12T05:25:10.224Z, interval=500ms, busiestThreads=3, ignoreIdleThreads=true: 105.3% (526.5ms out of 500ms) cpu usage by thread 'elasticsearch[][bulk][T#31]' 10/10 snapshots sharing following 34 elements java.lang.ThreadLocal$ThreadLocalMap.expungeStaleEntry(ThreadLocal.java:627) java.lang.ThreadLocal$ThreadLocalMap.remove(ThreadLocal.java:509) java.lang.ThreadLocal$ThreadLocalMap.access$200(ThreadLocal.java:308) java.lang.ThreadLocal.remove(ThreadLocal.java:224) java.util.concurrent.locks.ReentrantReadWriteLock$Sync.tryReleaseShared(ReentrantReadWriteLock.java:426) java.util.concurrent.locks.AbstractQueuedSynchronizer.releaseShared(AbstractQueuedSynchronizer.java:1349) java.util.concurrent.locks.ReentrantReadWriteLock$ReadLock.unlock(ReentrantReadWriteLock.java:881) org.elasticsearch.common.uti
[jira] [Updated] (LUCENE-10519) ThreadLocal.remove under G1GC takes 100% CPU
[ https://issues.apache.org/jira/browse/LUCENE-10519?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Boicehuang updated LUCENE-10519: Description: h2. Problem {*}org.apache.lucene.util.CloseableThreadLocal{*}(which is using {*}ThreadLocal>{*}) may still have a flaw under G1GC. There is a single ThreadLocalMap stored for each thread, which all ThreadLocals share, and that master map only periodically purges stale entries. When we close a CloseableThreadLocal, we only take care of the current thread right now, others will be taken care of via the WeakReferences. Under G1GC, the WeakReferences of other threads may not be recycled even after several rounds of mix-GC. The ThreadLocalMap may grow very large, it can take an arbitrarily long amount of CPU and time to iterate the things you had stored in it. Hot thread of elasticsearch: {code:java} ::: {x}{lCj7LcVnT328KHcJRd57yg}{WPiNCbk0R0SIKxg4-w3wew}{}{} Hot threads at 2020-04-12T05:25:10.224Z, interval=500ms, busiestThreads=3, ignoreIdleThreads=true: 105.3% (526.5ms out of 500ms) cpu usage by thread 'elasticsearch[][bulk][T#31]' 10/10 snapshots sharing following 34 elements java.lang.ThreadLocal$ThreadLocalMap.expungeStaleEntry(ThreadLocal.java:627) java.lang.ThreadLocal$ThreadLocalMap.remove(ThreadLocal.java:509) java.lang.ThreadLocal$ThreadLocalMap.access$200(ThreadLocal.java:308) java.lang.ThreadLocal.remove(ThreadLocal.java:224) java.util.concurrent.locks.ReentrantReadWriteLock$Sync.tryReleaseShared(ReentrantReadWriteLock.java:426) java.util.concurrent.locks.AbstractQueuedSynchronizer.releaseShared(AbstractQueuedSynchronizer.java:1349) java.util.concurrent.locks.ReentrantReadWriteLock$ReadLock.unlock(ReentrantReadWriteLock.java:881) org.elasticsearch.common.util.concurrent.ReleasableLock.close(ReleasableLock.java:49) org.elasticsearch.index.engine.InternalEngine.$closeResource(InternalEngine.java:356) org.elasticsearch.index.engine.InternalEngine.delete(InternalEngine.java:1272) org.elasticsearch.index.shard.IndexShard.delete(IndexShard.java:812) org.elasticsearch.index.shard.IndexShard.applyDeleteOperation(IndexShard.java:779) org.elasticsearch.index.shard.IndexShard.applyDeleteOperationOnReplica(IndexShard.java:750) org.elasticsearch.action.bulk.TransportShardBulkAction.performOpOnReplica(TransportShardBulkAction.java:623) org.elasticsearch.action.bulk.TransportShardBulkAction.performOnReplica(TransportShardBulkAction.java:577) {code} h2. Solution This bug does not reproduce under CMS. It can be reproduced under G1GC always. In fact, *CloseableThreadLocal* doesn't need to store entry twice in the hardRefs And ThreadLocals. Remove ThreadLocal from CloseableThreadLocal so that we would not be affected by the serious flaw of Java's built-in ThreadLocal. h2. See also [https://github.com/elastic/elasticsearch/issues/56766] [https://bugs.openjdk.java.net/browse/JDK-8182982] [https://discuss.elastic.co/t/indexing-performance-degrading-over-time/40229/44] was: h2. Problem {*}org.apache.lucene.util.CloseableThreadLocal{*}(which is using {*}ThreadLocal>{*}) may still have a flaw under G1GC. There is a single ThreadLocalMap stored for each thread, which all ThreadLocals share, and that master map only periodically purges stale entries. When we close a CloseableThreadLocal, we only take care of the current thread right now, others will be taken care of via the WeakReferences. Under G1GC, the WeakReferences of other threads may not be recycled even after several rounds of mix-GC. The ThreadLocalMap may grow very large, it can take an arbitrarily long amount of CPU and time to iterate the things you had stored in it. Hot thread of elasticsearch: {code:java} ::: {x}{lCj7LcVnT328KHcJRd57yg}{WPiNCbk0R0SIKxg4-w3wew}{}{} Hot threads at 2020-04-12T05:25:10.224Z, interval=500ms, busiestThreads=3, ignoreIdleThreads=true: 105.3% (526.5ms out of 500ms) cpu usage by thread 'elasticsearch[][bulk][T#31]' 10/10 snapshots sharing following 34 elements java.lang.ThreadLocal$ThreadLocalMap.expungeStaleEntry(ThreadLocal.java:627) java.lang.ThreadLocal$ThreadLocalMap.remove(ThreadLocal.java:509) java.lang.ThreadLocal$ThreadLocalMap.access$200(ThreadLocal.java:308) java.lang.ThreadLocal.remove(ThreadLocal.java:224) java.util.concurrent.locks.ReentrantReadWriteLock$Sync.tryReleaseShared(ReentrantReadWriteLock.java:426) java.util.concurrent.locks.AbstractQueuedSynchronizer.releaseShared(AbstractQueuedSynchronizer.java:1349) java.util.concurrent.locks.ReentrantReadWriteLock$ReadLock.unlock(ReentrantReadWriteLock.java:881) org.elasticsearch.common.util.concurrent.ReleasableLock.close(ReleasableLock.ja
[jira] [Created] (LUCENE-10520) HTMLCharStripFilter fails on '>' or '<' characters in attribute values
Alex Alishevskikh created LUCENE-10520: -- Summary: HTMLCharStripFilter fails on '>' or '<' characters in attribute values Key: LUCENE-10520 URL: https://issues.apache.org/jira/browse/LUCENE-10520 Project: Lucene - Core Issue Type: Bug Components: modules/analysis Reporter: Alex Alishevskikh Attachments: HTMLStripCharFilterTest.java If HTML input contains attributes with '<' or '>' characters in their values, HTMLCharStripFilter produces unexpected results. See the attached unit test for example. These characters are valid in attribute values, as by the [HTML5 specification |https://html.spec.whatwg.org/#syntax-attribute-value]. The [W3C validator|https://validator.w3.org/nu/#textarea] does not have issues with the test HTML. -- This message was sent by Atlassian Jira (v8.20.1#820001) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Updated] (LUCENE-10520) HTMLCharStripFilter fails on '>' or '<' characters in attribute values
[ https://issues.apache.org/jira/browse/LUCENE-10520?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alex Alishevskikh updated LUCENE-10520: --- Fix Version/s: 9.1 > HTMLCharStripFilter fails on '>' or '<' characters in attribute values > --- > > Key: LUCENE-10520 > URL: https://issues.apache.org/jira/browse/LUCENE-10520 > Project: Lucene - Core > Issue Type: Bug > Components: modules/analysis >Reporter: Alex Alishevskikh >Priority: Major > Labels: HTMLCharStripFilter > Fix For: 9.1 > > Attachments: HTMLStripCharFilterTest.java > > > If HTML input contains attributes with '<' or '>' characters in their values, > HTMLCharStripFilter produces unexpected results. > See the attached unit test for example. > These characters are valid in attribute values, as by the [HTML5 > specification |https://html.spec.whatwg.org/#syntax-attribute-value]. The > [W3C validator|https://validator.w3.org/nu/#textarea] does not have issues > with the test HTML. > -- This message was sent by Atlassian Jira (v8.20.1#820001) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Updated] (LUCENE-10520) HTMLCharStripFilter fails on '>' or '<' characters in attribute values
[ https://issues.apache.org/jira/browse/LUCENE-10520?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alex Alishevskikh updated LUCENE-10520: --- Affects Version/s: 9.1 > HTMLCharStripFilter fails on '>' or '<' characters in attribute values > --- > > Key: LUCENE-10520 > URL: https://issues.apache.org/jira/browse/LUCENE-10520 > Project: Lucene - Core > Issue Type: Bug > Components: modules/analysis >Affects Versions: 9.1 >Reporter: Alex Alishevskikh >Priority: Major > Labels: HTMLCharStripFilter > Fix For: 9.1 > > Attachments: HTMLStripCharFilterTest.java > > > If HTML input contains attributes with '<' or '>' characters in their values, > HTMLCharStripFilter produces unexpected results. > See the attached unit test for example. > These characters are valid in attribute values, as by the [HTML5 > specification |https://html.spec.whatwg.org/#syntax-attribute-value]. The > [W3C validator|https://validator.w3.org/nu/#textarea] does not have issues > with the test HTML. > -- This message was sent by Atlassian Jira (v8.20.1#820001) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene] mikemccand commented on pull request #815: Backport LUCENE-10482 Bug Fix: Don't use Instant.now() as prefix for the temp dir name
mikemccand commented on PR #815: URL: https://github.com/apache/lucene/pull/815#issuecomment-1101322811 I'll backport this to 9.x now -- sorry I should have done it last night too! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (LUCENE-10482) Allow users to create their own DirectoryTaxonomyReaders with empty taxoArrays instead of letting the taxoEpoch decide
[ https://issues.apache.org/jira/browse/LUCENE-10482?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17523660#comment-17523660 ] ASF subversion and git services commented on LUCENE-10482: -- Commit 766c08e475ba31e2f5b7e1cf491cdacbe276ab67 in lucene's branch refs/heads/branch_9x from Gautam Worah [ https://gitbox.apache.org/repos/asf?p=lucene.git;h=766c08e475b ] LUCENE-10482 Bug Fix: Don't use Instant.now() as prefix for the temp dir name (#814) * Don't use Instant.now() as prefix for the temp dir name * spotless > Allow users to create their own DirectoryTaxonomyReaders with empty > taxoArrays instead of letting the taxoEpoch decide > -- > > Key: LUCENE-10482 > URL: https://issues.apache.org/jira/browse/LUCENE-10482 > Project: Lucene - Core > Issue Type: Improvement > Components: modules/facet >Affects Versions: 9.1 >Reporter: Gautam Worah >Priority: Minor > Time Spent: 8h > Remaining Estimate: 0h > > I was experimenting with the taxonomy index and {{DirectoryTaxonomyReaders}} > in my day job where we were trying to replace the index underneath a reader > asynchronously and then call the {{doOpenIfChanged}} call on it. > It turns out that the taxonomy index uses its own index based counter (the > {{{}taxonomyIndexEpoch{}}}) to determine if the index was opened in write > mode after the last time it was written and if not, it directly tries to > reuse the previous {{taxoArrays}} it had created. This logic fails in a > scenario where both the old and new index were opened just once but the index > itself is completely different in both the cases. > In such a case, it would be good to give the user the flexibility to inform > the DTR to recreate its {{{}taxoArrays{}}}, {{ordinalCache}} and > {{{}categoryCache{}}} (not refreshing these arrays causes it to fail in > various ways). Luckily, such a constructor already exists! But it is private > today! The idea here is to allow subclasses of DTR to use this constructor. > Curious to see what other folks think about this idea. -- This message was sent by Atlassian Jira (v8.20.1#820001) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Updated] (LUCENE-10519) ThreadLocal.remove under G1GC takes 100% CPU
[ https://issues.apache.org/jira/browse/LUCENE-10519?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Boicehuang updated LUCENE-10519: Description: h2. Problem {*}org.apache.lucene.util.CloseableThreadLocal{*}(which is using {*}ThreadLocal>{*}) may still have a flaw under G1GC. There is a single ThreadLocalMap stored for each thread, which all ThreadLocals share, and that master map only periodically purges stale entries. When we close a CloseableThreadLocal, we only take care of the current thread right now, others will be taken care of via the WeakReferences. Under G1GC, the WeakReferences of other threads may not be recycled even after several rounds of mix-GC. The ThreadLocalMap may grow very large, it can take an arbitrarily long amount of CPU and time to iterate the things you had stored in it. Hot thread of elasticsearch: {code:java} ::: {x}{lCj7LcVnT328KHcJRd57yg}{WPiNCbk0R0SIKxg4-w3wew}{}{} Hot threads at 2020-04-12T05:25:10.224Z, interval=500ms, busiestThreads=3, ignoreIdleThreads=true: 105.3% (526.5ms out of 500ms) cpu usage by thread 'elasticsearch[][bulk][T#31]' 10/10 snapshots sharing following 34 elements java.lang.ThreadLocal$ThreadLocalMap.expungeStaleEntry(ThreadLocal.java:627) java.lang.ThreadLocal$ThreadLocalMap.remove(ThreadLocal.java:509) java.lang.ThreadLocal$ThreadLocalMap.access$200(ThreadLocal.java:308) java.lang.ThreadLocal.remove(ThreadLocal.java:224) java.util.concurrent.locks.ReentrantReadWriteLock$Sync.tryReleaseShared(ReentrantReadWriteLock.java:426) java.util.concurrent.locks.AbstractQueuedSynchronizer.releaseShared(AbstractQueuedSynchronizer.java:1349) java.util.concurrent.locks.ReentrantReadWriteLock$ReadLock.unlock(ReentrantReadWriteLock.java:881) org.elasticsearch.common.util.concurrent.ReleasableLock.close(ReleasableLock.java:49) org.elasticsearch.index.engine.InternalEngine.$closeResource(InternalEngine.java:356) org.elasticsearch.index.engine.InternalEngine.delete(InternalEngine.java:1272) org.elasticsearch.index.shard.IndexShard.delete(IndexShard.java:812) org.elasticsearch.index.shard.IndexShard.applyDeleteOperation(IndexShard.java:779) org.elasticsearch.index.shard.IndexShard.applyDeleteOperationOnReplica(IndexShard.java:750) org.elasticsearch.action.bulk.TransportShardBulkAction.performOpOnReplica(TransportShardBulkAction.java:623) org.elasticsearch.action.bulk.TransportShardBulkAction.performOnReplica(TransportShardBulkAction.java:577) {code} h2. Solution This bug does not reproduce under CMS. It can be reproduced under G1GC always. In fact, *CloseableThreadLocal* doesn't need to store entry twice in the hardRefs And ThreadLocals. Remove ThreadLocal from CloseableThreadLocal so that we would not be affected by the serious flaw of Java's built-in ThreadLocal. h2. See also [https://github.com/elastic/elasticsearch/issues/56766] [https://bugs.openjdk.java.net/browse/JDK-8182982] [https://discuss.elastic.co/t/indexing-performance-degrading-over-time/40229/44] was: h2. Problem {*}org.apache.lucene.util.CloseableThreadLocal{*}(which is using {*}ThreadLocal>{*}) may still have a flaw under G1GC. There is a single ThreadLocalMap stored for each thread, which all ThreadLocals share, and that master map only periodically purges stale entries. When we close a CloseableThreadLocal, we only take care of the current thread right now, others will be taken care of via the WeakReferences. Under G1GC, the WeakReferences of other threads may not be recycled even after several rounds of mix-GC. The ThreadLocalMap may grow very large, it can take an arbitrarily long amount of CPU and time to iterate the things you had stored in it. Hot thread of elasticsearch: {code:java} ::: {x}{lCj7LcVnT328KHcJRd57yg}{WPiNCbk0R0SIKxg4-w3wew}{}{} Hot threads at 2020-04-12T05:25:10.224Z, interval=500ms, busiestThreads=3, ignoreIdleThreads=true: 105.3% (526.5ms out of 500ms) cpu usage by thread 'elasticsearch[][bulk][T#31]' 10/10 snapshots sharing following 34 elements java.lang.ThreadLocal$ThreadLocalMap.expungeStaleEntry(ThreadLocal.java:627) java.lang.ThreadLocal$ThreadLocalMap.remove(ThreadLocal.java:509) java.lang.ThreadLocal$ThreadLocalMap.access$200(ThreadLocal.java:308) java.lang.ThreadLocal.remove(ThreadLocal.java:224) java.util.concurrent.locks.ReentrantReadWriteLock$Sync.tryReleaseShared(ReentrantReadWriteLock.java:426) java.util.concurrent.locks.AbstractQueuedSynchronizer.releaseShared(AbstractQueuedSynchronizer.java:1349) java.util.concurrent.locks.ReentrantReadWriteLock$ReadLock.unlock(ReentrantReadWriteLock.java:881) org.elasticsearch.common.util.concurrent.ReleasableLock.close(ReleasableLock.java
[GitHub] [lucene] mikemccand commented on pull request #808: LUCENE-10513: Run `gradlew tidy` first
mikemccand commented on PR #808: URL: https://github.com/apache/lucene/pull/808#issuecomment-1101417096 > I followed the commits and I think it shouldn't be the main concern here to edit `CONTRIBUTIONG.md`? I'm not going to be against or hold this. Maybe I'll open another follow-up PR for it; please go ahead. +1 to make it a priority to keep `CONTRIBUTING.md` succint. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene] mikemccand commented on pull request #808: LUCENE-10513: Run `gradlew tidy` first
mikemccand commented on PR #808: URL: https://github.com/apache/lucene/pull/808#issuecomment-1101418928 > I think I'm too accustomed to the development workflow on this project, so I'm not able to figure out what is the minimum information that should be shown there to introduce new contributors without unnecessary pain or asking "newbie" questions to committers - feedback is welcome. This happens to all of us :) We lose our fresh eyes / beginner's mind / [Shoshin](https://en.wikipedia.org/wiki/Shoshin). It is a sharp tool that quickly dulls, unfortunately. We old-timers must constantly strive to aggressively encourage the new people that grow the periphery of our community so we can learn from their experiences and make it better for the next new members. We are too API-blind or missing-stair to do this ourselves. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (LUCENE-10315) Speed up BKD leaf block ids codec by a 512 ints ForUtil
[ https://issues.apache.org/jira/browse/LUCENE-10315?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17523700#comment-17523700 ] Feng Guo commented on LUCENE-10315: --- Thanks [~ivera]! +1 to remove the int24 forutil implementation. I have updated the branch: https://github.com/apache/lucene/pull/797 > Speed up BKD leaf block ids codec by a 512 ints ForUtil > --- > > Key: LUCENE-10315 > URL: https://issues.apache.org/jira/browse/LUCENE-10315 > Project: Lucene - Core > Issue Type: Improvement >Reporter: Feng Guo >Assignee: Feng Guo >Priority: Major > Attachments: addall.svg, cpu_profile_baseline.html, > cpu_profile_path.html > > Time Spent: 6.5h > Remaining Estimate: 0h > > Elasticsearch (which based on lucene) can automatically infers types for > users with its dynamic mapping feature. When users index some low cardinality > fields, such as gender / age / status... they often use some numbers to > represent the values, while ES will infer these fields as {{{}long{}}}, and > ES uses BKD as the index of {{long}} fields. When the data volume grows, > building the result set of low-cardinality fields will make the CPU usage and > load very high. > This is a flame graph we obtained from the production environment: > [^addall.svg] > It can be seen that almost all CPU is used in addAll. When we reindex > {{long}} to {{{}keyword{}}}, the cluster load and search latency are greatly > reduced ( We spent weeks of time to reindex all indices... ). I know that ES > recommended to use {{keyword}} for term/terms query and {{long}} for range > query in the document, but there are always some users who didn't realize > this and keep their habit of using sql database, or dynamic mapping > automatically selects the type for them. All in all, users won't realize that > there would be such a big difference in performance between {{long}} and > {{keyword}} fields in low cardinality fields. So from my point of view it > will make sense if we can make BKD works better for the low/medium > cardinality fields. > As far as i can see, for low cardinality fields, there are two advantages of > {{keyword}} over {{{}long{}}}: > 1. {{ForUtil}} used in {{keyword}} postings is much more efficient than BKD's > delta VInt, because its batch reading (readLongs) and SIMD decode. > 2. When the query term count is less than 16, {{TermsInSetQuery}} can lazily > materialize of its result set, and when another small result clause > intersects with this low cardinality condition, the low cardinality field can > avoid reading all docIds into memory. > This ISSUE is targeting to solve the first point. The basic idea is trying to > use a 512 ints {{ForUtil}} for BKD ids codec. I benchmarked this optimization > by mocking some random {{LongPoint}} and querying them with > {{PointInSetQuery}}. > *Benchmark Result* > |doc count|field cardinality|query point|baseline QPS|candidate QPS|diff > percentage| > |1|32|1|51.44|148.26|188.22%| > |1|32|2|26.8|101.88|280.15%| > |1|32|4|14.04|53.52|281.20%| > |1|32|8|7.04|28.54|305.40%| > |1|32|16|3.54|14.61|312.71%| > |1|128|1|110.56|350.26|216.81%| > |1|128|8|16.6|89.81|441.02%| > |1|128|16|8.45|48.07|468.88%| > |1|128|32|4.2|25.35|503.57%| > |1|128|64|2.13|13.02|511.27%| > |1|1024|1|536.19|843.88|57.38%| > |1|1024|8|109.71|251.89|129.60%| > |1|1024|32|33.24|104.11|213.21%| > |1|1024|128|8.87|30.47|243.52%| > |1|1024|512|2.24|8.3|270.54%| > |1|8192|1|.33|5000|50.00%| > |1|8192|32|139.47|214.59|53.86%| > |1|8192|128|54.59|109.23|100.09%| > |1|8192|512|15.61|36.15|131.58%| > |1|8192|2048|4.11|11.14|171.05%| > |1|1048576|1|2597.4|3030.3|16.67%| > |1|1048576|32|314.96|371.75|18.03%| > |1|1048576|128|99.7|116.28|16.63%| > |1|1048576|512|30.5|37.15|21.80%| > |1|1048576|2048|10.38|12.3|18.50%| > |1|8388608|1|2564.1|3174.6|23.81%| > |1|8388608|32|196.27|238.95|21.75%| > |1|8388608|128|55.36|68.03|22.89%| > |1|8388608|512|15.58|19.24|23.49%| > |1|8388608|2048|4.56|5.71|25.22%| > The indices size is reduced for low cardinality fields and flat for high > cardinality fields. > {code:java} > 113Mindex_1_doc_32_cardinality_baseline > 114Mindex_1_doc_32_cardinality_candidate > 140Mindex_1_doc_128_cardinality_baseline > 133Mindex_1_doc_128_cardinality_candidate > 193Mindex_1_doc_1024_cardinality_baseline > 174Mindex_1_doc_1024_cardinality_candidate > 241Mindex_1_doc_8192_cardinality_baseline > 233Mindex_1_doc_8192_cardinality_candidat
[GitHub] [lucene] mikemccand commented on pull request #807: LUCENE-10512: Grammar: Remove incidents of "the the" in comments.
mikemccand commented on PR #807: URL: https://github.com/apache/lucene/pull/807#issuecomment-1101434176 > bq. Since the Spotless check seems to be fail-fast, maybe we should fix the exception message to just suggest ./gradlew tidy instead? > > Gradle runs tasks in parallel so it's not really "fail fast". It's "abort anything not yet started because built will fail". And if multiple things fail, gradle will report all of them (as a list of problems). As any tool, it takes some getting used to - I think these messages are quite fine (and 'tidy' is in fact a non-standard invention of mine... but it's a four letter word so I couldn't resist). Well then I stand by my original fail-fast label LOL ;) Concurrency of the tasks is no excuse. It is a complex build implementation detail, which you obviously deeply understand!! (thank you! I know migrating to `gradle` was an insane amount of work), that a new person should not need to know, or even know they don't know. From the fresh eyes user's standpoint, it still means that `./gradlew` is not showing all the things wrong with your change. Clearly it confused fresh-eyes here -- look at the commit message above [It keeps finding new things ... what's up with this?](https://github.com/apache/lucene/pull/807/commits/79b3f722392c2bdccadc442ac6a2c09fc9727c58). If we insist on our new users knowing these complex build implementation details then the least we could do is fix the exception message to say `just run ./gradlew check` or `./gradlew tidy`. Otherwise this trap remains for the next fresh eyes? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene] rmuir opened a new pull request, #817: improve spotless error to suggest running 'gradlew tidy'
rmuir opened a new pull request, #817: URL: https://github.com/apache/lucene/pull/817 The current error isn't helpful as it suggests a per-module command. If the user has modified multiple modules, they will be running gradle commands to try to fix each one of them, when it would be easier to just run 'gradlew tidy' a single time and fix everything.  -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene] rmuir commented on pull request #807: LUCENE-10512: Grammar: Remove incidents of "the the" in comments.
rmuir commented on PR #807: URL: https://github.com/apache/lucene/pull/807#issuecomment-1101498697 > bq. Since the Spotless check seems to be fail-fast, maybe we should fix the exception message to just suggest ./gradlew tidy instead? I have a hacky patch: https://github.com/apache/lucene/pull/817 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene] mikemccand commented on pull request #817: improve spotless error to suggest running 'gradlew tidy'
mikemccand commented on PR #817: URL: https://github.com/apache/lucene/pull/817#issuecomment-1101500256 +1 this is awesome @rmuir -- the exception message ought to be crystal clear how to resolve the problems, so that new users (who closely read the suggestions in exception messages) know how to proceed. As the message stands today, you can run the command it suggests (style checking just the one fail-fast module, e.g. `./gradlew :lucene:core:spotlessApply`), it succeeds, you open a PR thinking everything is great, and then it fails the top-level `./gradle check` in the GitHub checks. It's trappy for new users. This is what happened in [this incremental commit for this fresh-eyes PR](https://github.com/apache/lucene/pull/807/commits/79b3f722392c2bdccadc442ac6a2c09fc9727c58) with sad/fun commit message `It keeps finding new things ... what's up with this?`. Thank you!! We need to learn from the struggles of every new fresh-eyes developer in our community and fix the potholes for the next new developer. Our community grows only at its periphery. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene] uschindler commented on pull request #817: improve spotless error to suggest running 'gradlew tidy'
uschindler commented on PR #817: URL: https://github.com/apache/lucene/pull/817#issuecomment-1101504853 Hihi. Cool fix. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene] gautamworah96 commented on pull request #815: Backport LUCENE-10482 Bug Fix: Don't use Instant.now() as prefix for the temp dir name
gautamworah96 commented on PR #815: URL: https://github.com/apache/lucene/pull/815#issuecomment-1101580017 Closing this PR since the change was manually pushed by @mikemccand in https://github.com/apache/lucene/commit/766c08e475ba31e2f5b7e1cf491cdacbe276ab67 (`branch_9x`) -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene] gautamworah96 closed pull request #815: Backport LUCENE-10482 Bug Fix: Don't use Instant.now() as prefix for the temp dir name
gautamworah96 closed pull request #815: Backport LUCENE-10482 Bug Fix: Don't use Instant.now() as prefix for the temp dir name URL: https://github.com/apache/lucene/pull/815 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene] uschindler commented on pull request #815: Backport LUCENE-10482 Bug Fix: Don't use Instant.now() as prefix for the temp dir name
uschindler commented on PR #815: URL: https://github.com/apache/lucene/pull/815#issuecomment-1101589969 We never create PRs for backports if it is just a simple cherry picking action.no need for that as this is mostly 2 command line actions or 2 mouse clicks. I generally do this in one go. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene] rmuir commented on a diff in pull request #816: LUCENE-10519: ThreadLocal.remove under G1GC takes 100% CPU
rmuir commented on code in PR #816: URL: https://github.com/apache/lucene/pull/816#discussion_r852294751 ## lucene/core/src/java/org/apache/lucene/util/CloseableThreadLocal.java: ## @@ -123,12 +121,27 @@ public void close() { // Clear the hard refs; then, the only remaining refs to // all values we were storing are weak (unless somewhere // else is still using them) and so GC may reclaim them: -hardRefs = null; -// Take care of the current thread right now; others will be -// taken care of via the WeakReferences. -if (t != null) { - t.remove(); +perThreadValues = null; Review Comment: Can we be a bit more friendly, instead of just nulling the reference in close? We can empty the map. Something like (being friendly to closeable): ``` var values = perThreadValues; if (values != null) { values.clear(); } perThreadValues = null; ``` -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene] Yuti-G commented on pull request #779: LUCENE-10488: Optimize Facets#getTopDims in IntTaxonomyFacets
Yuti-G commented on PR #779: URL: https://github.com/apache/lucene/pull/779#issuecomment-1101733483 Hi @gautamworah96, thank you so much! I have re-run the benchmark with the up-to-date mainline, and please see the results: ``` TaskQPS baseline StdDevQPS my_modified_version StdDev Pct diff p-value HighTermTitleBDVSort 102.88 (19.0%) 96.21 (13.6%) -6.5% ( -32% - 32%) 0.214 HighTermMonthSort 121.33 (19.8%) 113.96 (14.0%) -6.1% ( -33% - 34%) 0.263 Wildcard 235.05 (9.0%) 222.60 (10.0%) -5.3% ( -22% - 14%) 0.077 Prefix3 138.51 (11.3%) 132.26 (12.2%) -4.5% ( -25% - 21%) 0.225 TermDTSort 83.76 (16.7%) 80.34 (14.5%) -4.1% ( -30% - 32%) 0.409 BrowseRandomLabelSSDVFacets6.23 (10.3%)6.14 (7.9%) -1.5% ( -17% - 18%) 0.598 BrowseDayOfYearSSDVFacets9.97 (19.8%)9.88 (20.4%) -0.9% ( -34% - 49%) 0.888 MedPhrase 72.98 (3.9%) 72.42 (3.7%) -0.8% ( -8% -7%) 0.521 MedTermDayTaxoFacets 19.40 (4.7%) 19.26 (5.0%) -0.7% ( -9% -9%) 0.627 PKLookup 137.96 (3.6%) 136.99 (3.4%) -0.7% ( -7% -6%) 0.521 MedSpanNear 13.70 (2.8%) 13.60 (3.5%) -0.7% ( -6% -5%) 0.483 HighSpanNear4.41 (3.3%)4.38 (4.5%) -0.7% ( -8% -7%) 0.581 LowSpanNear 12.33 (3.0%) 12.25 (4.2%) -0.7% ( -7% -6%) 0.553 MedSloppyPhrase 45.31 (1.2%) 45.07 (2.8%) -0.5% ( -4% -3%) 0.426 AndHighMedDayTaxoFacets 98.70 (2.0%) 98.17 (1.7%) -0.5% ( -4% -3%) 0.351 OrHighMedDayTaxoFacets9.51 (4.2%)9.47 (3.3%) -0.4% ( -7% -7%) 0.714 HighPhrase 230.28 (2.5%) 229.28 (2.8%) -0.4% ( -5% -5%) 0.610 HighTermDayOfYearSort 30.38 (21.9%) 30.29 (22.8%) -0.3% ( -36% - 56%) 0.963 LowPhrase 239.16 (3.9%) 238.51 (4.1%) -0.3% ( -7% -8%) 0.831 AndHighHigh 50.68 (4.8%) 50.56 (3.6%) -0.2% ( -8% -8%) 0.858 HighSloppyPhrase6.20 (4.1%)6.19 (4.7%) -0.2% ( -8% -8%) 0.878 LowTerm 1323.68 (4.6%) 1321.10 (5.7%) -0.2% ( -10% - 10%) 0.906 AndHighLow 752.80 (2.8%) 751.45 (3.5%) -0.2% ( -6% -6%) 0.859 AndHighHighDayTaxoFacets 25.28 (2.0%) 25.23 (2.8%) -0.2% ( -4% -4%) 0.827 LowSloppyPhrase 28.05 (1.6%) 28.01 (2.7%) -0.1% ( -4% -4%) 0.838 Fuzzy1 66.15 (1.8%) 66.09 (1.6%) -0.1% ( -3% -3%) 0.875 IntNRQ 445.57 (1.3%) 445.54 (2.0%) -0.0% ( -3% -3%) 0.989 OrHighLow 496.07 (2.0%) 496.07 (2.5%) -0.0% ( -4% -4%) 1.000 OrHighNotLow 827.15 (4.2%) 827.69 (3.7%)0.1% ( -7% -8%) 0.958 AndHighMed 189.81 (3.4%) 190.17 (3.4%)0.2% ( -6% -7%) 0.857 HighIntervalsOrdered6.60 (3.4%)6.61 (3.6%)0.2% ( -6% -7%) 0.835 OrNotHighLow 811.18 (2.9%) 813.56 (2.1%)0.3% ( -4% -5%) 0.718 MedIntervalsOrdered 14.97 (2.2%) 15.02 (2.9%)0.3% ( -4% -5%) 0.699 LowIntervalsOrdered6.50 (2.0%)6.52 (2.5%)0.3% ( -4% -5%) 0.648 OrHighNotHigh 609.78 (4.4%) 612.30 (3.6%)0.4% ( -7% -8%) 0.744 Fuzzy2 56.06 (1.6%) 56.34 (1.6%)0.5% ( -2% -3%) 0.327 OrNotHighMed 623.85 (4.5%) 627.41 (4.1%)0.6% ( -7% -9%) 0.675 BrowseDayOfYearTaxoFacets 18.77 (19.2%) 18.88 (16.5%)0.6% ( -29% - 44%) 0.916 OrHighMed 153.63 (3.6%) 154.59 (3.2%)0.6% ( -5% -7%) 0.562 OrHighHigh 32.59 (4.1%) 32.83 (4.1%)0.8% ( -7% -9%) 0.560 Respell 47.
[jira] [Created] (LUCENE-10521) Tests in windows are failing for the new testAlwaysRefreshDirectoryTaxonomyReader test
Gautam Worah created LUCENE-10521: - Summary: Tests in windows are failing for the new testAlwaysRefreshDirectoryTaxonomyReader test Key: LUCENE-10521 URL: https://issues.apache.org/jira/browse/LUCENE-10521 Project: Lucene - Core Issue Type: Bug Components: modules/facet Environment: Windows 10 Reporter: Gautam Worah Build: [https://jenkins.thetaphi.de/job/Lucene-main-Windows/10725/] is failing. Specifically, the loop which checks if any files still remain to be deleted is not ending. We have added an exception to the main test class to not run the test on WindowsFS (not sure if this is related). ``` SEVERE: 1 thread leaked from SUITE scope at org.apache.lucene.facet.taxonomy.directory.TestAlwaysRefreshDirectoryTaxonomyReader: 1) Thread[id=19, name=TEST-TestAlwaysRefreshDirectoryTaxonomyReader.testAlwaysRefreshDirectoryTaxonomyReader-seed#[F46E42CB7F2B6959], state=RUNNABLE, group=TGRP-TestAlwaysRefreshDirectoryTaxonomyReader] at java.base@18/sun.nio.fs.WindowsNativeDispatcher.GetFileAttributesEx0(Native Method) at java.base@18/sun.nio.fs.WindowsNativeDispatcher.GetFileAttributesEx(WindowsNativeDispatcher.java:390) at java.base@18/sun.nio.fs.WindowsFileAttributes.get(WindowsFileAttributes.java:307) at java.base@18/sun.nio.fs.WindowsFileSystemProvider.implDelete(WindowsFileSystemProvider.java:251) at java.base@18/sun.nio.fs.AbstractFileSystemProvider.delete(AbstractFileSystemProvider.java:105) at app/org.apache.lucene.test_framework@10.0.0-SNAPSHOT/org.apache.lucene.tests.mockfile.FilterFileSystemProvider.delete(FilterFileSystemProvider.java:130) at app/org.apache.lucene.test_framework@10.0.0-SNAPSHOT/org.apache.lucene.tests.mockfile.FilterFileSystemProvider.delete(FilterFileSystemProvider.java:130) at app/org.apache.lucene.test_framework@10.0.0-SNAPSHOT/org.apache.lucene.tests.mockfile.FilterFileSystemProvider.delete(FilterFileSystemProvider.java:130) at app/org.apache.lucene.test_framework@10.0.0-SNAPSHOT/org.apache.lucene.tests.mockfile.FilterFileSystemProvider.delete(FilterFileSystemProvider.java:130) at app/org.apache.lucene.test_framework@10.0.0-SNAPSHOT/org.apache.lucene.tests.mockfile.FilterFileSystemProvider.delete(FilterFileSystemProvider.java:130) at java.base@18/java.nio.file.Files.delete(Files.java:1152) at app/org.apache.lucene.core@10.0.0-SNAPSHOT/org.apache.lucene.store.FSDirectory.privateDeleteFile(FSDirectory.java:344) at app/org.apache.lucene.core@10.0.0-SNAPSHOT/org.apache.lucene.store.FSDirectory.deletePendingFiles(FSDirectory.java:325) at app/org.apache.lucene.core@10.0.0-SNAPSHOT/org.apache.lucene.store.FSDirectory.getPendingDeletions(FSDirectory.java:410) at app/org.apache.lucene.core@10.0.0-SNAPSHOT/org.apache.lucene.store.FilterDirectory.getPendingDeletions(FilterDirectory.java:121) at app//org.apache.lucene.facet.taxonomy.directory.TestAlwaysRefreshDirectoryTaxonomyReader.testAlwaysRefreshDirectoryTaxonomyReader(TestAlwaysRefreshDirectoryTaxonomyReader.java:97) ``` -- This message was sent by Atlassian Jira (v8.20.1#820001) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (LUCENE-10482) Allow users to create their own DirectoryTaxonomyReaders with empty taxoArrays instead of letting the taxoEpoch decide
[ https://issues.apache.org/jira/browse/LUCENE-10482?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17523902#comment-17523902 ] ASF subversion and git services commented on LUCENE-10482: -- Commit c38870585542dd86c051c8978e944e39a386f8ec in lucene's branch refs/heads/main from Michael McCandless [ https://gitbox.apache.org/repos/asf?p=lucene.git;h=c3887058554 ] LUCENE-10482: Ignore this test for now > Allow users to create their own DirectoryTaxonomyReaders with empty > taxoArrays instead of letting the taxoEpoch decide > -- > > Key: LUCENE-10482 > URL: https://issues.apache.org/jira/browse/LUCENE-10482 > Project: Lucene - Core > Issue Type: Improvement > Components: modules/facet >Affects Versions: 9.1 >Reporter: Gautam Worah >Priority: Minor > Time Spent: 8.5h > Remaining Estimate: 0h > > I was experimenting with the taxonomy index and {{DirectoryTaxonomyReaders}} > in my day job where we were trying to replace the index underneath a reader > asynchronously and then call the {{doOpenIfChanged}} call on it. > It turns out that the taxonomy index uses its own index based counter (the > {{{}taxonomyIndexEpoch{}}}) to determine if the index was opened in write > mode after the last time it was written and if not, it directly tries to > reuse the previous {{taxoArrays}} it had created. This logic fails in a > scenario where both the old and new index were opened just once but the index > itself is completely different in both the cases. > In such a case, it would be good to give the user the flexibility to inform > the DTR to recreate its {{{}taxoArrays{}}}, {{ordinalCache}} and > {{{}categoryCache{}}} (not refreshing these arrays causes it to fail in > various ways). Luckily, such a constructor already exists! But it is private > today! The idea here is to allow subclasses of DTR to use this constructor. > Curious to see what other folks think about this idea. -- This message was sent by Atlassian Jira (v8.20.1#820001) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (LUCENE-10482) Allow users to create their own DirectoryTaxonomyReaders with empty taxoArrays instead of letting the taxoEpoch decide
[ https://issues.apache.org/jira/browse/LUCENE-10482?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17523903#comment-17523903 ] ASF subversion and git services commented on LUCENE-10482: -- Commit 39a8c7d1369fc1d6f4232bc34cc87e8fe9cd925e in lucene's branch refs/heads/branch_9x from Michael McCandless [ https://gitbox.apache.org/repos/asf?p=lucene.git;h=39a8c7d1369 ] LUCENE-10482: Ignore this test for now > Allow users to create their own DirectoryTaxonomyReaders with empty > taxoArrays instead of letting the taxoEpoch decide > -- > > Key: LUCENE-10482 > URL: https://issues.apache.org/jira/browse/LUCENE-10482 > Project: Lucene - Core > Issue Type: Improvement > Components: modules/facet >Affects Versions: 9.1 >Reporter: Gautam Worah >Priority: Minor > Time Spent: 8.5h > Remaining Estimate: 0h > > I was experimenting with the taxonomy index and {{DirectoryTaxonomyReaders}} > in my day job where we were trying to replace the index underneath a reader > asynchronously and then call the {{doOpenIfChanged}} call on it. > It turns out that the taxonomy index uses its own index based counter (the > {{{}taxonomyIndexEpoch{}}}) to determine if the index was opened in write > mode after the last time it was written and if not, it directly tries to > reuse the previous {{taxoArrays}} it had created. This logic fails in a > scenario where both the old and new index were opened just once but the index > itself is completely different in both the cases. > In such a case, it would be good to give the user the flexibility to inform > the DTR to recreate its {{{}taxoArrays{}}}, {{ordinalCache}} and > {{{}categoryCache{}}} (not refreshing these arrays causes it to fail in > various ways). Luckily, such a constructor already exists! But it is private > today! The idea here is to allow subclasses of DTR to use this constructor. > Curious to see what other folks think about this idea. -- This message was sent by Atlassian Jira (v8.20.1#820001) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (LUCENE-10518) FieldInfos consistency check can refuse to open Lucene 8 index
[ https://issues.apache.org/jira/browse/LUCENE-10518?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17523918#comment-17523918 ] Nhat Nguyen commented on LUCENE-10518: -- [~mayya] Thank you for your response. I understand the concern. I think the current consistency check is not good enough to enable these rewrite optimizations. We can open an inconsistent index (created in Lucene8) as read-only, then searches with that reader can return incorrect results. Or we can open that inconsistent index after force-merge. > FieldInfos consistency check can refuse to open Lucene 8 index > -- > > Key: LUCENE-10518 > URL: https://issues.apache.org/jira/browse/LUCENE-10518 > Project: Lucene - Core > Issue Type: Bug > Components: core/index >Affects Versions: 8.10.1 >Reporter: Nhat Nguyen >Priority: Major > > A field-infos consistency check introduced in Lucene 9 (LUCENE-9334) can > refuse to open a Lucene 8 index. Lucene 8 can create a partial FieldInfo if > hitting a non-aborting exception (for example [term is too > long|https://github.com/apache/lucene-solr/blob/6a6484ba396927727b16e5061384d3cd80d616b2/lucene/core/src/java/org/apache/lucene/index/DefaultIndexingChain.java#L944]) > during processing fields of a document. We don't have this problem in Lucene > 9 as we process fields in two phases with the [first > phase|https://github.com/apache/lucene/blob/10ebc099c846c7d96f4ff5f9b7853df850fa8442/lucene/core/src/java/org/apache/lucene/index/IndexingChain.java#L589-L614] > processing only FieldInfos. > The issue can be reproduced with this snippet. > {code:java} > public void testWriteIndexOn8x() throws Exception { > FieldType KeywordField = new FieldType(); > KeywordField.setTokenized(false); > KeywordField.setOmitNorms(true); > KeywordField.setIndexOptions(IndexOptions.DOCS); > KeywordField.freeze(); > try (Directory dir = newDirectory()) { > IndexWriterConfig config = new IndexWriterConfig(); > config.setCommitOnClose(false); > config.setMergePolicy(NoMergePolicy.INSTANCE); > try (IndexWriter writer = new IndexWriter(dir, config)) { > // first segment > writer.addDocument(new Document()); // an empty doc > Document d1 = new Document(); > byte[] chars = new byte[IndexWriter.MAX_STORED_STRING_LENGTH + 1]; > Arrays.fill(chars, (byte) 'a'); > d1.add(new Field("field", new BytesRef(chars), KeywordField)); > d1.add(new BinaryDocValuesField("field", new BytesRef(chars))); > expectThrows(IllegalArgumentException.class, () -> > writer.addDocument(d1)); > writer.flush(); > // second segment > Document d2 = new Document(); > d2.add(new Field("field", new BytesRef("hello world"), KeywordField)); > d2.add(new SortedDocValuesField("field", new BytesRef("hello world"))); > writer.addDocument(d2); > writer.flush(); > writer.commit(); > // Check for doc values types consistency > Map docValuesTypes = new HashMap<>(); > try(DirectoryReader reader = DirectoryReader.open(dir)){ > for (LeafReaderContext leaf : reader.leaves()) { > for (FieldInfo fi : leaf.reader().getFieldInfos()) { > DocValuesType current = docValuesTypes.putIfAbsent(fi.name, > fi.getDocValuesType()); > if (current != null && current != fi.getDocValuesType()) { > fail("cannot change DocValues type from " + current + " to " + > fi.getDocValuesType() + " for field \"" + fi.name + "\""); > } > } > } > } > } > } > } > {code} > I would like to propose to: > - Backport the two-phase fields processing from Lucene9 to Lucene8. The patch > should be small and contained. > - Introduce an option in Lucene9 to skip checking field-infos consistency > (i.e., behave like Lucene 8 when the option is enabled). > /cc [~mayya] and [~jpountz] -- This message was sent by Atlassian Jira (v8.20.1#820001) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Created] (LUCENE-10522) issue with pattern capture group token filter
Dishant Sharma created LUCENE-10522: --- Summary: issue with pattern capture group token filter Key: LUCENE-10522 URL: https://issues.apache.org/jira/browse/LUCENE-10522 Project: Lucene - Core Issue Type: Task Reporter: Dishant Sharma -- This message was sent by Atlassian Jira (v8.20.1#820001) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Updated] (LUCENE-10522) issue with pattern capture group token filter
[ https://issues.apache.org/jira/browse/LUCENE-10522?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dishant Sharma updated LUCENE-10522: Description: |The default pattern capture token filter in elastic search gives the same start and end offset for each generated token: the start and end offset as that of the input string. Is there any way by which I can change the start and end offset of an input string to the positions at which they are found in the input string? The issue that I'm currently facing is that in case of highlighting, it highlights enter string instead of the match. I am getting all the tokens using the regexes that I have created but the only issue is that all the tokens have the same start and end offsets as that of the input string. I am using the pattern token filter alongwith the whitespace tokenizer. Suppose I have a text: "Website url is [https://www.google.com/]"; Then, the desired tokens are: Website, url, is, [https://www.google.com/], https, www, google, com, https:, https:/, https://, /www, .google, .com, [www|http://www/]., google., com/, www [google.com|http://google.com/] etc. I am getting all these tokens through my regexes the only issue is with the offsets. Suppose the start and end offsets of the entire url "[https://www.google.com/]"; are 0 and 23, so it is giving 0 and 23 for all the generated tokens. But, as per my use case, I'm using the highlighting functionality where I have to use it to highlight all the generated tokens inside the text. But, the issue here is that I instead of highlighting only the match inside the text, it is highlighting the entire input text.| > issue with pattern capture group token filter > - > > Key: LUCENE-10522 > URL: https://issues.apache.org/jira/browse/LUCENE-10522 > Project: Lucene - Core > Issue Type: Task >Reporter: Dishant Sharma >Priority: Critical > > |The default pattern capture token filter in elastic search gives the same > start and end offset for each generated token: the start and end offset as > that of the input string. Is there any way by which I can change the start > and end offset of an input string to the positions at which they are found in > the input string? The issue that I'm currently facing is that in case of > highlighting, it highlights enter string instead of the match. > I am getting all the tokens using the regexes that I have created but the > only issue is that all the tokens have the same start and end offsets as that > of the input string. > I am using the pattern token filter alongwith the whitespace tokenizer. > Suppose I have a text: > "Website url is [https://www.google.com/]"; > Then, the desired tokens are: > Website, url, is, [https://www.google.com/], https, www, google, com, https:, > https:/, https://, /www, .google, .com, [www|http://www/]., google., com/, > www [google.com|http://google.com/] etc. > I am getting all these tokens through my regexes the only issue is with the > offsets. Suppose the start and end offsets of the entire url > "[https://www.google.com/]"; are 0 and 23, so it is giving 0 and 23 for all > the generated tokens. > But, as per my use case, I'm using the highlighting functionality where I > have to use it to highlight all the generated tokens inside the text. But, > the issue here is that I instead of highlighting only the match inside the > text, it is highlighting the entire input text.| -- This message was sent by Atlassian Jira (v8.20.1#820001) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Updated] (LUCENE-10522) issue with pattern capture group token filter
[ https://issues.apache.org/jira/browse/LUCENE-10522?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dishant Sharma updated LUCENE-10522: Description: |The default pattern capture token filter in elastic search gives the same start and end offset for each generated token: the start and end offset as that of the input string. Is there any way by which I can change the start and end offset of an input string to the positions at which they are found in the input string? The issue that I'm currently facing is that in case of highlighting, it highlights enter string instead of the match.| I am not using any mapping but, I am using the below index settings as per my use case: {{ "settings" : \{ "analysis" : { "analyzer" : { "special_analyzer" : { "tokenizer" : "whitespace", "filter" : [ "url-filter-1", "url-filter-2", "url-filter-3", "url-filter-4", "url-filter-5", "url-filter-6", "url-filter-7", "url-filter-8", "url-filter-9", "url-filter-10", "url-filter-11", "unique" ] } } } }}} I am getting all the tokens using the regexes that I have created but the only issue is that all the tokens have the same start and end offsets as that of the input string. I am using the pattern token filter alongwith the whitespace tokenizer. Suppose I have a text: "Website url is [https://www.google.com/]"; Then, the desired tokens are: Website, url, is, [https://www.google.com/], https, www, google, com, https:, https:/, [https://|https:], /www, .google, .com, [www|http://www/]., google., com/, www [google.com|http://google.com/] etc. I am getting all these tokens through my regexes the only issue is with the offsets. Suppose the start and end offsets of the entire url "[https://www.google.com/]"; are 0 and 23, so it is giving 0 and 23 for all the generated tokens. But, as per my use case, I'm using the highlighting functionality where I have to use it to highlight all the generated tokens inside the text. But, the issue here is that I instead of highlighting only the match inside the text, it is highlighting the entire input text.| was: |The default pattern capture token filter in elastic search gives the same start and end offset for each generated token: the start and end offset as that of the input string. Is there any way by which I can change the start and end offset of an input string to the positions at which they are found in the input string? The issue that I'm currently facing is that in case of highlighting, it highlights enter string instead of the match. I am getting all the tokens using the regexes that I have created but the only issue is that all the tokens have the same start and end offsets as that of the input string. I am using the pattern token filter alongwith the whitespace tokenizer. Suppose I have a text: "Website url is [https://www.google.com/]"; Then, the desired tokens are: Website, url, is, [https://www.google.com/], https, www, google, com, https:, https:/, https://, /www, .google, .com, [www|http://www/]., google., com/, www [google.com|http://google.com/] etc. I am getting all these tokens through my regexes the only issue is with the offsets. Suppose the start and end offsets of the entire url "[https://www.google.com/]"; are 0 and 23, so it is giving 0 and 23 for all the generated tokens. But, as per my use case, I'm using the highlighting functionality where I have to use it to highlight all the generated tokens inside the text. But, the issue here is that I instead of highlighting only the match inside the text, it is highlighting the entire input text.| > issue with pattern capture group token filter > - > > Key: LUCENE-10522 > URL: https://issues.apache.org/jira/browse/LUCENE-10522 > Project: Lucene - Core > Issue Type: Task >Reporter: Dishant Sharma >Priority: Critical > > |The default pattern capture token filter in elastic search gives the same > start and end offset for each generated token: the start and end offset as > that of the input string. Is there any way by which I can change the start > and end offset of an input string to the positions at which they are found in > the input string? The issue that I'm currently facing is that in case of > highlighting, it highlights enter string instead of the match.| > > I am not using any mapping but, I am using the below index settings as per my > use case: > {{ "settings" : \{ > "analysis" : { > "analyzer" : { > "special_analyzer" : { >"tokenizer" : "whitespace", >"filter" : [ "url-filter-1", "url-filter-2", "url-filter-3", > "url-filter-4", "url-filter-5", "url-filter-6", "url-filter-7", > "url-filter-8", "url-filter-9", "url-filter-10", "url-