[GitHub] [lucene-solr] dweiss merged pull request #2068: LUCENE-8982: Separate out native code to another module to allow cpp build with gradle
dweiss merged pull request #2068: URL: https://github.com/apache/lucene-solr/pull/2068 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (LUCENE-8982) Make NativeUnixDirectory pure java now that direct IO is possible
[ https://issues.apache.org/jira/browse/LUCENE-8982?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17232609#comment-17232609 ] ASF subversion and git services commented on LUCENE-8982: - Commit ebc87a8a27f3b3bd89ea7c38c8b701d94e50788d in lucene-solr's branch refs/heads/master from zacharymorn [ https://gitbox.apache.org/repos/asf?p=lucene-solr.git;h=ebc87a8 ] LUCENE-8982: Separate out native code to another module to allow cpp build with gradle (#2068) * LUCENE-8982: Separate out native code to another module to allow cpp build with gradle > Make NativeUnixDirectory pure java now that direct IO is possible > - > > Key: LUCENE-8982 > URL: https://issues.apache.org/jira/browse/LUCENE-8982 > Project: Lucene - Core > Issue Type: Improvement > Components: modules/misc >Reporter: Michael McCandless >Priority: Major > Time Spent: 8h 50m > Remaining Estimate: 0h > > {{NativeUnixDirectory}} is a {{Directory}} implementation that uses direct IO > to write newly merged segments. Direct IO bypasses the kernel's buffer cache > and write cache, making merge writes "invisible" to the kernel, though the > reads for merging the N segments are still going through the kernel. > But today, {{NativeUnixDirectory}} uses a small JNI wrapper to access the > {{O_DIRECT}} flag to {{open}} ... since JDK9 we can now pass that flag in > pure java code, so we should now fix {{NativeUnixDirectory}} to not use JNI > anymore. > We should also run some more realistic benchmarks seeing if this option > really helps nodes that are doing concurrent indexing (merging) and searching. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (LUCENE-8982) Make NativeUnixDirectory pure java now that direct IO is possible
[ https://issues.apache.org/jira/browse/LUCENE-8982?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17232610#comment-17232610 ] ASF subversion and git services commented on LUCENE-8982: - Commit ebc87a8a27f3b3bd89ea7c38c8b701d94e50788d in lucene-solr's branch refs/heads/master from zacharymorn [ https://gitbox.apache.org/repos/asf?p=lucene-solr.git;h=ebc87a8 ] LUCENE-8982: Separate out native code to another module to allow cpp build with gradle (#2068) * LUCENE-8982: Separate out native code to another module to allow cpp build with gradle > Make NativeUnixDirectory pure java now that direct IO is possible > - > > Key: LUCENE-8982 > URL: https://issues.apache.org/jira/browse/LUCENE-8982 > Project: Lucene - Core > Issue Type: Improvement > Components: modules/misc >Reporter: Michael McCandless >Priority: Major > Time Spent: 8h 50m > Remaining Estimate: 0h > > {{NativeUnixDirectory}} is a {{Directory}} implementation that uses direct IO > to write newly merged segments. Direct IO bypasses the kernel's buffer cache > and write cache, making merge writes "invisible" to the kernel, though the > reads for merging the N segments are still going through the kernel. > But today, {{NativeUnixDirectory}} uses a small JNI wrapper to access the > {{O_DIRECT}} flag to {{open}} ... since JDK9 we can now pass that flag in > pure java code, so we should now fix {{NativeUnixDirectory}} to not use JNI > anymore. > We should also run some more realistic benchmarks seeing if this option > really helps nodes that are doing concurrent indexing (merging) and searching. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene-solr] dweiss commented on pull request #2068: LUCENE-8982: Separate out native code to another module to allow cpp build with gradle
dweiss commented on pull request #2068: URL: https://github.com/apache/lucene-solr/pull/2068#issuecomment-727827305 I've added changes entry and committed it in, thanks @zacharymorn ! This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Assigned] (LUCENE-8982) Make NativeUnixDirectory pure java now that direct IO is possible
[ https://issues.apache.org/jira/browse/LUCENE-8982?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dawid Weiss reassigned LUCENE-8982: --- Assignee: Dawid Weiss > Make NativeUnixDirectory pure java now that direct IO is possible > - > > Key: LUCENE-8982 > URL: https://issues.apache.org/jira/browse/LUCENE-8982 > Project: Lucene - Core > Issue Type: Improvement > Components: modules/misc >Reporter: Michael McCandless >Assignee: Dawid Weiss >Priority: Major > Time Spent: 9h > Remaining Estimate: 0h > > {{NativeUnixDirectory}} is a {{Directory}} implementation that uses direct IO > to write newly merged segments. Direct IO bypasses the kernel's buffer cache > and write cache, making merge writes "invisible" to the kernel, though the > reads for merging the N segments are still going through the kernel. > But today, {{NativeUnixDirectory}} uses a small JNI wrapper to access the > {{O_DIRECT}} flag to {{open}} ... since JDK9 we can now pass that flag in > pure java code, so we should now fix {{NativeUnixDirectory}} to not use JNI > anymore. > We should also run some more realistic benchmarks seeing if this option > really helps nodes that are doing concurrent indexing (merging) and searching. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (SOLR-14998) any Collections Handler actions should be logged at debug level
[ https://issues.apache.org/jira/browse/SOLR-14998?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17232611#comment-17232611 ] Nazerke Seidan commented on SOLR-14998: --- PR: https://github.com/apache/lucene-solr/pull/2079 > any Collections Handler actions should be logged at debug level > --- > > Key: SOLR-14998 > URL: https://issues.apache.org/jira/browse/SOLR-14998 > Project: Solr > Issue Type: Improvement > Security Level: Public(Default Security Level. Issues are Public) >Reporter: Nazerke Seidan >Priority: Minor > > CLUSTERSTATUS is logged in CollectionsHandler at INFO level but the cluster > status is already logged in HttpSolrCall at INFO. In CollectionsHandler INFO > level should be set to DEBUG to avoid a lot of noise. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (LUCENE-8982) Make NativeUnixDirectory pure java now that direct IO is possible
[ https://issues.apache.org/jira/browse/LUCENE-8982?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17232612#comment-17232612 ] Dawid Weiss commented on LUCENE-8982: - I've committed in the extracted native library part. I still think it'd be good to see if we can get the same performance with just plain java (and new direct flags). > Make NativeUnixDirectory pure java now that direct IO is possible > - > > Key: LUCENE-8982 > URL: https://issues.apache.org/jira/browse/LUCENE-8982 > Project: Lucene - Core > Issue Type: Improvement > Components: modules/misc >Reporter: Michael McCandless >Assignee: Dawid Weiss >Priority: Major > Time Spent: 9h > Remaining Estimate: 0h > > {{NativeUnixDirectory}} is a {{Directory}} implementation that uses direct IO > to write newly merged segments. Direct IO bypasses the kernel's buffer cache > and write cache, making merge writes "invisible" to the kernel, though the > reads for merging the N segments are still going through the kernel. > But today, {{NativeUnixDirectory}} uses a small JNI wrapper to access the > {{O_DIRECT}} flag to {{open}} ... since JDK9 we can now pass that flag in > pure java code, so we should now fix {{NativeUnixDirectory}} to not use JNI > anymore. > We should also run some more realistic benchmarks seeing if this option > really helps nodes that are doing concurrent indexing (merging) and searching. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Created] (LUCENE-9611) Remove deprecated PACKED_SINGLE_BLOCK from PackedInts
Ignacio Vera created LUCENE-9611: Summary: Remove deprecated PACKED_SINGLE_BLOCK from PackedInts Key: LUCENE-9611 URL: https://issues.apache.org/jira/browse/LUCENE-9611 Project: Lucene - Core Issue Type: Improvement Reporter: Ignacio Vera In LUCENE-7521, the PACKED_SINGLE_BLOCK format was deprecated. I propose to remove it entirely for Lucene 9.0. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (LUCENE-9611) Remove deprecated PACKED_SINGLE_BLOCK from PackedInts
[ https://issues.apache.org/jira/browse/LUCENE-9611?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17232657#comment-17232657 ] Ignacio Vera commented on LUCENE-9611: -- Ok, no the change was done in 9.0 so it cannot e remove until Lucene 10. > Remove deprecated PACKED_SINGLE_BLOCK from PackedInts > - > > Key: LUCENE-9611 > URL: https://issues.apache.org/jira/browse/LUCENE-9611 > Project: Lucene - Core > Issue Type: Improvement >Reporter: Ignacio Vera >Priority: Major > > In LUCENE-7521, the PACKED_SINGLE_BLOCK format was deprecated. I propose to > remove it entirely for Lucene 9.0. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene-solr] noblepaul commented on a change in pull request #2065: SOLR-14977 : ContainerPlugins should be configurable
noblepaul commented on a change in pull request #2065: URL: https://github.com/apache/lucene-solr/pull/2065#discussion_r524089049 ## File path: solr/core/src/java/org/apache/solr/api/ContainerPluginsRegistry.java ## @@ -114,6 +118,16 @@ public synchronized ApiInfo getPlugin(String name) { return currentPlugins.get(name); } + static class PluginMetaHolder { +private final Map original; +private final PluginMeta meta; Review comment: It can be a specific sub-class of `PluginMeta` say `EndPointPluginMeta` . However, that change does not belong here This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Created] (SOLR-15002) Upgrade HttpClient to 4.5.13
Andras Salamon created SOLR-15002: - Summary: Upgrade HttpClient to 4.5.13 Key: SOLR-15002 URL: https://issues.apache.org/jira/browse/SOLR-15002 Project: Solr Issue Type: Task Security Level: Public (Default Security Level. Issues are Public) Reporter: Andras Salamon Upgrade HttpClient 4.5.13 and HttpCore 4.4.13 -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene-solr] asalamon74 opened a new pull request #2082: SOLR-15002: Upgrade HttpClient to 4.5.13
asalamon74 opened a new pull request #2082: URL: https://github.com/apache/lucene-solr/pull/2082 # Description Upgrade HttpClient 4.5.13 and HttpCore 4.4.13 # Solution Upgrade HttpClient 4.5.13 and HttpCore 4.4.13 # Tests Unit tests. # Checklist Please review the following and check all that apply: - [x] I have reviewed the guidelines for [How to Contribute](https://wiki.apache.org/solr/HowToContribute) and my code conforms to the standards described there to the best of my ability. - [x] I have created a Jira issue and added the issue ID to my pull request title. - [x] I have given Solr maintainers [access](https://help.github.com/en/articles/allowing-changes-to-a-pull-request-branch-created-from-a-fork) to contribute to my PR branch. (optional but recommended) - [x] I have developed this patch against the `master` branch. - [x] I have run `./gradlew check`. - [ ] I have added tests for my changes. - [ ] I have added documentation for the [Ref Guide](https://github.com/apache/lucene-solr/tree/master/solr/solr-ref-guide) (for Solr changes only). This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene-solr] sigram commented on a change in pull request #2065: SOLR-14977 : ContainerPlugins should be configurable
sigram commented on a change in pull request #2065: URL: https://github.com/apache/lucene-solr/pull/2065#discussion_r524211640 ## File path: solr/core/src/java/org/apache/solr/api/ContainerPluginsRegistry.java ## @@ -114,6 +118,16 @@ public synchronized ApiInfo getPlugin(String name) { return currentPlugins.get(name); } + static class PluginMetaHolder { +private final Map original; +private final PluginMeta meta; Review comment: Let's make it a sub-class. (I would argue that it belongs in this PR because it's a part of the configuration mechanism for the plugins that this PR defines. Since we are adding a flexible config bean it doesn't make sense to still keep the old primitive field. And you can always ADD fields in subclasses but removing them is much harder... ;) ) This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (LUCENE-9611) Remove deprecated PACKED_SINGLE_BLOCK from PackedInts
[ https://issues.apache.org/jira/browse/LUCENE-9611?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17232740#comment-17232740 ] Adrien Grand commented on LUCENE-9611: -- It looks like we only use it for numbers of bits per value 1, 2 and 4 when encoding postings. Maybe one way to move forward with this would be to drop PACKED_SINGLE_BLOCK from PackedInts and introduce special decoding logic in Lucene50PostingsFormat's ForUtil for these numbers of bits per value. > Remove deprecated PACKED_SINGLE_BLOCK from PackedInts > - > > Key: LUCENE-9611 > URL: https://issues.apache.org/jira/browse/LUCENE-9611 > Project: Lucene - Core > Issue Type: Improvement >Reporter: Ignacio Vera >Priority: Major > > In LUCENE-7521, the PACKED_SINGLE_BLOCK format was deprecated. I propose to > remove it entirely for Lucene 9.0. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene-solr] noblepaul merged pull request #2065: SOLR-14977 : ContainerPlugins should be configurable
noblepaul merged pull request #2065: URL: https://github.com/apache/lucene-solr/pull/2065 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (SOLR-14977) Container plugins need a way to be configured
[ https://issues.apache.org/jira/browse/SOLR-14977?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17232752#comment-17232752 ] ASF subversion and git services commented on SOLR-14977: Commit 73d5e7ae77d8953cb9be35a7cbcebe3a516dd04a in lucene-solr's branch refs/heads/master from Noble Paul [ https://gitbox.apache.org/repos/asf?p=lucene-solr.git;h=73d5e7a ] SOLR-14977 : ContainerPlugins should be configurable (#2065) > Container plugins need a way to be configured > - > > Key: SOLR-14977 > URL: https://issues.apache.org/jira/browse/SOLR-14977 > Project: Solr > Issue Type: Improvement > Security Level: Public(Default Security Level. Issues are Public) > Components: Plugin system >Reporter: Andrzej Bialecki >Priority: Major > Attachments: SOLR-14977.patch > > Time Spent: 4.5h > Remaining Estimate: 0h > > Container plugins are defined in {{/clusterprops.json:/plugin}} using a > simple {{PluginMeta}} bean. This is sufficient for implementations that don't > need any configuration except for the {{pathPrefix}} but insufficient for > anything else that needs more configuration parameters. > An example would be a {{CollectionsRepairEventListener}} plugin proposed in > PR-1962, which needs parameters such as the list of collections, {{waitFor}}, > maximum operations allowed, etc. to properly function. > This issue proposes to extend the {{PluginMeta}} bean to allow a > {{Map}} configuration parameters. > There is an interface that we could potentially use ({{MapInitializedPlugin}} > but it works only with {{String}} values. This is not optimal because it > requires additional type-safety validation from the consumers. The existing > {{PluginInfo}} / {{PluginInfoInitialized}} interface is too complex for this > purpose. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (SOLR-15000) Solr based enterprise level, one-stop search center products with high performance, high reliability and high scalability
[ https://issues.apache.org/jira/browse/SOLR-15000?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17232763#comment-17232763 ] David Eric Pugh commented on SOLR-15000: I checked out the github repo, and I see you've cut two releases (1.0 and 1.0.1) and that over the past month you have had activity. I always look at https://github.com/qlangtech/tis-solr/pulse/monthly page on Github when evaluating if a open source project is one I want to invest in adopting! I think your idea about many users just looking for a product is very true! I've gone ahead and added your project to my watchlist for releases ;-). Good luck! I'm think this is a ticket that should be closed as a won't fix, in favour of new more targeted tickets? > Solr based enterprise level, one-stop search center products with high > performance, high reliability and high scalability > - > > Key: SOLR-15000 > URL: https://issues.apache.org/jira/browse/SOLR-15000 > Project: Solr > Issue Type: Wish > Security Level: Public(Default Security Level. Issues are Public) > Components: Admin UI >Reporter: bai sui >Priority: Minor > Attachments: add-collection-step-2-expert.png, > add-collection-step-2.png > > > h2. Summary > I have developed an enterprise application based on Solr,named TIS . Use TIS > can quickly build enterprise search service for you. TIS includes three > components: > - offline index building platform > The data is exported from ER database( mysql, sqlserver and so on) through > full table scanning, and then the wide table is constructed by local MR tool, > or the wide table is constructed directly by spark > - incremental real-time channel > It is transmitted to Kafka , and real-time stream calculation is carried out > by Flink and submitted to search engine to ensure that the data in search > engine and database are consistent in near real time > - search engine > currently,based on Solr8 > TIS integrate these components seamlessly and bring users one-stop, out of > the box experience. > h2. My question > I want to feed back my code to the community, but TIS focuses on Enterprise > Application Search, just as elasitc search focuses on visual analysis of time > series data. Because Solr is a general search product, *I don't think TIS can > be merged directly into Solr. Is it possible for TIS to be a new incubation > project under Apache?* > h2. TIS main Features > - The schema and solrconfig storage are separated from ZK and stored in > MySQL. The version management function is provided. Users can roll back to > the historical version of the configuration. > !add-collection-step-2-expert.png|width=500! > !add-collection-step-2.png|width=500! >Schema editing mode can be switched between visual editing mode or > advanced expert mode > - Define wide table rules based on the selected data table > - The offline index building component is provided. Outside the collection, > the data is built into Lucene segment file. Then, the segment file is > returned to the local disk where solrcore is located. The new index of reload > solrcore takes effect -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene-solr] cpoerschke commented on pull request #1571: SOLR-14560: Interleaving for Learning To Rank
cpoerschke commented on pull request #1571: URL: https://github.com/apache/lucene-solr/pull/1571#issuecomment-728089734 > ... I assume we merge to master squashing and then cherry-pick the commit to some other branches? ... Yes, squash-and-merge to master branch and then cherry-pick to branch_8x from which branch_8_8 would be cut in due course as part of the 8.8.0 release process. > ... Do you think we need to target a major release? Or we could add it in the upcoming minors? ... Good question. * From end users' perspective there are no breaking changes which would point towards targeting 9.x major rather than 8.8+ minor releases. * From code API perspective, there are some APIs that changed but from what I can see -- https://github.com/cpoerschke/lucene-solr/commits/feature/SOLR-14560-cpoerschke-2 has associated commits -- we could provide backwards compatible deprecated niceties to avoid breaking builds for anyone who perhaps had created their own transformer or plugin class that referenced the changed APIs. Having said that, perhaps for `solr/contrib` code the expectations of not breaking APIs in a given 8.x series are different from `solr/core` code. * But yes, with or without backcompat niceties, I think we can target the 8.8.0 minor rather than then 9.0 major release. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Created] (SOLR-15003) SolrCloud Snapshot metadata inconsistent after core replication
Istvan Farkas created SOLR-15003: Summary: SolrCloud Snapshot metadata inconsistent after core replication Key: SOLR-15003 URL: https://issues.apache.org/jira/browse/SOLR-15003 Project: Solr Issue Type: Bug Security Level: Public (Default Security Level. Issues are Public) Components: SolrCloud Affects Versions: 7.4.1 Reporter: Istvan Farkas Attachments: snap2-before.txt, snap2-failed.png, state.json After a replica does a full recovery, the old index directory is deleted, however the snapshot metadata is not updated in Zookeeper. This means that the affected core will have snapshots which point to a non-existing index directory. Steps to reproduce (I used Solr 7.4.0 to test this but it likely affects newer versions too): (1) Create any collection in SolrCloud with more than 1 replica per shard. The state.json of the testcollection2 I used: [^state.json] (2) Start adding documents to the collection. After some documents are added, creating a snapshot. In this example I called it snap2. {code} INFO (qtp577405636-12140)-c:testcollection2o.a.s.s.HttpSolrCall: [admin] webapp=null path=/admin/collections params={name=testcollection2&action=CREATESNAPSHOT&collection=testcollection2&commitName=snap2&wt=javabin&version=2} status=0 QTime=280 {code} The snapshot is created successfully for both cores, the metadata in ZooKeeper looks like this: [^snap2-before.txt] For core_node4 the index directory is /solr/testcollection2/core_node4/data/index. (3) Shut down one of the Solr servers. Here I use the server hosting core_node4. (4) Continue adding documents to the collection (add at least 100 documents to ensure that the replica which is shut down will go to a full replication recovery on the next start) Start the server: {code} INFO (recoveryExecutor-4-thread-1-processing-n:snapshot-test-2.example.com:8985_solr x:testcollection2_shard1_replica_n2 s:shard1 c:testcollection2 r:core_nod e4)-c:testcollection2-s:shard1-r:core_node4-x:testcollection2_shard1_replica_n2-o.a.s.u.PeerSyncWithLeader: PeerSync: core=testcollection2_shard1_replica_n2 url=https://snapshot-test-2.example.com:8985/solr Received 99 versions from https://snapshot-test-3.example.com:8985/solr/testcollection2_shard1_replica_n1/ INFO (recoveryExecutor-4-thread-1-processing-n:snapshot-test-2.example.com:8985_solr x:testcollection2_shard1_replica_n2 s:shard1 c:testcollection2 r:core_nod e4)-c:testcollection2-s:shard1-r:core_node4-x:testcollection2_shard1_replica_n2-o.a.s.u.PeerSync: PeerSync: core=testcollection2_shard1_replica_n2 url=https://snapshot-test-2.example.com:8985/solr Our versions are too old. ourHighThreshold=1683104801277083650 otherLowThreshold=1683139494002294784 ourHighest=1683104801293860865 otherHighest=1683139494085132289 INFO (recoveryExecutor-4-thread-1-processing-n:snapshot-test-2.example.com:8985_solr x:testcollection2_shard1_replica_n2 s:shard1 c:testcollection2 r:core_nod e4)-c:testcollection2-s:shard1-r:core_node4-x:testcollection2_shard1_replica_n2-o.a.s.u.PeerSyncWithLeader: PeerSync: core=testcollection2_shard1_replica_n2 url=https://snapshot-test-2.example.com:8985/solr DONE. sync failed INFO (recoveryExecutor-4-thread-1-processing-n:snapshot-test-2.example.com:8985_solr x:testcollection2_shard1_replica_n2 s:shard1 c:testcollection2 r:core_nod e4)-c:testcollection2-s:shard1-r:core_node4-x:testcollection2_shard1_replica_n2-o.a.s.c.RecoveryStrategy: PeerSync Recovery was not successful - trying replication. INFO (recoveryExecutor-4-thread-1-processing-n:snapshot-test-2.example.com:8985_solr x:testcollection2_shard1_replica_n2 s:shard1 c:testcollection2 r:core_nod e4)-c:testcollection2-s:shard1-r:core_node4-x:testcollection2_shard1_replica_n2-o.a.s.c.RecoveryStrategy: Starting Replication Recovery. {code} (5) After the replication is finished, the index.properties points to the new directory index.20201112075340480 {code} hdfs dfs -cat /solr/testcollection2/core_node4/data/index.properties #index.properties #Thu Nov 12 07:58:52 GMT+00:00 2020 index=index.20201112075340480 {code} And the original index directory for core_node4 has been deleted: {code} hdfs dfs -du -h /solr/testcollection2/core_node4/data 0 0 /solr/testcollection2/core_node4/data/index 4.5 G 8.9 G /solr/testcollection2/core_node4/data/index.20201112075340480 84 168 /solr/testcollection2/core_node4/data/index.properties 215430 /solr/testcollection2/core_node4/data/replication.properties 401802 /solr/testcollection2/core_node4/data/snapshot_metadata 9.2 M 18.5 M /solr/testcollection2/core_node4/data/tlog {code} The snapshot metadata in Zookeeper is exactly the same as in step (2), so the snap2 still points to the index directory /solr/testcollection2/core_node4/data/index which is empty by this time. (6) Try de
[jira] [Commented] (LUCENE-9378) Configurable compression for BinaryDocValues
[ https://issues.apache.org/jira/browse/LUCENE-9378?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17232812#comment-17232812 ] Adrien Grand commented on LUCENE-9378: -- This introduced some slowdowns on the nightly benchmarks, e.g. http://people.apache.org/~mikemccand/lucenebench/BrowseMonthTaxoFacets.html. It would be nice if the strategy for BEST_SPEED performed better on linear scans. > Configurable compression for BinaryDocValues > > > Key: LUCENE-9378 > URL: https://issues.apache.org/jira/browse/LUCENE-9378 > Project: Lucene - Core > Issue Type: Improvement >Reporter: Viral Gandhi >Priority: Major > Fix For: 8.8 > > Attachments: hotspots-v76x.png, hotspots-v76x.png, hotspots-v76x.png, > hotspots-v76x.png, hotspots-v76x.png, hotspots-v77x.png, hotspots-v77x.png, > hotspots-v77x.png, hotspots-v77x.png, image-2020-06-12-22-17-30-339.png, > image-2020-06-12-22-17-53-961.png, image-2020-06-12-22-18-24-527.png, > image-2020-06-12-22-18-48-919.png, snapshot-v77x.nps, snapshot-v77x.nps, > snapshot-v77x.nps, snapshots-v76x.nps, snapshots-v76x.nps, snapshots-v76x.nps > > Time Spent: 5h 40m > Remaining Estimate: 0h > > Lucene 8.5.1 includes a change to always [compress > BinaryDocValues|https://issues.apache.org/jira/browse/LUCENE-9211]. This > caused (~30%) reduction in our red-line QPS (throughput). > We think users should be given some way to opt-in for this compression > feature instead of always being enabled which can have a substantial query > time cost as we saw during our upgrade. [~mikemccand] suggested one possible > approach by introducing a *mode* in Lucene80DocValuesFormat (COMPRESSED and > UNCOMPRESSED) and allowing users to create a custom Codec subclassing the > default Codec and pick the format they want. > Idea is similar to Lucene50StoredFieldsFormat which has two modes, > Mode.BEST_SPEED and Mode.BEST_COMPRESSION. > Here's related issues for adding benchmark covering BINARY doc values > query-time performance - [https://github.com/mikemccand/luceneutil/issues/61] -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Updated] (SOLR-14970) elevation does not work without elevate.xml config
[ https://issues.apache.org/jira/browse/SOLR-14970?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bernd Wahlen updated SOLR-14970: Summary: elevation does not work without elevate.xml config (was: elevation does not workout elevate.xml config) > elevation does not work without elevate.xml config > -- > > Key: SOLR-14970 > URL: https://issues.apache.org/jira/browse/SOLR-14970 > Project: Solr > Issue Type: Bug > Security Level: Public(Default Security Level. Issues are Public) >Affects Versions: 8.6.3 >Reporter: Bernd Wahlen >Priority: Minor > > When i remove elevate.xml line from solrconfig.xml plus the file, elevation > is not working and no error is logged. > We put the ids directly in the query and we are not using the default fields > or ids, so the xml is completely useless, but required to let the elevation > component work, example query: > {code:java}http://staging.qeep.net:8983/solr/profile_v2/elevate?q=%2Bapp_sns%3A%20qeep&sort=random_4239%20desc,%20id%20desc&elevateIds=361018,361343&forceElevation=true{code} > {code:java} > > > string > elevate.xml > elevated > > > > > explicit > > > elevator > > > {code} -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (SOLR-14970) elevation does not work without elevate.xml config
[ https://issues.apache.org/jira/browse/SOLR-14970?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17232848#comment-17232848 ] Bernd Wahlen commented on SOLR-14970: - solution sounds good, i try to understand the code... > elevation does not work without elevate.xml config > -- > > Key: SOLR-14970 > URL: https://issues.apache.org/jira/browse/SOLR-14970 > Project: Solr > Issue Type: Bug > Security Level: Public(Default Security Level. Issues are Public) >Affects Versions: 8.6.3 >Reporter: Bernd Wahlen >Priority: Minor > > When i remove elevate.xml line from solrconfig.xml plus the file, elevation > is not working and no error is logged. > We put the ids directly in the query and we are not using the default fields > or ids, so the xml is completely useless, but required to let the elevation > component work, example query: > {code:java}http://staging.qeep.net:8983/solr/profile_v2/elevate?q=%2Bapp_sns%3A%20qeep&sort=random_4239%20desc,%20id%20desc&elevateIds=361018,361343&forceElevation=true{code} > {code:java} > > > string > elevate.xml > elevated > > > > > explicit > > > elevator > > > {code} -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (LUCENE-8982) Make NativeUnixDirectory pure java now that direct IO is possible
[ https://issues.apache.org/jira/browse/LUCENE-8982?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17232870#comment-17232870 ] Dawid Weiss commented on LUCENE-8982: - Yeah... like I suspected - CI does complain about missing toolchains. https://jenkins.thetaphi.de/job/Lucene-Solr-master-Linux/28653/consoleText {code} * What went wrong: Execution failed for task ':lucene:misc:native:compileDebugLinuxCpp'. > No tool chain is available to build C++ for host operating system 'Linux' > architecture 'x86-64': - Tool chain 'visualCpp' (Visual Studio): - Visual Studio is not available on this operating system. - Tool chain 'gcc' (GNU GCC): - Could not find C++ compiler 'g++' in system path. - Tool chain 'clang' (Clang): - Could not find C++ compiler 'clang++' in system path. {code} I think we'll switch native builds to be optionally enabled (rather than disabled)? > Make NativeUnixDirectory pure java now that direct IO is possible > - > > Key: LUCENE-8982 > URL: https://issues.apache.org/jira/browse/LUCENE-8982 > Project: Lucene - Core > Issue Type: Improvement > Components: modules/misc >Reporter: Michael McCandless >Assignee: Dawid Weiss >Priority: Major > Time Spent: 9h > Remaining Estimate: 0h > > {{NativeUnixDirectory}} is a {{Directory}} implementation that uses direct IO > to write newly merged segments. Direct IO bypasses the kernel's buffer cache > and write cache, making merge writes "invisible" to the kernel, though the > reads for merging the N segments are still going through the kernel. > But today, {{NativeUnixDirectory}} uses a small JNI wrapper to access the > {{O_DIRECT}} flag to {{open}} ... since JDK9 we can now pass that flag in > pure java code, so we should now fix {{NativeUnixDirectory}} to not use JNI > anymore. > We should also run some more realistic benchmarks seeing if this option > really helps nodes that are doing concurrent indexing (merging) and searching. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Updated] (SOLR-15004) Unit tests for the replica placement API
[ https://issues.apache.org/jira/browse/SOLR-15004?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andrzej Bialecki updated SOLR-15004: Description: This is a follow-up to SOLR-14613. > Unit tests for the replica placement API > > > Key: SOLR-15004 > URL: https://issues.apache.org/jira/browse/SOLR-15004 > Project: Solr > Issue Type: Improvement > Security Level: Public(Default Security Level. Issues are Public) > Components: AutoScaling >Reporter: Andrzej Bialecki >Assignee: Andrzej Bialecki >Priority: Major > > This is a follow-up to SOLR-14613. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Created] (SOLR-15004) Unit tests for the replica placement API
Andrzej Bialecki created SOLR-15004: --- Summary: Unit tests for the replica placement API Key: SOLR-15004 URL: https://issues.apache.org/jira/browse/SOLR-15004 Project: Solr Issue Type: Improvement Security Level: Public (Default Security Level. Issues are Public) Components: AutoScaling Reporter: Andrzej Bialecki Assignee: Andrzej Bialecki -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Updated] (SOLR-15004) Unit tests for the replica placement API
[ https://issues.apache.org/jira/browse/SOLR-15004?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andrzej Bialecki updated SOLR-15004: Description: This is a follow-up to SOLR-14613. Both the APIs and the sample implementations need unit tests. (was: This is a follow-up to SOLR-14613.) > Unit tests for the replica placement API > > > Key: SOLR-15004 > URL: https://issues.apache.org/jira/browse/SOLR-15004 > Project: Solr > Issue Type: Improvement > Security Level: Public(Default Security Level. Issues are Public) > Components: AutoScaling >Reporter: Andrzej Bialecki >Assignee: Andrzej Bialecki >Priority: Major > > This is a follow-up to SOLR-14613. Both the APIs and the sample > implementations need unit tests. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (LUCENE-8982) Make NativeUnixDirectory pure java now that direct IO is possible
[ https://issues.apache.org/jira/browse/LUCENE-8982?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17232874#comment-17232874 ] ASF subversion and git services commented on LUCENE-8982: - Commit fd3ffd0d38aaeaa8629943f69dca2ff04afcfbfa in lucene-solr's branch refs/heads/master from Dawid Weiss [ https://gitbox.apache.org/repos/asf?p=lucene-solr.git;h=fd3ffd0 ] LUCENE-8982: make native builds disabled by default (CI complains). > Make NativeUnixDirectory pure java now that direct IO is possible > - > > Key: LUCENE-8982 > URL: https://issues.apache.org/jira/browse/LUCENE-8982 > Project: Lucene - Core > Issue Type: Improvement > Components: modules/misc >Reporter: Michael McCandless >Assignee: Dawid Weiss >Priority: Major > Time Spent: 9h > Remaining Estimate: 0h > > {{NativeUnixDirectory}} is a {{Directory}} implementation that uses direct IO > to write newly merged segments. Direct IO bypasses the kernel's buffer cache > and write cache, making merge writes "invisible" to the kernel, though the > reads for merging the N segments are still going through the kernel. > But today, {{NativeUnixDirectory}} uses a small JNI wrapper to access the > {{O_DIRECT}} flag to {{open}} ... since JDK9 we can now pass that flag in > pure java code, so we should now fix {{NativeUnixDirectory}} to not use JNI > anymore. > We should also run some more realistic benchmarks seeing if this option > really helps nodes that are doing concurrent indexing (merging) and searching. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (LUCENE-8982) Make NativeUnixDirectory pure java now that direct IO is possible
[ https://issues.apache.org/jira/browse/LUCENE-8982?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17232880#comment-17232880 ] Uwe Schindler commented on LUCENE-8982: --- Hi, would it be possible to build the stuff, if a toolchain is there? It looks like Gradle looks for all different types of tool chains and then gives up. My idea: If it finds windows, it builds windows. if it finds gcc it builds linux, macos same. My biggest problem is still: How to handle releases of binary artifacts? Do we really want to make this dependent of the release manager's local system? > Make NativeUnixDirectory pure java now that direct IO is possible > - > > Key: LUCENE-8982 > URL: https://issues.apache.org/jira/browse/LUCENE-8982 > Project: Lucene - Core > Issue Type: Improvement > Components: modules/misc >Reporter: Michael McCandless >Assignee: Dawid Weiss >Priority: Major > Time Spent: 9h > Remaining Estimate: 0h > > {{NativeUnixDirectory}} is a {{Directory}} implementation that uses direct IO > to write newly merged segments. Direct IO bypasses the kernel's buffer cache > and write cache, making merge writes "invisible" to the kernel, though the > reads for merging the N segments are still going through the kernel. > But today, {{NativeUnixDirectory}} uses a small JNI wrapper to access the > {{O_DIRECT}} flag to {{open}} ... since JDK9 we can now pass that flag in > pure java code, so we should now fix {{NativeUnixDirectory}} to not use JNI > anymore. > We should also run some more realistic benchmarks seeing if this option > really helps nodes that are doing concurrent indexing (merging) and searching. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (LUCENE-8982) Make NativeUnixDirectory pure java now that direct IO is possible
[ https://issues.apache.org/jira/browse/LUCENE-8982?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17232887#comment-17232887 ] Dawid Weiss commented on LUCENE-8982: - gradle will pick up the toolchain suitable for the platform if it finds any - the CI just doesn't have it. bq. My biggest problem is still: How to handle releases of binary artifacts? Do we really want to make this dependent of the release manager's local system? This is what I actually raised in the issue's comment too. I don't think we should include binaries in releases. We should strive to make it java-only (and if somebody really wants to, they can compile native modules locally, for a given platform). > Make NativeUnixDirectory pure java now that direct IO is possible > - > > Key: LUCENE-8982 > URL: https://issues.apache.org/jira/browse/LUCENE-8982 > Project: Lucene - Core > Issue Type: Improvement > Components: modules/misc >Reporter: Michael McCandless >Assignee: Dawid Weiss >Priority: Major > Time Spent: 9h > Remaining Estimate: 0h > > {{NativeUnixDirectory}} is a {{Directory}} implementation that uses direct IO > to write newly merged segments. Direct IO bypasses the kernel's buffer cache > and write cache, making merge writes "invisible" to the kernel, though the > reads for merging the N segments are still going through the kernel. > But today, {{NativeUnixDirectory}} uses a small JNI wrapper to access the > {{O_DIRECT}} flag to {{open}} ... since JDK9 we can now pass that flag in > pure java code, so we should now fix {{NativeUnixDirectory}} to not use JNI > anymore. > We should also run some more realistic benchmarks seeing if this option > really helps nodes that are doing concurrent indexing (merging) and searching. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene-solr] HoustonPutman commented on pull request #2020: SOLR-14949: Ability to customize Solr Docker build
HoustonPutman commented on pull request #2020: URL: https://github.com/apache/lucene-solr/pull/2020#issuecomment-728176570 That's fair, we can certainly move it. I included it there because that seemed like the place to put help files for gradle usage. All of the files used in the `help.gradle` are in that directory, but I guess there's no actual requirement for that. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene-solr] HoustonPutman commented on pull request #1769: SOLR-14789: Absorb the docker-solr repo.
HoustonPutman commented on pull request #1769: URL: https://github.com/apache/lucene-solr/pull/1769#issuecomment-728185520 The extra step exists because there was no consensus around how to do official release images. If we want to decide that the official image should be built the same way as it currently is in the project (via the local build), then we can get rid of the sub-module and the extra step. However if we want to have the official image use the official release binaries, as it does in `docker-solr`, then we will need to keep the submodule. I would have preferred to have all of this done in one module, but the gradle docker plugin only supports building one image per-module. So if we want to build multiple images (which is necessary for supporting the two image types, local and release), we need two modules. I am all for not adding support for official binary release strategy, and consolidating into one docker file. I just don't want to make that decision unanimously. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene-solr] uschindler commented on pull request #2022: LUCENE-9004: KNN vector search using NSW graphs
uschindler commented on pull request #2022: URL: https://github.com/apache/lucene-solr/pull/2022#issuecomment-728207517 Hi, this broke javadocs: > /home/jenkins/workspace/Lucene-Solr-master-Linux/lucene/core/src/java/org/apache/lucene/util/hnsw/HnswGraph.java:43: error: heading used out of sequence: , compared to implicit preceding heading: * Hyperparameters ^ 1 error This line is wrong: https://github.com/apache/lucene-solr/blob/b36b4af22bb76dc42b466b818b417bcbc0deb006/lucene/core/src/java/org/apache/lucene/util/hnsw/HnswGraph.java#L43 (should be ``) This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene-solr] uschindler edited a comment on pull request #2022: LUCENE-9004: KNN vector search using NSW graphs
uschindler edited a comment on pull request #2022: URL: https://github.com/apache/lucene-solr/pull/2022#issuecomment-728207517 Hi, this broke javadocs: ``` > /home/jenkins/workspace/Lucene-Solr-master-Linux/lucene/core/src/java/org/apache/lucene/util/hnsw/HnswGraph.java:43: error: heading used out of sequence: , compared to implicit preceding heading: * Hyperparameters ^ 1 error ``` This line is wrong: https://github.com/apache/lucene-solr/blob/b36b4af22bb76dc42b466b818b417bcbc0deb006/lucene/core/src/java/org/apache/lucene/util/hnsw/HnswGraph.java#L43 (should be ``) This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Reopened] (LUCENE-9004) Approximate nearest vector search
[ https://issues.apache.org/jira/browse/LUCENE-9004?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Uwe Schindler reopened LUCENE-9004: --- This commit broke Javadocs: Hi, this broke javadocs: {noformat} > /home/jenkins/workspace/Lucene-Solr-master-Linux/lucene/core/src/java/org/apache/lucene/util/hnsw/HnswGraph.java:43: > error: heading used out of sequence: , compared to implicit preceding > heading: * Hyperparameters ^ 1 error {noformat} This line is wrong: https://github.com/apache/lucene-solr/blob/b36b4af22bb76dc42b466b818b417bcbc0deb006/lucene/core/src/java/org/apache/lucene/util/hnsw/HnswGraph.java#L43 (should be ) Maybe this is not detected in Java 11, but later javac version complain hardly. > Approximate nearest vector search > - > > Key: LUCENE-9004 > URL: https://issues.apache.org/jira/browse/LUCENE-9004 > Project: Lucene - Core > Issue Type: New Feature >Reporter: Michael Sokolov >Assignee: Michael Sokolov >Priority: Major > Fix For: master (9.0) > > Attachments: hnsw_layered_graph.png > > Time Spent: 6h 40m > Remaining Estimate: 0h > > "Semantic" search based on machine-learned vector "embeddings" representing > terms, queries and documents is becoming a must-have feature for a modern > search engine. SOLR-12890 is exploring various approaches to this, including > providing vector-based scoring functions. This is a spinoff issue from that. > The idea here is to explore approximate nearest-neighbor search. Researchers > have found an approach based on navigating a graph that partially encodes the > nearest neighbor relation at multiple scales can provide accuracy > 95% (as > compared to exact nearest neighbor calculations) at a reasonable cost. This > issue will explore implementing HNSW (hierarchical navigable small-world) > graphs for the purpose of approximate nearest vector search (often referred > to as KNN or k-nearest-neighbor search). > At a high level the way this algorithm works is this. First assume you have a > graph that has a partial encoding of the nearest neighbor relation, with some > short and some long-distance links. If this graph is built in the right way > (has the hierarchical navigable small world property), then you can > efficiently traverse it to find nearest neighbors (approximately) in log N > time where N is the number of nodes in the graph. I believe this idea was > pioneered in [1]. The great insight in that paper is that if you use the > graph search algorithm to find the K nearest neighbors of a new document > while indexing, and then link those neighbors (undirectedly, ie both ways) to > the new document, then the graph that emerges will have the desired > properties. > The implementation I propose for Lucene is as follows. We need two new data > structures to encode the vectors and the graph. We can encode vectors using a > light wrapper around {{BinaryDocValues}} (we also want to encode the vector > dimension and have efficient conversion from bytes to floats). For the graph > we can use {{SortedNumericDocValues}} where the values we encode are the > docids of the related documents. Encoding the interdocument relations using > docids directly will make it relatively fast to traverse the graph since we > won't need to lookup through an id-field indirection. This choice limits us > to building a graph-per-segment since it would be impractical to maintain a > global graph for the whole index in the face of segment merges. However > graph-per-segment is a very natural at search time - we can traverse each > segments' graph independently and merge results as we do today for term-based > search. > At index time, however, merging graphs is somewhat challenging. While > indexing we build a graph incrementally, performing searches to construct > links among neighbors. When merging segments we must construct a new graph > containing elements of all the merged segments. Ideally we would somehow > preserve the work done when building the initial graphs, but at least as a > start I'd propose we construct a new graph from scratch when merging. The > process is going to be limited, at least initially, to graphs that can fit > in RAM since we require random access to the entire graph while constructing > it: In order to add links bidirectionally we must continually update existing > documents. > I think we want to express this API to users as a single joint > {{KnnGraphField}} abstraction that joins together the vectors and the graph > as a single joint field type. Mostly it just looks like a vector-valued > field, but has this graph attached to it. > I'll push a branch with my POC and would love to hear comments. It has many > nocommits, basic design is not really set, there is no Query implementation
[jira] [Comment Edited] (LUCENE-9004) Approximate nearest vector search
[ https://issues.apache.org/jira/browse/LUCENE-9004?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17232941#comment-17232941 ] Uwe Schindler edited comment on LUCENE-9004 at 11/16/20, 5:28 PM: -- This commit broke Javadocs: Hi, this broke javadocs: {noformat} > /home/jenkins/workspace/Lucene-Solr-master-Linux/lucene/core/src/java/org/apache/lucene/util/hnsw/HnswGraph.java:43: > error: heading used out of sequence: , compared to implicit preceding > heading: * Hyperparameters ^ 1 error {noformat} This line is wrong: https://github.com/apache/lucene-solr/blob/b36b4af22bb76dc42b466b818b417bcbc0deb006/lucene/core/src/java/org/apache/lucene/util/hnsw/HnswGraph.java#L43 (should be ) Maybe this is not detected in Java 11, but later javac version complain hardly (e.g. Java 14) was (Author: thetaphi): This commit broke Javadocs: Hi, this broke javadocs: {noformat} > /home/jenkins/workspace/Lucene-Solr-master-Linux/lucene/core/src/java/org/apache/lucene/util/hnsw/HnswGraph.java:43: > error: heading used out of sequence: , compared to implicit preceding > heading: * Hyperparameters ^ 1 error {noformat} This line is wrong: https://github.com/apache/lucene-solr/blob/b36b4af22bb76dc42b466b818b417bcbc0deb006/lucene/core/src/java/org/apache/lucene/util/hnsw/HnswGraph.java#L43 (should be ) Maybe this is not detected in Java 11, but later javac version complain hardly. > Approximate nearest vector search > - > > Key: LUCENE-9004 > URL: https://issues.apache.org/jira/browse/LUCENE-9004 > Project: Lucene - Core > Issue Type: New Feature >Reporter: Michael Sokolov >Assignee: Michael Sokolov >Priority: Major > Fix For: master (9.0) > > Attachments: hnsw_layered_graph.png > > Time Spent: 6h 40m > Remaining Estimate: 0h > > "Semantic" search based on machine-learned vector "embeddings" representing > terms, queries and documents is becoming a must-have feature for a modern > search engine. SOLR-12890 is exploring various approaches to this, including > providing vector-based scoring functions. This is a spinoff issue from that. > The idea here is to explore approximate nearest-neighbor search. Researchers > have found an approach based on navigating a graph that partially encodes the > nearest neighbor relation at multiple scales can provide accuracy > 95% (as > compared to exact nearest neighbor calculations) at a reasonable cost. This > issue will explore implementing HNSW (hierarchical navigable small-world) > graphs for the purpose of approximate nearest vector search (often referred > to as KNN or k-nearest-neighbor search). > At a high level the way this algorithm works is this. First assume you have a > graph that has a partial encoding of the nearest neighbor relation, with some > short and some long-distance links. If this graph is built in the right way > (has the hierarchical navigable small world property), then you can > efficiently traverse it to find nearest neighbors (approximately) in log N > time where N is the number of nodes in the graph. I believe this idea was > pioneered in [1]. The great insight in that paper is that if you use the > graph search algorithm to find the K nearest neighbors of a new document > while indexing, and then link those neighbors (undirectedly, ie both ways) to > the new document, then the graph that emerges will have the desired > properties. > The implementation I propose for Lucene is as follows. We need two new data > structures to encode the vectors and the graph. We can encode vectors using a > light wrapper around {{BinaryDocValues}} (we also want to encode the vector > dimension and have efficient conversion from bytes to floats). For the graph > we can use {{SortedNumericDocValues}} where the values we encode are the > docids of the related documents. Encoding the interdocument relations using > docids directly will make it relatively fast to traverse the graph since we > won't need to lookup through an id-field indirection. This choice limits us > to building a graph-per-segment since it would be impractical to maintain a > global graph for the whole index in the face of segment merges. However > graph-per-segment is a very natural at search time - we can traverse each > segments' graph independently and merge results as we do today for term-based > search. > At index time, however, merging graphs is somewhat challenging. While > indexing we build a graph incrementally, performing searches to construct > links among neighbors. When merging segments we must construct a new graph > containing elements of all the merged segments. Ideally we would somehow > preserve the work done when building the initial graphs, but at least as a > start I'd propose
[jira] [Commented] (LUCENE-8982) Make NativeUnixDirectory pure java now that direct IO is possible
[ https://issues.apache.org/jira/browse/LUCENE-8982?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17232957#comment-17232957 ] Uwe Schindler commented on LUCENE-8982: --- bq. gradle will pick up the toolchain suitable for the platform if it finds any - the CI just doesn't have it. That's fine. My idea was: Can't we build the native dependencies, if the toolchain is there, but don't fail if not an just print a warning? > Make NativeUnixDirectory pure java now that direct IO is possible > - > > Key: LUCENE-8982 > URL: https://issues.apache.org/jira/browse/LUCENE-8982 > Project: Lucene - Core > Issue Type: Improvement > Components: modules/misc >Reporter: Michael McCandless >Assignee: Dawid Weiss >Priority: Major > Time Spent: 9h > Remaining Estimate: 0h > > {{NativeUnixDirectory}} is a {{Directory}} implementation that uses direct IO > to write newly merged segments. Direct IO bypasses the kernel's buffer cache > and write cache, making merge writes "invisible" to the kernel, though the > reads for merging the N segments are still going through the kernel. > But today, {{NativeUnixDirectory}} uses a small JNI wrapper to access the > {{O_DIRECT}} flag to {{open}} ... since JDK9 we can now pass that flag in > pure java code, so we should now fix {{NativeUnixDirectory}} to not use JNI > anymore. > We should also run some more realistic benchmarks seeing if this option > really helps nodes that are doing concurrent indexing (merging) and searching. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (LUCENE-8982) Make NativeUnixDirectory pure java now that direct IO is possible
[ https://issues.apache.org/jira/browse/LUCENE-8982?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17232963#comment-17232963 ] Uwe Schindler commented on LUCENE-8982: --- In short: change the default to "true" if toolchain is there. > Make NativeUnixDirectory pure java now that direct IO is possible > - > > Key: LUCENE-8982 > URL: https://issues.apache.org/jira/browse/LUCENE-8982 > Project: Lucene - Core > Issue Type: Improvement > Components: modules/misc >Reporter: Michael McCandless >Assignee: Dawid Weiss >Priority: Major > Time Spent: 9h > Remaining Estimate: 0h > > {{NativeUnixDirectory}} is a {{Directory}} implementation that uses direct IO > to write newly merged segments. Direct IO bypasses the kernel's buffer cache > and write cache, making merge writes "invisible" to the kernel, though the > reads for merging the N segments are still going through the kernel. > But today, {{NativeUnixDirectory}} uses a small JNI wrapper to access the > {{O_DIRECT}} flag to {{open}} ... since JDK9 we can now pass that flag in > pure java code, so we should now fix {{NativeUnixDirectory}} to not use JNI > anymore. > We should also run some more realistic benchmarks seeing if this option > really helps nodes that are doing concurrent indexing (merging) and searching. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (LUCENE-8982) Make NativeUnixDirectory pure java now that direct IO is possible
[ https://issues.apache.org/jira/browse/LUCENE-8982?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17232968#comment-17232968 ] Dawid Weiss commented on LUCENE-8982: - You can set -Pbuild.native=true manually on those VMs you know the tools are available... I don't think duplicating whatever logic gradle uses to detect those toolschains automatically is worth the effort, to be honest. The logic is probably in gradle's sources somewhere. I don't know if it can be done easier than by copying from their code. > Make NativeUnixDirectory pure java now that direct IO is possible > - > > Key: LUCENE-8982 > URL: https://issues.apache.org/jira/browse/LUCENE-8982 > Project: Lucene - Core > Issue Type: Improvement > Components: modules/misc >Reporter: Michael McCandless >Assignee: Dawid Weiss >Priority: Major > Time Spent: 9h > Remaining Estimate: 0h > > {{NativeUnixDirectory}} is a {{Directory}} implementation that uses direct IO > to write newly merged segments. Direct IO bypasses the kernel's buffer cache > and write cache, making merge writes "invisible" to the kernel, though the > reads for merging the N segments are still going through the kernel. > But today, {{NativeUnixDirectory}} uses a small JNI wrapper to access the > {{O_DIRECT}} flag to {{open}} ... since JDK9 we can now pass that flag in > pure java code, so we should now fix {{NativeUnixDirectory}} to not use JNI > anymore. > We should also run some more realistic benchmarks seeing if this option > really helps nodes that are doing concurrent indexing (merging) and searching. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene-solr] dsmiley opened a new pull request #2083: SOLR-15001 Docker: require init_var_solr.sh
dsmiley opened a new pull request #2083: URL: https://github.com/apache/lucene-solr/pull/2083 https://issues.apache.org/jira/browse/SOLR-15001 There are two distinct commits here. The second is just to the build.gradle. I struggled with the Docker gradle build because the inputs/outputs were not configured correctly which confused Gradle's cool incremental build. I also disagree with defining a verbose class file inside a build file when it won't be re-used by other modules -- just do ad-hoc instead. Build automation is best suited to scripting languages IMO. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene-solr] dsmiley commented on pull request #1769: SOLR-14789: Absorb the docker-solr repo.
dsmiley commented on pull request #1769: URL: https://github.com/apache/lucene-solr/pull/1769#issuecomment-728271138 I'm glad your are amenable to changes, and that the complexity & Docker image weight I see will melt away if we only produce an image from the Solr assembly. That is identical to the "official release" except packaging -- plain dir vs tgz of the same. I can appreciate there were unknowns causing you to add this extra baggage because it might be useful but I prefer to follow a KISS/YAGNI philosophy so that we don't pay complexity debt on something not yet needed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene-solr] HoustonPutman commented on pull request #2083: SOLR-15001 Docker: require init_var_solr.sh
HoustonPutman commented on pull request #2083: URL: https://github.com/apache/lucene-solr/pull/2083#issuecomment-728316337 I agree with not liking the cumbersome Test class, the reason I added it was to allow for easy inclusion/exclusion rules for test cases. I think this PR loses that functionality. You could add it back in with ```groovy testsInclude = propertyOrEnvOrDefault("solr.docker.tests.include", "SOLR_DOCKER_TESTS_INCLUDE", "") testsExclude = propertyOrEnvOrDefault("solr.docker.tests.exclude", "SOLR_DOCKER_TESTS_EXCLUDE", "") ``` You would just need to edit `help/docker.txt` to make sure it's up to date with how to use the test inputs. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene-solr] msokolov commented on pull request #2022: LUCENE-9004: KNN vector search using NSW graphs
msokolov commented on pull request #2022: URL: https://github.com/apache/lucene-solr/pull/2022#issuecomment-728317354 Thanks Uwe, I'll push a fix soon. True enough, I am building with Java 11, which is not as fussy about such things. Indeed, I'm used to using arbitrary heading levels to achieve different presentation, but I guess that's not OK! This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Created] (SOLR-15005) RequestHandlerBase's logger name should point to the implementation class
David Smiley created SOLR-15005: --- Summary: RequestHandlerBase's logger name should point to the implementation class Key: SOLR-15005 URL: https://issues.apache.org/jira/browse/SOLR-15005 Project: Solr Issue Type: Improvement Security Level: Public (Default Security Level. Issues are Public) Components: logging Reporter: David Smiley RequestHandlerBase is an abstract class that defines a private static Logger with a logger name of this very class. I think it should point to the implementing class (getClass()). This would require it be non-static. It's used in just one spot, from a method that isn't static, so this will work. Do we go farther and declare as protected and remove static loggers in all subclasses, so long as they aren't being referenced from static methods there? See recent comments at the end of SOLR-8330 -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (SOLR-8330) Restrict logger visibility throughout the codebase to private so that only the file that declares it can use it
[ https://issues.apache.org/jira/browse/SOLR-8330?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17233112#comment-17233112 ] David Smiley commented on SOLR-8330: FYI [~hossman]; you may have an opinion as well since I see you filed SOLR-4833 which refers to http://slf4j.org/faq.html#declared_static I looked at that; the analysis didn't consider the class hierarchy / subclassing factor. I filed SOLR-15005 to change RequestHandlerBase's logger. While just that one little change is probably not controversial, please weight in on wether or not the subclasses should remove loggers they have in lieu of the RHB one that will be in-scope. > Restrict logger visibility throughout the codebase to private so that only > the file that declares it can use it > --- > > Key: SOLR-8330 > URL: https://issues.apache.org/jira/browse/SOLR-8330 > Project: Solr > Issue Type: Sub-task >Affects Versions: 6.0 >Reporter: Jason Gerlowski >Assignee: Anshum Gupta >Priority: Major > Labels: logging > Fix For: 5.4, 6.0 > > Attachments: SOLR-8330-combined.patch, SOLR-8330-detector.patch, > SOLR-8330-detector.patch, SOLR-8330.patch, SOLR-8330.patch, SOLR-8330.patch, > SOLR-8330.patch, SOLR-8330.patch, SOLR-8330.patch, SOLR-8330.patch > > > As Mike Drob pointed out in Solr-8324, many loggers in Solr are > unintentionally shared between classes. Many instances of this are caused by > overzealous copy-paste. This can make debugging tougher, as messages appear > to come from an incorrect location. > As discussed in the comments on SOLR-8324, there also might be legitimate > reasons for sharing loggers between classes. Where any ambiguity exists, > these instances shouldn't be touched. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene-solr] dsmiley merged pull request #2079: SOLR-14998 Update log level to DEBUG for ClusterStatus in Collections…
dsmiley merged pull request #2079: URL: https://github.com/apache/lucene-solr/pull/2079 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (SOLR-14998) any Collections Handler actions should be logged at debug level
[ https://issues.apache.org/jira/browse/SOLR-14998?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17233118#comment-17233118 ] ASF subversion and git services commented on SOLR-14998: Commit 2d583eaba7ab8eb778bebbc5557bae29ea481830 in lucene-solr's branch refs/heads/master from Nazerke Seidan [ https://gitbox.apache.org/repos/asf?p=lucene-solr.git;h=2d583ea ] SOLR-14998: logging: info->debug in CollectionsHandler (#2079) Because it's almost always redundant with HttpSolrCall's admin request log. Co-authored-by: Nazerke Seidan > any Collections Handler actions should be logged at debug level > --- > > Key: SOLR-14998 > URL: https://issues.apache.org/jira/browse/SOLR-14998 > Project: Solr > Issue Type: Improvement > Security Level: Public(Default Security Level. Issues are Public) >Reporter: Nazerke Seidan >Priority: Minor > Time Spent: 10m > Remaining Estimate: 0h > > CLUSTERSTATUS is logged in CollectionsHandler at INFO level but the cluster > status is already logged in HttpSolrCall at INFO. In CollectionsHandler INFO > level should be set to DEBUG to avoid a lot of noise. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (SOLR-14998) any Collections Handler actions should be logged at debug level
[ https://issues.apache.org/jira/browse/SOLR-14998?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17233121#comment-17233121 ] ASF subversion and git services commented on SOLR-14998: Commit 4d904e523c9bc36f36403b9f3831b0563f3a1f79 in lucene-solr's branch refs/heads/branch_8x from Nazerke Seidan [ https://gitbox.apache.org/repos/asf?p=lucene-solr.git;h=4d904e5 ] SOLR-14998: logging: info->debug in CollectionsHandler (#2079) Because it's almost always redundant with HttpSolrCall's admin request log. Co-authored-by: Nazerke Seidan (cherry picked from commit 2d583eaba7ab8eb778bebbc5557bae29ea481830) > any Collections Handler actions should be logged at debug level > --- > > Key: SOLR-14998 > URL: https://issues.apache.org/jira/browse/SOLR-14998 > Project: Solr > Issue Type: Improvement > Security Level: Public(Default Security Level. Issues are Public) >Reporter: Nazerke Seidan >Priority: Minor > Time Spent: 10m > Remaining Estimate: 0h > > CLUSTERSTATUS is logged in CollectionsHandler at INFO level but the cluster > status is already logged in HttpSolrCall at INFO. In CollectionsHandler INFO > level should be set to DEBUG to avoid a lot of noise. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Resolved] (SOLR-14998) any Collections Handler actions should be logged at debug level
[ https://issues.apache.org/jira/browse/SOLR-14998?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] David Smiley resolved SOLR-14998. - Fix Version/s: 8.8 Resolution: Fixed Merged. Thanks Nazerke! > any Collections Handler actions should be logged at debug level > --- > > Key: SOLR-14998 > URL: https://issues.apache.org/jira/browse/SOLR-14998 > Project: Solr > Issue Type: Improvement > Security Level: Public(Default Security Level. Issues are Public) >Reporter: Nazerke Seidan >Priority: Minor > Fix For: 8.8 > > Time Spent: 10m > Remaining Estimate: 0h > > CLUSTERSTATUS is logged in CollectionsHandler at INFO level but the cluster > status is already logged in HttpSolrCall at INFO. In CollectionsHandler INFO > level should be set to DEBUG to avoid a lot of noise. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene-solr] dsmiley commented on pull request #2083: SOLR-15001 Docker: require init_var_solr.sh
dsmiley commented on pull request #2083: URL: https://github.com/apache/lucene-solr/pull/2083#issuecomment-728356772 Ah; thanks for pointing out the point of that configuration; now I see. I'll remove these changes from this PR and have a separate PR dedicated to overhauling/simplifying the Docker build. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (SOLR-15005) RequestHandlerBase's logger name should point to the implementation class
[ https://issues.apache.org/jira/browse/SOLR-15005?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17233133#comment-17233133 ] Chris M. Hostetter commented on SOLR-15005: --- I'm not sure i really follow the rationale/suggestion being made here ... even after reading hte recent comment in SOLR-8330. In my personal opinion: If I'm reading a log message, i would much rather know the *code* that's logging the message, then what particular subclass that called the method that ran that code ... it does not make sense to me that the same block of code in RequestHandlerBase might use one Logger when subclassed by SearchHandler, and a differnet Logger when subclassed by UpdateRequestHandler. In general, the suggestion that if {{Foo extends Bar}} means that any code path through _an instance of Foo_ should use Foo's logger -- even inside a method implemented in Bar that Foo inherits -- seems just as weird to me as suggesting that if my Foo instance calls out to some static method in YakUtils, that the YakUtils method should (somehow) also use my Foo logger. For that matter: what logger should be used if an _instance_ of Foo calls a static method in Bar? what if a _static_ method in Foo calls a static method in Bar? ... all of these permutations make me very very scared of how confusing it would be if _soemtimes_ code in Bar used it's own logger, but other times it used some other caller specific logger. Going back to the specific context of this jira: If you care _which_ handler is logging that message, then changing the Logger used based on the class doesn't really help you anyway -- there can/will be many isntances of SearchHandler -- this is what MDC values are for, and we could (should?) certainly put the "name" of the heandler (ie: {{/update}} vs {{/select}} vs {{/query}} in the MDC context for logging if folks find that useful. Allthough i would suggest that at a certain point, instead of putting tons of info in the MDC, it makes sense to keep the MDC small, and mainly focus on having a UUID logged that can be used to corrolate different log entries (i think not too long back jason added a UUID that was included for distributed request tracing, but IDK if it's part of the MDC as well) > RequestHandlerBase's logger name should point to the implementation class > - > > Key: SOLR-15005 > URL: https://issues.apache.org/jira/browse/SOLR-15005 > Project: Solr > Issue Type: Improvement > Security Level: Public(Default Security Level. Issues are Public) > Components: logging >Reporter: David Smiley >Priority: Minor > > RequestHandlerBase is an abstract class that defines a private static Logger > with a logger name of this very class. I think it should point to the > implementing class (getClass()). This would require it be non-static. It's > used in just one spot, from a method that isn't static, so this will work. > Do we go farther and declare as protected and remove static loggers in all > subclasses, so long as they aren't being referenced from static methods there? > See recent comments at the end of SOLR-8330 -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Resolved] (SOLR-15005) RequestHandlerBase's logger name should point to the implementation class
[ https://issues.apache.org/jira/browse/SOLR-15005?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] David Smiley resolved SOLR-15005. - Resolution: Won't Fix Fair enough Hoss... I enjoy reading your wisdom. Indeed you make a good point about MDC; that should ameliorate wanting to know which handler is logging. > RequestHandlerBase's logger name should point to the implementation class > - > > Key: SOLR-15005 > URL: https://issues.apache.org/jira/browse/SOLR-15005 > Project: Solr > Issue Type: Improvement > Security Level: Public(Default Security Level. Issues are Public) > Components: logging >Reporter: David Smiley >Priority: Minor > > RequestHandlerBase is an abstract class that defines a private static Logger > with a logger name of this very class. I think it should point to the > implementing class (getClass()). This would require it be non-static. It's > used in just one spot, from a method that isn't static, so this will work. > Do we go farther and declare as protected and remove static loggers in all > subclasses, so long as they aren't being referenced from static methods there? > See recent comments at the end of SOLR-8330 -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Created] (SOLR-15006) replace the coreNodeName variables to replicaName
Noble Paul created SOLR-15006: - Summary: replace the coreNodeName variables to replicaName Key: SOLR-15006 URL: https://issues.apache.org/jira/browse/SOLR-15006 Project: Solr Issue Type: Task Security Level: Public (Default Security Level. Issues are Public) Reporter: Noble Paul Assignee: Noble Paul {{coreNodeName}} makes no sense. it's just the replica name -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Updated] (SOLR-15006) replace the coreNodeName variables to replicaName
[ https://issues.apache.org/jira/browse/SOLR-15006?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Noble Paul updated SOLR-15006: -- Description: {{coreNodeName}} makes no sense. it's just the replica name This is a backward compatible change. it won't change any attributes that are apart of ZK messages/commands etc. was:{{coreNodeName}} makes no sense. it's just the replica name > replace the coreNodeName variables to replicaName > - > > Key: SOLR-15006 > URL: https://issues.apache.org/jira/browse/SOLR-15006 > Project: Solr > Issue Type: Task > Security Level: Public(Default Security Level. Issues are Public) >Reporter: Noble Paul >Assignee: Noble Paul >Priority: Major > > {{coreNodeName}} makes no sense. it's just the replica name > This is a backward compatible change. it won't change any attributes that are > apart of ZK messages/commands etc. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene-solr] jtibshirani commented on a change in pull request #2047: LUCENE-9592: Use doubles in VectorUtil to maintain precision.
jtibshirani commented on a change in pull request #2047: URL: https://github.com/apache/lucene-solr/pull/2047#discussion_r524826537 ## File path: lucene/core/src/java/org/apache/lucene/util/VectorUtil.java ## @@ -25,47 +25,22 @@ private VectorUtil() { } - public static float dotProduct(float[] a, float[] b) { -float res = 0f; -/* - * If length of vector is larger than 8, we use unrolled dot product to accelerate the - * calculation. - */ -int i; -for (i = 0; i < a.length % 8; i++) { - res += b[i] * a[i]; -} -if (a.length < 8) { - return res; -} -float s0 = 0f; -float s1 = 0f; -float s2 = 0f; -float s3 = 0f; -float s4 = 0f; -float s5 = 0f; -float s6 = 0f; -float s7 = 0f; -for (; i + 7 < a.length; i += 8) { - s0 += b[i] * a[i]; - s1 += b[i + 1] * a[i + 1]; - s2 += b[i + 2] * a[i + 2]; - s3 += b[i + 3] * a[i + 3]; - s4 += b[i + 4] * a[i + 4]; - s5 += b[i + 5] * a[i + 5]; - s6 += b[i + 6] * a[i + 6]; - s7 += b[i + 7] * a[i + 7]; + public static double dotProduct(float[] a, float[] b) { Review comment: Simply changing the test to use a larger epsilon sounds good to me. After thinking about this more, I'm not sure we want to optimize for the precision of these individual calculations. Many high-dimensional vectors are already an approximation to an original object, like a piece of text. And I've heard of practitioners choosing less precise representations (like bfloat16) for each vector element to save space, and still achieving acceptable results. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (LUCENE-8982) Make NativeUnixDirectory pure java now that direct IO is possible
[ https://issues.apache.org/jira/browse/LUCENE-8982?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17233183#comment-17233183 ] Zach Chen commented on LUCENE-8982: --- Sorry for the late response, just got off work and see this. >From the discussion it seems the assumption / reality here is that cpp >toolchain may or may not be available in the VMs. However, since Lucene does >have native code and scheduled build can discover any change that breaks the >native-java integration early on (there was actually one commit before this >that broke it), should the build in general assume cpp toolchain to be there >in the VMs (and add them if they are missing), but still have >-Pbuild.native=false as default to not break builds for others and have a few >VMs with cpp toolchain intentionally left out to test for compatibility? > Make NativeUnixDirectory pure java now that direct IO is possible > - > > Key: LUCENE-8982 > URL: https://issues.apache.org/jira/browse/LUCENE-8982 > Project: Lucene - Core > Issue Type: Improvement > Components: modules/misc >Reporter: Michael McCandless >Assignee: Dawid Weiss >Priority: Major > Time Spent: 9h > Remaining Estimate: 0h > > {{NativeUnixDirectory}} is a {{Directory}} implementation that uses direct IO > to write newly merged segments. Direct IO bypasses the kernel's buffer cache > and write cache, making merge writes "invisible" to the kernel, though the > reads for merging the N segments are still going through the kernel. > But today, {{NativeUnixDirectory}} uses a small JNI wrapper to access the > {{O_DIRECT}} flag to {{open}} ... since JDK9 we can now pass that flag in > pure java code, so we should now fix {{NativeUnixDirectory}} to not use JNI > anymore. > We should also run some more realistic benchmarks seeing if this option > really helps nodes that are doing concurrent indexing (merging) and searching. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Comment Edited] (LUCENE-8982) Make NativeUnixDirectory pure java now that direct IO is possible
[ https://issues.apache.org/jira/browse/LUCENE-8982?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17233183#comment-17233183 ] Zach Chen edited comment on LUCENE-8982 at 11/17/20, 1:50 AM: -- Sorry for the late response, just got off work and see this. >From the discussion it seems the assumption / reality here is that cpp >toolchain may or may not be available in the VMs. However, since Lucene does >have native code and scheduled build can discover any change that breaks the >native-java integration early on (there was actually one commit before this >that broke it), should the build in general assume cpp toolchain to be there >in the VMs (and add them if they are missing) to execute the compilation and >tests, but still have -Pbuild.native=false as default to not break builds for >others and have a few VMs with cpp toolchain intentionally left out to test >for compatibility? was (Author: zacharymorn): Sorry for the late response, just got off work and see this. >From the discussion it seems the assumption / reality here is that cpp >toolchain may or may not be available in the VMs. However, since Lucene does >have native code and scheduled build can discover any change that breaks the >native-java integration early on (there was actually one commit before this >that broke it), should the build in general assume cpp toolchain to be there >in the VMs (and add them if they are missing), but still have >-Pbuild.native=false as default to not break builds for others and have a few >VMs with cpp toolchain intentionally left out to test for compatibility? > Make NativeUnixDirectory pure java now that direct IO is possible > - > > Key: LUCENE-8982 > URL: https://issues.apache.org/jira/browse/LUCENE-8982 > Project: Lucene - Core > Issue Type: Improvement > Components: modules/misc >Reporter: Michael McCandless >Assignee: Dawid Weiss >Priority: Major > Time Spent: 9h > Remaining Estimate: 0h > > {{NativeUnixDirectory}} is a {{Directory}} implementation that uses direct IO > to write newly merged segments. Direct IO bypasses the kernel's buffer cache > and write cache, making merge writes "invisible" to the kernel, though the > reads for merging the N segments are still going through the kernel. > But today, {{NativeUnixDirectory}} uses a small JNI wrapper to access the > {{O_DIRECT}} flag to {{open}} ... since JDK9 we can now pass that flag in > pure java code, so we should now fix {{NativeUnixDirectory}} to not use JNI > anymore. > We should also run some more realistic benchmarks seeing if this option > really helps nodes that are doing concurrent indexing (merging) and searching. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene-solr] jtibshirani opened a new pull request #2084: LUCENE-9592: Loosen equality checks in TestVectorUtil.
jtibshirani opened a new pull request #2084: URL: https://github.com/apache/lucene-solr/pull/2084 TestVectorUtil occasionally fails because of floating point errors. This change slightly increases the epsilon in equality checks -- testing shows that this will greatly decrease the chance of failure. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene-solr] jtibshirani commented on a change in pull request #2047: LUCENE-9592: Use doubles in VectorUtil to maintain precision.
jtibshirani commented on a change in pull request #2047: URL: https://github.com/apache/lucene-solr/pull/2047#discussion_r524835397 ## File path: lucene/core/src/java/org/apache/lucene/util/VectorUtil.java ## @@ -25,47 +25,22 @@ private VectorUtil() { } - public static float dotProduct(float[] a, float[] b) { -float res = 0f; -/* - * If length of vector is larger than 8, we use unrolled dot product to accelerate the - * calculation. - */ -int i; -for (i = 0; i < a.length % 8; i++) { - res += b[i] * a[i]; -} -if (a.length < 8) { - return res; -} -float s0 = 0f; -float s1 = 0f; -float s2 = 0f; -float s3 = 0f; -float s4 = 0f; -float s5 = 0f; -float s6 = 0f; -float s7 = 0f; -for (; i + 7 < a.length; i += 8) { - s0 += b[i] * a[i]; - s1 += b[i + 1] * a[i + 1]; - s2 += b[i + 2] * a[i + 2]; - s3 += b[i + 3] * a[i + 3]; - s4 += b[i + 4] * a[i + 4]; - s5 += b[i + 5] * a[i + 5]; - s6 += b[i + 6] * a[i + 6]; - s7 += b[i + 7] * a[i + 7]; + public static double dotProduct(float[] a, float[] b) { Review comment: I opened https://github.com/apache/lucene-solr/pull/2047, it if it looks okay I can close out this PR. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Comment Edited] (LUCENE-8982) Make NativeUnixDirectory pure java now that direct IO is possible
[ https://issues.apache.org/jira/browse/LUCENE-8982?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17233183#comment-17233183 ] Zach Chen edited comment on LUCENE-8982 at 11/17/20, 2:25 AM: -- Sorry for the late response, just got off work and saw this. >From the discussion it seems the assumption / reality here is that cpp >toolchain may or may not be available in the VMs. However, since Lucene does >have native code and scheduled build can discover any change that breaks the >native-java integration early on (there was actually one commit before this >that broke it), should the build in general requires cpp toolchain to be there >in the VMs (and add them if they are missing) to execute the compilation and >tests, but still have -Pbuild.native=false as default to not break builds for >others and have a few VMs with cpp toolchain intentionally left out to test >for compatibility? was (Author: zacharymorn): Sorry for the late response, just got off work and saw this. >From the discussion it seems the assumption / reality here is that cpp >toolchain may or may not be available in the VMs. However, since Lucene does >have native code and scheduled build can discover any change that breaks the >native-java integration early on (there was actually one commit before this >that broke it), should the build in general require cpp toolchain to be there >in the VMs (and add them if they are missing) to execute the compilation and >tests, but still have -Pbuild.native=false as default to not break builds for >others and have a few VMs with cpp toolchain intentionally left out to test >for compatibility? > Make NativeUnixDirectory pure java now that direct IO is possible > - > > Key: LUCENE-8982 > URL: https://issues.apache.org/jira/browse/LUCENE-8982 > Project: Lucene - Core > Issue Type: Improvement > Components: modules/misc >Reporter: Michael McCandless >Assignee: Dawid Weiss >Priority: Major > Time Spent: 9h > Remaining Estimate: 0h > > {{NativeUnixDirectory}} is a {{Directory}} implementation that uses direct IO > to write newly merged segments. Direct IO bypasses the kernel's buffer cache > and write cache, making merge writes "invisible" to the kernel, though the > reads for merging the N segments are still going through the kernel. > But today, {{NativeUnixDirectory}} uses a small JNI wrapper to access the > {{O_DIRECT}} flag to {{open}} ... since JDK9 we can now pass that flag in > pure java code, so we should now fix {{NativeUnixDirectory}} to not use JNI > anymore. > We should also run some more realistic benchmarks seeing if this option > really helps nodes that are doing concurrent indexing (merging) and searching. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Comment Edited] (LUCENE-8982) Make NativeUnixDirectory pure java now that direct IO is possible
[ https://issues.apache.org/jira/browse/LUCENE-8982?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17233183#comment-17233183 ] Zach Chen edited comment on LUCENE-8982 at 11/17/20, 2:25 AM: -- Sorry for the late response, just got off work and saw this. >From the discussion it seems the assumption / reality here is that cpp >toolchain may or may not be available in the VMs. However, since Lucene does >have native code and scheduled build can discover any change that breaks the >native-java integration early on (there was actually one commit before this >that broke it), should the build in general require cpp toolchain to be there >in the VMs (and add them if they are missing) to execute the compilation and >tests, but still have -Pbuild.native=false as default to not break builds for >others and have a few VMs with cpp toolchain intentionally left out to test >for compatibility? was (Author: zacharymorn): Sorry for the late response, just got off work and see this. >From the discussion it seems the assumption / reality here is that cpp >toolchain may or may not be available in the VMs. However, since Lucene does >have native code and scheduled build can discover any change that breaks the >native-java integration early on (there was actually one commit before this >that broke it), should the build in general assume cpp toolchain to be there >in the VMs (and add them if they are missing) to execute the compilation and >tests, but still have -Pbuild.native=false as default to not break builds for >others and have a few VMs with cpp toolchain intentionally left out to test >for compatibility? > Make NativeUnixDirectory pure java now that direct IO is possible > - > > Key: LUCENE-8982 > URL: https://issues.apache.org/jira/browse/LUCENE-8982 > Project: Lucene - Core > Issue Type: Improvement > Components: modules/misc >Reporter: Michael McCandless >Assignee: Dawid Weiss >Priority: Major > Time Spent: 9h > Remaining Estimate: 0h > > {{NativeUnixDirectory}} is a {{Directory}} implementation that uses direct IO > to write newly merged segments. Direct IO bypasses the kernel's buffer cache > and write cache, making merge writes "invisible" to the kernel, though the > reads for merging the N segments are still going through the kernel. > But today, {{NativeUnixDirectory}} uses a small JNI wrapper to access the > {{O_DIRECT}} flag to {{open}} ... since JDK9 we can now pass that flag in > pure java code, so we should now fix {{NativeUnixDirectory}} to not use JNI > anymore. > We should also run some more realistic benchmarks seeing if this option > really helps nodes that are doing concurrent indexing (merging) and searching. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Comment Edited] (LUCENE-8982) Make NativeUnixDirectory pure java now that direct IO is possible
[ https://issues.apache.org/jira/browse/LUCENE-8982?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17233183#comment-17233183 ] Zach Chen edited comment on LUCENE-8982 at 11/17/20, 2:32 AM: -- Sorry for the late response, just got off work and saw this. >From the discussion it seems the assumption / reality here is that cpp >toolchain may or may not be available in the VMs. However, since Lucene does >have native code and scheduled build can discover any change that breaks the >native-java integration early on (there was actually one commit before this >that broke it), should the CI builds in general require cpp toolchain to be >there in the VMs (and add them if they are missing) to execute the compilation >and tests, but in gradle still have -Pbuild.native=false as default to not >break builds for others in development, and have a few CI VMs with cpp >toolchain intentionally left out to test for compatibility? was (Author: zacharymorn): Sorry for the late response, just got off work and saw this. >From the discussion it seems the assumption / reality here is that cpp >toolchain may or may not be available in the VMs. However, since Lucene does >have native code and scheduled build can discover any change that breaks the >native-java integration early on (there was actually one commit before this >that broke it), should the build in general requires cpp toolchain to be there >in the VMs (and add them if they are missing) to execute the compilation and >tests, but still have -Pbuild.native=false as default to not break builds for >others and have a few VMs with cpp toolchain intentionally left out to test >for compatibility? > Make NativeUnixDirectory pure java now that direct IO is possible > - > > Key: LUCENE-8982 > URL: https://issues.apache.org/jira/browse/LUCENE-8982 > Project: Lucene - Core > Issue Type: Improvement > Components: modules/misc >Reporter: Michael McCandless >Assignee: Dawid Weiss >Priority: Major > Time Spent: 9h > Remaining Estimate: 0h > > {{NativeUnixDirectory}} is a {{Directory}} implementation that uses direct IO > to write newly merged segments. Direct IO bypasses the kernel's buffer cache > and write cache, making merge writes "invisible" to the kernel, though the > reads for merging the N segments are still going through the kernel. > But today, {{NativeUnixDirectory}} uses a small JNI wrapper to access the > {{O_DIRECT}} flag to {{open}} ... since JDK9 we can now pass that flag in > pure java code, so we should now fix {{NativeUnixDirectory}} to not use JNI > anymore. > We should also run some more realistic benchmarks seeing if this option > really helps nodes that are doing concurrent indexing (merging) and searching. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Resolved] (SOLR-15000) Solr based enterprise level, one-stop search center products with high performance, high reliability and high scalability
[ https://issues.apache.org/jira/browse/SOLR-15000?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] bai sui resolved SOLR-15000. Resolution: Done > Solr based enterprise level, one-stop search center products with high > performance, high reliability and high scalability > - > > Key: SOLR-15000 > URL: https://issues.apache.org/jira/browse/SOLR-15000 > Project: Solr > Issue Type: Wish > Security Level: Public(Default Security Level. Issues are Public) > Components: Admin UI >Reporter: bai sui >Priority: Minor > Attachments: add-collection-step-2-expert.png, > add-collection-step-2.png > > > h2. Summary > I have developed an enterprise application based on Solr,named TIS . Use TIS > can quickly build enterprise search service for you. TIS includes three > components: > - offline index building platform > The data is exported from ER database( mysql, sqlserver and so on) through > full table scanning, and then the wide table is constructed by local MR tool, > or the wide table is constructed directly by spark > - incremental real-time channel > It is transmitted to Kafka , and real-time stream calculation is carried out > by Flink and submitted to search engine to ensure that the data in search > engine and database are consistent in near real time > - search engine > currently,based on Solr8 > TIS integrate these components seamlessly and bring users one-stop, out of > the box experience. > h2. My question > I want to feed back my code to the community, but TIS focuses on Enterprise > Application Search, just as elasitc search focuses on visual analysis of time > series data. Because Solr is a general search product, *I don't think TIS can > be merged directly into Solr. Is it possible for TIS to be a new incubation > project under Apache?* > h2. TIS main Features > - The schema and solrconfig storage are separated from ZK and stored in > MySQL. The version management function is provided. Users can roll back to > the historical version of the configuration. > !add-collection-step-2-expert.png|width=500! > !add-collection-step-2.png|width=500! >Schema editing mode can be switched between visual editing mode or > advanced expert mode > - Define wide table rules based on the selected data table > - The offline index building component is provided. Outside the collection, > the data is built into Lucene segment file. Then, the segment file is > returned to the local disk where solrcore is located. The new index of reload > solrcore takes effect -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene-solr] dweiss commented on a change in pull request #2052: LUCENE-8982: Make NativeUnixDirectory pure java with FileChannel direct IO flag, and rename to DirectIODirectory
dweiss commented on a change in pull request #2052: URL: https://github.com/apache/lucene-solr/pull/2052#discussion_r524942578 ## File path: lucene/misc/src/java/org/apache/lucene/misc/store/DirectIODirectory.java ## @@ -74,12 +65,12 @@ * * @lucene.experimental */ -public class NativeUnixDirectory extends FSDirectory { Review comment: I think we should make a copy of the NativeUnixDirectory, modify this to direct IO, then perhaps benchmark how they perform? If we replace in-place we won't be able to do it (unless you compile from different git commits). Then any removal of native code, should it follow-up, would be a cleaner patch as well. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org