mvolikas commented on code in PR #1833:
URL: https://github.com/apache/stormcrawler/pull/1833#discussion_r2971712673
##########
external/solr/src/main/java/org/apache/stormcrawler/solr/SolrConnection.java:
##########
@@ -134,8 +132,8 @@ private void updateAsync(Update update) {
synchronized (lock) {
lastUpdate = System.currentTimeMillis();
- CloudHttp2SolrClient cloudHttp2SolrClient = (CloudHttp2SolrClient)
client;
- DocCollection col =
cloudHttp2SolrClient.getClusterState().getCollection(collection);
+ CloudSolrClient cloudClient = (CloudSolrClient) client;
+ DocCollection col =
cloudClient.getClusterState().getCollection(collection);
Review Comment:
`cloudClient.getClusterState()` is deprecated. We can use
`cloudClient.getClusterStateProvider().getClusterState()` instead.
##########
external/solr/src/main/java/org/apache/stormcrawler/solr/SolrConnection.java:
##########
@@ -195,29 +187,27 @@ private void flushUpdates(
}
UpdateRequest updateRequest = new UpdateRequest();
- updateRequest.add(docs);
- updateRequest.deleteById(deletionIds);
+ if (!docs.isEmpty()) {
+ updateRequest.add(docs);
+ }
+ if (!deletionIds.isEmpty()) {
+ updateRequest.deleteById(deletionIds);
+ }
List<Update> batch = new ArrayList<>(waitingUpdates);
waitingUpdates.clear();
- // Get the async client
- LBHttp2SolrClient lbHttp2SolrClient =
cloudHttp2SolrClient.getLbClient();
- LBSolrClient.Req req = new LBSolrClient.Req(updateRequest, endpoints);
-
- lbHttp2SolrClient
- .requestAsync(req)
- .whenComplete(
- (futureResponse, throwable) -> {
- if (throwable != null) {
- LOG.error("Exception caught while updating",
throwable);
-
- // The request failed => add the batch back to
the pending updates
- synchronized (lock) {
- waitingUpdates.addAll(batch);
- }
- }
- });
+ CompletableFuture.runAsync(
Review Comment:
This wraps a synchronous request, which will only block a different thread.
That is the reason why we had used `lbHttp2SolrClient.requestAsync` which
will start a "real" async network request to Solr. (see #1488 and the
discussion
[here](https://lists.apache.org/thread/4n7z4vkslrjj3wsb5t5fvghvyvvwy71k))
We could keep both the old logic and the new `CloudSolrClient` by doing the
following:
1. When building the `CloudSolrClient`, specify a `HttpJettySolrClient`
(like you do for the `ConcurrentUpdateJettySolrClient`) to ensure that the
created `LBSolrClient` will be an `LBAsyncSolrClient`.
```
HttpJettySolrClient jettyClient = new HttpJettySolrClient.Builder().build();
CloudSolrClient.Builder builder =
new CloudSolrClient.Builder(
Collections.singletonList(zkHost), Optional.empty())
.withHttpClient(jettyClient);
```
2. Get the async client before making a request.
```
LBAsyncSolrClient lbAsyncSolrClient = (LBAsyncSolrClient)
((CloudHttp2SolrClient) cloudClient).getLbClient();
```
##########
external/solr/configsets/docs/conf/solrconfig.xml:
##########
@@ -18,7 +18,7 @@ specific language governing permissions and limitations
under the License.
-->
<config>
- <luceneMatchVersion>9.0.0</luceneMatchVersion>
+ <luceneMatchVersion>10.0.0</luceneMatchVersion>
Review Comment:
The default configsets in Solr `10.0.0` use
`<luceneMatchVersion>10.3</luceneMatchVersion>`, so we might as well pin to
this version instead (minor).
##########
external/solr/src/test/java/org/apache/stormcrawler/solr/SolrContainerTest.java:
##########
@@ -74,24 +74,24 @@ protected ExecResult createCollection(String
collectionName, int shards)
"/opt/solr/bin/solr",
"zk",
"upconfig",
- "-n",
+ "--conf-name",
collectionName,
- "-d",
+ "--conf-dir",
"/opt/solr/server/solr/configsets/" + collectionName,
- "-z",
+ "--zk-host",
"localhost:9983");
// Create the collection
return container.execInContainer(
"/opt/solr/bin/solr",
"create",
- "-c",
+ "--name",
collectionName,
- "-n",
+ "--conf-name",
Review Comment:
Some options (like `--conf-name`) are not documented in the [Solr Control
Script
Reference](https://solr.apache.org/guide/solr/latest/deployment-guide/solr-control-script-reference.html),
but are shown when running `bin/solr create --help` so I guess are ok to use.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]