In my quest to support indexing from files located in azure storage (as
opposed to standard disk based files), .. I have the following question
The SOLR request parser (request is configured for remote load but
CommonParams.STREAM_FILE is still required as it references the relative
azure path) always attempts to create a file object. The system doesn't
crash but still, it doesn't seem like the right thing to do.
SolrRequestParsers.buildRequestFrom(....) {
...........................
// Handle streaming files
strs = params.getParams( CommonParams.STREAM_FILE );
if( strs != null ) {
if( !enableRemoteStreams ) {
throw new SolrException( ErrorCode.BAD_REQUEST, "Remote Streaming
is disabled." );
}
for( final String file : strs ) {
ContentStreamBase stream = new ContentStreamBase.FileStream( new
File(file) ); =======> xxxxxxxxxxxxxxxx this always tries to create a file
object
if( contentType != null ) {
stream.setContentType( contentType );
}
streams.add( stream );
}
}
...........................
Question:
1. while I can retrieve the relative azure file path in my plugin and
download it from there, I was wondering if SOLR actually supports custom
request parsers and if so, how to plugin them in. The SOLR config only
gives me configuration options
2. What is the recommended approach. As I mentioned above, I "think" I can
get this working by intercepting the fake File object in my plugin,
extracting the path, and handling it differently as desired
Thanks
<!-- Request Parsing
These settings indicate how Solr Requests may be parsed, and
what restrictions may be placed on the ContentStreams from
those requests
enableRemoteStreaming - enables use of the stream.file
and stream.url parameters for specifying remote streams.
multipartUploadLimitInKB - specifies the max size (in KiB) of
Multipart File Uploads that Solr will allow in a Request.
formdataUploadLimitInKB - specifies the max size (in KiB) of
form data (application/x-www-form-urlencoded) sent via
POST. You can use POST to pass request parameters not
fitting into the URL.
addHttpRequestToContext - if set to true, it will instruct
the requestParsers to include the original HttpServletRequest
object in the context map of the SolrQueryRequest under the
key "httpRequest". It will not be used by any of the existing
Solr components, but may be useful when developing custom
plugins.
*** WARNING ***
The settings below authorize Solr to fetch remote files, You
should make sure your system has some authentication before
using enableRemoteStreaming="true"
-->
<requestParsers enableRemoteStreaming="true"
multipartUploadLimitInKB="2048000"
formdataUploadLimitInKB="2048"
addHttpRequestToContext="false"/>