In my quest to support indexing from files located in azure storage (as
opposed to standard disk based files), .. I have the following question

The SOLR request parser (request is configured for remote load but
CommonParams.STREAM_FILE is still required as it references the relative
azure path) always attempts to create a file object. The system doesn't
crash but still, it doesn't seem like the right thing to do.

SolrRequestParsers.buildRequestFrom(....) {
...........................
    // Handle streaming files
    strs = params.getParams( CommonParams.STREAM_FILE );
    if( strs != null ) {
      if( !enableRemoteStreams ) {
        throw new SolrException( ErrorCode.BAD_REQUEST, "Remote Streaming
is disabled." );
      }
      for( final String file : strs ) {
        ContentStreamBase stream = new ContentStreamBase.FileStream( new
File(file) ); =======> xxxxxxxxxxxxxxxx  this always tries to create a file
object
        if( contentType != null ) {
          stream.setContentType( contentType );
        }
        streams.add( stream );
      }
    }
...........................

Question:
1. while I can retrieve the relative azure file path in my plugin and
download it from there, I was wondering if SOLR actually supports custom
request parsers and if so, how to plugin them in. The SOLR config only
gives me configuration options
2. What is the recommended approach. As I mentioned above, I "think" I can
get this working by intercepting the fake File object in my plugin,
extracting the path, and handling it differently as desired

Thanks

    <!-- Request Parsing

         These settings indicate how Solr Requests may be parsed, and
         what restrictions may be placed on the ContentStreams from
         those requests

         enableRemoteStreaming - enables use of the stream.file
         and stream.url parameters for specifying remote streams.

         multipartUploadLimitInKB - specifies the max size (in KiB) of
         Multipart File Uploads that Solr will allow in a Request.

         formdataUploadLimitInKB - specifies the max size (in KiB) of
         form data (application/x-www-form-urlencoded) sent via
         POST. You can use POST to pass request parameters not
         fitting into the URL.

         addHttpRequestToContext - if set to true, it will instruct
         the requestParsers to include the original HttpServletRequest
         object in the context map of the SolrQueryRequest under the
         key "httpRequest". It will not be used by any of the existing
         Solr components, but may be useful when developing custom
         plugins.

         *** WARNING ***
         The settings below authorize Solr to fetch remote files, You
         should make sure your system has some authentication before
         using enableRemoteStreaming="true"

      -->
    <requestParsers enableRemoteStreaming="true"
                    multipartUploadLimitInKB="2048000"
                    formdataUploadLimitInKB="2048"
                    addHttpRequestToContext="false"/>

Reply via email to