Hi Bustaa,

Can you paste your data-config.xml? 

Also, did you consider using ManifoldCF [1] to crawl/index your CMS? What CMS 
are you using?

[1] 
http://manifoldcf.apache.org/release/trunk/en_US/end-user-documentation.html#repositoryconnectiontypes






On Wednesday, January 29, 2014 1:03 PM, Bustaa <bus...@gmail.com> wrote:
Hello Solr Users,

i'm trying to get Tika's "BinFileDataSource" to take the filenames
from a multivalue field (array) but I'm getting the following
exception:

Debug output from running dataimport (shortenend):


          "query",
          "<<< LONG SQL-QUERY >>>",
          "time-taken",
          "0:0:0.11",
          null,
          "----------- row #1-------------",
          "di_description",
          "asdad",
          "di_longtitle",
          "",
          "di_file",
          
"fileadmin/user_upload/dateien/abc/file1.pdf,fileadmin/user_upload/dateien/abc/file2.pdf",
          "di_title",
          "test",
          "di_date",
          "2014-01-30T00:00:00Z",
          "di_notes",
          "",
          null,
          "---------------------------------------------",
          "transformer:script:PrependPath",
          [
            null,
            "---------------------------------------------",
            "di_description",
            "asdad",
            "di_longtitle",
            "",
            "di_file",
            [
              "/Users/b/Sites/fileadmin/user_upload/dateien/abc/file1.pdf",
              "/Users/b/Sites/fileadmin/user_upload/dateien/abc/file2.pdf"
            ],
            "di_title",
            "test",
            "di_date",
            "2014-01-30T00:00:00Z",
            "di_notes",
            "",
            null,
            "---------------------------------------------",
            "entity:binaryImport",
            [
              "query",
              "[/Users/b/Sites/fileadmin/user_upload/dateien/abc/file1.pdf,
/Users/b/Sites/fileadmin/user_upload/dateien/abc/file2.pdf]",
              "EXCEPTION",
              "java.lang.RuntimeException:
java.io.FileNotFoundException: Could not find file:
[/Users/b/Sites/fileadmin/user_upload/dateien/abc/file1.pdf,
/Users/b/Sites/fileadmin/user_upload/dateien/abc/file1.pdf] <<< MORE
STACKTRACE >>>",
              "time-taken",
              "0:0:0.1"
            ]
          ]
        ]
      ]

Is there a way to get Tika's "BinFileDataSource" to accept the
multiple values or is there a workaround (the CMS we are using save
the file comma-separated into on big text field).

Thanks in advance,

Sam

Reply via email to