Man you people are fast!

There is a bug in Solr/Lucene. It keeps memory around from previous
fields, so giant text files might run out of memory when they should
not. This bug is fixed in the trunk.

On 4/17/10, Lance Norskog <goks...@gmail.com> wrote:
> The DataImportHandler can let you fetch the file name from the
> database record, and then load the file as a field and process the
> text with Tika.
>
> It will not be easy :) but it is possible.
>
> http://wiki.apache.org/solr/DataImportHandler
>
> On 4/17/10, Serdar Sahin <anlamar...@gmail.com> wrote:
>> Hi,
>>
>> I am rather new to Solr and have a question.
>>
>> We have around 200.000 txt files which are placed into the file cloud.
>> The file path is something similar to this:
>>
>> file/97/8f/840/fa4-1.txt
>> file/a6/9d/ab0/ca2-2.txt etc.
>>
>> and we also store the metadata (like title, description, tags etc)
>> about these files in the mysql server. So, what I want to do is to
>> index title, description, tags and other data from mysql, and also get
>> the txt file from file server, and link them as one record for
>> searching, but I could not figure out how to automatize this process.
>> I can give the path from the sql query like, Select id, title,
>> description, file_path, and then solr can use this path to retrieve
>> txt file, but I don't know whether is it possible or not.
>>
>> What is the best way to index these files with their tag title and
>> description without coding in Java (Perl is ok). These txt files are
>> large, between 100kb-10mb, so the last option is to store them in the
>> database.
>>
>> Thanks,
>>
>> Serdar
>>
>
>
> --
> Lance Norskog
> goks...@gmail.com
>


-- 
Lance Norskog
goks...@gmail.com

Reply via email to