>  what amount/rate of generated text files are you thinking about?

I have 1TB worth of text files coming in every couple of minutes in
real-time.  In about 10 minute I will have 4TB worth of text files.

>  Do you just have one of these text files, containing many reports?
>  Do you have many of these text files each containing one report?
>  Also, is the report a single line, that has been wrapped for email?

these files, rotate every hour.   In each text files, it contains many
reports, and it is not wrapped for email.

 Is there an effective way to use Solr to have it consistently index my text
files.

Please note: that these files all have the same formats.



On Wed, Apr 15, 2009 at 1:58 AM, Shalin Shekhar Mangar <
shalinman...@gmail.com> wrote:

> On Tue, Apr 14, 2009 at 10:37 PM, Alex Vu <alex.v...@gmail.com> wrote:
>
> > I just want to be able to index my text file, and other files that
> carries
> > the same format but with different IP address, ports, ect.
> >
>
> Alex, Solr consumes XML (in a specifc format) and CSV. It can consume plain
> text through ExtractIonHandler. It can index DBs, other XML formats.
>
> You can write a java program, parse your text file, and use Solrj client to
> send data to Solr. You could also write a program in any language you want
> and convert those text files to CSV or XML and post them to Solr.
>
> http://wiki.apache.org/solr/UpdateXmlMessages
> http://wiki.apache.org/solr/UpdateCSV
> http://wiki.apache.org/solr/Solrj
>
>
> >
> >  I will have the traffic flow running in real-time.  Do you think Solr
> will
> > be able to index a bunch of my text files in real time?
> >
>
> I don't think Solr is very suitable for this task. You can add the files to
> Solr at any time but you won't be able to search on them immediately. You
> should batch the commits (you can also use the maxDocs/maxTime properties
> in
> the autoCommit section in solrconfig.xml)
>
> --
> Regards,
> Shalin Shekhar Mangar.
>

Reply via email to