Re: Problem to start solr-4.0.0-BETA with tomcat-6.0.20

2012-09-21 Thread sabman
I ma having the same problem after upgrading from 3.2 to 4.0. I have the sharedLib="lib" added in the tag and I still get the same error. I deleted all the files from the SOLR home directory and copied the files from 4.0 package. I still see this error. Where else could the old lib files be referen

avoid overwrite in DataImportHandler

2011-12-07 Thread sabman
I have a unique ID defined for the documents I am indexing. I want to avoid overwriting the documents that have already been indexed. I am using XPathEntityProcessor and TikaEntityProcessor to process the documents. The DataImportHandler does not seem to have the option to set overwrite=false. I h

Error with Extracting PDF metadata

2011-07-29 Thread sabman
I am using Solr 3.3 and I am trying to extract and index meta data from PDF files. I am using the DataImportHandler with the TikaEntityProcessor to add the documents. Here is are the fields as defined in my schema.xml file: So I suppose the meta data information should b

Using Scriptransformer to send a HTTP Request

2011-07-21 Thread sabman
I am using solr to index RSS feeds and I am using DataImportHandler to parse the urls and then index them. Now I have implemented a web service that takes a url and creates an thumbnail image and stores it in a local directory. So here is what I want to do: After the url is parsed, I want to send

Indexing PDF documents with no UniqueKey

2011-07-15 Thread sabman
I want to index PDF (and other rich) documents. I am using the DataImportHandler. Here is how my schema.xml looks: . . link As you can see I have set link as the unique key so that when the indexing happens documents are not duplicated again.

Re: Updating the data-config file

2011-06-24 Thread sabman
Thanks. I will look into this and see how it goes. -- View this message in context: http://lucene.472066.n3.nabble.com/Updating-the-data-config-file-tp3101241p3104470.html Sent from the Solr - User mailing list archive at Nabble.com.

Re: Updating the data-config file

2011-06-23 Thread sabman
Ahh! Thats interesting! I understand what you mean. Since RSS and Atom feeds have the same structure parsing them would be the same but I can do the for each different URLs. These URLs can be obtained from a db, a file or through the request parameters, right? -- View this message in context: ht

Re: Updating the data-config file

2011-06-23 Thread sabman
So you mean I cannot update the data-config programmatically? I don't understand how the request parameters be of use to me. This is how my data-config file looks: http://rss.slashdot.org/Slashdot/slashdot"; processor="XPathEntityProcess

Updating the data-config file

2011-06-23 Thread sabman
So I have some RSS feeds that I want to index using Solr. I am using the DataImportHandler and I have added the instructions on how to parse the feeds in the data-config file. Now if a user wants to add more RSS feeds to index, do I have to programatically instruct Solr to update the config file?

Re: DIH Scheduling

2011-06-21 Thread sabman
Thanks. Using curl would be an option but ideally I want to implement it using this scheduler. I want to add Solr as part of another application package and send it to clients. So rather than asking them run a cron job it would be easier to have Solr configured to run the scheduler. -- View this m

DIH Scheduling

2011-06-21 Thread sabman
There is information http://wiki.apache.org/solr/DataImportHandler#Scheduling here about Scheduling but I don't understand how to use them. I am not a Java developer so maybe I am missing something obvious. Based on instructions http://stackoverflow.com/questions/3206171/how-can-i-schedule-dat