DataImportHandler - Automatic scheduling of delta imports in Solr in windows 7

2014-04-10 Thread harshrossi
I am using *DeltaImportHandler* for indexing data in Solr. Currently I am
manually indexing the data into Solr by selecting commands full-import or
delta-import from the Solr Admin screen.

I am using Windows 7 and would like to automate the process by specifying a
certain time interval for executing the commands through windows task
scheduler or something. e.g.: like every two minutes it should index data
into solr.

>From few sites I came to know that I need to create a *batch file* with some
command to run the imports and the batch file is run using *windows
scheduler*. But there were no examples regarding this.

I am not sure what to code in the batch file and how to link it with the
scheduler.

Can someone provide me the code and the steps to accomplish it?

Thanks a lot in advance.




--
View this message in context: 
http://lucene.472066.n3.nabble.com/DataImportHandler-Automatic-scheduling-of-delta-imports-in-Solr-in-windows-7-tp4130565.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: DataImportHandler - Automatic scheduling of delta imports in Solr in windows 7

2014-04-11 Thread harshrossi
Yes that is all fine with me. Only thing that worries me is what needs to be
coded in the batch file.
I will just try a sample batch file and get back with queries if any.

Thank you 



--
View this message in context: 
http://lucene.472066.n3.nabble.com/DataImportHandler-Automatic-scheduling-of-delta-imports-in-Solr-in-windows-7-tp4130565p4130635.html
Sent from the Solr - User mailing list archive at Nabble.com.


'qt' parameter is not working in search call of SolrPhpClient

2014-04-17 Thread harshrossi
I am using SolrPhpClient for interacting with Solr via PHP.

I am using a custom request handler ( /select_test ) with 'edismax' feature
in Solr config file

  
 
   explicit
   json
   
   edismax
   
  text name topic description
   
   text
   100%
   *:*
   10
   *,score

   
  text name topic description
   
   text,name,topic,description
   3
 
  

I set the value for 'qt' parameter as '/select_test' in the $search_options
array and pass it as parameter to the search function of the
Apache_Solr_Service as below:

$search_options = array(
'qt' => '/select_test',
   'fq' => 'topic:games',
   'sort' => 'name desc'
);



$result = $solr->search($query, 0, 10, $search_options);

It does not call the request handler at all. The call goes to the default
'/select' handler in solr config file.

Just to confirm I put the custom request handler code in default handler and
it worked.

Why is this happening? Am I not setting it right?

Please help!



--
View this message in context: 
http://lucene.472066.n3.nabble.com/qt-parameter-is-not-working-in-search-call-of-SolrPhpClient-tp4131934.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: 'qt' parameter is not working in search call of SolrPhpClient

2014-04-20 Thread harshrossi
Yes I know but I am using SolrPhpClient API where by default the search()
function access the '/select' request handler. So I used the 'qt' parameter
to access '/select_test' as given in this link:

Non-Default Request Handler

  

I have used the 'qt' as mentioned in the link but still it points to the
default '/select'.

Any suggestions?






--
View this message in context: 
http://lucene.472066.n3.nabble.com/qt-parameter-is-not-working-in-search-call-of-SolrPhpClient-tp4131934p4132282.html
Sent from the Solr - User mailing list archive at Nabble.com.


Tika: url issue

2014-06-04 Thread harshrossi
Hi,

   I am working on Solr using DataImortHander for indexing rich documents
like pdf,word,image etc 
I am using TikaEntityProcessor for extracting contents from the files.

I have one small issue regarding setting value to 'url' entry.

My data-config.xml file is like so:






 








 
   

   



The thing is, the file path is stored in a different pattern in the
database:
"doc_url" is the field in db which stores the url or file path. The file
path is stored in this way:
 *D:\Games\CS2\setup.doc#D:\Games\CS2\setup.doc#*
i.e. the path is stored twice seperated by a '#'. I am not sure why it is
done. It has been done by our client.

All I need is only the one file path i.e. D:\Games\CS2\setup.doc
I am passing the url value to tika as * url="${db_link.LINK}"
*
But the *${db_link.LINK}* contains the path coming from database directly.
I have tried using script transformer and splitting the path string to parts
by '#' and taking the first path using the method *getFilePath(row)* but no
luck.

I am still getting the path as stored in db. This gives a *FileNotFound*
exception while trying to index it and that is obvious because the path is
incorrect.

What can be done to get only the path and leaving out rest of the path
having # and all?

Help would be much appreciated :)







--
View this message in context: 
http://lucene.472066.n3.nabble.com/Tika-url-issue-tp4139781.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: Tika: url issue

2014-06-06 Thread harshrossi
Thanks for the help.. But anyway I solved it using RegexTransformer

In the db_link entity I used RegexTransformer and set the link field as:


and in tika-doc entity I set the Url value as:
${db_link.link}



--
View this message in context: 
http://lucene.472066.n3.nabble.com/Tika-url-issue-tp4139781p4140376.html
Sent from the Solr - User mailing list archive at Nabble.com.