DIH when using XML Files questions

2011-09-27 Thread Gabriel Cooper
I'm researching using DataImportHandler to import my data files utilizing
FileDataSource with FileListEntityProcessor and have a couple questions
before I get started that I'm hoping you guys can assist with.

1) I would like to put a file on the local filesystem in the configured
location and have Solr see and process the file without additional effort on
my part.
1a) Is this doable in any way? From what I've seen, this is not supported
and I must manually call a URL (e.g.
http://foo/solr/dataimport?command=full-import).
1b) The manual, URL-based invocation method seems perfectly logical in a
database-oriented world, where one might schedule an update to run regularly
but in my case I have a couple identical indexes I load balance between and
don't want to run the same hefty query multiple times in parallel. As such,
I'm doing one query, writing the results to an XML file, pushing that file
to each box, and then wanting that file processed. I'd like the process to
be as automated as possible.

2) I would like any files processed by Solr to be deleted after they've been
imported. I haven't seen any way to do this currently. I thought I might be
able to subclass something, but FileListEntityProcessor, for example,
doesn't seem to give any handles at the right time in the workflow to delete
a file.

3) When reading the DIH documentation, I ran across this statement: "When
delta-import command is executed, it reads the start time stored in *
conf/dataimport.properties*. It uses that timestamp to run delta queries and
after completion, updates the timestamp in *conf/dataimport.properties*." If
it really does update the date to the completion date, what happens to any
files added between the start and end dates? Are they lost?

4) For delta imports, I don't see mention of how processed files are ordered
other than that it tries not to re-import files older than that mentioned in
the conf/dataimport.properties file. In cases where order matters, does it
order the files by name or creation date or ...?

Thanks for any help,

Gabriel.


possible to do arithmetic on returned values?

2011-12-09 Thread Gabriel Cooper

Is there a way to manipulate the results coming back from SOLR?

I have a SOLR 3.5 index that contains values in cents (e.g. "100" in the 
index represents $1.00) and in certain contexts (e.g. CSV export) I'd 
like to divide by 100 for that field to provide a user-friendly "in 
dollars" number. To do this I played around with Function Queries for a 
while without realizing they're limited to relevancy scores, and later 
found "DocTransformers" in 4.0 whose description sounded right but don't 
exist in 3.5.


Is there anything else I haven't considered?

Thanks for any help

Gabriel Cooper.


manipulate the results coming back from SOLR? (was: possible to do arithmetic on returned values?)

2011-12-12 Thread Gabriel Cooper
I'm hoping I just got lost in the shuffle due to posting on a Friday 
night. Is there a way to change a field's data via some function, e.g. 
add, subtract, product, etc.?



On 12/9/11 4:17 PM, Gabriel Cooper wrote:

Is there a way to manipulate the results coming back from SOLR?

I have a SOLR 3.5 index that contains values in cents (e.g. "100" in the
index represents $1.00) and in certain contexts (e.g. CSV export) I'd
like to divide by 100 for that field to provide a user-friendly "in
dollars" number. To do this I played around with Function Queries for a
while without realizing they're limited to relevancy scores, and later
found "DocTransformers" in 4.0 whose description sounded right but don't
exist in 3.5.

Is there anything else I haven't considered?

Thanks for any help

Gabriel Cooper.