Hi Guys,
thanks for the Answers you help me alot. I wrote a php scipt for this
Problem.
Thank you
--
View this message in context:
http://lucene.472066.n3.nabble.com/Indexing-Fixed-length-file-tp4225807p4227163.html
Sent from the Solr - User mailing list archive at Nabble.com.
Ah yes, I should have made my example use tabs, though that currently would
have required also adding “&separator=%09” to the params.
I definitely support the use of tabs for what they were intended, delimiting
columns of data. +1, thanks for that mention Alex
> On Aug 28, 2015, at 1:38 P
Erik's version might be better with tabs though to avoid CSV's
requirements on escaping comas, quotes, etc. And maybe trim those
fields a bit either in awk or in URP inside Solr.
But it would definitely work.
Regards,
Alex.
Solr Analyzers, Tokenizers, Filters, URPs and even a newsletter:
How about this incantation:
$ bin/solr create -c fw
$ echo "Q36" | awk -v OFS=, '{ print substr($0, 1, 1), substr($0, 2, 2) }' |
bin/post -c fw -params "fieldnames=id,val&header=false" -type text/csv -d
$ curl 'http://localhost:8983/solr/fw/select?q=*:*&wt=csv'
val,_version_,id
36,151076711525200
If you use DataImportHandler, you can combine LineEntityProcessor with
RegexTransformer to split each line into a bunch of fields:
https://cwiki.apache.org/confluence/display/solr/Uploading+Structured+Data+Store+Data+with+the+Data+Import+Handler#UploadingStructuredDataStoreDatawiththeDataImportHand
Hi Tim,
I haven’t heard of people indexing this kind of input with Solr, but the format
is quite similar to CSV/TSV files, with the exception that the field separators
have fixed positions and are omitted.
You could write a short script to insert separators (e.g. commas) at these
points (but b
Solr doesn't know anything about such a file. The post program expects
well-defined structures, see the xml and json formats in example/exampledocs.
So you either have to transform the data into the form expected by the bin/post
tool or perhaps you can use the CSV import, see:
https://cwiki.apache