Thank you for your answer. To 3.) The file is on server A, my program is on server B and solr is on server C. If I use a normal http(rest) post, my program has to fetch the file content from server A to Server B and then post it from server B to server C as there is no open connection between A and C. So the file has to be transmitted two times. Is there a way to tell solr to read the file _directly_ from Server A (e.g. via SMB)
Thank you Amrit Sarkar <sarkaramr...@gmail.com> schrieb am Fr., 13. Okt. 2017 um 12:51 Uhr: > Hi, > > 1.) I created a core and tried to simplify the managed-schema file. But if > > I remove all "unecessary" fields/fieldtypes, I get errors like: field > > "_version_" is missing, type "boolean" is missing and so on. Why do I > have > > to define this types/fields? Which fields/fieldtypes are required? > > > Solr expects the primitive field names and types in the schema. Though a > better explanation should be there. "_version_" and a unique id field is > mandatory for each document as "_version_" contains the current version of > the document utilised in sync across nodes and atomic updation of the > documents. > > 2.) Can I modify the managed-schema remotly/by program e.g. with a post > > request or only by editing the managed-schema file directly? > > Sure, Schema API is available to us for a while: > https://lucene.apache.org/solr/guide/6_6/schema-api.html > > 3.) When I have a service(solrnet client) that pushes a file from a > > fileserver to solr, will it cause two times traffic? (from the fileserver > > to my service and from the service to solr?) Is there a chance to index > the > > file direct? (I need to add additional attributes to the index document) > > > Two times traffic? where? Solr will receive the docs once so we are good at > that part. Please utilize the SolrJ to index documents if possible, as it > is most updates one, if you are on solrcloud, use CloudSolrJClient. > Regarding index files direct, you can utilize the DIH (DataImportHandler), > depends on the file format, its csv, xml, json, but mind it is single > threaded. > > Hope this clarifies some of it. > > Amrit Sarkar > Search Engineer > Lucidworks, Inc. > 415-589-9269 <(415)%20589-9269> > www.lucidworks.com > Twitter http://twitter.com/lucidworks > LinkedIn: https://www.linkedin.com/in/sarkaramrit2 > > On Fri, Oct 13, 2017 at 3:10 PM, startrekfan <startrekfa...@freenet.de> > wrote: > > > Hello, > > > > I have some Solr related questions: > > > > 1.) I created a core and tried to simplify the managed-schema file. But > if > > I remove all "unecessary" fields/fieldtypes, I get errors like: field > > "_version_" is missing, type "boolean" is missing and so on. Why do I > have > > to define this types/fields? Which fields/fieldtypes are required? > > > > 2.) Can I modify the managed-schema remotly/by program e.g. with a post > > request or only by editing the managed-schema file directly? > > > > 3.) When I have a service(solrnet client) that pushes a file from a > > fileserver to solr, will it cause two times traffic? (from the fileserver > > to my service and from the service to solr?) Is there a chance to index > the > > file direct? (I need to add additional attributes to the index document) > > > > Thank you > > >