Maybe this is expert stuff, but we keep our schema, solrconfig, and everything else checked into source control.
I wrote a Python thingy to hit the cluster through the load balancer, get the zkHost string from status, upload the files to zookeeper (kazoo is a nice library), link the config, then do an async reload. I’ve been thinking about time stamping the config directories so I can roll back to a previous config if the reload fails. wunder Walter Underwood wun...@wunderwood.org http://observer.wunderwood.org/ (my blog) > On Mar 7, 2017, at 12:47 PM, OTH <omer.t....@gmail.com> wrote: > > In the reference guide, in the chapter named "The Well Configured Solr > Instance", it says (I'm copying+pasting from the PDF version) : > > Switching from Managed Schema to Manually Edited schema.xml >> If you have started Solr with managed schema enabled and you would like to >> switch to manually editing a schem >> a.xml >> a.xml file, you should take the following steps: >> Rename the >> Rename the managed-schema file to schema.xml. >> Modify >> Modify solrconfig.xml to replace the schemaFactory class. >> Remove any >> Remove any ManagedIndexSchemaFactory definition if it exists. >> Add a >> Add a ClassicIndexSchemaFactory definition as shown above >> Reload the core(s). >> Reload the core(s). >> Apache Solr Reference Guide 6.4 515 >> If you are using SolrCloud, you may need to modify the files via >> ZooKeeper. The >> If you are using SolrCloud, you may need to modify the files via >> ZooKeeper. The bin/solr script provides an >> easy way to download the files from ZooKeeper and upload them back after >> edits. See the section >> easy way to download the files from ZooKeeper and upload them back after >> edits. See the section ZooKeeper >> Operations >> Operations for more information. >> IndexConfig in SolrConfig >> The <indexConfig> section of solrconfig.xml defines low-level behavior of >> the Lucene index writers. >> By default, the settings are commented out in the sample >> By default, the settings are commented out in the sample solrconfig.xml >> included >> with Solr, which means >> the defaults are used. In most cases, the defaults are fine. >> the defaults are used. In most cases, the defaults are fine. >> <indexConfig> >> ... >> </indexConfig> >> Parameters covered in this section: >> Writing New Segments >> Merging Index Segments >> Compound File Segments >> Index Locks >> Other Indexing Settings >> Writing New Segments >> ramBufferSizeMB >> Once accumulated document updates exceed this much memory space (defined >> in megabytes), then the >> pending updates are flushed. This can also create new segments or trigger >> a merge. Using this setting is >> generally preferable to maxBufferedDocs. If both maxBufferedDocs and >> ramBufferSizeMB >> are set in s >> olrconfig.xml >> olrconfig.xml, then a flush will occur when either limit is reached. The >> default is 100Mb. >> <ramBufferSizeMB>100</ramBufferSizeMB> >> maxBufferedDocs >> Sets the number of document updates to buffer in memory before they are >> flushed as a new segment. This >> may also trigger a merge. The default Solr configuration sets to flush by >> RAM usage (ramBufferSizeMB). >> <maxBufferedDocs>1000</maxBufferedDocs> >> useCompoundFile >> Controls whether newly written (and not yet merged) index segments should >> use the Compound File >> Segment >> Segment format. The default is false. >> <useCompoundFile>false</useCompoundFile> >> To have full control over your schema.xml file, you may also want to >> disable schema guessing, which >> allows unknown fields to be added to the schema during indexing. The >> properties that enable this feature >> are discussed in the section >> allows unknown fields to be added to the schema during indexing. The >> properties that enable this feature >> are discussed in the section Schemaless Mode > > > On Wed, Mar 8, 2017 at 1:32 AM, Phil Scadden <p.scad...@gns.cri.nz> wrote: > >> I would second that guide could be clearer on that. I read and reread >> several times trying to get my head around the schema.xml/managed-schema >> bit. I came away from first cursory reading with the idea that >> managed-schema was mostly for schema-less mode and only after some stuff >> ups and puzzling over comments in the basic-config schema file itself did I >> go back for more careful re-read. I am still not sure that I have got all >> the nuances. My understanding is: >> >> If you don’t want ability to edit it via admin UI or config api, rename to >> schema.xml. Unclear whether you have to make changes to other configs to do >> this. Also unclear to me whether there was any upside at all to using >> schema.xml? Why degrade functionality? Does the capacity for schema.xml >> only exist for backward compatibility? >> >> If you want to run schema-less, you have to use managed-schema????? (I >> didn’t delve too deep into this). >> >> In the end, I used basic-config to create core and then hacked >> managed-schema from there. >> >> >> I would have to say the "basic-config" seems distinctly more than basic. >> It is still a huge file. I thought perhaps I could delete every unused >> field type, but worried there were some "system" dependencies. Ie if you >> want *target type wildcard queries do you need to have text_general_reverse >> and a copy to it? If you always explicitly set only defined fields in a >> custom indexer, then can you dump the whole dynamic fields bit? >> Notice: This email and any attachments are confidential and may not be >> used, published or redistributed without the prior written consent of the >> Institute of Geological and Nuclear Sciences Limited (GNS Science). If >> received in error please destroy and immediately notify GNS Science. Do not >> copy or disclose the contents. >>