Found the Entity Includes - thanks. On Thu, Aug 4, 2016 at 4:22 PM, John Bickerstaff <j...@johnbickerstaff.com> wrote:
> Thanks! > > The schema is a copy of the techproducts sample. > > Entire include here - and I take your point about the possibility of > malformation - thanks. > > I assumed (perhaps wrongly) that I could duplicate the <schema ...> > </schema> arrangement from the schema.xml file. > > I'm unfamiliar with xml entity includes, but I'll go look them up... > > <?xml version="1.0" encoding="UTF-8" ?> > <schema name="example" version="1.6"> > > <!-- ngram field to support suggestions / lookahead search on title > (and category, contentType)--> > <copyField source="foobar" dest="text"/> > <field name="suggestion_ngram_for_title" type="text_suggest_ngram" > indexed="true" stored="false"/> > <field name="displayurl" type="text_general" indexed="true" > stored="true" multiValued="false"/> > <field name="productVersionId" type="string" indexed="true" > stored="true" multiValued="false"/> > <field name="caption" type="text_general" indexed="true" stored="true" > multiValued="false"/> > <field name="documentId" type="string" indexed="true" stored="true" > multiValued="false"/> > <!--<field name="category" type="string" indexed="true" stored="true" > multiValued="true"/>--> > <field name="contentType" type="text_special_synonym" indexed="true" > stored="true" multiValued="false"/> > <!-- Do NOT assume that much thought went into using int on the > following field. This is testing only!--> > <field name="preference_" type="int" indexed="true" stored="true" > multiValued="false"/> > > <field name="meta_doc_type" type="text_general" indexed="true" > stored="true" multiValued="false"/> > <!--<field name="content" type="text_general" indexed="true" > stored="true" multiValued="false"/>--> > > <!-- STATdx Weighting fields here. These are not part of the document, > but are used to calculate relevancy scores --> > <field name="category_weight" type="double" indexed="true" > stored="true"/> <!-- used for rule one - weighting docs on general > usefulness --> > > <!-- Main body of document extracted by SolrCell. > NOTE: This field is not indexed by default, since it is also > copied to "text" > using copyField below. This is to save space. Use this field for > returning and > highlighting document content. Use the "text" field to search the > content. --> > <field name="content" type="text_en" indexed="false" stored="true" > multiValued="true"/> *//HERE IS WHERE "CONTENT" IS DEFINED* > > <!-- test for parsing statdx-provided html in content field. text_html has > been modified to clean html --> > <field name="html_content" type="text_html" indexed="true" > stored="true" multiValued="true"/> > > <!-- Text fields from SolrCell to search by default in our catch-all > field --> > <copyField source="title" dest="text"/> > <copyField source="author" dest="text"/> > <copyField source="description" dest="text"/> > <copyField source="keywords" dest="text"/> > <copyField source="content" dest="text"/> /*/THROWING ERROR ABOUT > "CONTENT" NOT EXISTING HERE* > <copyField source="content_type" dest="text"/> > <copyField source="resourcename" dest="text"/> > <copyField source="url" dest="text"/> > > <!-- Create a string version of author for faceting --> > <copyField source="author" dest="author_s"/> > > <!-- Above, multiple source fields are copied to the [text] field. > Another way to map multiple source fields to the same > destination field is to use the dynamic field syntax. > copyField also supports a maxChars to copy setting. --> > > <copyField source="*_en" dest="text"/> > > > <!-- a copy of text_general. Used to handle the rule that says that > docs with "table" > and "tsm" in the contentType field should show at the top of > results IF any of the > following terms are in the search term submitted by the user: > [TNM, AJCC, Stage, Staging, FIGO] Note the special synonym file > in the xml below. > Note to self: Expand this documentation if we end up adding more > "special" synonyms --> > <fieldType name="text_special_synonym" class="solr.TextField" > positionIncrementGap="100"> > <analyzer type="index"> > <tokenizer class="solr.StandardTokenizerFactory"/> > <filter class="solr.StopFilterFactory" ignoreCase="true" > words="stopwords.txt" /> > <!-- in this example, we will only use synonyms at query > time > <filter class="solr.SynonymFilterFactory" > synonyms="index_synonyms.txt" ignoreCase="true" expand="false"/> > --> > <filter class="solr.LowerCaseFilterFactory"/> > </analyzer> > <analyzer type="query"> > <tokenizer class="solr.StandardTokenizerFactory"/> > <filter class="solr.StopFilterFactory" ignoreCase="true" > words="stopwords.txt" /> > <!-- Special synonym file here!!!! --> > <filter class="solr.SynonymFilterFactory" > synonyms="contentType_synonyms.txt" > ignoreCase="true" expand="true"/> > <filter class="solr.LowerCaseFilterFactory"/> > </analyzer> > </fieldType> > > </schema> > > > > On Thu, Aug 4, 2016 at 3:55 PM, Chris Hostetter <hossman_luc...@fucit.org> > wrote: > >> >> you mentioned that the problem only happens when you use xinclude, but you >> havne't shown us hte details of your xinclude -- what exactly does your >> schema.xml look like (with the xinclude call) and what exactly does the >> file being included look like (entire contents) >> >> (I suspect the problem you are seeing is realted to the way xinclude >> doens't really support "snippets" of malformed xml, and instead requires >> some root tag -- i can't imagine what root tag you are using in the >> included file that would play nicely with mixing/matching field >> declarations. ... using xml entity includes may be a simpler/safer option) >> >> >> >> : Date: Thu, 4 Aug 2016 15:47:00 -0600 >> : From: John Bickerstaff <j...@johnbickerstaff.com> >> : Reply-To: solr-user@lucene.apache.org >> : To: solr-user@lucene.apache.org >> : Subject: Re: Problems using fieldType text_general in copyField >> : >> : I would call this a bug... >> : >> : I'm going out on a limb and say that if you define a field in the >> included >> : XML file, you will get this error. >> : >> : As long as the field is defined first in schema.xml, you can >> "copyFIeld" it >> : or whatever in the include file, but apparently fields MUST be created >> in >> : the schema.xml file. >> : >> : That makes use of the include for custom things somewhat moot - at >> least in >> : my situation. >> : >> : I'd love to be wrong by the way, but that's what my tests suggest right >> : now... >> : >> : On Thu, Aug 4, 2016 at 1:37 PM, John Bickerstaff < >> j...@johnbickerstaff.com> >> : wrote: >> : >> : > Summary: >> : > >> : > Using xinclude to include an xml file into schema.xml >> : > >> : > The following line >> : > >> : > <copyField source="content" dest="text"/> >> : > >> : > generates an error: about a field being "not a glob and not matching >> an >> : > explicit field" even though I declare the field in the line just >> above. >> : > >> : > This seems to happen only for for fieldType text_general? >> : > >> : > ============ >> : > >> : > Explanation: >> : > >> : > I need a little help - keep getting an error when trying to use the >> : > ability to include an additional XML file. I may be overlooking >> something, >> : > but if so, I need help to see it. >> : > >> : > I have the following two lines which throw zero errors when part of >> : > schema.xml: >> : > >> : > <field name="content" type="text_general" indexed="false" >> stored="true" >> : > multiValued="true"/> >> : > <copyField source="content" dest="text"/> >> : > >> : > However, when I put this into an include file and use xinclude, then >> I get >> : > this error when starting Solr. >> : > >> : > >> : > >> : > - *statdx_shard1_replica3:* org.apache.solr.common. >> : > SolrException:org.apache.solr.common.SolrException: Could not load >> : > conf for core statdx_shard1_replica3: Can't load schema schema.xml: >> : > copyField source :'content' is not a glob and doesn't match any >> explicit >> : > field or dynamicField. >> : > >> : > >> : > Given that I am defining the field in the line right above the >> copyField >> : > statement, I'm confused about why this works fine in schema.xml but >> NOT in >> : > an included file. >> : > >> : > I experimented and found that any field of type "text_general" will >> throw >> : > this same error if it is part of the included xml file. Other >> fieldTypes >> : > that I tried (string, int, double) did not have this issue. >> : > >> : > I'm using Solr 5.4, although I'm pulling custom config into an >> included >> : > file for purposes of moving to 6.1 >> : > >> : > I have the following list of copyField commands in the included xml >> file, >> : > and get no errors on any but the "content" one. It just so happens >> that >> : > "content" is the only field of type "text_general" in there. >> : > >> : > >> : > Any hints greatly appreciated. >> : > >> : > <copyField source="title" dest="text"/> >> : > <copyField source="author" dest="text"/> >> : > <copyField source="description" dest="text"/> >> : > <copyField source="keywords" dest="text"/> >> : > <copyField source="content" dest="text"/> >> : > <copyField source="content_type" dest="text"/> >> : > <copyField source="resourcename" dest="text"/> >> : > <copyField source="url" dest="text"/> >> : > >> : > >> : >> >> -Hoss >> http://www.lucidworks.com/ >> > >