Thank you for your thorough response. Things make more sense now. Back to the drawing board.
Alan. On Tue, Feb 15, 2011 at 10:23 AM, Jonathan Rochkind <rochk...@jhu.edu>wrote: > You can't just send arbitrary XML to Solr for update, no. You need to send > a Solr Update Request in XML. You can write software that transforms that > arbitrary XML to a Solr update request, for simple cases it could even just > be XSLT. There are also a variety of other mediator pieces that come with > Solr for doing updates; you can send updates in comma-seperated-value > format, or you can use Direct Import Handler to, in some not-too-complicated > cases, embed the translation from your arbitrary XML to Solr documents in > your Solr instance itself. > > But you can't just send arbitrary XML to the Solr update handler, no. > > No matter what method you use to send documents to solr, you're going to > have to think about what you want your Solr schema to look like -- what > fields of what types. And then map your data to it. In Solr, unlike in an > rdbms, what you want your schema to look like has a lot to do with what > kinds of queries you will want it to support, it can't just be done based on > the nature of the data alone. > > Jonathan > > > On 2/15/2011 12:45 PM, alan bonnemaison wrote: > >> Erick, >> >> I think you put the finger on the problem. Our XML files (we get from our >> suppliers) do *not* look like that. >> >> That's what a typical file looks like >> >> <insert_list>...................<result><result >> outcome="PASS"></result><parameter_list><string_parameter name="SN" >> value="NOVAL" /><string_parameter name="RECEIVER" value="000907010391" >> /><string_parameter name="Model" value="R16-500" />...<string_parameter >> name="WorkCenterID" value="PREP" /><string_parameter name="SiteID" >> value="CTCA" /><string_parameter name="RouteID" value="ADV" >> /><string_parameter name="LineID" value="Line5" /></parameter_list><config >> enable_sfcs_comm="true" enable_param_db_comm="false" >> force_param_db_update="false" driver_platform="LABVIEW" mode="PROD" >> driver_revision="2.0"></config></insert_list> >> >> Obviously, nothing like<add><doc>....</doc></add> >> >> By the way, querying q=*:* retrieved "HTTP error 500 Null pointer >> exception", which leads me to believe that my index is 100% empty. >> >> What I am trying to do cannot be done, correct? I just don't want to waste >> anyone's time................. >> >> Thanks, >> >> Alan. >> >> >> On Tue, Feb 15, 2011 at 6:01 AM, Erick Erickson<erickerick...@gmail.com >> >wrote: >> >> Can we see a small sample of an xml file you're posting? Because it >>> should >>> look something like >>> <add> >>> <doc> >>> <field name="stbmodel">R16-500</field> >>> more fields here. >>> </doc> >>> </add> >>> >>> Take a look at the Solr admin page after you've indexed data to see >>> what's >>> actually in your index, I suspect what's in there isn't what you >>> expect. >>> >>> Try querying q=*:* just for yucks to see what the documents returned look >>> like. >>> >>> I suspect your index doesn't contain anything like what you think, but >>> that's only >>> a guess... >>> >>> Best >>> Erick >>> >>> On Mon, Feb 14, 2011 at 7:15 PM, alan bonnemaison<kg6...@gmail.com> >>> wrote: >>> >>>> Hello! >>>> >>>> We receive from our suppliers hardware manufacturing data in XML files. >>>> >>> On a >>> >>>> typical day, we got 25,000 files. That is why I chose to implement Solr. >>>> >>>> The file names are made of eleven fields separated by tildas like so >>>> >>>> >>>> CTCA~PRE~PREP~1010123~ONTDTVP5A~41~P~R16-500~000912239878~20110125~212321.XML >>> >>>> Our R&D guys want to be able search each field of the file XML file >>>> names >>>> (OR operation) but they don't care to search the file contents. Ideally, >>>> they would like to do a query all files where "stbmodel" equal to >>>> >>> "R16-500" >>> >>>> or "result" is "P" or "filedate" is "20110125"...you get the idea. >>>> >>>> I defined in schema.xml each data field like so (from left to right -- >>>> >>> sorry >>> >>>> for the long list): >>>> >>>> <field name="location" type="textgen" indexed="false" >>>> stored="true" multiValued="false"/> >>>> <field name="scriptid" type="textgen" indexed="false" >>>> stored="true" multiValued="false"/> >>>> <field name="slotid" type="textgen" indexed="false" >>>> stored="true" multiValued="false"/> >>>> <field name="workcenter" type="textgen" indexed="false" >>>> stored="false" multiValued="false"/> >>>> <field name="workcenterid" type="textgen" indexed="false" >>>> stored="fase" multiValued="false"/> >>>> <field name="result" type="string" indexed="true" >>>> stored="true" multiValued="false"/> >>>> <field name="computerid" type="textgen" indexed="false" >>>> stored="true" multiValued="false"/> >>>> <field name="stbmodel" type="textgen" indexed="true" >>>> stored="true" multiValued="false"/> >>>> <field name="receiver" type="string" indexed="true" >>>> stored="true" multiValued="false"/> >>>> <field name="filedate" type="textgen" indexed="false" >>>> stored="true" multiValued="false"/> >>>> <field name="filetime" type="textgen" indexed="false" >>>> stored="true" multiValued="false"/> >>>> >>>> Also, I defined as unique key the field "receiver". But no results are >>>> returned by my queries. I made sure to update my index like so: "java >>>> >>> -jar >>> >>>> apache-solr-1.4.1/example/exampledocs/post.jar *XML". >>>> >>>> I am obviously missing something. Is there a way to configure schema.xml >>>> >>> to >>> >>>> search for file names? I welcome your input. >>>> >>>> Al. >>>> >>>> >> >> -- AB. ---- Sent from my Gmail account.