Thank you for your thorough response. Things make more sense now. Back to
the drawing board.

Alan.

On Tue, Feb 15, 2011 at 10:23 AM, Jonathan Rochkind <rochk...@jhu.edu>wrote:

> You can't just send arbitrary XML to Solr for update, no.  You need to send
> a Solr Update Request in XML. You can write software that transforms that
> arbitrary XML to a Solr update request, for simple cases it could even just
> be XSLT.  There are also a variety of other mediator pieces that come with
> Solr for doing updates; you can send updates in comma-seperated-value
> format, or you can use Direct Import Handler to, in some not-too-complicated
> cases, embed the translation from your arbitrary XML to Solr documents in
> your Solr instance itself.
>
> But you can't just send arbitrary XML to the Solr update handler, no.
>
> No matter what method you use to send documents to solr, you're going to
> have to think about what you want your Solr schema to look like -- what
> fields of what types.  And then map your data to it.  In Solr, unlike in an
> rdbms, what you want your schema to look like has a lot to do with what
> kinds of queries you will want it to support, it can't just be done based on
> the nature of the data alone.
>
> Jonathan
>
>
> On 2/15/2011 12:45 PM, alan bonnemaison wrote:
>
>> Erick,
>>
>> I think you put the finger on the problem. Our XML files (we get from our
>> suppliers) do *not* look like that.
>>
>> That's what a typical file looks like
>>
>> <insert_list>...................<result><result
>> outcome="PASS"></result><parameter_list><string_parameter name="SN"
>> value="NOVAL" /><string_parameter name="RECEIVER" value="000907010391"
>> /><string_parameter name="Model" value="R16-500" />...<string_parameter
>> name="WorkCenterID" value="PREP" /><string_parameter name="SiteID"
>> value="CTCA" /><string_parameter name="RouteID" value="ADV"
>> /><string_parameter name="LineID" value="Line5" /></parameter_list><config
>> enable_sfcs_comm="true" enable_param_db_comm="false"
>> force_param_db_update="false" driver_platform="LABVIEW" mode="PROD"
>> driver_revision="2.0"></config></insert_list>
>>
>> Obviously, nothing like<add><doc>....</doc></add>
>>
>> By the way, querying q=*:* retrieved "HTTP error 500 Null pointer
>> exception", which leads me to believe that my index is 100% empty.
>>
>> What I am trying to do cannot be done, correct? I just don't want to waste
>> anyone's time.................
>>
>> Thanks,
>>
>> Alan.
>>
>>
>> On Tue, Feb 15, 2011 at 6:01 AM, Erick Erickson<erickerick...@gmail.com
>> >wrote:
>>
>>  Can we see a small sample of an xml file you're posting? Because it
>>> should
>>> look something like
>>> <add>
>>>   <doc>
>>>     <field name="stbmodel">R16-500</field>
>>>        more fields here.
>>>   </doc>
>>> </add>
>>>
>>> Take a look at the Solr admin page after you've indexed data to see
>>> what's
>>> actually in your index, I suspect what's in there isn't what you
>>> expect.
>>>
>>> Try querying q=*:* just for yucks to see what the documents returned look
>>> like.
>>>
>>> I suspect your index doesn't contain anything like what you think, but
>>> that's only
>>> a guess...
>>>
>>> Best
>>> Erick
>>>
>>> On Mon, Feb 14, 2011 at 7:15 PM, alan bonnemaison<kg6...@gmail.com>
>>> wrote:
>>>
>>>> Hello!
>>>>
>>>> We receive from our suppliers hardware manufacturing data in XML files.
>>>>
>>> On a
>>>
>>>> typical day, we got 25,000 files. That is why I chose to implement Solr.
>>>>
>>>> The file names are made of eleven fields separated by tildas like so
>>>>
>>>>
>>>> CTCA~PRE~PREP~1010123~ONTDTVP5A~41~P~R16-500~000912239878~20110125~212321.XML
>>>
>>>> Our R&D guys want to be able search each field of the file XML file
>>>> names
>>>> (OR operation) but they don't care to search the file contents. Ideally,
>>>> they would like to do a query all files where "stbmodel" equal to
>>>>
>>> "R16-500"
>>>
>>>> or "result" is "P" or "filedate" is "20110125"...you get the idea.
>>>>
>>>> I defined in schema.xml each data field like so (from left to right --
>>>>
>>> sorry
>>>
>>>> for the long list):
>>>>
>>>>   <field name="location"       type="textgen"          indexed="false"
>>>> stored="true"   multiValued="false"/>
>>>>   <field name="scriptid"       type="textgen"          indexed="false"
>>>> stored="true"   multiValued="false"/>
>>>>   <field name="slotid"         type="textgen"          indexed="false"
>>>> stored="true"   multiValued="false"/>
>>>>   <field name="workcenter"     type="textgen"          indexed="false"
>>>> stored="false"  multiValued="false"/>
>>>>   <field name="workcenterid"   type="textgen"          indexed="false"
>>>> stored="fase"   multiValued="false"/>
>>>>   <field name="result"         type="string"           indexed="true"
>>>> stored="true"    multiValued="false"/>
>>>>   <field name="computerid"     type="textgen"          indexed="false"
>>>> stored="true"   multiValued="false"/>
>>>>   <field name="stbmodel"       type="textgen"          indexed="true"
>>>> stored="true"    multiValued="false"/>
>>>>   <field name="receiver"       type="string"           indexed="true"
>>>> stored="true"    multiValued="false"/>
>>>>   <field name="filedate"       type="textgen"          indexed="false"
>>>> stored="true"   multiValued="false"/>
>>>>   <field name="filetime"       type="textgen"          indexed="false"
>>>> stored="true"   multiValued="false"/>
>>>>
>>>> Also, I defined as unique key the field "receiver". But no results are
>>>> returned by my queries. I made sure to update my index like so: "java
>>>>
>>> -jar
>>>
>>>> apache-solr-1.4.1/example/exampledocs/post.jar *XML".
>>>>
>>>> I am obviously missing something. Is there a way to configure schema.xml
>>>>
>>> to
>>>
>>>> search for file names? I welcome your input.
>>>>
>>>> Al.
>>>>
>>>>
>>
>>


-- 
AB.
----
Sent from my Gmail account.

Reply via email to