Skipping fields from XML
Hi, I want to index a perfectly good solr XML-file into an Solr/Lucene instance. The problem is that the XML has many fields that I don't want to be indexed. I tried to index the file but Solr gives me an error because the XML contains fields that I have not declared in my schema.xml How can I tell Solr to skip unwanted fields and only index the fields that I have declared in my schema.xml? I know it must be something with a catchall setting and / or copyFields but I can not get the configuration right. To be clear. I want Solr to index / store only a few fields from the XML-file to be indexed and skip all the other fields. An answer or a link to a good reference would help.
Re: Skipping fields from XML
perfect! That resolved my issue. BTW. This was my first posting on this list. I must say that the responses were quick and to the point!!! Good community help! On Thu, Jul 30, 2009 at 10:58 AM, AHMET ARSLAN wrote: > > > How can I tell Solr to skip unwanted fields and only index > > the fields that I have declared in my schema.xml? > > More precisely: (taken from schema.xml) > > > > > > > -- Met vriendelijke groet / Kind regards, Edwin Stauthamer Adviser Search & Collaboration Emid Consult T: +31 (0) 70 8870700 M: +31 (0) 6 4555 4994 E: estautha...@emidconsult.com I: http://www.emidconsult.com
Re: Recreating SOLR index after a schema change - without having to re-post the data
That is a shame. I have much experience with Autonomy IDOL and the possibility of quickly reindexing the content without making a call to the original source is great. Just Export, update the config, and import (=reindex) to see if, for instance the performance is better or just to transport the information to an other server. This can only be done of course when there are no fields added etc. On Fri, Jul 31, 2009 at 2:59 PM, Erik Hatcher wrote: > > On Jul 31, 2009, at 7:01 AM, Vannia Rajan wrote: > > On Fri, Jul 31, 2009 at 3:22 PM, Erik Hatcher > >wrote: >> >> You'll have to reindex your documents from scratch. Such is the nature >>> of >>> changing the schema of an index. It's always a great idea (in fact, I'd >>> say >>> mandatory) to have a full reindex process handy. >>> >>> >>> Thank you for your response. Yes, i need to make the setup handy to >> query & >> repost to solr - till this new feature is included in SOLR. >> > > It's only tractable to do this if the original field values are stored, > which is quite prohibitive in many cases. So I don't think this is a > feature that you'll see in Solr any time soon. > >Erik > > -- Met vriendelijke groet / Kind regards, Edwin Stauthamer Adviser Search & Collaboration Emid Consult T: +31 (0) 70 8870700 M: +31 (0) 6 4555 4994 E: estautha...@emidconsult.com I: http://www.emidconsult.com
Re: More like *these*? (recommendation system)
You don't have to create a new "handler" for this... just do some preprocessing on the resultset that comes back on your first "id:1 OR id:2 OR id:3" query. So - post your query - get the relevant text-nodes from the resultset (XSL-processing is great for that). - Combine the text - Send that text as one query back to Solr You can build in that logic in your frontend or use some "layer" for this. We (Emid Consult) have created a product that does these things for us we call SEAL. SEAL stands for Search Engine Abstraction Layer (implemented in PHP). The searchengine specific querycooking or performing extra processing for added functionality (like "More like this") is done in that layer. SEAL now can site between the frontend and some search engine (like Solr, Autonomy IDOL or Exalead). On Fri, Jul 31, 2009 at 2:08 PM, Andrew Ingram wrote: > Hi all, > I'm trying various methods of building a user-specific product > recommendation system and one idea is to use solr's MLT functionality. > > For each customer I have a list of items they've bought, and I want to find > similar items that are new to the site. > > The problem is that MLT operates on each result found (if I send it an id, > it will return a list for that id, if I send it lots of ids it will return > a > list for EACH result), what I really want is to return a single list based > on the combined factors of all items return by the initial query. > > So if I search for "id:1 OR id:2 OR id:3", I want the MLT result to be a > single list of items, rather than 3 lists. > > Is this possible without writing a completely new handler? > > Regards, > Andrew Ingram > -- Met vriendelijke groet / Kind regards, Edwin Stauthamer Adviser Search & Collaboration Emid Consult T: +31 (0) 70 8870700 M: +31 (0) 6 4555 4994 E: estautha...@emidconsult.com I: http://www.emidconsult.com
Re: Recreating SOLR index after a schema change - without having to re-post the data
Simple but effective ;-) On Fri, Jul 31, 2009 at 3:23 PM, Erik Hatcher wrote: > There certainly could be some intermediate storage of documents prior to > indexing, but as far as the Lucene index goes it is inherently a one-way > process. Solr could facilitate this pretty easily... with an update > processor that wrote the documents coming in to some other storage (one > option: simple Solr XML files on the filesystem). So hope is not lost. > >Erik > > > > > On Jul 31, 2009, at 9:07 AM, Edwin Stauthamer wrote: > > That is a shame. I have much experience with Autonomy IDOL and the >> possibility of quickly reindexing the content without making a call to the >> original source is great. Just Export, update the config, and import >> (=reindex) to see if, for instance the performance is better or just to >> transport the information to an other server. >> >> This can only be done of course when there are no fields added etc. >> >> On Fri, Jul 31, 2009 at 2:59 PM, Erik Hatcher > >wrote: >> >> >>> On Jul 31, 2009, at 7:01 AM, Vannia Rajan wrote: >>> >>> On Fri, Jul 31, 2009 at 3:22 PM, Erik Hatcher < >>> e...@ehatchersolutions.com >>> >>>> wrote: >>>>> >>>> >>>> You'll have to reindex your documents from scratch. Such is the nature >>>> >>>>> of >>>>> changing the schema of an index. It's always a great idea (in fact, >>>>> I'd >>>>> say >>>>> mandatory) to have a full reindex process handy. >>>>> >>>>> >>>>> Thank you for your response. Yes, i need to make the setup handy to >>>>> >>>> query & >>>> repost to solr - till this new feature is included in SOLR. >>>> >>>> >>> It's only tractable to do this if the original field values are stored, >>> which is quite prohibitive in many cases. So I don't think this is a >>> feature that you'll see in Solr any time soon. >>> >>> Erik >>> >>> >>> >> >> -- >> Met vriendelijke groet / Kind regards, >> >> Edwin Stauthamer >> Adviser Search & Collaboration >> Emid Consult >> T: +31 (0) 70 8870700 >> M: +31 (0) 6 4555 4994 >> E: estautha...@emidconsult.com >> I: http://www.emidconsult.com >> > > -- Met vriendelijke groet / Kind regards, Edwin Stauthamer Adviser Search & Collaboration Emid Consult T: +31 (0) 70 8870700 M: +31 (0) 6 4555 4994 E: estautha...@emidconsult.com I: http://www.emidconsult.com