I guess that depends on what you mean by re-index, but here are some guesses. All of them share the assumption that you can determine #what# you want to index from the various sites. That is, you have some way of identifying the content you care about.
Solr won't help you at all in identifying what you really want, it just follows the orders you give it when you tell it to index content. > if you already have junk in your solr index that you want to remove, you can delete by query (and risk removing valuable stuff). You could also reindex from scratch. > #Assuming# you have a unique key defined, and you're really asking about updating documents, you don't have to do anything. If your schema.xml file has <uniqueKey> identifying a particular field, just add your document again and Solr will automatically delete the old version and add the new one. If none of this makes sense, perhaps you can give us a better idea of what updating means in your use case... This forum concentrates on Solr, there's a Nutch form that'll help you there and I haven't a clue about Drupal. Best Erick On Sat, Oct 30, 2010 at 3:34 PM, Eric Martin <e...@makethembite.com> wrote: > HI everyone, > > > > I'm new which won't be hard to figure out after I ask this question: > > > > I use Drupal/Solr/Nutch > > > > > http://svn.apache.org/viewvc/lucene/dev/trunk/solr/example/solr/conf/schema. > xml?view=markup > > > > Solr specific: > > How do I re-index for specific content only? I am starting a legal index > specifically geared for law students and lawyers. I am crawling law related > sites but I really don't want to index law firms, just the law content on > places like: > > http://www.ecasebriefs.com/blog/law/ > > http://www.lawnix.com/cases/cases-index/ > > http://www.oyez.org/ > > http://www.4lawnotes.com/ > > http://www.docstoc.com/documents/education/law-school/case-briefs > > http://www.lawschoolcasebriefs.com/ > > http://dictionary.findlaw.com <http://dictionary.findlaw.com/> > > > > As I was saying, while crawling I get all kinds of extrinsic information > put > into the Solr index. How do I combat that? > > > > I am assuming (cough) that I can do this but I am really at a loss as to > where I start to look to get this done. I prefer to learn and I defiantly > don't want to waste anyone's time. > > > > Non-Solr Specific > > Does anyone here help with nutch or is this Solr only? > > > > I am sorry if I am asking elementary questions and am asking in the wrong > place. I just need to be pointed to the right place. I'm sort of > lost.(imagine that.) > > > > Thanks > > > > Eric > > > > > > > >