Index an URL

Tolga Tue, 15 May 2012 09:13:35 -0700

Hi,

I have a few questions, please bear with me:

1- I have a theory. nutch may be used to index to solr when we don'thave access to URL's file system, while we can use curl when we do haveaccess. Am I correct?2- A tutorial I have been reading is talking about different levels ofid. Is there such a thing (exid6, exid7 etc)?3- When I use curl"http://localhost:8983/solr/update/extract?literal.id=exid7&commit=true";-F "myfile=@serialized-form.html", I get ERROR: [doc=exid7] unknownfield 'ignored_link'</pre>. Is this something exid7 gives me? Where doesthis field ignored_link come from? Do I need to add all these fields toschema.xml in order not to get such error? What is the safest way?


Regards,

Index an URL

Reply via email to