Hi, I have a few questions, please bear with me:
1- I have a theory. nutch may be used to index to solr when we don't have access to URL's file system, while we can use curl when we do have access. Am I correct? 2- A tutorial I have been reading is talking about different levels of id. Is there such a thing (exid6, exid7 etc)? 3- When I use curl "http://localhost:8983/solr/update/extract?literal.id=exid7&commit=true" -F "myfile=@serialized-form.html", I get ERROR: [doc=exid7] unknown field 'ignored_link'</pre>. Is this something exid7 gives me? Where does this field ignored_link come from? Do I need to add all these fields to schema.xml in order not to get such error? What is the safest way?
Regards,