we try to index some pdf and other documents with solr cell/tika. so far our crawler downloads the documents and post them to solr cell. this works, the documents get indexed, some fields are filled etc.
our crawler is written in perl. we prepare the following params to post to solr: my $params="&literal.id=$url"; $params .= "&lowernames=true"; $params .= "&fmap.content=body"; $params .= "&fmap.keywords=_stichwort"; $params .= "&fmap.description=_kurzbeschreibung"; $params .= "&fmap.subject=_kurzbeschreibung"; $params .= "&fmap.author=_autor"; $params .= "&literal.__source=extern"; $params .= "&literal.__doctype=$doctype"; $params .= "&literal.__mikronav=extern"; $params .= "&literal.__intern=0"; $params .= "&literal.visiblePath=$url"; $params .= "&literal._dokumententyp=Extern"; all the fields we set with literal are stored in our index. all but one! visiblePath is not stored! we have tried it with some other fields which are defined in schema.xml. we can set fields with literal only if they begin with an underscore. fields not starting with underscore can't be set this way. visiblePath is defined as <field name="visiblePath" type="string" indexed="false" stored="true" /> is there any konfiguration option in solrconfig.xml or schema.xml which prevents this? strange thing is, that literal.id works. in schema.xml we have also <dynamicField name="ignored_*" type="ignored" multiValued="true"/> so that all the "unknown" fields - not defined in schema.xml - which solr cell/tika finds, will be completly ignored and so being unsaved. thanxs markus