You have an example on using mail dih in solr distro

[]s,
Lucas Frare Teixeira .·.
- lucas...@gmail.com
- lucastex.com.br
- blog.lucastex.com
- twitter.com/lucastex


On Sun, Nov 8, 2009 at 1:56 PM, Michael Lackhoff <mich...@lackhoff.de>wrote:

> I would like to start using DIH to index some RSS-Feeds and mail folders
>
> To get started I tried the RSS example from the wiki but as it is Solr
> complains about the missing id field. After some experimenting I found
> out two ways to fill the id:
>
> - <copyField source="link" dest="id"/> in schema.xml
> This works but isn't very flexible. Perhaps I have other types of
> records with a real id or a multivalued link-field. Then this solution
> would break.
>
> - Changing the id field to type "uuid"
> Again I would like to keep real ids where I have them and not a random
> UUID.
>
> What didn't work but looks like the potentially best solution is to fill
> the id in my data-config by using the link twice:
>  <field column="link"         xpath="/RDF/item/link" />
>  <field column="id"           xpath="/RDF/item/link" />
> This would be a definition just for this single data source but I don't
> get any docs (also no error message). No trace of any inserts whatsoever.
> Is it possible to fill the id that way?
>
> Another question regarding MailEntityProcessor
> I found this example:
> <document>
>   <entity processor="MailEntityProcessor"
>           user="someb...@gmail.com"
>           password="something"
>           host="imap.gmail.com"
>           protocol="imaps"
>           folders = "x,y,z"/>
> </document>
>
> But what is the dataSource (the enclosing tag to document)? That is, how
> would a minimal but complete data-config.xml look like to index mails
> from an IMAP server?
>
> And finally, is it possible to combine the definitions for several
> RSS-Feeds and Mail-accounts into one data-config? Or do I need a
> separate config file and request handler for each of them?
>
> -Michael
>

Reply via email to