Thank you very much for your answer Erick.
My apologies for the previous email; my problem is that I don´t speak
English very well and I´m new in the world of mailing list.
The problem is that I´m indexing emails throw Data import Handler using
Gmail with imaps; I do this for search on email list in the future. The
emails are indexed partiality and I can´t found the problem of why don´t
index all of the emails.
Below I show you de configuration of my DIH.
<dataConfig>
<document>
<entity
name="gmail"
processor="MailEntityProcessor"
transformer="LogTransformer"
user="[email protected]"
password="password"
host="imap.gmail.com"
protocol="imaps"
fetchMailsSince="2010-01-01
00:00:00"
folders="inbox"
deltaFetch="false"
processAttachement="false"
batchSize="100"
fetchSize="1024"
recurse="true" />
</document>
</dataConfig>
The date of my emails is later to “2010-01-01 00:00:00”.
I´ve done a full import and no errors were found, but in the status I saw
that was added 28 documents, and in the console, I found 35 messanges.
Below I show you the status screen, first, and then part of the console
output.
Status:
<response>
<lst name="responseHeader">
<int name="status">0</int>
<int name="QTime">1</int>
</lst>
<lst name="initArgs">
<lst name="defaults">
<str name="config">data-config.xml</str>
</lst>
</lst>
<str name="command">status</str>
<str name="status">idle</str>
<str name="importResponse"/>
<lst name="statusMessages">
<str name="Total Requests made to DataSource">0</str>
<str name="Total Rows Fetched">28</str>
<str name="Total Documents Skipped">0</str>
<str name="Full Dump Started">2011-03-22 15:55:12</str>
<str name="">
Indexing completed. Added/Updated: 28 documents. Deleted 0 documents.
</str>
<str name="Committed">2011-03-22 15:55:20</str>
<str name="Optimized">2011-03-22 15:55:20</str>
<str name="Total Documents Processed">28</str>
<str name="Time taken ">0:0:8.520</str>
</lst>
<str name="WARNING">
This response format is experimental. It is likely to change in the future.
</str>
</response>
…”
Mar 22, 2011 3:55:14 PM
org.apache.solr.handler.dataimport.MailEntityProcessor connectToMailBox
INFO: Connected to mailbox
Mar 22, 2011 3:55:15 PM
org.apache.solr.handler.dataimport.MailEntityProcessor$FolderIterator next
INFO: Opened folder : inbox
Mar 22, 2011 3:55:15 PM
org.apache.solr.handler.dataimport.MailEntityProcessor$FolderIterator next
INFO: Added its children to list :
Mar 22, 2011 3:55:15 PM
org.apache.solr.handler.dataimport.MailEntityProcessor$FolderIterator next
INFO: NO children :
Mar 22, 2011 3:55:16 PM
org.apache.solr.handler.dataimport.MailEntityProcessor$MessageIterator
<init>
INFO: Total messages : 35
Mar 22, 2011 3:55:16 PM
org.apache.solr.handler.dataimport.MailEntityProcessor$MessageIterator
<init>
INFO: Search criteria applied. Batching disabled
Mar 22, 2011 3:55:19 PM org.apache.solr.handler.dataimport.DocBuilder finish
INFO: Import completed successfully
“…
Regards,
Matias.
2011/3/22 Erick Erickson <[email protected]>
> Not unless you provide a lot more data. Have you
> inspected the Solr logs and seen any anomalies?
>
> Please review:
> http://wiki.apache.org/solr/UsingMailingLists
>
> Best
> Erick
>
> On Mon, Mar 21, 2011 at 3:56 PM, Matias Alonso <[email protected]>
> wrote:
> > Hi,
> >
> >
> > I’m using Data Import Handler for index emails.
> >
> > The problem is that nota ll the emails was indexed When I do a full
> import.
> >
> > Someone have any idea?
> >
> >
> > Regards,
> >
> > --
> > Matias.
> >
>