Q) What are some typical values for this "content" field (i.e. how
many different words does the content field contain for each document)?

A) They are indexed from word and pdf documents, the highest is 278 pages
long (about 372000 bytes when indexed into Solr). There's thousands of
different words in each of the document.

Regards,
Edwin


On 2 September 2015 at 19:45, Yonik Seeley <[email protected]> wrote:

> On Wed, Sep 2, 2015 at 1:19 AM, Zheng Lin Edwin Yeo
> <[email protected]> wrote:
> > The type of field is text_general.
>
> What are some typical values for this "content" field (i.e. how many
> different words does the content field contain for each document)?
>
> -Yonik
>
> > I found that the problem mainly happen in the content field of the
> > collections with rich text document.
> > It works fine for other files, and also collections indexed with CSV
> > documents, even if the fieldType is text_general.
> >
> > Regards,
> > Edwin
> >
> >
> > On 2 September 2015 at 12:12, Yonik Seeley <[email protected]> wrote:
> >
> >> On Tue, Sep 1, 2015 at 11:51 PM, Zheng Lin Edwin Yeo
> >> <[email protected]> wrote:
> >> > No, I've tested it several times after committing it.
> >>
> >> Hmmm, well something is really wrong for this orders of magnitude
> >> difference.  I've never seen anything like that and we should
> >> definitely try to get to the bottom of it.
> >> What is the type of the field?
> >>
> >> -Yonik
> >>
>

Reply via email to