How to get term frequency count grouped by date?

Dipanjan Kailthya Fri, 13 Jun 2014 06:13:36 -0700

Hi everyone,

I have a simple schema that looks like this:


<uniqueKey>key</uniqueKey>

<fields>
        <field name="_version_" type="int" indexed="true" stored="true"
multiValued="false"/>

        <field name="text" type="text" />
        <field name="date" type="int" />
        <field name="key" type="string_key" />
    </fields>

The text field is defined as

    <fieldType name="text" class="solr.TextField" indexed="true"
stored="true" termVectors="true" termPositions="true" termOffsets="true">
            <analyzer type="index">
                    <tokenizer class="solr.StandardTokenizerFactory"/>

            </analyzer>
            <analyzer type="query">
                    <tokenizer class="solr.StandardTokenizerFactory"/>

            </analyzer>
        </fieldType>


I have indexed some documents based on this schema, for example:

<doc>
    <str name="text">the quick fox jumped over the lazy dog</str>
    <str name="key">302afeec-b8ef-4675-aec7-585a6b9120cf</str>
    <int name="date">20140601</int>
    <int name="_version_">-1987051520</int>
</doc>

Now, I'd like to get the sum of the term frequencies (not document
frequencies) grouped by date for a particular term. For example, if the
term is 'the', and I have 10 documents with date 20140601 all with the same
string as in the example, then I want to see 20 against 'the' (it appears
twice in each sentence) and 10 against 'fox' (it appears once).

How do I form a query to do this? So far I've tried thsese approaches:

1. Faceting on the date field: I seem to get only document frequencies:
http://localhost:8983/solr/select?q=text:the&facet=true&facet.field=date&rows=0

2. Use the term vector component or termfreq()
Using this I was only able to get the term frequency counts on a per
document basis. Is there a way to combine these 2 approaches, or another
way to get the term frequencies aggregated per day?

Thanks in advance for your help.

-Dipanjan

How to get term frequency count grouped by date?

Reply via email to