Nevermind -- I found I can just add another fq, so i'm not getting the 0s
back, which makes it quick to add it up on my end.

So the solution is:
collection1/query?q=crawl_id:40fq=text:%22matched%20text%22&fl=termfreq(text,%27matched%20text%27)&rows=1000000&tv=false

Thanks for your help!


Akos (Aki) Balogh
M: 617-682-0066
Co-Founder, MarketMuse
https://www.MarketMuse.com

On Wed, Feb 4, 2015 at 5:34 PM, Aki Balogh <a...@marketmuse.com> wrote:

> PS - I found that termfreq() actually returns the raw tf, i.e. an integer
> for each document. However, I have to get the request and add them up on my
> end.
>
> Unfortunately totaltermfreq() sums the similarity-modified tf values.
>
> Is there a way to just get the sum of the termfreq() values?
>
>
> Akos (Aki) Balogh
> M: 617-682-0066
> Co-Founder, MarketMuse
> https://www.MarketMuse.com
>
> On Wed, Feb 4, 2015 at 4:58 PM, Aki Balogh <a...@marketmuse.com> wrote:
>
>> Is there a way to set solr to only return raw tf (i.e. by maybe turning
>> off the DefaultSimilarity), so I could use ttf() to get the sum of raw tf
>> values?
>>
>> Or do I need to parse each tf value, square it and add them up in
>> post-processing?
>>
>>
>> Thx,
>> Aki
>>
>> On Wed, Feb 4, 2015 at 4:39 PM, Ahmet Arslan <iori...@yahoo.com.invalid>
>> wrote:
>>
>>> Hi,
>>>
>>> So you want raw tf. tf method implemented as square root of raw tf. So
>>> you can re-obtain it by reverse operation.
>>> 1.424 * 1.424 = 2.02 = int = 2
>>>
>>> Ahmet
>>>
>>>
>>>
>>>
>>> On Wednesday, February 4, 2015 11:31 PM, Aki Balogh <a...@marketmuse.com>
>>> wrote:
>>> Hi Ahmet,
>>>
>>> Thank you for your idea, very helpful. I can indeed get tf values through
>>> the tf and ttf function queries.
>>>
>>> Since tf uses Similarity, I'm getting back some floats (i.e. "dog occurs
>>> 1.424 times"), when I was expecting ints.
>>> Is there a way to get back ints (simple word count)?
>>>
>>> Thanks,
>>> Aki
>>>
>>>
>>>
>>> On Wed, Feb 4, 2015 at 3:41 PM, Ahmet Arslan <iori...@yahoo.com.invalid>
>>> wrote:
>>>
>>> > Hi Aki,
>>> >
>>> > How about tf function query?
>>> > https://cwiki.apache.org/confluence/display/solr/Function+Queries
>>> >
>>> > Ahmet
>>> >
>>> >
>>> >
>>> > On Wednesday, February 4, 2015 7:59 PM, Aki Balogh <a...@marketmuse.com
>>> >
>>> > wrote:
>>> > I'm using solr TermVectorComponent to get term frequencies for specific
>>> > terms in a corpus. I.e. I query for "q=dog" and want to get back term
>>> > frequencies for "dog" in the corpus.
>>> >
>>> > However, when I request term frequencies, I get back ALL term
>>> frequencies
>>> > for ALL matching documents, which is generating a massive response and
>>> > wasting I/O.
>>> >
>>> > Instead, I would like to get tf for ONLY the terms that are an exact
>>> match
>>> > to the term in my query.
>>> >
>>> > Word count like this seems like it would be a common use case, but I
>>> didn't
>>> > see it in the code.
>>> >
>>> >
>>> http://grepcode.com/file_/repo1.maven.org/maven2/org.dspace.dependencies.solr/dspace-solr-core/1.4.0.1/org/apache/solr/handler/component/TermVectorComponent.java#78
>>> >
>>> > Is there a way to get this behavior without having to modify the source
>>> > code?
>>> >
>>> > Thanks,
>>> > Aki
>>> >
>>>
>>
>>
>

Reply via email to