Nevermind -- I found I can just add another fq, so i'm not getting the 0s back, which makes it quick to add it up on my end.
So the solution is: collection1/query?q=crawl_id:40fq=text:%22matched%20text%22&fl=termfreq(text,%27matched%20text%27)&rows=1000000&tv=false Thanks for your help! Akos (Aki) Balogh M: 617-682-0066 Co-Founder, MarketMuse https://www.MarketMuse.com On Wed, Feb 4, 2015 at 5:34 PM, Aki Balogh <a...@marketmuse.com> wrote: > PS - I found that termfreq() actually returns the raw tf, i.e. an integer > for each document. However, I have to get the request and add them up on my > end. > > Unfortunately totaltermfreq() sums the similarity-modified tf values. > > Is there a way to just get the sum of the termfreq() values? > > > Akos (Aki) Balogh > M: 617-682-0066 > Co-Founder, MarketMuse > https://www.MarketMuse.com > > On Wed, Feb 4, 2015 at 4:58 PM, Aki Balogh <a...@marketmuse.com> wrote: > >> Is there a way to set solr to only return raw tf (i.e. by maybe turning >> off the DefaultSimilarity), so I could use ttf() to get the sum of raw tf >> values? >> >> Or do I need to parse each tf value, square it and add them up in >> post-processing? >> >> >> Thx, >> Aki >> >> On Wed, Feb 4, 2015 at 4:39 PM, Ahmet Arslan <iori...@yahoo.com.invalid> >> wrote: >> >>> Hi, >>> >>> So you want raw tf. tf method implemented as square root of raw tf. So >>> you can re-obtain it by reverse operation. >>> 1.424 * 1.424 = 2.02 = int = 2 >>> >>> Ahmet >>> >>> >>> >>> >>> On Wednesday, February 4, 2015 11:31 PM, Aki Balogh <a...@marketmuse.com> >>> wrote: >>> Hi Ahmet, >>> >>> Thank you for your idea, very helpful. I can indeed get tf values through >>> the tf and ttf function queries. >>> >>> Since tf uses Similarity, I'm getting back some floats (i.e. "dog occurs >>> 1.424 times"), when I was expecting ints. >>> Is there a way to get back ints (simple word count)? >>> >>> Thanks, >>> Aki >>> >>> >>> >>> On Wed, Feb 4, 2015 at 3:41 PM, Ahmet Arslan <iori...@yahoo.com.invalid> >>> wrote: >>> >>> > Hi Aki, >>> > >>> > How about tf function query? >>> > https://cwiki.apache.org/confluence/display/solr/Function+Queries >>> > >>> > Ahmet >>> > >>> > >>> > >>> > On Wednesday, February 4, 2015 7:59 PM, Aki Balogh <a...@marketmuse.com >>> > >>> > wrote: >>> > I'm using solr TermVectorComponent to get term frequencies for specific >>> > terms in a corpus. I.e. I query for "q=dog" and want to get back term >>> > frequencies for "dog" in the corpus. >>> > >>> > However, when I request term frequencies, I get back ALL term >>> frequencies >>> > for ALL matching documents, which is generating a massive response and >>> > wasting I/O. >>> > >>> > Instead, I would like to get tf for ONLY the terms that are an exact >>> match >>> > to the term in my query. >>> > >>> > Word count like this seems like it would be a common use case, but I >>> didn't >>> > see it in the code. >>> > >>> > >>> http://grepcode.com/file_/repo1.maven.org/maven2/org.dspace.dependencies.solr/dspace-solr-core/1.4.0.1/org/apache/solr/handler/component/TermVectorComponent.java#78 >>> > >>> > Is there a way to get this behavior without having to modify the source >>> > code? >>> > >>> > Thanks, >>> > Aki >>> > >>> >> >> >