Solr & Heatmap & Geotools

2015-05-23 Thread Joseph Obernberger
Hi All - I've been working with geo tools to build a heat map based on 
location data coming back from Solr Cloud using these nifty feature 
where you can facet on location 
(https://issues.apache.org/jira/browse/SOLR-7005) and generate a raster.
I've been able to take this data and build a GridCoverage2D object and 
.show() that OK - works nice (even displays a nice map), but I don't 
know how to apply the HeatmapProcess to it.  What I've tried so far is:


HeatmapProcess process = new HeatmapProcess();
SimpleFeatureCollection sfc = null;
try {
sfc = FeatureUtilities.wrapGridCoverage(rasterCov);
} catch (TransformException ex) {
System.out.println("Error: "+ex);
} catch (SchemaException ex) {
System.out.println("Error: "+ex);
}
if (sfc == null) {
System.out.println("Bummer");
System.exit(1);
}

GridCoverage2D cov = process.execute(sfc, // data
1024, //radius
"0", // weightAttr
1, // pixelsPerCell
bounds, // outputEnv
1024, // outputWidth
1024, // outputHeight
monitor // monitor)
);
cov.show();

So far all I get is an empty frame.  Any tips, or links on how to use 
HeatmapProcess?  Anyone trying anything like this?

Thanks very much!

-Joe


Multivalued OR query with equal score/rankings when any one value matches

2015-05-23 Thread Troy Collinsworth
While trying to query a multivalued String field for multiple values, when
any one value matches the score is higher for the lower value and lower for
the higher. I swapped the value order and it had no affect so it isn't
positional. I want the score to be the same irrespective of the value
matched. I also still want the score highest when both values match. The
solution has to work for 'n' values, not just two as in this example.

solr-5.1.0


Indexed data and example scores:

"userIds": ["931","890"] "score": 9.600372
"userIds": ["890"] "score": 2.5523214
"userIds": ["931"] "score": 2.247865

The results for 890 and 931 need to have the same score as they each
matched one query value so they will be returned mixed together. I realize
they wont be randomly sorted, however if they at least have the same score
I can recognize and randomize them before use.

Queries:

"q": "userIds:931 OR userIds:890", "fl": "*,score"

The edismax and bq query has the same symptoms.

"defType": "edismax", "fl": "*,score", "bq": "userIds:931 OR userIds:890"

Adding boost values to the query will change the order, but still doesn't
give the desired behavior of results with only one value having equal score
and being mixed together in the results.

"defType": "edismax", "fl": "*,score", "bq": "userIds:931^2 OR
userIds:890^1"

Also tried quoting the values with no effect.

-Troy


Re: Multivalued OR query with equal score/rankings when any one value matches

2015-05-23 Thread Troy Collinsworth
>
> Thanks, that was it. Being new to this I wouldn't have thought of that.
>
> docfreq for 890 is 12
> docfreq for 931 is 19
>
> I found this post
> 
> on how to disable idf which I will try.
>
> -Troy
>


Re: Multivalued OR query with equal score/rankings when any one value matches

2015-05-23 Thread Yonik Seeley
On Sat, May 23, 2015 at 1:29 PM, Troy Collinsworth
 wrote:
> While trying to query a multivalued String field for multiple values, when
> any one value matches the score is higher for the lower value and lower for
> the higher. I swapped the value order and it had no affect so it isn't
> positional. I want the score to be the same irrespective of the value
> matched. I also still want the score highest when both values match.

It's a bit cumbersome, but you can make each clause a constant score query.
http://yonik.com/solr/query-syntax/#ConstantScoreQuery

userIds:890^=1 userIds:931^=1
or I think the following should work as well:
userIds:(890^=1 931^=1)

-Yonik


SolrCloud 4.8 - Transaction log size over 1GB

2015-05-23 Thread Vincenzo D'Amore
Hi,

looking at tlog size I see there are many collection that have keep more
than 1GB of space.
Tlog are growing and the code that adds new documents never does an hard
commit.

The question is must I fix the code that update the collections or can I do
an hard commit externally using collection api or via admin console?

Thanks, any help is appreciated,
Vincenzo


Re: SolrCloud 4.8 - Transaction log size over 1GB

2015-05-23 Thread Shawn Heisey
On 5/23/2015 8:56 PM, Vincenzo D'Amore wrote:
> looking at tlog size I see there are many collection that have keep more
> than 1GB of space.
> Tlog are growing and the code that adds new documents never does an hard
> commit.
> 
> The question is must I fix the code that update the collections or can I do
> an hard commit externally using collection api or via admin console?

I strongly recommend that you configure autoCommit in your
solrconfig.xml with openSearcher set to false.

I will usually recommend an interval of 30 (five minutes) but others
recommend an interval of 15000 (15 seconds).  This kind of autoCommit
generally does happen very quickly, but my philosophy is to keep the
impact of any commit as low as possible ... which means doing them as
infrequently as possible.

The tradeoff on the autoCommit interval is the size of each individual
transaction log.  If you are indexing documents very quickly, you
probably want a shorter interval.

The example autoCommit config that you can find on the following wiki
page also has maxDocs ... it's up to you whether you include that part
of the config.

http://wiki.apache.org/solr/SolrPerformanceProblems#Slow_startup_due_to_the_transaction_log

Thanks,
Shawn



Re: SolrCloud 4.8 - Transaction log size over 1GB

2015-05-23 Thread Vincenzo D'Amore
Thanks Shawn,

may be this is a silly question, but I looked around and didn't find an
answer...
Well, could I update solrconfig.xml for the collection while the instances
are running or should I restart the cluster/reload the cores?

On Sun, May 24, 2015 at 5:07 AM, Shawn Heisey  wrote:

> On 5/23/2015 8:56 PM, Vincenzo D'Amore wrote:
> > looking at tlog size I see there are many collection that have keep more
> > than 1GB of space.
> > Tlog are growing and the code that adds new documents never does an hard
> > commit.
> >
> > The question is must I fix the code that update the collections or can I
> do
> > an hard commit externally using collection api or via admin console?
>
> I strongly recommend that you configure autoCommit in your
> solrconfig.xml with openSearcher set to false.
>
> I will usually recommend an interval of 30 (five minutes) but others
> recommend an interval of 15000 (15 seconds).  This kind of autoCommit
> generally does happen very quickly, but my philosophy is to keep the
> impact of any commit as low as possible ... which means doing them as
> infrequently as possible.
>
> The tradeoff on the autoCommit interval is the size of each individual
> transaction log.  If you are indexing documents very quickly, you
> probably want a shorter interval.
>
> The example autoCommit config that you can find on the following wiki
> page also has maxDocs ... it's up to you whether you include that part
> of the config.
>
>
> http://wiki.apache.org/solr/SolrPerformanceProblems#Slow_startup_due_to_the_transaction_log
>
> Thanks,
> Shawn
>
>


Re: SolrCloud 4.8 - Transaction log size over 1GB

2015-05-23 Thread Shawn Heisey
On 5/23/2015 9:41 PM, Vincenzo D'Amore wrote:
> Thanks Shawn,
> 
> may be this is a silly question, but I looked around and didn't find an
> answer...
> Well, could I update solrconfig.xml for the collection while the instances
> are running or should I restart the cluster/reload the cores?

You can upload a new config to zookeeper with the zkcli program while
Solr is running, and nothing will change, at least not immediately.  The
new config will take effect when you reload the collection or restart
all the Solr instances.

Thanks,
Shawn