Re: Regarding google maps polyline to use IsWithin(POLYGON(())) in solr

2016-03-15 Thread David Smiley
Hi Pradeep, Are you seeing an error when it doesn't work? I believe a shape overlapping itself will cause an error from JTS. If you do see that, then you can ask Spatial4j (used by Lucene/Solr) to attempt to deal with it in a number of ways. See "validationRule": https://locationtech.github.io/

Re: Regarding google maps polyline to use IsWithin(POLYGON(())) in solr

2016-03-19 Thread David Smiley
JTS doesn't has any vertex limit on the geometries. So I don't know why your query isn't working. On Wed, Mar 16, 2016 at 1:58 AM Pradeep Chandra < pradeepchandra@gmail.com> wrote: > Hi Sir, > > Let me give some clarification on IsWithin(POLYGON(())) query...It is not > giving any result for

Re: Seasonal searches in SOLR 5.x

2016-03-22 Thread David Smiley
Hi, I suggest having a "season" field (or whatever you might want to call it) using DateRangeField but simply use a nominal year value. So basically all durations would be within this nominal year. For some docs that span new-years, this might mean 2 durations and that's okay. Also it's okay if

Re: Facet heatmaps: cluster coordinates based on average position of docs

2016-04-19 Thread David Smiley
Hi Anton, Perhaps you should request a more detailed / high-res heatmap, and then work with that, perhaps using some clustering technique? I confess I don't work on the UI end of things these days. p.s. I'm on vacation this week; so I don't respond quickly ~ David On Thu, Apr 7, 2016 at 3:43 P

Re: issues doing a spatial query

2016-04-28 Thread David Smiley
Hi. This makes sense to me. The point 49.8,-97.1 is in your query box. The box is lower-left to upper-right, so your box is actually an almost world-wrapping one grabbing all longitudes except -93 to -92. Maybe you mean to switch your left & right. On Sun, Apr 24, 2016 at 8:03 PM GW wrote: >

Re: Solr - index polygons from csv

2016-04-28 Thread David Smiley
Hi. To use polygons, you need to add JTS, otherwise you get an unsupported shape error. See https://cwiki.apache.org/confluence/display/solr/Apache+Solr+Reference+Guide it involves not only adding a JTS lib to your classpath (ideal spot is WEB-INF/lib ) but also adding a spatialContextFactory att

Re: relaxed vs. improved validation in solr.TrieDateField

2016-05-06 Thread David Smiley
Sorry to hear that Uwe Reh. If this is just in your input/index data, then this could be handled with an URP, maybe evan an existing URP. See ParseDateFieldUpdateProcessorFactory which uses the Joda-time API. I am not sure if that will work, I'm a little doubtful in fact since Solr now uses the J

Re: Boosting by calculated distance buckets

2015-02-14 Thread David Smiley
: &boost=recip(geodist(),1,20,20) ~ David Smiley Freelance Apache Lucene/Solr Search Consultant/Developer http://www.linkedin.com/in/davidwsmiley sraav wrote > I hit a block when I ran into a use case where I had to boost on ranges of > distances calculated at query time. This is the

Re: Boosting by calculated distance buckets

2015-02-17 Thread David Smiley
Raav, You may need to actually subscribe to the solr-user list. Nabble seems to not be working to well. p.s. I’m on vacation this week so I can’t be very responsive First of all... it's not clear you actually want to *boost* (since you seem to not care about the relevancy score), it seems you wa

Re: Solr join + Boost in single query

2015-03-03 Thread David Smiley
No, not without writing something custom anyway. It'd be difficult to make it fast if there's a lot of documents to join on. sraav wrote > David, > > Is it possible to write a query to join two cores and either bring back > data from the two cores or to boost on the data coming back from either

Re: Price Range Faceting Based on Date Constraints

2015-05-21 Thread David Smiley
Another more modern option, very related to this, is to use DateRangeField in 5.0. You have full 64 bit precision. More info is in the Solr Ref Guide. If Alessandro sticks with RPT, then the best reference to give is this: http://wiki.apache.org/solr/SpatialForTimeDurations ~ David https://www

Re: Highlighting phone numbers

2016-05-18 Thread David Smiley
Perhaps an easy thing to try is see of the FastVectorHighlighter yields any different results. There are some nuances to the highlighters -- it might. Failing that, this likely due to your analysis chain, and where exactly the offsets point to, which you can see/debug in Solr's analysis screen.

Re: Facet heatmaps: cluster coordinates based on average position of docs

2016-05-18 Thread David Smiley
t to add average positions of documents in cell? > I think I've seen hand-rolled heatmap capabilities added to Solr (i.e. no custom Solr hacking) that went about it kinda like that. stats.facet on some geohash (or similar), then average lat & average lon. ~ David > 2016-04-20 4:2

Re: Issues with coordinates in Solr during updating of fields

2016-06-13 Thread David Smiley
Zheng, There are a few Solr FieldTypes that are basically composite fields -- a virtual field of other fields. AFAIK they are all spatial related. You don't necessarily need to pay attention to the fact that gps_1_coordinate exists under the hood unless you wish to customize the options on that f

Re: error rendering solr spatial in geoserver

2016-06-29 Thread David Smiley
For polygons in 6.0 you need to set spatialContextFactory="org.locationtech.spatial4j.context.jts.JtsSpatialContextFactory" -- see https://cwiki.apache.org/confluence/display/solr/Spatial+Search and the example. And of course as you probably already know, put the JTS jar on Solr's classpath. What

Re: error rendering solr spatial in geoserver

2016-07-01 Thread David Smiley
r 5 node reloads the configuration. > > --Ere > > 30.6.2016, 3.46, David Smiley kirjoitti: > > For polygons in 6.0 you need to set > > > spatialContextFactory="org.locationtech.spatial4j.context.jts.JtsSpatialContextFactory" > > -- see > > https://cwiki.apache

Re: error indexing spatial

2016-07-25 Thread David Smiley
Hi tig. Most likely, you didn't repeat the first point as the last. Even though it's redundant, nonetheless this is what WKT (and some other spatial formats) calls for. ~ David On Wed, Jul 20, 2016 at 10:13 PM tkg_cangkul wrote: > hi i try to indexing spatial format to solr 5.5.0 but i've got

Re: Need Help Resolving Unknown Shape Definition Error

2016-08-15 Thread David Smiley
Hello Jennifer, The spatial documentation is largely this page: https://cwiki.apache.org/confluence/display/solr/Spatial+Search (however note the online version is always for the latest Solr release. You can download a PDF versioned against your Solr version). To do polygon searches, you both nee

Re: Sorting on DateRangeField?

2016-09-09 Thread David Smiley
Hi Alex, DateRangeField extends some spatial stuff, which has that error message in it, not in DateRangeField proper. You cannot sort on a DateRangeField. If you want to... try adding either one plain docValues field if you just have date instances, or a pair of them to hold a min & max and pick

Re: request SOLR - spatial field with Intersect and Contains functions

2016-09-19 Thread David Smiley
Hi Leo, You should use two spatial fields for this -- one is for an indexed Box/Envelope, and another for an indexed LineString. The indexed box should use either BBoxField or RptWithGeometrySpatialField, and the LineString field should use RptWithGeometrySpatialField. If you have an older inst

Re: Negative Date Query for Local Params in Solr

2016-09-20 Thread David Smiley
It should, I think... what happens? Can you ascertain the nature of the results? ~ David On Tue, Sep 20, 2016 at 5:35 AM Sandeep Khanzode wrote: > For Solr 6.1.0 > This works .. -{!field f=schedule op=Intersects}2016-08-26T12:00:56Z > > This works .. {!field f=schedule op=Contains}[2016-08-26T12

Re: Negative Date Query for Local Params in Solr

2016-09-20 Thread David Smiley
quot;: "json", "_": > "1474373612202" } }, "error": { "msg": "Invalid Date in Date Math > String:'[2016-08-26T12:00:12Z'", "code": 400 }} > SRK > > On Tuesday, September 20, 2016 5:34 PM, David Smiley

Re: Negative Date Query for Local Params in Solr

2016-09-20 Thread David Smiley
ch up. > Thanks a lot ... SRK > > Show original message On Tuesday, September 20, 2016 5:54 PM, David > Smiley wrote: > > > OH! Ok the moment the query no longer starts with "{!", the query is > parsed by defType (for 'q') and will default to

Re: Negative Date Query for Local Params in Solr

2016-09-20 Thread David Smiley
ps://cwiki.apache.org/confluence/display/solr/Working+with+Dates "More DateRangeField Details" mentions "op". {!lucene df=dateRange op=Contains}... would also work. I don't know of any other local-param used in this way. On Tue, Sep 20, 2016 at 11:21 PM David Smiley

Re: Migrating to Solr 6.1.0 from 5.5.0

2016-09-29 Thread David Smiley
Arjun, Your input is a POLYGON -- as seen in the error message. The "Try JTS" was hopefully a clue -- on https://cwiki.apache.org/confluence/display/solr/Spatial+Search search for "JTS" and you should see how to set the spatialContextFactory to JTS, and a mention of needing JTS jar. I'll try and

Re: Heatmap in JSON facet API

2016-11-01 Thread David Smiley
I plan on adding this in the near future... hopefully for Solr 6.4. On Mon, Oct 31, 2016 at 7:06 AM Никита Веневитин wrote: > I've built query as described in https://cwiki.apache.org/confluence/x/ZYDxAQ";>Heatmap Faceting, > but I would like to get same results using JSON facet API > > 2016-10-

How-To: Secure Solr by IP Address

2016-11-04 Thread David Smiley
I was just researching how to secure Solr by IP address and I finally figured it out. Perhaps this might go in the ref guide but I'd like to share it here anyhow. The scenario is where only "localhost" should have full unfettered access to Solr, whereas everyone else (notably web clients) can onl

Re: How-To: Secure Solr by IP Address

2016-11-04 Thread David Smiley
Not to knock the other suggestions, but a benefit to securing Jetty like this is that *everyone* can do this approach. On Fri, Nov 4, 2016 at 9:54 AM john saylor wrote: > hi > > any firewall worth it's name should be able to do this. in fact, that is > one of several things that a firewall was d

Re: Highlighter not working on some documents

2017-06-11 Thread David Smiley
Probably the most common reason is the default hl.maxAnalyzedChars -- thus your highlightable text might not be in the first 51200 chars of text. The first Solr release with the unified highlighter had an even lower default of 10k chars. On Fri, Jun 9, 2017 at 9:58 PM Phil Scadden wrote: > Trie

Re: Issue with highlighter

2017-06-14 Thread David Smiley
> Beware of NOT plus OR in a search. That will certainly produce no highlights. (eg test -results when default op is OR) Seems like a bug to me; the default operator shouldn't matter in that case I think since there is only one clause that has no BooleanQuery.Occur operator and thus the OR/AND sho

Re: Polygon search query working but NOT Multipolygon

2017-06-28 Thread David Smiley
Hi Puneeta, So what does your field type definition look like? I'd imagine you're using RptWithGeometrySpatialField. And what is your Solr version? BTW note the settings here https://locationtech.github.io/spatial4j/apidocs/org/locationtech/spatial4j/context/jts/JtsSpatialContextFactory.html

Re: Polygon search query working but NOT Multipolygon

2017-06-28 Thread David Smiley
I suggest using RptWithGeometry field, and with that change remove distErrPct and maxDistErr. See the ref guide, and note the geometry cache option. BTW spatialContextFactory can simply be "jts". If this fixes the issue, then the issue was related to grid approximation. BTW you never quite said

Re: Solr 5.5 - spatial intersects query returns results outside of search box

2017-06-28 Thread David Smiley
> On Jun 27, 2017, at 3:28 AM, Leila Gonzales wrote: > > { > >"id": "5230", > >"location_geo": > ["ENVELOPE(-75.0,-75.939723,39.3597224,38.289722)"] > > } This is an unusual rectangle. Remember this is minX, maxX, maxY, minY. Thus this rectangl

Re: Solr 5.5 - spatial intersects query returns results outside of search box

2017-06-28 Thread David Smiley
ecking in my > Solr indexing script to trap for this type of coordinate mismatch. > > -Original Message----- > From: David Smiley [mailto:david.w.smi...@gmail.com] > Sent: Wednesday, June 28, 2017 8:21 AM > To: solr-user@lucene.apache.org > Subject: Re: Solr 5.5 - sp

Re: Polygon search query working but NOT Multipolygon

2017-06-28 Thread David Smiley
https://lucene.apache.org/solr/guide/6_6/spatial-search.html#SpatialSearch-RptWithGeometrySpatialField > On Jun 28, 2017, at 11:32 AM, puneeta wrote: > > Hi David, > I am sorry ,I did not

Re: Spatial Search based on the amount of docs, not the distance

2017-06-28 Thread David Smiley
Deniz didn't mention document-to-document distance sort but he/she didn't say it wasn't that case either. Any way, FYI at the Lucene level with LatLonPoint there is some sophisticated BKD search code to efficiently return the top N distance ordered documents (where you supply N). Although as f

Re: Polygon search query working but NOT Multipolygon

2017-06-28 Thread David Smiley
This polygon is fairly rectangular with one side having a ton of points. Nonetheless the query point is clearly far apart from it (it's much lower (smaller 'y' dimension). On Wed, Jun 28, 2017 at 10:17 PM puneeta wrote: > Hi David, > Actually my polygon had too many coordinates, so i just omit

Re: Not highlighting "and" and "or"?

2017-06-28 Thread David Smiley
Hi Walter, No they are not. Does debug=query show that these words are in your parsed query? On Wed, Jun 28, 2017 at 5:13 PM Walter Underwood wrote: > Is there some special casing in the highlighter to skip query syntax > words? The words “and” and “or” don’t get highlighted. > > This is in 6.5

Re: Issue: Hit Highlighting Working Inconsistently in Solr 6.6

2017-07-14 Thread David Smiley
Does hl.method=unified help any? Perhaps you need to set hl.fl? or hl.requireFieldMatch=false? (although it should default to false already) On Fri, Jul 14, 2017 at 6:52 PM Vikram Oberoi wrote: > Hi! > > Just wanted to close the loop here. > > I'm pretty sure this has something to do with the

Re: The unified highlighter html escaping. Seems rather extreme...

2017-07-20 Thread David Smiley
The escaping does appear excessive. Please file a bug to the Lucene project in Apache JIRA. On Fri, May 26, 2017 at 11:26 AM Michael Joyner wrote: > Isn't the unified html escaper a rather bit extreme in it's escaping? > > It makes it hard to deal with for simple post-processing. > > The origin

Re: Spatial search with arbitrary rectangle?

2017-08-29 Thread David Smiley
Hi, The "rectangular area" refers to a hypothetical map UI. In this scenario, the UI ought to communicate the lat-lon of each corner. The geofilt and bbox query parsers don't handle that; they only take a point and distance. RE projections: You may or may not need to care depending on exactly w

Re: Sorting by distance resources with WKT polygon data

2017-09-19 Thread David Smiley
Hello, Sorry for the belated response. Solr only supports sorting from point or rectangles in the index. For rectangles use BBoxField. For points, ideally use the new LatLonPointSpatialField; failing that use LatLonType. You can use RPT for point data but I don't recommend sorting with it; use

Re: Solr Spatial Query Problem Hk.

2017-10-04 Thread David Smiley
Hi, Firstly, if Solr returns an error referencing an exception then you can look in Solr's logs for the stack trace, which helps debugging problems a ton (at least for Solr devs). I suspect that the problem here is that your schema might have a dynamic field where *coordinates is defined to be a

Re: Retrieve DocIdSet from Query in lucene 5.x

2017-10-24 Thread David Smiley
See SolrIndexSearcher.getDocSet. It may not be identical to what you want but following what it does on through to DocSetUtil.createDocSet may be enlightening. On Fri, Oct 20, 2017 at 5:10 PM Jamie Johnson wrote: > I am trying to migrate some old code that used to retrieve DocIdSets from > filt

Re: Sum area polygon solr

2017-11-01 Thread David Smiley
Hi, Ah, no -- sorry. If you want to roll up your sleeves and write a Solr plugin (a ValueSource in this case, perhaps) then you could lookup the index polygon and then call out to JTS to compute the intersection and then ask it for the area. But that's going to be a very heavyweight computation

Re: Search opening hours

2016-11-24 Thread David Smiley
I just saw this conversation now. I didn't read every word but I have to ask immediately: does DateRangeField address your needs? https://cwiki.apache.org/confluence/display/solr/Working+with+Dates It was introduced in 5.0. On Wed, Nov 16, 2016 at 4:59 AM O. Klein wrote: > Above implementation

Re: Search opening hours

2016-11-28 Thread David Smiley
Lets say you wanted to do ranges over some integer. Simply convert those integers to dates, such as java.time.Instant.ofEpochSecond(myInteger).toString(). It's more efficient to convert to seconds (as in this example) as a base instead milliseconds because the internal date oriented tree has 1000

Re: How to identify documents failed in a batch request?

2016-12-17 Thread David Smiley
If you enable the "TolerantUpdateProcessor" Solr-side, you can add documents in bulk allowing some to fail and know which did: http://www.solr-start.com/javadoc/solr-lucene/org/apache/solr/update/processor/TolerantUpdateProcessorFactory.html On Sat, Dec 17, 2016 at 5:05 PM S G wrote: > Hi, > >

Re: Solr 6.4 new SynonymGraphFilter help for multi-word synonyms

2017-02-03 Thread David Smiley
Solr _does_ have a query parser that doesn't suffer from this problem -- SimpleQParser chosen as the string "simple". https://cwiki.apache.org/confluence/display/solr/Other+Parsers#OtherParsers-SimpleQueryParser In this case, see the "WHITESPACE" operator feature which can be toggled. Configure to

Re: Boolean expression for spatial query

2017-03-02 Thread David Smiley
I recommend the MULTIPOINT approach. BTW if you go the route of multiple OR'ed sub-clauses, I recommend avoiding the _query_ syntax which predates Solr 4.x's (4.2?) ability to embed fully the sub-clauses more naturally; though you need to beware of the gotcha of needing to add a leading space. If

Re: Error with polygon search

2017-03-21 Thread David Smiley
Hello Hank, The online version of the reference guide is always for the latest Solr release. I think your configuration would work in the latest release. Prior to Solr 6, the Spatial4J library had a different Java package location: replace "org.locationtech.spatial4j" with "com.spatial4j.core

Re: DateRangeField and Faceting

2017-04-26 Thread David Smiley
Hi Stephen, I agree that it would be nice if the JSON faceting module worked with DateRangeField. Sadly Solr has several faceting engines (classic, JSON Facets, analytics contrib) and there has yet been any effort to coral them. My sense is that JSON Faceting is where effort should go, and as yo

Re: Spatial Search: can not use FieldCache on a field which is neither indexed nor has doc values: latitudeLongitude_0_coordinate

2017-04-30 Thread David Smiley
Frederick, RE LatLonType: Weird. Is the dynamic field "_coordinate" defined? It should be ensure it has indexed=true on it. I forget if indexed needs to be set on that or on the LLT field that refers to it but to be sure set on both. RE LatLonPointSpatialField: You should use this for sure

Re: why MULTILINESTRING can contains polygon in solr spatial search

2017-06-02 Thread David Smiley
Hi, Solr 4.7 is old but is probably okay. Is it easy to try a 6.x version? (note Spatial4j java package names have changed). There's also multiple new pertinent options to your scenario: https://locationtech.github.io/spatial4j/apidocs/org/locationtech/spatial4j/context/jts/JtsSpatialContextFact

Re: Spatial maxDistErr changes

2014-04-02 Thread David Smiley
Good question Steve, You'll have to re-index right off. ~ David p.s. Sorry I didn't reply sooner; I just switched jobs and reconfigured my mailing list subscriptions Steven Bower wrote > If am only indexing point shapes and I want to change the maxDistErr from > 0.09 (1m res) to 0.00045 wi

Re: Rectangle with rotation in Solr

2018-09-13 Thread David Smiley
Polygon is the only way. On Wed, Aug 29, 2018 at 7:46 AM Zahra Aminolroaya wrote: > I have locations with 4-tuple (longitude,latitude) which are like > rectangles > and I want to index them. Solr BBoxField with minX, maxX, maxY and minY, > only considers rectangles which does not have rotations.

Re: Geofilt and distance measurement problems using SpatialRecursivePrefixTreeFieldType field type

2018-12-20 Thread David Smiley
Hi Peter, Use of an RPT field for distance sorting/boosting is to be avoided where possible because it's very inefficient at this specific use-case. Simply use LatLonType for this task, and continue to use RPT for the filter/search use-case. Also I see you putting a space between the coordinates

Re: Geofilt and distance measurement problems using SpatialRecursivePrefixTreeFieldType field type

2018-12-23 Thread David Smiley
our data. In my defence that is > far from obvious in the documentation. > > Thanks again for your help. > > Cheers, > Peter. > > -Original Message- > From: David Smiley [mailto:david.w.smi...@gmail.com] > Sent: 21 December 2018 04:44 > To:

Re: Solr 7.2.1 Stream API throws null pointer execption when used with collapse filter query

2019-01-03 Thread David Smiley
File a JIRA issue please On Thu, Jan 3, 2019 at 5:20 PM gopikannan wrote: > Hi, >I am getting null pointer exception when streaming search is done with > collapse filter query. When debugged the last element in FixedBitSet array > is null. Please let me know if I can raise an issue. > > > ht

Re: regarding debugging solr in eclipse

2019-01-18 Thread David Smiley
On Fri, Jan 18, 2019 at 9:20 AM Scott Stults < sstu...@opensourceconnections.com> wrote: > This blog article might help: > > https://opensourceconnections.com/blog/2013/04/13/how-to-debug-solr-with-eclipse/ > > I don't use Eclipse but I believe things are better now than the instructions given. T

Re: Nested geofilt query for LTR feature

2019-03-20 Thread David Smiley
en call the "geodist" function query. Additionally if you dump the full stack trace here, it might be helpful. Getting a RuntimeException suggests we need to do a better of job wrapping/cleaning errors internally. ~ David Smiley Apache Lucene/Solr Search Developer http://www.linkedin.com/in

Re: Range query syntax on a polygon field is returning all documents

2019-03-20 Thread David Smiley
Can you try one other query syntax e.g. bbox query parser to see if the problem goes away? I doubt this is it but you seem to point to the syntax being related. ~ David Smiley Apache Lucene/Solr Search Developer http://www.linkedin.com/in/davidwsmiley On Mon, Mar 18, 2019 at 12:24 AM M

Re: Slower indexing speed in Solr 8.0.0

2019-04-03 Thread David Smiley
What/where is this benchmark? I recall once Ishan was working with a volunteer to set up something like Lucene has but sadly it was not successful On Wed, Apr 3, 2019 at 6:04 AM Đạt Cao Mạnh wrote: > Hi guys, > > I'm seeing the same problems with Shalin nightly indexing benchmark. This > happen

Re: Slower indexing speed in Solr 8.0.0

2019-04-03 Thread David Smiley
Hi Edwin, I'd like to rule something out. Does your schema define a field "_root_"? If you don't have nested documents then remove it. It's presence adds indexing weight in 8.0 that was not there previously. I'm not sure how much though; I've hoped small bu

Re: Spatial Search using two separate fields for lat and long

2019-04-13 Thread David Smiley
with the lat & lon separately. Your spatial field could be stored=false, and the separate fields would be stored but otherwise not be indexed or have other characteristics that add weight. The result is efficient; no redundancies. ~ David Smiley Apache Lucene/Solr Search Devel

Re: Sorting results for spatial search

2018-02-01 Thread David Smiley
quote: "The problem is that this includes children that DON’T touch the search area in the sum. How can I only include the shapes from the first query above in my sort?" Unless I'm misunderstanding your intent, I think this is a simple matter of adding the spatial filter to the parent join query y

Re: InetAddressPoint support in Solr or other IP type?

2018-03-23 Thread David Smiley
Hi, For IPv4, use TrieIntField with precisionStep=8 For IPv6 https://issues.apache.org/jira/browse/SOLR-6741 There's nothing there yet; you could help out if you are familiar with the codebase. Or you might try something relatively simple involving edge ngrams. ~ David On Thu, Mar 22, 2018 a

Re: InetAddressPoint support in Solr or other IP type?

2018-03-27 Thread David Smiley
something I was > missing since I couldn't find any discussion on this. > > Michael Cooper > > -Original Message- > From: David Smiley [mailto:david.w.smi...@gmail.com] > Sent: Friday, March 23, 2018 5:14 PM > To: solr-user@lucene.apache.org > Subject: Re: InetAdd

Re: Copying a SolrCloud collection to other hosts

2018-03-27 Thread David Smiley
The backup/restore API is intended to address this. https://builds.apache.org/job/Solr-reference-guide-master/javadoc/making-and-restoring-backups.html Erick's advice is good (and I once drafted docs for the same scheme years ago as well), but I consider it dated -- it's what people had to do befo

Re: Copying a SolrCloud collection to other hosts

2018-03-28 Thread David Smiley
tures in that tool have been incorporated into > Solr itself these days, but I still use clonecollection/copycollection > regularly. (most recently with Solr 7.2) > > > On 3/27/18, 9:55 PM, "David Smiley" wrote: > > The backup/restore API is intended to address

Re: querying vs. highlighting: complete freedom?

2018-04-02 Thread David Smiley
Hi Arturas, Both Erick and I had a go at improving the documentation here. I hope it's clearer. https://builds.apache.org/job/Solr-reference-guide-master/javadoc/highlighting.html The docs for hl.fl, hl.q, hl.qparser were all updated. The meat of the change was a new note in hl.fl including an e

Re: PreAnalyzed FieldType, and simultaneously importing JSON

2018-04-02 Thread David Smiley
Hello Markus, It appears you are not familiar with PreAnalyzedUpdateProcessor? Using that is much more flexible -- you could have different URP chains for your use-cases. IMO PreAnalyzedField ought to go away. I argued for the URP version and thus it's superiority to the FieldType here: https://

Re: querying vs. highlighting: complete freedom?

2018-04-03 Thread David Smiley
Thanks for your review! On Tue, Apr 3, 2018 at 6:56 AM Arturas Mazeika wrote: ... > What I missed at the beginning of the documentation is the minimal set of > requirements that is reacquired to have highlighting sensible: somehow I > have a feeling that one needs some of the information stored

Re: querying vs. highlighting: complete freedom?

2018-04-03 Thread David Smiley
On Tue, Apr 3, 2018 at 10:51 AM Arturas Mazeika wrote: ... > Similarly, there's the > hl.qparser parameter, but the documentation of that parameter is not as > rich (the documentation says, that the default value is lucene). I am > wondering are there other alternatives available? In case you ar

Re: PreAnalyzed URP and SchemaRequest API

2018-04-05 Thread David Smiley
Is this really a problem when you could easily enough create a TextField and call setTokenStream? Does your remote client have Solr-core and all its dependencies on the classpath? That's one way to do it... and presumably the direction you are going because you're asking how to work with PreAnal

Re: PreAnalyzed URP and SchemaRequest API

2018-04-12 Thread David Smiley
Ah ok. I've wondered how much value there is in pre-analysis. The serialization of the analyzed form in JSON is bulky. If you can share any results, I'd be interested to hear how it went. It's an optimization so you should be able to know how much better it is. Of course it isn't for everybody

Re: PreAnalyzed URP and SchemaRequest API

2018-04-13 Thread David Smiley
Yes I could imagine big gains from this strategy if OpenNLP is in the analysis chain ;-) On Fri, Apr 13, 2018 at 5:01 PM Markus Jelsma wrote: > Hello David, > > If JSON serialization is too bulky, we could also opt for > SimplePreAnalyzed right? At least as a FieldType it is possible, if not > w

Re: ClassCastException: o.a.l.d.Field cannot be cast to o.a.l.d.StoredField

2018-04-26 Thread David Smiley
I'm not sure but I wonder why you would want to cast it in the first place. Field is the base class; all it's subclasses are in one way or another utilities/conveniences. In other words, if you ever see code casting Field to some subclass, there's a good chance it's fundamentally wrong or making

Re: Highlighter throwing InvalidTokenOffsetsException for field with large number of synonyms

2018-04-26 Thread David Smiley
Yay! I'm glad the UnifiedHighlighter is serving you well. I was about to suggest it. If you think the fragmentation/snippeting could be improved in a general way then post a JIRA for consideration. Note: identical results with the original Highlighter is a non-goal. On Mon, Apr 23, 2018 at 10:

Re: ClassCastException: o.a.l.d.Field cannot be cast to o.a.l.d.StoredField

2018-04-26 Thread David Smiley
> but how would a DocumentTransformer affect UpdateLog replay? Oh right; nevermind that silly theory ;-) On Thu, Apr 26, 2018 at 10:42 AM Markus Jelsma wrote: > Hello David, > > Yes it was sporadic indeed, but how would a DocumentTransformer affect > UpdateLog replay? > > We removed the cast, n

Re: Impact/Performance of maxDistErr

2018-05-29 Thread David Smiley
Hello Jens, With solr.RptWithGeometrySpatialField, you always get an accurate result thanks to the "WithGeometry" part. The "Rpt" part is a grid index, and most of the parameters pertain to that. maxDistErr controls the highest resolution grid. No shape will be indexed to higher resolutions than

Re: Impact/Performance of maxDistErr

2018-05-30 Thread David Smiley
hat helps a lot to understand! > Best Regards > > Jens > > P.S. Currently the only search we are doing on the polygon is > Contains(POINT(x,y)) > > > Am 29.05.2018 um 13:30 schrieb David Smiley: > > Hello Jens, > With solr.RptWithGeometrySpatialField, you always get

Re: Syntax error while parsing Spatial Query as string

2020-02-14 Thread David Smiley
n 9.0 as it's obsolete. ~ David Smiley Apache Lucene/Solr Search Developer http://www.linkedin.com/in/davidwsmiley On Fri, Feb 14, 2020 at 6:47 AM vas aj wrote: > Hi team, > > I am using Lucene 6.6.2, Spatial4j 0.7, lucene-spatial-extras 6.6.2. I am > trying to create a Spat

Re: Unified highlighter- unable to get results - can get results with original and termvector highlighters

2020-05-22 Thread David Smiley
Hello, Did you get it to work eventually? Try setting hl.weightMatches=false and see if that helps. Wether this helps or not, I'd like to have a deeper understanding of the internal structure of the Query (not the original query string). What query parser are you using?. If you pass debug=quer

Re: Highlighting Solr 8

2020-05-22 Thread David Smiley
What did you end up doing, Eric? Did you migrate to the Unified Highlighter? ~ David On Wed, Oct 16, 2019 at 4:36 PM Eric Allen wrote: > Thanks for the reply. > > Currently we are migrating from solr4 to solr8 under solr 4 we wrote our > own highlighter because the provided one was too slow fo

Re: Unified highlighter with storeOffsetsWithPositions and termVectors giving an exception

2020-05-22 Thread David Smiley
FWIW I tried this on the techproducts schema with a modification to the name field, but did not see the issue. I suspect you did not re-index after making these schema changes. If you did, then also check that the collection (or core) truly started fresh (never had any previous schema) because if

Re: unified highlighter methods works unexpected

2020-05-22 Thread David Smiley
diagnose the underlying problem and possibly fix. ~ David Smiley Apache Lucene/Solr Search Developer http://www.linkedin.com/in/davidwsmiley On Thu, Apr 2, 2020 at 9:02 AM Szűcs Roland wrote: > Hi All, > > I use Solr 8.4.1 and implement suggester functionality. As part of the > suggestio

Re: Alternate Fields for Unified Highlighter

2020-05-22 Thread David Smiley
t'd be nice if Solr had a DocTransformer to accomplish that. I know it's been awhile; I'm curious how the UH has been working for you, assuming you are using it. ~ David Smiley Apache Lucene/Solr Search Developer http://www.linkedin.com/in/davidwsmiley On Sun, Jun 2, 2019 at

Re: hl.preserveMulti in Unified highlighter?

2020-05-22 Thread David Smiley
Hi Walter, No, the UnifiedHighlighter does not behave as if this setting were true. The docs say: `hl.preserveMulti`:: If `true`, multi-valued fields will return all values in the order they were saved in the index. If `false`, the default, only values that match the highlight request will be re

Re: Creating custom PassageFormatter

2020-05-22 Thread David Smiley
You've probably gotten you answer now but "no". Basically, you'd need to specify your own subclass of UnifiedSolrHighlighter in solrconfig.xml like this: Error loading class 'solr.highlight.CustomPassageFormatter'". > > Example from solrconfig.xml: > class="solr.highlight.CustomPassageFormat

Re: hl.preserveMulti in Unified highlighter?

2020-05-23 Thread David Smiley
h of a performance hit from > > essentially removing the offset usage, but our highlighted fields aren't > > extremely large :-) > > > > Hope that helps! > > Anthony > > > > *Anthony Groves* | Technical Lead, Search > > > > O'Reilly Media, Inc.

Re: highlighting a whole html document using Unified highlighter

2020-05-24 Thread David Smiley
s a problem, and the root cause is here: LUCENE-5734 <https://issues.apache.org/jira/browse/LUCENE-5734> It's on my long TODO list but hasn't bitten me lately so I've neglected it. ~ David Smiley Apache Lucene/Solr Search Developer http://www.linkedin.com/in/davidwsmiley O

Re: highlighting a whole html document using Unified highlighter

2020-05-24 Thread David Smiley
cument as html document ? > (preserving the field data coming from meta-tags and not strip the html > tags) > > Then I could use solr.HTMLStripCharFilterFactory for analysis. > > Thank You, > > Serkan, > > > > > -Original Message- > From: Davi

Re: unified highlighter performance in solr 8.5.1

2020-05-25 Thread David Smiley
Wow that's terrible! So this problem is for SENTENCE in particular, and it's a regression in 8.5? I'll see if I can reproduce this with the Lucene benchmark module. I figure you have some meaty text, like "page" size or longer? ~ David On Mon, May 25, 2020 at 10:38 AM Michal Hlavac wrote: >

Re: unified highlighter performance in solr 8.5.1

2020-05-26 Thread David Smiley
Please create an issue. I haven't reproduced it yet but it seems unlikely to be user-error. ~ David On Mon, May 25, 2020 at 9:28 AM Michal Hlavac wrote: > Hi, > > I have field: > stored="true" indexed="false" storeOffsetsWithPositions="true"/> > > and configuration: > true > unified > true >

Re: unified highlighter performance in solr 8.5.1

2020-05-27 Thread David Smiley
s > > > > On utorok 26. mája 2020 17:44:52 CEST David Smiley wrote: > > > Please create an issue. I haven't reproduced it yet but it seems > unlikely > > > to be user-error. > > > > > > ~ David > > > > > > >

Re: Why Did It Match?

2020-05-29 Thread David Smiley
I've used the highlighter in the past for this but it has to do a lot more work than "explain". Typically that extra work is analysis of the fields' text again. Still; the highlighter can make sense when the individual fields aren't otherwise searchable because you are searching on an aggregate c

Re: Facet Performance

2020-06-17 Thread David Smiley
I strongly recommend setting indexed=true on a field you facet on for the purposes of efficient refinement (fq=field:value). But it strictly isn't required, as you have discovered. ~ David On Wed, Jun 17, 2020 at 9:02 AM Michael Gibney wrote: > facet.method=enum works by executing a query (ag

Re: Master Slave Terminology

2020-06-17 Thread David Smiley
priv...@lucene.apache.org but it should have been public and expect it to spill out to the dev list today. ~ David On Wed, Jun 17, 2020 at 11:14 AM Mike Drob wrote: > Hi Jan, > > Can you link to the discussion? I searched the dev list and didn’t see > anything, is it on slack or a jira or some

Re: unified highlighter performance in solr 8.5.1

2020-07-03 Thread David Smiley
I think we should flip the default of hl.fragsizeIsMinimum to be 'true', thus have the behavior close to what preceded 8.5. (a) it was very recently (<= 8.4) the previous behavior and so may require less tuning for users in 8.6 henceforth (b) it's significantly faster for long text -- seems to be 2

  1   2   3   4   >