A problem with depolying:solr6.5 report 404 error

2017-04-23 Thread David
Dear,manager:


   I have a problem with deploying the solr6.5. my environment are 
windows7+jdk 8u131+tomcat9.0+solr6.5. java run successfully, tomcat run 
successfully, solr6.5 has been depolyed. Enter 
http://localhost:8080/solr/index.html in firefox, and then report 404 error. 
The detail is "The origin server did not find a current representation for the 
target resource or is not willing to disclose that one exists." Could you tell 
me why it happened and how to solve this problem? Thank you!


sincerely yours
David.Wu

server won't start using configs from Drupal

2009-07-23 Thread david
I've downloaded solr-2009-07-21.tgz and followed the instructions at http://drupal.org/node/343467 
including retrieving the solrconfig.xml and schema.xml files from the Drupal apachesolr module.


The server seems to start properly with the original solrconfig.xml and 
schema.xml files

When I try to start up the server with the Drupal supplied files, I get errors on the command line, 
and a 500 error from the server.


solrconfig.xml  http://pastebin.com/m23d14a2
schema.xml  http://pastebin.com/m2e79f304
output of http://localhost:8983/solr/admin/:http://pastebin.com/m410fa74d


Following looks to me like the important bits, but I'm not a java coder, so I 
could easily be wrong.

command line extract:

22/07/2009 5:58:54 PM org.apache.solr.common.SolrException log
SEVERE: org.apache.solr.common.SolrException: analyzer without class or tokenizer 
& filter list
(plus lots of WARN messages)

extract from browser at http://localhost:8983/solr/admin/

org.apache.solr.common.SolrException: Unknown fieldtype 'text' specified on 
field title
(snip lots of stuff)
org.apache.solr.common.SolrException: analyzer without class or tokenizer & 
filter list
(snip lots of stuff)
org.apache.solr.common.SolrException: Error loading class 
'solr.CharStreamAwareWhitespaceTokenizerFactory'

(snip lots of stuff)
Caused by: java.lang.ClassNotFoundException: 
solr.CharStreamAwareWhitespaceTokenizerFactory

Nothing in apache logs...

solr logs contain this:
127.0.0.1 - - [22/07/2009:08:01:10 +] "GET /solr/admin/ HTTP/1.1" 500 10292

Any help greatly appreciated.

David.


Re: server won't start using configs from Drupal

2009-07-24 Thread david



Otis Gospodnetic wrote:

I think the problem is CharStreamAwareWhitespaceTokenizerFactory, which used to 
live in Solr (when Drupal schema.xml for Solr was made), but has since moved to 
Lucene.  I'm half guessing. :)

 Otis
--


Thanks  unfortunately I have no idea about Java. Do you know when that 
change was made?

regards,

David.



Sematext is hiring -- http://sematext.com/about/jobs.html?mls
Lucene, Solr, Nutch, Katta, Hadoop, HBase, UIMA, NLP, NER, IR



- Original Message 

From: david 
To: solr-user@lucene.apache.org
Sent: Thursday, July 23, 2009 9:59:53 PM
Subject: server won't start using configs from Drupal 

I've downloaded solr-2009-07-21.tgz and followed the instructions at 
http://drupal.org/node/343467 including retrieving the solrconfig.xml and 
schema.xml files from the Drupal apachesolr module.


The server seems to start properly with the original solrconfig.xml and 
schema.xml files


When I try to start up the server with the Drupal supplied files, I get errors 
on the command line, and a 500 error from the server.


solrconfig.xml http://pastebin.com/m23d14a2
schema.xml http://pastebin.com/m2e79f304
output of http://localhost:8983/solr/admin/:  http://pastebin.com/m410fa74d


Following looks to me like the important bits, but I'm not a java coder, so I 
could easily be wrong.


command line extract:

22/07/2009 5:58:54 PM org.apache.solr.common.SolrException log
SEVERE: org.apache.solr.common.SolrException: analyzer without class or 
tokenizer & filter list

(plus lots of WARN messages)

extract from browser at http://localhost:8983/solr/admin/

org.apache.solr.common.SolrException: Unknown fieldtype 'text' specified on 
field title

(snip lots of stuff)
org.apache.solr.common.SolrException: analyzer without class or tokenizer & 
filter list

(snip lots of stuff)
org.apache.solr.common.SolrException: Error loading class 
'solr.CharStreamAwareWhitespaceTokenizerFactory'

(snip lots of stuff)
Caused by: java.lang.ClassNotFoundException: 
solr.CharStreamAwareWhitespaceTokenizerFactory


Nothing in apache logs...

solr logs contain this:
127.0.0.1 - - [22/07/2009:08:01:10 +] "GET /solr/admin/ HTTP/1.1" 500 10292

Any help greatly appreciated.

David.




Re: server won't start using configs from Drupal

2009-07-24 Thread david



Koji Sekiguchi wrote:

David,

Try to change solr.CharStreamAwareWhitespaceTokenizerFactory to 
solr.WhitespaceTokenizerFactory

in your schema.xml and reboot Solr.



That worked... thanks...

David.


Koji


david wrote:



Otis Gospodnetic wrote:
I think the problem is CharStreamAwareWhitespaceTokenizerFactory, 
which used to live in Solr (when Drupal schema.xml for Solr was 
made), but has since moved to Lucene.  I'm half guessing. :)


 Otis
--


Thanks  unfortunately I have no idea about Java. Do you know when 
that change was made?


regards,

David.



Sematext is hiring -- http://sematext.com/about/jobs.html?mls
Lucene, Solr, Nutch, Katta, Hadoop, HBase, UIMA, NLP, NER, IR



- Original Message 

From: david 
To: solr-user@lucene.apache.org
Sent: Thursday, July 23, 2009 9:59:53 PM
Subject: server won't start using configs from Drupal
I've downloaded solr-2009-07-21.tgz and followed the instructions at 
http://drupal.org/node/343467 including retrieving the 
solrconfig.xml and schema.xml files from the Drupal apachesolr module.


The server seems to start properly with the original solrconfig.xml 
and schema.xml files


When I try to start up the server with the Drupal supplied files, I 
get errors on the command line, and a 500 error from the server.


solrconfig.xml http://pastebin.com/m23d14a2
schema.xml http://pastebin.com/m2e79f304
output of http://localhost:8983/solr/admin/:  
http://pastebin.com/m410fa74d



Following looks to me like the important bits, but I'm not a java 
coder, so I could easily be wrong.


command line extract:

22/07/2009 5:58:54 PM org.apache.solr.common.SolrException log
SEVERE: org.apache.solr.common.SolrException: analyzer without class 
or tokenizer & filter list

(plus lots of WARN messages)

extract from browser at http://localhost:8983/solr/admin/

org.apache.solr.common.SolrException: Unknown fieldtype 'text' 
specified on field title

(snip lots of stuff)
org.apache.solr.common.SolrException: analyzer without class or 
tokenizer & filter list

(snip lots of stuff)
org.apache.solr.common.SolrException: Error loading class 
'solr.CharStreamAwareWhitespaceTokenizerFactory'

(snip lots of stuff)
Caused by: java.lang.ClassNotFoundException: 
solr.CharStreamAwareWhitespaceTokenizerFactory


Nothing in apache logs...

solr logs contain this:
127.0.0.1 - - [22/07/2009:08:01:10 +] "GET /solr/admin/ 
HTTP/1.1" 500 10292


Any help greatly appreciated.

David.








Changing schema without having to reindex

2010-05-28 Thread David

Hi,

Can anyone tell me if it is possible to change the schema without having 
to reindex? I want to change the stored fields specifically.  Any help 
would be appreciated, thanks.




Range query on long value

2010-06-04 Thread David

Hi,

I have an issue with range queries on a long value in our dataset (the 
dataset is fairly large, but i believe the problem still exists for 
smaller datasets).  When i query the index with a range, as such: id:[1 
TO 2000], I get values back that are well outside that range.  Its as if 
the range query is ignoring the values and doing something like id:[* TO 
*]. We are running Solr 1.3.  The value is set as the unique key for the 
index.


Our schema is similar to this:


required="true" />
required="false" />
required="false" />

.
.
.
required="false" />


id


Has anyone else had this problem?  If so, how did you correct it?  
Thanks in advance.


Re: Range query on long value

2010-06-04 Thread David

On 10-06-04 05:11 PM, Ahmet Arslan wrote:
   

I have an issue with range queries on a long value in our
dataset (the dataset is fairly large, but i believe the
problem still exists for smaller datasets).  When i
query the index with a range, as such: id:[1 TO 2000], I get
values back that are well outside that range.  Its as
if the range query is ignoring the values and doing
something like id:[* TO *]. We are running Solr 1.3. 
The value is set as the unique key for the index.


Our schema is similar to this:





 

You need to use sortable double in solr 1.3.0 type="slong" for range queries to 
work correctly. Default schema.xml has an explanation about sortable (sint etc) types.



   
Thanks for the fast response Ahmet.  This fixed my issue, but I have a 
question as to whether there is a performance hit if I change other 
fields to a sortable type, even if im not sure they will ever be used 
for range searches?


Re: How to check when a search exceeds the threshold of timeAllowed parameter

2015-12-23 Thread David Santamauro



On 12/23/2015 01:42 AM, William Bell wrote:

I agree that when using timeAllowed in the header info there should be an
entry that indicates timeAllowed triggered.


If I'm not mistaken, there is
 => partialResults:true

  "responseHeader":{ "partialResults":true }

//



This is the only reason why we have not used timeAllowed. So this is a
great suggestion. Something like: 1 ??
That would be great.


0
1
107

*:*
1000





On Tue, Dec 22, 2015 at 6:43 PM, Vincenzo D'Amore 
wrote:


Well... I can write everything, but really all this just to understand
when timeAllowed
parameter trigger a partial answer? I mean, isn't there anything set in the
response when is partial?

On Wed, Dec 23, 2015 at 2:38 AM, Walter Underwood 
wrote:


We need to know a LOT more about your site. Number of documents, size of
index, frequency of updates, length of queries approximate size of server
(CPUs, RAM, type of disk), version of Solr, version of Java, and features
you are using (faceting, highlighting, etc.).

After that, we’ll have more questions.

wunder
Walter Underwood
wun...@wunderwood.org
http://observer.wunderwood.org/  (my blog)



On Dec 22, 2015, at 4:58 PM, Vincenzo D'Amore 

wrote:


Hi All,

my website is under pressure, there is a big number of concurrent

searches.

When the connected users are too many, the searches becomes so slow

that

in

some cases users have to wait many seconds.
The queue of searches becomes so long that, in same cases, servers are
blocked trying to serve all these requests.
As far as I know because some searches are very expensive, and when

many

expensive searches clog the queue server becomes unresponsive.

In order to quickly workaround this herd effect, I have added a
default timeAllowed to 15 seconds, and this seems help a lot.

But during stress tests but I'm unable to understand when and what

requests

are affected by timeAllowed parameter.

Just be clear, I have configure timeAllowed parameter in a SolrCloud
environment, given that partial results may be returned (if there are

any),

how can I know when this happens? When the timeAllowed parameter

trigger

a

partial answer?

Best regards,
Vincenzo



--
Vincenzo D'Amore
email: v.dam...@gmail.com
skype: free.dev
mobile: +39 349 8513251






--
Vincenzo D'Amore
email: v.dam...@gmail.com
skype: free.dev
mobile: +39 349 8513251







date difference faceting

2016-01-08 Thread David Santamauro


Hi,

I have two date fields, d_a and d_b, both of type solr.TrieDateField, 
that represent different events associated with a particular document. 
The interval between these dates is relevant for corner-case statistics. 
The interval is calculated as the difference: sub(d_b,d_a) and I've been 
able to


  stats=true&stats.field={!func}sub(d_b,d_a)

What I ultimately would like to report is the interval represented as a 
range, which could be seen as facet.query


(pseudo code)
  facet.query=sub(d_b,d_a)[ * TO 8640 ] // day
  facet.query=sub(d_b,d_a)[ 8641 TO 60480 ] // week
  facet.query=sub(d_b,d_a)[ 60481 TO 259200 ] // month
etc.

Aside from actually indexing the difference in a separate field, is 
there something obvious I'm missing? I'm on SOLR 5.2 in cloud mode.


thanks
David


Re: date difference faceting

2016-01-08 Thread David Santamauro


For anyone wanting to know an answer, I used

facet.query={!frange l=0 u=3110400}ms(d_b,d_a)
facet.query={!frange l=3110401 u=6220800}ms(d_b,d_a)
facet.query={!frange l=6220801 u=15552000}ms(d_b,d_a)

etc ...

Not the prettiest nor most efficient but accomplishes what I need 
without re-indexing TBs of data.


thanks.

On 01/08/2016 12:09 PM, Erick Erickson wrote:

I'm going to side-step your primary question and say that it's nearly
always best to do your calculations up-front during indexing to make
queries more efficient and thus serve more requests on the same
hardware. This assumes that the stat you're interested in is
predictable of course...

Best,
Erick

On Fri, Jan 8, 2016 at 2:23 AM, David Santamauro
 wrote:


Hi,

I have two date fields, d_a and d_b, both of type solr.TrieDateField, that
represent different events associated with a particular document. The
interval between these dates is relevant for corner-case statistics. The
interval is calculated as the difference: sub(d_b,d_a) and I've been able to

   stats=true&stats.field={!func}sub(d_b,d_a)

What I ultimately would like to report is the interval represented as a
range, which could be seen as facet.query

(pseudo code)
   facet.query=sub(d_b,d_a)[ * TO 8640 ] // day
   facet.query=sub(d_b,d_a)[ 8641 TO 60480 ] // week
   facet.query=sub(d_b,d_a)[ 60481 TO 259200 ] // month
etc.

Aside from actually indexing the difference in a separate field, is there
something obvious I'm missing? I'm on SOLR 5.2 in cloud mode.

thanks
David


solr-5.3.1 admin console not show properly

2016-01-13 Thread David Cao
I installed and started solr following instructions from solr wiki as this
... (on a Redhat server)

cd ~/
tar zxf /tmp/solr-5.3.1.tgz
cd solr-5.3.1/bin
./solr start -f


Solr starts fine. But when opening console in a browser ("
http://server-ip:8983/solr/admin.html";), it shows a partially rendered page
with highlighted messages "*SolrCore Initialization Failures*"; and a whole
bunch of WARN messages in this nature,

55724 WARN  (qtp1018134259-20) [   ] o.e.j.s.ServletHandler Error for
/solr/css/styles/common.css
java.lang.NoSuchMethodError:
javax/servlet/http/HttpServletRequest.isAsyncSupported()Z
at
org.eclipse.jetty.servlet.DefaultServlet.sendData(DefaultServlet.java:922)
at
org.eclipse.jetty.servlet.DefaultServlet.doGet(DefaultServlet.java:533)
at javax.servlet.http.HttpServlet.service(HttpServlet.java:617)
at javax.servlet.http.HttpServlet.service(HttpServlet.java:723)
at
org.eclipse.jetty.servlet.ServletHolder.handle(ServletHolder.java:808)
at
org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1669)
at
org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:206)
at
org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:179)
at
org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1652)
at
org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:585)
at
org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:143)
at
org.eclipse.jetty.security.SecurityHandler.handle(SecurityHandler.java:577)
at
org.eclipse.jetty.server.session.SessionHandler.doHandle(SessionHandler.java:223)
at
org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1127)
at
org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHandler.java:515)
at
org.eclipse.jetty.server.session.SessionHandler.doScope(SessionHandler.java:185)
at
org.eclipse.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:1061)
at
org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:141)
at
org.eclipse.jetty.server.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:215)
at
org.eclipse.jetty.server.handler.HandlerCollection.handle(HandlerCollection.java:110)
at
org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:97)
at org.eclipse.jetty.server.Server.handle(Server.java:499)
at org.eclipse.jetty.server.HttpChannel.handle(HttpChannel.java:310)
at
org.eclipse.jetty.server.HttpConnection.onFillable(HttpConnection.java:257)
at
org.eclipse.jetty.io.AbstractConnection$2.run(AbstractConnection.java:540)
at
org.eclipse.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.java:635)
at
org.eclipse.jetty.util.thread.QueuedThreadPool$3.run(QueuedThreadPool.java:555)
at java.lang.Thread.run(Thread.java:801)


There was also a line at the start of the console log,

1784 WARN  (main) [   ] o.e.j.s.SecurityHandler
ServletContext@o.e.j.w.WebAppContext@1c662fe5{/solr,file:/root/solr-5.3.1/server/solr-webapp/webapp/,STARTING}{/root/solr-5.3.1/server/solr-webapp/webapp}
has uncovered http methods for path: /


Any ideas? is there any work I need to do to config the classpath?

thanks a lot!
david


Fwd: solr-5.3.1 admin console not show properly

2016-01-14 Thread David Cao
Hi there,

I installed and started solr following instructions from solr wiki as this
... (on a Redhat server)

cd ~/
tar zxf /tmp/solr-5.3.1.tgz
cd solr-5.3.1/bin
./solr start -f


Solr starts fine. But when opening console in a browser ("
http://server-ip:8983/solr/admin.html";), it shows a partially rendered page
with highlighted messages "*SolrCore Initialization Failures*"; and a whole
bunch of WARN messages in this nature,

55724 WARN  (qtp1018134259-20) [   ] o.e.j.s.ServletHandler Error for
/solr/css/styles/common.css
java.lang.NoSuchMethodError:
javax/servlet/http/HttpServletRequest.isAsyncSupported()Z
at
org.eclipse.jetty.servlet.DefaultServlet.sendData(DefaultServlet.java:922)
at
org.eclipse.jetty.servlet.DefaultServlet.doGet(DefaultServlet.java:533)
at javax.servlet.http.HttpServlet.service(HttpServlet.java:617)
at javax.servlet.http.HttpServlet.service(HttpServlet.java:723)
at
org.eclipse.jetty.servlet.ServletHolder.handle(ServletHolder.java:808)
at
org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1669)
at
org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:206)
at
org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:179)
at
org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1652)
at
org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:585)
at
org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:143)
at
org.eclipse.jetty.security.SecurityHandler.handle(SecurityHandler.java:577)
at
org.eclipse.jetty.server.session.SessionHandler.doHandle(SessionHandler.java:223)
at
org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1127)
at
org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHandler.java:515)
at
org.eclipse.jetty.server.session.SessionHandler.doScope(SessionHandler.java:185)
at
org.eclipse.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:1061)
at
org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:141)
at
org.eclipse.jetty.server.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:215)
at
org.eclipse.jetty.server.handler.HandlerCollection.handle(HandlerCollection.java:110)
at
org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:97)
at org.eclipse.jetty.server.Server.handle(Server.java:499)
at org.eclipse.jetty.server.HttpChannel.handle(HttpChannel.java:310)
at
org.eclipse.jetty.server.HttpConnection.onFillable(HttpConnection.java:257)
at
org.eclipse.jetty.io.AbstractConnection$2.run(AbstractConnection.java:540)
at
org.eclipse.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.java:635)
at
org.eclipse.jetty.util.thread.QueuedThreadPool$3.run(QueuedThreadPool.java:555)
at java.lang.Thread.run(Thread.java:801)


There was also a line at the start of the console log,

1784 WARN  (main) [   ] o.e.j.s.SecurityHandler
ServletContext@o.e.j.w.WebAppContext@1c662fe5{/solr,file:/root/solr-5.3.1/server/solr-webapp/webapp/,STARTING}{/root/solr-5.3.1/server/solr-webapp/webapp}
has uncovered http methods for path: /


Any ideas? is there any work I need to do to config the classpath?

thanks a lot!
david


Re: solr-5.3.1 admin console not show properly

2016-01-14 Thread David Cao
Hi Jan,

The JVM is from IBM based on jre 1.7.

IBM J9 VM (build 2.6, JRE 1.7.0 Linux amd64-64 Compressed References
20141216_227497 (JIT enabled, AOT enabled)


The box I am using is just a dev vm box, using 'root' is temporary ...

Thanks
david

On Thu, Jan 14, 2016 at 6:53 AM, David Cao  wrote:

> Hi there,
>
> I installed and started solr following instructions from solr wiki as this
> ... (on a Redhat server)
>
> cd ~/
> tar zxf /tmp/solr-5.3.1.tgz
> cd solr-5.3.1/bin
> ./solr start -f
>
>
> Solr starts fine. But when opening console in a browser ("
> http://server-ip:8983/solr/admin.html";), it shows a partially rendered
> page with highlighted messages "*SolrCore Initialization Failures*"; and
> a whole bunch of WARN messages in this nature,
>
> 55724 WARN  (qtp1018134259-20) [   ] o.e.j.s.ServletHandler Error for
> /solr/css/styles/common.css
> java.lang.NoSuchMethodError:
> javax/servlet/http/HttpServletRequest.isAsyncSupported()Z
> at
> org.eclipse.jetty.servlet.DefaultServlet.sendData(DefaultServlet.java:922)
> at
> org.eclipse.jetty.servlet.DefaultServlet.doGet(DefaultServlet.java:533)
> at javax.servlet.http.HttpServlet.service(HttpServlet.java:617)
> at javax.servlet.http.HttpServlet.service(HttpServlet.java:723)
> at
> org.eclipse.jetty.servlet.ServletHolder.handle(ServletHolder.java:808)
> at
> org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1669)
> at
> org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:206)
> at
> org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:179)
> at
> org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1652)
> at
> org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:585)
> at
> org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:143)
> at
> org.eclipse.jetty.security.SecurityHandler.handle(SecurityHandler.java:577)
> at
> org.eclipse.jetty.server.session.SessionHandler.doHandle(SessionHandler.java:223)
> at
> org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1127)
> at
> org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHandler.java:515)
> at
> org.eclipse.jetty.server.session.SessionHandler.doScope(SessionHandler.java:185)
> at
> org.eclipse.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:1061)
> at
> org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:141)
> at
> org.eclipse.jetty.server.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:215)
> at
> org.eclipse.jetty.server.handler.HandlerCollection.handle(HandlerCollection.java:110)
> at
> org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:97)
> at org.eclipse.jetty.server.Server.handle(Server.java:499)
> at
> org.eclipse.jetty.server.HttpChannel.handle(HttpChannel.java:310)
> at
> org.eclipse.jetty.server.HttpConnection.onFillable(HttpConnection.java:257)
> at
> org.eclipse.jetty.io.AbstractConnection$2.run(AbstractConnection.java:540)
> at
> org.eclipse.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.java:635)
> at
> org.eclipse.jetty.util.thread.QueuedThreadPool$3.run(QueuedThreadPool.java:555)
> at java.lang.Thread.run(Thread.java:801)
>
>
> There was also a line at the start of the console log,
>
> 1784 WARN  (main) [   ] o.e.j.s.SecurityHandler
> ServletContext@o.e.j.w.WebAppContext@1c662fe5{/solr,file:/root/solr-5.3.1/server/solr-webapp/webapp/,STARTING}{/root/solr-5.3.1/server/solr-webapp/webapp}
> has uncovered http methods for path: /
>
>
> Any ideas? is there any work I need to do to config the classpath?
>
> thanks a lot!
> david
>
>
>
>


SolrCloud replicas out of sync

2016-01-22 Thread David Smith
I have a SolrCloud v5.4 collection with 3 replicas that appear to have fallen 
permanently out of sync.  Users started to complain that the same search, 
executed twice, sometimes returned different result counts.  Sure enough, our 
replicas are not identical:

>> shard1_replica1:  89867 documents / version 1453479763194
>> shard1_replica2:  89866 documents / version 1453479763194
>> shard1_replica3:  89867 documents / version 1453479763191

I do not think this discrepancy is going to resolve itself.  The Solr Admin 
screen reports all 3 replicas as “Current”.  The last modification to this 
collection was 2 hours before I captured this information, and our auto commit 
time is 60 seconds.  

I have a lot of concerns here, but my first question is if anyone else has had 
problems with out of sync replicas, and if so, what they have done to correct 
this?

Kind Regards,

David



Re: Read time out exception - exactly 10 minutes after starting committing

2016-01-25 Thread David Andrews
I just got bit by this today.  I tracked it down to the default solr.xml file 
in ./server/solr/solr.xml with the following:
  
${socketTimeout:60}
${connTimeout:6}
  

I’m on Solr 5.3.1 now, and I wasn’t having this problem with 4.10.3, and sure 
enough, 4.10.3 has the values at 0 (i.e. no socket timeout)

-David

> On Jan 24, 2016, at 3:18 PM, Shawn Heisey  wrote:
> 
> On 1/23/2016 9:24 PM, adfel70 wrote:
>> 1. I am getting the "read time out" from the Solr Server.
>> Not from my client, but from the server client when it tries to reach other
>> instances while committing.
>> 
>> 2. I reduced the filter cache autowarmCount to 512, and seems to fix the
>> problem. It now takes only several seconds to commit!
> 
> Do you have any configuration for ShardHandler in your solrconfig.xml?
> 
> https://wiki.apache.org/solr/SolrConfigXml#Configuration_of_Shard_Handlers_for_Distributed_searches
> 
> This is where the client built into Solr can be configured with a socket
> timeout.
> 
> Regarding your cache configuration, even an autowarmCount of 512 is
> quite high.  I have configured a value of *four* for my filterCache,
> because anything higher resulted in unacceptable commit times.  You may
> need to experiment with your configuration for best results.
> 
> Thanks,
> Shawn
> 



Re: SolrCloud replicas out of sync

2016-01-26 Thread David Smith
Thanks Jeff!  A few comments

>>
>> Although you could probably bounce a node and get your document counts back 
>> in sync (by provoking a check)
>>
 

If the check is a simple doc count, that will not work. We have found that 
replica1 and replica3, although they contain the same doc count, don’t have the 
SAME docs.  They each missed at least one update, but of different docs.  This 
also means none of our three replicas are complete.

>>
>>it’s interesting that you’re in this situation. It implies to me that at some 
>>point the leader couldn’t write a doc to one of the replicas,
>>

That is our belief as well. We experienced a datacenter-wide network disruption 
of a few seconds, and user complaints started the first workday after that 
event.  

The most interesting log entry during the outage is this:

"1/19/2016, 5:08:07 PM ERROR null DistributedUpdateProcessorRequest says it is 
coming from leader,​ but we are the leader: 
update.distrib=FROMLEADER&distrib.from=http://dot.dot.dot.dot:8983/solr/blah_blah_shard1_replica3/&wt=javabin&version=2";

>>
>> You might watch the achieved replication factor of your updates and see if 
>> it ever changes
>>

This is a good tip. I’m not sure I like the implication that any failure to 
write all 3 of our replicas must be retried at the app layer.  Is this really 
how SolrCloud applications must be built to survive network partitions without 
data loss? 

Regards,

David


On 1/26/16, 12:20 PM, "Jeff Wartes"  wrote:

>
>My understanding is that the "version" represents the timestamp the searcher 
>was opened, so it doesn’t really offer any assurances about your data.
>
>Although you could probably bounce a node and get your document counts back in 
>sync (by provoking a check), it’s interesting that you’re in this situation. 
>It implies to me that at some point the leader couldn’t write a doc to one of 
>the replicas, but that the replica didn’t consider itself down enough to check 
>itself.
>
>You might watch the achieved replication factor of your updates and see if it 
>ever changes:
>https://cwiki.apache.org/confluence/display/solr/Read+and+Write+Side+Fault+Tolerance
> (See Achieved Replication Factor/min_rf)
>
>If it does, that might give you clues about how this is happening. Also, it 
>might allow you to work around the issue by trying the write again.
>
>
>
>
>
>
>On 1/22/16, 10:52 AM, "David Smith"  wrote:
>
>>I have a SolrCloud v5.4 collection with 3 replicas that appear to have fallen 
>>permanently out of sync.  Users started to complain that the same search, 
>>executed twice, sometimes returned different result counts.  Sure enough, our 
>>replicas are not identical:
>>
>>>> shard1_replica1:  89867 documents / version 1453479763194
>>>> shard1_replica2:  89866 documents / version 1453479763194
>>>> shard1_replica3:  89867 documents / version 1453479763191
>>
>>I do not think this discrepancy is going to resolve itself.  The Solr Admin 
>>screen reports all 3 replicas as “Current”.  The last modification to this 
>>collection was 2 hours before I captured this information, and our auto 
>>commit time is 60 seconds.  
>>
>>I have a lot of concerns here, but my first question is if anyone else has 
>>had problems with out of sync replicas, and if so, what they have done to 
>>correct this?
>>
>>Kind Regards,
>>
>>David
>>



Re: SolrCloud replicas out of sync

2016-01-27 Thread David Smith
Jeff, again, very much appreciate your feedback.  

It is interesting — the article you linked to by Shalin is exactly why we 
picked SolrCloud over ES, because (eventual) consistency is critical for our 
application and we will sacrifice availability for it.  To be clear, after the 
outage, NONE of our three replicas are correct or complete.

So we definitely don’t have CP yet — our very first network outage resulted in 
multiple overlapped lost updates.  As a result, I can’t pick one replica and 
make it the new “master”.  I must rebuild this collection from scratch, which I 
can do, but that requires downtime which is a problem in our app (24/7 High 
Availability with few maintenance windows).


So, I definitely need to “fix” this somehow.  I wish I could outline a 
reproducible test case, but as the root cause is likely very tight timing 
issues and complicated interactions with Zookeeper, that is not really an 
option.  I’m happy to share the full logs of all 3 replicas though if that 
helps.

I am curious though if the thoughts have changed since 
https://issues.apache.org/jira/browse/SOLR-5468 of seriously considering a 
“majority quorum” model, with rollback?  Done properly, this should be free of 
all lost update problems, at the cost of availability.  Some SolrCloud users 
(like us!!!) would gladly accept that tradeoff.  

Regards

David


On 1/26/16, 4:32 PM, "Jeff Wartes"  wrote:

>
>Ah, perhaps you fell into something like this then? 
>https://issues.apache.org/jira/browse/SOLR-7844
>
>That says it’s fixed in 5.4, but that would be an example of a split-brain 
>type incident, where different documents were accepted by different replicas 
>who each thought they were the leader. If this is the case, and you actually 
>have different data on each replica, I’m not aware of any way to fix the 
>problem short of reindexing those documents. Before that, you’ll probably need 
>to choose a replica and just force the others to get in sync with it. I’d 
>choose the current leader, since that’s slightly easier.
>
>Typically, a leader writes an update to it’s transaction log, then sends the 
>request to all replicas, and when those all finish it acknowledges the update. 
>If a replica gets restarted, and is less than N documents behind, the leader 
>will only replay that transaction log. (Where N is the numRecordsToKeep 
>configured in the updateLog section of solrconfig.xml)
>
>What you want is to provoke the heavy-duty process normally invoked if a 
>replica has missed more than N docs, which essentially does a checksum and 
>file copy on all the raw index files. FetchIndex would probably work, but it’s 
>a replication handler API originally designed for master/slave replication, so 
>take care: https://wiki.apache.org/solr/SolrReplication#HTTP_API
>Probably a lot easier would be to just delete the replica and re-create it. 
>That will also trigger a full file copy of the index from the leader onto the 
>new replica.
>
>I think design decisions around Solr generally use CP as a goal. (I sometimes 
>wish I could get more AP behavior!) See posts like this: 
>http://lucidworks.com/blog/2014/12/10/call-maybe-solrcloud-jepsen-flaky-networks/
> 
>So the fact that you encountered this sounds like a bug to me.
>That said, another general recommendation (of mine) is that you not use Solr 
>as your primary data source, so you can rebuild your index from scratch if you 
>really need to. 
>
>
>
>
>
>
>On 1/26/16, 1:10 PM, "David Smith"  wrote:
>
>>Thanks Jeff!  A few comments
>>
>>>>
>>>> Although you could probably bounce a node and get your document counts 
>>>> back in sync (by provoking a check)
>>>>
>> 
>>
>>If the check is a simple doc count, that will not work. We have found that 
>>replica1 and replica3, although they contain the same doc count, don’t have 
>>the SAME docs.  They each missed at least one update, but of different docs.  
>>This also means none of our three replicas are complete.
>>
>>>>
>>>>it’s interesting that you’re in this situation. It implies to me that at 
>>>>some point the leader couldn’t write a doc to one of the replicas,
>>>>
>>
>>That is our belief as well. We experienced a datacenter-wide network 
>>disruption of a few seconds, and user complaints started the first workday 
>>after that event.  
>>
>>The most interesting log entry during the outage is this:
>>
>>"1/19/2016, 5:08:07 PM ERROR null DistributedUpdateProcessorRequest says it 
>>is coming from leader,​ but we are the leader: 
>>update.distrib=FROMLEADER&distrib.from=http://dot.dot.dot.dot:8983/solr/blah_blah_shard1_replica3/&wt=javabin&version

Re: SolrCloud replicas out of sync

2016-01-27 Thread David Smith
Sure.  Here is our SolrCloud cluster:

   + Three (3) instances of Zookeeper on three separate (physical) servers.  
The ZK servers are beefy and fairly recently built, with 2x10 GigE (bonded) 
Ethernet connectivity to the rest of the data center.  We recognize importance 
of the stability and responsiveness of ZK to the stability of SolrCloud as a 
whole.

   + 364 collections, all with single shards and a replication factor of 3.  
Currently housing only 100,000,000 documents in aggregate.  Expected to grow to 
25 billion+.  The size of a single document would be considered “large”, by the 
standards of what I’ve seen posted elsewhere on this mailing list. 

We are always open to ZK recommendations from you or anyone else, particularly 
for running a SolrCloud cluster of this size.

Kind Regards,

David



On 1/27/16, 12:46 PM, "Jeff Wartes"  wrote:

>
>If you can identify the problem documents, you can just re-index those after 
>forcing a sync. Might save a full rebuild and downtime.
>
>You might describe your cluster setup, including ZK. it sounds like you’ve 
>done your research, but improper ZK node distribution could certainly 
>invalidate some of Solr’s assumptions.
>
>
>
>
>On 1/27/16, 7:59 AM, "David Smith"  wrote:
>
>>Jeff, again, very much appreciate your feedback.  
>>
>>It is interesting — the article you linked to by Shalin is exactly why we 
>>picked SolrCloud over ES, because (eventual) consistency is critical for our 
>>application and we will sacrifice availability for it.  To be clear, after 
>>the outage, NONE of our three replicas are correct or complete.
>>
>>So we definitely don’t have CP yet — our very first network outage resulted 
>>in multiple overlapped lost updates.  As a result, I can’t pick one replica 
>>and make it the new “master”.  I must rebuild this collection from scratch, 
>>which I can do, but that requires downtime which is a problem in our app 
>>(24/7 High Availability with few maintenance windows).
>>
>>
>>So, I definitely need to “fix” this somehow.  I wish I could outline a 
>>reproducible test case, but as the root cause is likely very tight timing 
>>issues and complicated interactions with Zookeeper, that is not really an 
>>option.  I’m happy to share the full logs of all 3 replicas though if that 
>>helps.
>>
>>I am curious though if the thoughts have changed since 
>>https://issues.apache.org/jira/browse/SOLR-5468 of seriously considering a 
>>“majority quorum” model, with rollback?  Done properly, this should be free 
>>of all lost update problems, at the cost of availability.  Some SolrCloud 
>>users (like us!!!) would gladly accept that tradeoff.  
>>
>>Regards
>>
>>David
>>
>>



Re: SolrCloud replicas out of sync

2016-01-29 Thread David Smith
Tomás,

Good find, but I don’t think the rate of updates was high enough during the 
network outage to create the overrun situation described in the ticket.

I did notice that one of the proposed fixes, 
https://issues.apache.org/jira/browse/SOLR-8586, is an entire-index consistency 
check between leader and replica.  I really hope they are able to get this to 
work.  Ideally, the replicas would never become (permanently) inconsistent, but 
given that they do, it is crucial that SolrCloud can internally detect and fix, 
no matter what caused it or how long ago it happened.


Regards,

David



On 1/28/16, 1:08 PM, "Tomás Fernández Löbbe"  wrote:

>Maybe you are hitting the reordering issue described in SOLR-8129?
>
>Tomás
>
>On Wed, Jan 27, 2016 at 11:32 AM, David Smith 
>wrote:
>
>> Sure.  Here is our SolrCloud cluster:
>>
>>+ Three (3) instances of Zookeeper on three separate (physical)
>> servers.  The ZK servers are beefy and fairly recently built, with 2x10
>> GigE (bonded) Ethernet connectivity to the rest of the data center.  We
>> recognize importance of the stability and responsiveness of ZK to the
>> stability of SolrCloud as a whole.
>>
>>+ 364 collections, all with single shards and a replication factor of
>> 3.  Currently housing only 100,000,000 documents in aggregate.  Expected to
>> grow to 25 billion+.  The size of a single document would be considered
>> “large”, by the standards of what I’ve seen posted elsewhere on this
>> mailing list.
>>
>> We are always open to ZK recommendations from you or anyone else,
>> particularly for running a SolrCloud cluster of this size.
>>
>> Kind Regards,
>>
>> David
>>
>>
>>
>> On 1/27/16, 12:46 PM, "Jeff Wartes"  wrote:
>>
>> >
>> >If you can identify the problem documents, you can just re-index those
>> after forcing a sync. Might save a full rebuild and downtime.
>> >
>> >You might describe your cluster setup, including ZK. it sounds like
>> you’ve done your research, but improper ZK node distribution could
>> certainly invalidate some of Solr’s assumptions.
>> >
>> >
>> >
>> >
>> >On 1/27/16, 7:59 AM, "David Smith"  wrote:
>> >
>> >>Jeff, again, very much appreciate your feedback.
>> >>
>> >>It is interesting — the article you linked to by Shalin is exactly why
>> we picked SolrCloud over ES, because (eventual) consistency is critical for
>> our application and we will sacrifice availability for it.  To be clear,
>> after the outage, NONE of our three replicas are correct or complete.
>> >>
>> >>So we definitely don’t have CP yet — our very first network outage
>> resulted in multiple overlapped lost updates.  As a result, I can’t pick
>> one replica and make it the new “master”.  I must rebuild this collection
>> from scratch, which I can do, but that requires downtime which is a problem
>> in our app (24/7 High Availability with few maintenance windows).
>> >>
>> >>
>> >>So, I definitely need to “fix” this somehow.  I wish I could outline a
>> reproducible test case, but as the root cause is likely very tight timing
>> issues and complicated interactions with Zookeeper, that is not really an
>> option.  I’m happy to share the full logs of all 3 replicas though if that
>> helps.
>> >>
>> >>I am curious though if the thoughts have changed since
>> https://issues.apache.org/jira/browse/SOLR-5468 of seriously considering
>> a “majority quorum” model, with rollback?  Done properly, this should be
>> free of all lost update problems, at the cost of availability.  Some
>> SolrCloud users (like us!!!) would gladly accept that tradeoff.
>> >>
>> >>Regards
>> >>
>> >>David
>> >>
>> >>
>>
>>



docValues error

2016-02-28 Thread David Santamauro


I'm porting a 4.8 schema to 5.3 and I came across this new error when I 
tried to group.field=f1:


unexpected docvalues type SORTED_SET for field 'f1' (expected=SORTED). 
Use UninvertingReader or index with docvalues.


f1 is defined as

positionIncrementGap="100">

  



  


  required="true" />


Notice that I don't have docValues defined. I realize the field type 
doesn't allow docValues so why does this group request fail with a 
docValues error? It did work with 4.8


Any clue would be appreciated, thanks

David


Re: docValues error

2016-02-29 Thread David Santamauro


So I started over (deleted all documents), re-deployed configs to 
zookeeper and reloaded the collection.


This error still appears when I group.field=f1

unexpected docvalues type SORTED_SET for field 'f1' (expected=SORTED). 
Use UninvertingReader or index with docvalues.


What exactly does this error mean and why am I getting it with a field 
that doesn't even have docValues defined?


Why is the DocValues code being used when docValues are not defined 
anywhere in my schema.xml?



null:java.lang.IllegalStateException: unexpected docvalues type 
SORTED_SET for field 'f1' (expected=SORTED). Use UninvertingReader or 
index with docvalues.

at org.apache.lucene.index.DocValues.checkField(DocValues.java:208)
at org.apache.lucene.index.DocValues.getSorted(DocValues.java:264)
	at 
org.apache.lucene.search.grouping.term.TermFirstPassGroupingCollector.doSetNextReader(TermFirstPassGroupingCollector.java:92)
	at 
org.apache.lucene.search.SimpleCollector.getLeafCollector(SimpleCollector.java:33)
	at 
org.apache.lucene.search.MultiCollector.getLeafCollector(MultiCollector.java:117)
	at 
org.apache.lucene.search.TimeLimitingCollector.getLeafCollector(TimeLimitingCollector.java:144)
	at 
org.apache.lucene.search.MultiCollector.getLeafCollector(MultiCollector.java:117)

at org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:763)
at org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:486)
	at 
org.apache.solr.search.grouping.CommandHandler.searchWithTimeLimiter(CommandHandler.java:233)
	at 
org.apache.solr.search.grouping.CommandHandler.execute(CommandHandler.java:160)
	at 
org.apache.solr.handler.component.QueryComponent.process(QueryComponent.java:398)


etc ...



On 02/28/2016 05:31 PM, David Santamauro wrote:


I'm porting a 4.8 schema to 5.3 and I came across this new error when I
tried to group.field=f1:

unexpected docvalues type SORTED_SET for field 'f1' (expected=SORTED).
Use UninvertingReader or index with docvalues.

f1 is defined as

 
   
 
 
 
   
 

   

Notice that I don't have docValues defined. I realize the field type
doesn't allow docValues so why does this group request fail with a
docValues error? It did work with 4.8

Any clue would be appreciated, thanks

David


Re: docValues error

2016-02-29 Thread David Santamauro




On 02/29/2016 06:05 AM, Mikhail Khludnev wrote:

On Mon, Feb 29, 2016 at 12:43 PM, David Santamauro <
david.santama...@gmail.com> wrote:


unexpected docvalues type SORTED_SET for field 'f1' (expected=SORTED). Use
UninvertingReader or index with docvalues.


  DocValues is primary citizen api for accessing forward-view index, ie. it
replaced FieldCache. The error is caused by an attempt to group by
multivalue field, which is explicitly claimed as unsupported in the doc.



You will have noticed below, the field definition does not contain 
multiValues=true




On 02/28/2016 05:31 PM, David Santamauro wrote:



f1 is defined as

  

  
  
  

  





Re: docValues error

2016-02-29 Thread David Santamauro



On 02/29/2016 07:59 AM, Tom Evans wrote:

On Mon, Feb 29, 2016 at 11:43 AM, David Santamauro
 wrote:

You will have noticed below, the field definition does not contain
multiValues=true


What version of the schema are you using? In pre 1.1 schemas,
multiValued="true" is the default if it is omitted.


1.5

Other single-value fields (tint, string) group correctly. The move from 
4.8 to 5.3 has rendered grouping on populated, single-value, 
solr.TextField fields crippled -- at least for me.


Re: docValues error

2016-02-29 Thread David Santamauro


thanks Shawn, that seems to be the error exactly.

On 02/29/2016 09:22 AM, Shawn Heisey wrote:

On 2/28/2016 3:31 PM, David Santamauro wrote:


I'm porting a 4.8 schema to 5.3 and I came across this new error when
I tried to group.field=f1:

unexpected docvalues type SORTED_SET for field 'f1' (expected=SORTED).
Use UninvertingReader or index with docvalues.

f1 is defined as

 
   
 
 
 
   
 

   

Notice that I don't have docValues defined. I realize the field type
doesn't allow docValues so why does this group request fail with a
docValues error? It did work with 4.8

Any clue would be appreciated, thanks


It sounds like you are running into pretty much exactly what I did with 5.x.

https://issues.apache.org/jira/browse/SOLR-8088

I had to create a copyField that's a string (StrField) type and include
docValues on that field.  I still can't use my tokenized field like I
want to, as I do in 4.x.

Thanks,
Shawn



Re: Regarding google maps polyline to use IsWithin(POLYGON(())) in solr

2016-03-15 Thread David Smiley
Hi Pradeep,

Are you seeing an error when it doesn't work?  I believe a shape
overlapping itself will cause an error from JTS.  If you do see that, then
you can ask Spatial4j (used by Lucene/Solr) to attempt to deal with it in a
number of ways.  See "validationRule":
https://locationtech.github.io/spatial4j/apidocs/org/locationtech/spatial4j/context/jts/JtsSpatialContextFactory.html
<https://locationtech.github.io/spatial4j/apidocs/>
Probably try validationRule="repairBuffer0".

If it still doesn't work (and if you can't use what I say next), I
suggesting debugging this at the JTS level.  You might then wind up
submitting a question to the JTS list.

Spatial4j extends the WKT syntax with a BUFFER() syntax which is possibly
easier/better than your approach of manually building up the buffered path
with your own code to produce a large polygon to send to Solr.  You would
do something like BUFFER(LINESTRING(...),0.001) whereas "10" is the
distance in degrees if you have geo="true", otherwise whatever units your
data was put in.  You can use that with or without JTS since Spatial4j has
a native BufferedLineString shape.  But FYI it doesn't support geo="true"
very well (i.e. working in degrees); the buffer will be skewed very much
away from the equator.  So you could set geo="false" and supply, say,
web-mercator bounding box and work in that Euclidean/2D projected space.

Another FYI, Lucene has a "Geo3d" package within the Spatial3d module that
has a native implementation of a buffered LineString as well, one that
works on the surface of the earth.  It hasn't yet been hooked into
Spatial4j, after which Solr would need no changes.  There's a user "Chris"
who is working on that; it's filed here:
https://github.com/locationtech/spatial4j/issues/134

Good luck.

~ David


On Tue, Mar 15, 2016 at 2:45 PM Pradeep Chandra <
pradeepchandra@gmail.com> wrote:

> Hi Sir,
>
> I want to draw a polyline along the route given by google maps (from one
> place to another place).
>
> I applied the logic of calculating parallel lines between the two markers
> on the route on both sides of the route. Because of the non-linear nature
> of the route. In some cases the polyline is overlapping.
>
> Finally what I am willing to do is by drawing that polyline along the
> route. I will give that polygon go Solr in order to get the results within
> the polygon. But where the problem I am getting is because of the
> overlapping nature of polyline, the Solr is not taking that shape.
>
> Can u suggest me a logic to draw a polyline along the route / Let me know
> is there any type to fetch the data with that type of polyline also in Sorl.
>
> I construct a polygon with 300 points. But for that solr is not giving any
> result..Where as it is giving for results for polygon having points of <
> 200...Can u tell me about the max no.of points to construct a polygon using
> solr...Or it is restricted to that many points in solr.
>
> I am sending some images of my final desired one & my applied one. Please
> find those attachments.
>
> Thanks and Regards
> M Pradeep Chandra
>
-- 
Lucene/Solr Search Committer, Consultant, Developer, Author, Speaker
LinkedIn: http://linkedin.com/in/davidwsmiley | Book:
http://www.solrenterprisesearchserver.com


Re: Regarding google maps polyline to use IsWithin(POLYGON(())) in solr

2016-03-19 Thread David Smiley
JTS doesn't has any vertex limit on the geometries.  So I don't know why
your query isn't working.

On Wed, Mar 16, 2016 at 1:58 AM Pradeep Chandra <
pradeepchandra@gmail.com> wrote:

> Hi Sir,
>
> Let me give some clarification on IsWithin(POLYGON(())) query...It is not
> giving any result for beyond 213 points of polygon...
>
> Thanks
> M Pradeep Chandra
>
>
>
> --
> View this message in context:
> http://lucene.472066.n3.nabble.com/Regarding-google-maps-polyline-to-use-IsWithin-POLYGON-in-solr-tp4263975p4264046.html
> Sent from the Solr - User mailing list archive at Nabble.com.
>
-- 
Lucene/Solr Search Committer, Consultant, Developer, Author, Speaker
LinkedIn: http://linkedin.com/in/davidwsmiley | Book:
http://www.solrenterprisesearchserver.com


Re: Seasonal searches in SOLR 5.x

2016-03-22 Thread David Smiley
Hi,

I suggest having a "season" field (or whatever you might want to call it)
using DateRangeField but simply use a nominal year value.  So basically all
durations would be within this nominal year.  For some docs that span
new-years, this might mean 2 durations and that's okay.  Also it's okay if
you have multiple values and it's okay if your calculations result in some
that overlap; you needn't make them distinct; it'll all get coalesced in
the index.

If for some reason you wind up going the route of abusing point data for
durations, I recommend this link:
http://wiki.apache.org/solr/SpatialForTimeDurations
and it most definitely does not require polygons (and thus JTS); I'm not
sure what gave you that impression.  It's all rectangles & points.

~ David

On Mon, Mar 21, 2016 at 1:29 PM Ioannis Kirmitzoglou <
ioanniskirmitzog...@gmail.com> wrote:

> Hi all,
>
> I would like to implement seasonal date searches on date ranges. I’m using
> SOLR 5.4.1 and have indexed date ranges using a DateRangeField (let’s call
> this field date_ranges).
> Each document in SOLR corresponds to a biological sample and each sample
> was collected during a date range that can span from a single day to
> multiple years. For my application it makes sense to enable seasonal
> searches, ie find samples that were collected during a specific period of
> the year (e.g. summer, or February). In this type of search, the year that
> the sample was collected is not relevant, only the days of the year. I’ve
> been all over SOLR documentation and I haven’t been able to find anything
> that will enable do me that. The closest I got was a post with instructions
> on how to use a spatial field to do date searches (
> https://people.apache.org/~hossman/spatial-for-non-spatial-meetup-20130117/).
> Using the logic in that post I was able to come up with a solution but it’s
> rather complex and needs polygon searches (which in turn means installing
> the JTS Topology suite).
> Before committing to that I would like to ask for your input and whether
> there’s an easier way to do these types of searches.
>
> Many thanks,
>
> Ioannis
>
> -
> Ioannis Kirmitzoglou, PhD
> Bioinformatician - Scientific Programmer
> Imperial College, London
> www.vectorbase.org
> www.vigilab.org
>
> --
Lucene/Solr Search Committer, Consultant, Developer, Author, Speaker
LinkedIn: http://linkedin.com/in/davidwsmiley | Book:
http://www.solrenterprisesearchserver.com


Re: Deleted documents and expungeDeletes

2016-03-30 Thread David Santamauro



On 03/30/2016 08:23 AM, Jostein Elvaker Haande wrote:

On 30 March 2016 at 12:25, Markus Jelsma  wrote:

Hello - with TieredMergePolicy and default reclaimDeletesWeight of 2.0, and 
frequent updates, it is not uncommon to see a ratio of 25%. If you want deletes 
to be reclaimed more often, e.g. weight of 4.0, you will see very frequent 
merging of large segments, killing performance if you are on spinning disks.


Most of our installations are on spinning disks, so if I want a more
aggressive reclaim, this will impact performance. This is of course
something that I do not desire, so I'm wondering if scheduling a
commit with 'expungeDeletes' during off peak business hours is a
better approach than setting up a more aggressive merge policy.



As far as my experimentation with @expungeDeletes goes, if the data you 
indexed and committed using @expungeDeletes didn't touch segments with 
any deleted documents nor wasn't enough data to cause merging with a 
segment containing deleted documents, no deleted documents will be 
removed. Basically, @expungeDeletes expunges deletes in segments 
affected by the commit. If you have a large update that touches many 
segments containing deleted documents and you use @expungeDeletes, it 
could be just as resource intensive as an optimize.


My setting for reclaimDeletesWeight:
  5.0

It keeps the deleted documents down to ~ 10% without any noticable 
impact on resources or performance. But I'm still in the testing phase 
with this setting.




Re: Deleted documents and expungeDeletes

2016-04-01 Thread David Santamauro


The docs on reclaimDeletesWeight say:

"Controls how aggressively merges that reclaim more deletions are 
favored. Higher values favor selecting merges that reclaim deletions."


I can't imagine you would notice anything after only a few commits. I 
have many shards that size or larger and what I do occasionally is to 
loop an optimize, setting maxSegments with decremented values, e.g.,


for maxSegments in $( seq 40 -1 20 ); do
  # optimize maxSegments=$maxSegments
done

It's definitely a poor-man's hack and is clearly not the most efficient 
way of optimizing, but it does remove deletes without requiring double 
or triple the disk space that a full optimize requires. I can usually 
reclaim 100-300GB of disk space in a collection that us currently ~ 2TB 
-- not inconsequential.


Seeing you only have 1.6M documents, perhaps an index rebuild isn't out 
of the question? I did just that on a test collection with 100M 
documents. Starting with 0 deleted docs, a reclaimDeletesWeight=5.0 and 
probably about 1-3% document turnover per week (updates) over the last 3 
months and my deleted percentage is staying below 10%.


If that's not an option, keeping reclaimDeletesWeight at 5.0 and using 
expungeDeletes=true on commit will get that percentage down over time.


//


On 04/01/2016 04:49 AM, Jostein Elvaker Haande wrote:

On 30 March 2016 at 17:46, Erick Erickson  wrote:

through a clever bit of reflection, you can set the
reclaimDeletesWeight variable from solrconfig by including something
like
5 (going from memory
here, you'll get an error on startup if I've messed it up.)


I added the following to my solrconfig a couple of days ago:

 
   8
   8
   5.0
 

There has been several commits and the core is current according to
SOLR admin, however I'm still seeing a lot of deleted docs. These are
my current core statistics.

Last Modified:4 minutes ago
Num Docs:1 675 255
Max Doc:2 353 476
Heap Memory Usage:208 464 267
Deleted Docs:678 221
Version:1 870 539
Segment Count:39

Index size is close to 149GB.

So at the moment, I'm seeing a deleted docs to max docs percentage
ratio of 28.81%. With 'reclaimsWeight' set to 5, it doesn't seem to be
deleting away any deleted docs.

Anything obvious I'm missing?



Solr update fails with “Could not initialize class sun.nio.fs.LinuxNativeDispatcher”

2016-04-07 Thread David Moles
Hi folks,

New Solr user here, attempting to apply the following Solr update command via 
curl

curl 'my-solr-server:8983/solr/my-core/update?commit=true' \
  -H 'Content-type:application/json' -d \
  '[{"my_id_field":"some-id-value","my_other_field":{"set":"new-field-value"}}]'

I'm getting an error response with a stack trace that reduces to:

Caused by: java.lang.NoClassDefFoundError: Could not initialize class 
sun.nio.fs.LinuxNativeDispatcher
at sun.nio.fs.LinuxFileSystem.getMountEntries(LinuxFileSystem.java:81)
at sun.nio.fs.LinuxFileStore.findMountEntry(LinuxFileStore.java:86)
at sun.nio.fs.UnixFileStore.(UnixFileStore.java:65)
at sun.nio.fs.LinuxFileStore.(LinuxFileStore.java:44)
at 
sun.nio.fs.LinuxFileSystemProvider.getFileStore(LinuxFileSystemProvider.java:51)
at 
sun.nio.fs.LinuxFileSystemProvider.getFileStore(LinuxFileSystemProvider.java:39)
at 
sun.nio.fs.UnixFileSystemProvider.getFileStore(UnixFileSystemProvider.java:368)
at java.nio.file.Files.getFileStore(Files.java:1461)
at org.apache.lucene.util.IOUtils.getFileStore(IOUtils.java:528)
at org.apache.lucene.util.IOUtils.spinsLinux(IOUtils.java:483)
at org.apache.lucene.util.IOUtils.spins(IOUtils.java:472)
at org.apache.lucene.util.IOUtils.spins(IOUtils.java:447)
at 
org.apache.lucene.index.ConcurrentMergeScheduler.initDynamicDefaults(ConcurrentMergeScheduler.java:371)
at 
org.apache.lucene.index.ConcurrentMergeScheduler.merge(ConcurrentMergeScheduler.java:457)
at org.apache.lucene.index.IndexWriter.maybeMerge(IndexWriter.java:1817)
at 
org.apache.lucene.index.IndexWriter.prepareCommitInternal(IndexWriter.java:2761)
at org.apache.lucene.index.IndexWriter.commitInternal(IndexWriter.java:2866)
at org.apache.lucene.index.IndexWriter.commit(IndexWriter.java:2833)
at 
org.apache.solr.update.DirectUpdateHandler2.commit(DirectUpdateHandler2.java:586)
at 
org.apache.solr.update.processor.RunUpdateProcessor.processCommit(RunUpdateProcessorFactory.java:95)
at 
org.apache.solr.update.processor.UpdateRequestProcessor.processCommit(UpdateRequestProcessor.java:64)
at 
org.apache.solr.update.processor.DistributedUpdateProcessor.doLocalCommit(DistributedUpdateProcessor.java:1635)
at 
org.apache.solr.update.processor.DistributedUpdateProcessor.processCommit(DistributedUpdateProcessor.java:1612)
at 
org.apache.solr.update.processor.LogUpdateProcessor.processCommit(LogUpdateProcessorFactory.java:161)
at 
org.apache.solr.handler.RequestHandlerUtils.handleCommit(RequestHandlerUtils.java:69)
at 
org.apache.solr.handler.ContentStreamHandlerBase.handleRequestBody(ContentStreamHandlerBase.java:78)
at 
org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:143)
at org.apache.solr.core.SolrCore.execute(SolrCore.java:2064)
at org.apache.solr.servlet.HttpSolrCall.execute(HttpSolrCall.java:654)
at org.apache.solr.servlet.HttpSolrCall.call(HttpSolrCall.java:450)
... 22 more

It looks like sun.nio.fs can't find its own classes, which seems odd. Solr is 
running with OpenJDK 1.8.0_77 on Amazon Linux AMI release 2016.03.

Does anyone know what might be going on here? Is it an OpenJDK / Amazon Linux 
problem?

--
David Moles
UC Curation Center
California Digital Library




Re: Solr update fails with “Could not initialize class sun.nio.fs.LinuxNativeDispatcher”

2016-04-07 Thread David Moles
Hmm, I wonder whether I *am* using an SSD or spinning disk, in Apache. :) I 
guess I can try to find out.

I forgot to mention, this is with Solr 5.2.1 — is that likely to make much 
difference?

-- 
David Moles
UC Curation Center
California Digital Library










On 4/7/16, 4:19 PM, "Chris Hostetter"  wrote:

>
>hat's a strainge error to get.
>
>I can't explain why LinuxFileSystem can't load LinuxNativeDispatcher, but 
>you can probably bypass hte entire situation by explicitly configuring 
>ConcurrentMergeScheduler with defaults so that it doesn't try determine 
>wether you are using an SSD or "spinning" disk...
>
>http://lucene.apache.org/core/5_5_0/core/org/apache/lucene/index/ConcurrentMergeScheduler.html
>https://cwiki.apache.org/confluence/display/solr/IndexConfig+in+SolrConfig#IndexConfiginSolrConfig-MergingIndexSegments
>
>Something like this in your indexConfig settings...
>
>
>  42
>  7
>
>
>...will force those specific settings, instead of trying to guess 
>defaults.
>
>I haven't tested this, but in theory you can also use something like to 
>indicate definitively that you are using a spinning disk (or not) but let 
>it pick the appropriate default values for the merge count & 
>threads accordingly ...
>
>
>  true
>
>
>
>
>: Date: Thu, 7 Apr 2016 22:56:54 +
>: From: David Moles 
>: Reply-To: solr-user@lucene.apache.org
>: To: "solr-user@lucene.apache.org" 
>: Subject: Solr update fails with “Could not initialize class
>: sun.nio.fs.LinuxNativeDispatcher”
>: 
>: Hi folks,
>: 
>: New Solr user here, attempting to apply the following Solr update command 
>via curl
>: 
>: curl 'my-solr-server:8983/solr/my-core/update?commit=true' \
>:   -H 'Content-type:application/json' -d \
>:   
>'[{"my_id_field":"some-id-value","my_other_field":{"set":"new-field-value"}}]'
>: 
>: I'm getting an error response with a stack trace that reduces to:
>: 
>: Caused by: java.lang.NoClassDefFoundError: Could not initialize class 
>sun.nio.fs.LinuxNativeDispatcher
>: at sun.nio.fs.LinuxFileSystem.getMountEntries(LinuxFileSystem.java:81)
>: at sun.nio.fs.LinuxFileStore.findMountEntry(LinuxFileStore.java:86)
>: at sun.nio.fs.UnixFileStore.(UnixFileStore.java:65)
>: at sun.nio.fs.LinuxFileStore.(LinuxFileStore.java:44)
>: at 
>sun.nio.fs.LinuxFileSystemProvider.getFileStore(LinuxFileSystemProvider.java:51)
>: at 
>sun.nio.fs.LinuxFileSystemProvider.getFileStore(LinuxFileSystemProvider.java:39)
>: at 
>sun.nio.fs.UnixFileSystemProvider.getFileStore(UnixFileSystemProvider.java:368)
>: at java.nio.file.Files.getFileStore(Files.java:1461)
>: at org.apache.lucene.util.IOUtils.getFileStore(IOUtils.java:528)
>: at org.apache.lucene.util.IOUtils.spinsLinux(IOUtils.java:483)
>: at org.apache.lucene.util.IOUtils.spins(IOUtils.java:472)
>: at org.apache.lucene.util.IOUtils.spins(IOUtils.java:447)
>: at 
>org.apache.lucene.index.ConcurrentMergeScheduler.initDynamicDefaults(ConcurrentMergeScheduler.java:371)
>: at 
>org.apache.lucene.index.ConcurrentMergeScheduler.merge(ConcurrentMergeScheduler.java:457)
>: at org.apache.lucene.index.IndexWriter.maybeMerge(IndexWriter.java:1817)
>: at 
>org.apache.lucene.index.IndexWriter.prepareCommitInternal(IndexWriter.java:2761)
>: at 
>org.apache.lucene.index.IndexWriter.commitInternal(IndexWriter.java:2866)
>: at org.apache.lucene.index.IndexWriter.commit(IndexWriter.java:2833)
>: at 
>org.apache.solr.update.DirectUpdateHandler2.commit(DirectUpdateHandler2.java:586)
>: at 
>org.apache.solr.update.processor.RunUpdateProcessor.processCommit(RunUpdateProcessorFactory.java:95)
>: at 
>org.apache.solr.update.processor.UpdateRequestProcessor.processCommit(UpdateRequestProcessor.java:64)
>: at 
>org.apache.solr.update.processor.DistributedUpdateProcessor.doLocalCommit(DistributedUpdateProcessor.java:1635)
>: at 
>org.apache.solr.update.processor.DistributedUpdateProcessor.processCommit(DistributedUpdateProcessor.java:1612)
>: at 
>org.apache.solr.update.processor.LogUpdateProcessor.processCommit(LogUpdateProcessorFactory.java:161)
>: at 
>org.apache.solr.handler.RequestHandlerUtils.handleCommit(RequestHandlerUtils.java:69)
>: at 
>org.apache.solr.handler.ContentStreamHandlerBase.handleRequestBody(ContentStreamHandlerBase.java:78)
>: at 
>org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:143)
>: at org.apache.solr.core.SolrCore.execute(SolrCore.java:2064)
>: at org.apache.solr.serv

Solr Support for BM25F

2016-04-14 Thread David Cawley
Hello,
I am developing an enterprise search engine for a project and I was hoping
to implement BM25F ranking algorithm to configure the tuning parameters on
a per field basis. I understand BM25 similarity is now supported in Solr
but I was hoping to be able to configure k1 and b for different fields such
as title, description, anchor etc, as they are structured documents.
I am fairly new to Solr so any help would be appreciated. If this is
possible or any steps as to how I can go about implementing this it would
be greatly appreciated.

Regards,

David

Current Solr Version 5.4.1


Re: Facet heatmaps: cluster coordinates based on average position of docs

2016-04-19 Thread David Smiley
Hi Anton,

Perhaps you should request a more detailed / high-res heatmap, and then
work with that, perhaps using some clustering technique?  I confess I don't
work on the UI end of things these days.

p.s. I'm on vacation this week; so I don't respond quickly

~ David

On Thu, Apr 7, 2016 at 3:43 PM Anton K.  wrote:

> I am working with new solr feature: facet heatmaps. It works great, i
> create clusters on my map with counts. When user click on cluster i zoom in
> that area and i might show him more clusters or documents (based on current
> zoom level).
>
> But all my cluster icons (i use round one, see screenshot below) placed
> straight in the center of cluster's rectangles:
>
> https://dl.dropboxusercontent.com/u/1999619/images/map_grid3.png
>
> Some clusters can be in sea and so on. Also it feels not natural in my case
> to have icons placed orderly on the world map.
>
> I want to place cluster's icons in average coords based on coordinates of
> all my docs inside cluster. Is there any way to achieve this? I am trying
> to use stats component for facet heatmap but it isn't implemented yet.
>
-- 
Lucene/Solr Search Committer, Consultant, Developer, Author, Speaker
LinkedIn: http://linkedin.com/in/davidwsmiley | Book:
http://www.solrenterprisesearchserver.com


Re: Replicas for same shard not in sync

2016-04-25 Thread David Smith
Erick,

So that my understanding is correct, let me ask, if one or more replicas are 
down, updates presented to the leader still succeed, right?  If so, tedsolr is 
correct that the Solr client app needs to re-issue updates, if it wants 
stronger guarantees on replica consistency than what Solr provides.

The “Write Fault Tolerance” section of the Solr Wiki makes what I believe is 
the same point:

"On the client side, if the achieved replication factor is less than the 
acceptable level, then the client application can take additional measures to 
handle the degraded state. For instance, a client application may want to keep 
a log of which update requests were sent while the state of the collection was 
degraded and then resend the updates once the problem has been resolved."


https://cwiki.apache.org/confluence/display/solr/Read+and+Write+Side+Fault+Tolerance


Kind Regards,

David




On 4/25/16, 11:57 AM, "Erick Erickson"  wrote:

>bq: I also read that it's up to the
>client to keep track of updates in case commits don't happen on all the
>replicas.
>
>This is not true. Or if it is it's a bug.
>
>The update cycle is this:
>1> updates get to the leader
>2> updates are sent to all followers and indexed on the leader as well
>3> each replica writes the updates to the local transaction log
>4> all the replicas ack back to the leader
>5> the leader responds to the client.
>
>At this point, all the replicas for the shard have the docs locally
>and can take over as leader.
>
>You may be confusing indexing in batches and having errors with
>updates getting to replicas. When you send a batch of docs to Solr,
>if one of them fails indexing some of the rest of the docs may not
>be indexed. See SOLR-445 for some work on this front.
>
>That said, bouncing servers willy-nilly during heavy indexing, especially
>if the indexer doesn't know enough to retry if an indexing attempt fails may
>be the root cause here. Have you verified that your indexing program
>retries in the event of failure?
>
>Best,
>Erick
>
>On Mon, Apr 25, 2016 at 6:13 AM, tedsolr  wrote:
>> I've done a bit of reading - found some other posts with similar questions.
>> So I gather "Optimizing" a collection is rarely a good idea. It does not
>> need to be condensed to a single segment. I also read that it's up to the
>> client to keep track of updates in case commits don't happen on all the
>> replicas. Solr will commit and return success as long as one replica gets
>> the update.
>>
>> I have a state where the two replicas for one collection are out of sync.
>> One has some updates that the other does not. And I don't have log data to
>> tell me what the differences are. This happened during a maintenance window
>> when the servers got restarted while a large index job was running. Normally
>> this doesn't cause a problem, but it did last Thursday.
>>
>> What I plan to do is select the replica I believe is incomplete and delete
>> it. Then add a new one. I was just hoping Solr had a solution for this -
>> maybe using the ZK transaction logs to replay some updates, or force a
>> resync between the replicas.
>>
>> I will also implement a fix to prevent Solr from restarting unless one of
>> its config files has changed. No need to bounce Solr just for kicks.
>>
>>
>>
>> --
>> View this message in context: 
>> http://lucene.472066.n3.nabble.com/Replicas-for-same-shard-not-in-sync-tp4272236p4272602.html
>> Sent from the Solr - User mailing list archive at Nabble.com.



Re: issues doing a spatial query

2016-04-28 Thread David Smiley
Hi.
This makes sense to me.  The point 49.8,-97.1 is in your query box.  The
box is lower-left to upper-right, so your box is actually an almost
world-wrapping one grabbing all longitudes except  -93 to -92.  Maybe you
mean to switch your left & right.

On Sun, Apr 24, 2016 at 8:03 PM GW  wrote:

> I was not getting the results I expected so I started testing with the solr
> webclient
>
> Maybe I don;t understand things.
>
> simple test query
>
> q=*:*&fq=locations:[49,-92 TO 50,-93]
>
> I don't understand why I get a result set for longitude range -92 to -93
> but should be zero results as far as I understand.
>
>
> 
>
> {
>   "responseHeader": {
> "status": 0,
> "QTime": 2,
> "params": {
>   "q": "*:*",
>   "indent": "true",
>   "fq": "locations:[49,-92 TO 50,-93]",
>   "wt": "json",
>   "_": "1461541195102"
> }
>   },
>   "response": {
> "numFound": 85,
> "start": 0,
> "docs": [
>   {
> "id": "data.spidersilk.co!337",
> "entity_id": "337",
> "type_id": "simple",
> "gender": "Male",
> "name": "Aviator Sunglasses",
> "short_description": "A timeless accessory staple, the
> unmistakable teardrop lenses of our Aviator sunglasses appeal to
> everyone from suits to rock stars to citizens of the world.",
> "description": "Gunmetal frame with crystal gradient
> polycarbonate lenses in grey. ",
> "size": "",
> "color": "",
> "zdomain": "magento.spidersilk.co",
> "zurl":
> "
> http://magento.spidersilk.co/index.php/catalog/product/view/id/337/s/aviator-sunglasses/
> ",
> "main_image_url":
> "
> http://magento.spidersilk.co/media/catalog/product/cache/0/image/9df78eab33525d08d6e5fb8d27136e95/a/c/ace000a_1.jpg
> ",
> "keywords": "Eyewear  ",
> "data_size": "851,564",
> "category": "Eyewear",
> "final_price_without_tax": "295,USD",
> "image_url": [
>   "
> http://magento.spidersilk.co/media/catalog/product/a/c/ace000a_1.jpg";,
>   "
> http://magento.spidersilk.co/media/catalog/product/a/c/ace000b_1.jpg";
> ],
> "locations": [
>   "37.4463603,-122.1591775",
>   "42.5857514,-82.8873787",
>   "41.6942622,-86.2697108",
>   "49.8522263,-97.1390697"
> ],
> "_version_": 1532418847465799700
>   },
>
>
>
> Thanks,
>
> GW
>
-- 
Lucene/Solr Search Committer, Consultant, Developer, Author, Speaker
LinkedIn: http://linkedin.com/in/davidwsmiley | Book:
http://www.solrenterprisesearchserver.com


Re: Solr - index polygons from csv

2016-04-28 Thread David Smiley
Hi.

To use polygons, you need to add JTS, otherwise you get an unsupported
shape error.  See
https://cwiki.apache.org/confluence/display/solr/Apache+Solr+Reference+Guide
it involves not only adding a JTS lib to your classpath (ideal spot is
WEB-INF/lib ) but also adding a spatialContextFactory attribute.  Note that
the value of this attribute is different from 6.0 forward (as seen on the
live page), so get a PDF copy of the ref guide matching the Solr version
you are using if you are not on the latest.  Also, I recommend using
solr.RptWithGeometrySpatialField for indexing non-point data (and it'll
probably work fine for point data too).

When you use geo=false, there are no units or it might have an ignorable
value of degrees.  Essentially it's in whatever units your data is on the
Euclidean 2D plane.

~ David

On Fri, Apr 22, 2016 at 4:33 AM Jan Nekuda  wrote:

> Hello guys,
> I use solr 6 for indexing data with points and polygons.
>
> I have a question about indexing polygons from csv file. I have configured
> type:
>  class="solr.SpatialRecursivePrefixTreeFieldType" geo="false"
> maxDistErr="0.001" worldBounds="ENVELOPE(-1,-1,-1,-1)"
> distErrPct="0.025" distanceUnits="kilometers"/>
>
> and field
>  stored="true"/>
>
> I have tried to import this csv:
>
> kod_adresa,nazev_ulice,cislo_orientacni,cislo_domovni,polygon_mapa,nazev_obec,Nazev_cast_obce,kod_ulice,kod_cast_obce,kod_obec,kod_momc,nazev_momc,Nazev,psc,nazev_vusc,kod_vusc,Nazev_okres,Kod_okres
> 9,,,4,"POLYGON ((-30 -10,-10 -20,-20 -40,-40 -40,-30
> -10))",Vacov,Javorník,,57843,550621,,,Stachy,38473,Jihočeský
> kraj,35,Prachatice,3306
>
> and result is:
>
> Posting files to [base] url http://localhost:8983/solr/ruian/update...
> Entering auto mode. File endings considered are
>
> xml,json,jsonl,csv,pdf,doc,docx,ppt,pptx,xls,xlsx,odt,odp,ods,ott,otp,ots,rtf,htm,html,txt,log
> POSTing file polygon.csv (text/csv) to [base]
> SimplePostTool: WARNING: Solr returned an error #400 (Bad Request) for url:
> http://localhost:8983/solr/ruian/update
> SimplePostTool: WARNING: Response: 
> 
> 400 name="QTime">3 name="error-class">org.apache.solr.common.SolrException
> name="root-error-class">java.lang.UnsupportedOperationException name="msg">Couldn't parse shape 'POLYGON ((-30 -10,-10 -20,-20 -40,-40
> -40,-30 -10))' because: java.lang.UnsupportedOperationException:
> Unsupported shape of this SpatialContext. Try JTS or Geo3D. name="code">400
> 
> SimplePostTool: WARNING: IOException while reading response:
> java.io.IOException: Server returned HTTP response code: 400 for URL:
> http://localhost:8983/solr/ruian/update
> 1 files indexed.
> COMMITting Solr index changes to http://localhost:8983/solr/ruian/update.
> ..
> Time spent: 0:00:00.036
>
> Could someone give me any advice how to solve it? With indexing points in
> the same way I'm fine.
>
> and one more question:
> I have this field type:
>   class="solr.SpatialRecursivePrefixTreeFieldType"* geo="false*"
> maxDistErr="0.001"
> worldBounds="ENVELOPE(-1,-1,-1,-1)" distErrPct="0.025"
> distanceUnits="kilometers"/>
>
> if I use  geo=false for solr.SpatialRecursivePrefixTreeFieldType and I use
> this query:
>
> http://localhost:8983/solr/ruian/select?indent=on&q=*:*&fq={!bbox%20sfield=mapa}&pt=-818044.37%20-1069122.12&d=20
> <http://localhost:8983/solr/ruian/select?indent=on&q=*:*&fq=%7B!bbox%20sfield=mapa%7D&pt=-818044.37%20-1069122.12&d=20>
> <
> http://localhost:8983/solr/ruian/select?indent=on&q=*:*&fq=%7B!bbox%20sfield=mapa%7D&pt=-818044.37%20-1069122.12&d=20
> >
> for
> getting all object in distance. But I actually don't know in which units
> the distance is with this settings.
>
>
>
> Thank you very much
>
> Jan
>
-- 
Lucene/Solr Search Committer, Consultant, Developer, Author, Speaker
LinkedIn: http://linkedin.com/in/davidwsmiley | Book:
http://www.solrenterprisesearchserver.com


11:12:25 ERROR SolrCore org.apache.solr.common.SolrException: undefined field 1948

2016-05-05 Thread Garfinkel, David
I'm new to administering Solr, but it is part of my DAM and I'd like to
have a better understanding. If I understand correctly I have a field in my
schema with uuid 1948 that is causing an issue right?

-- 
David Garfinkel
Digital Asset Management/Helpdesk/Systems Support
The Museum of Modern Art
212.708.9866
david_garfin...@moma.org


Re: 11:12:25 ERROR SolrCore org.apache.solr.common.SolrException: undefined field 1948

2016-05-05 Thread Garfinkel, David
Thanks Shawn!

On Thu, May 5, 2016 at 12:14 PM, Shawn Heisey  wrote:

> On 5/5/2016 9:52 AM, Garfinkel, David wrote:
> > I'm new to administering Solr, but it is part of my DAM and I'd like to
> > have a better understanding. If I understand correctly I have a field in
> my
> > schema with uuid 1948 that is causing an issue right?
>
> The data being indexed contains a field *named* 1948.  That is not the
> value of the field, it's the name.  Your schema does not contain a field
> named 1948, so Solr refuses to index the data.
>
> Thanks,
> Shawn
>
>


-- 
David Garfinkel
Digital Asset Management/Helpdesk/Systems Support
The Museum of Modern Art
212.708.9866
david_garfin...@moma.org


Re: relaxed vs. improved validation in solr.TrieDateField

2016-05-06 Thread David Smiley
Sorry to hear that Uwe Reh.

If this is just in your input/index data, then this could be handled with
an URP, maybe evan an existing URP.
See ParseDateFieldUpdateProcessorFactory which uses the Joda-time API.  I
am not sure if that will work, I'm a little doubtful in fact since Solr now
uses the Java 8 time API which was taken, more or less, from Joda-time.
But it's worth a shot, any way.  If it doesn't work, let me know and I'll
give you a snippet of JavaScript you can use in your URP chain.

~ David

On Fri, Apr 29, 2016 at 4:07 AM Uwe Reh  wrote:

> Hi,
>
> doing some migration tests (4.10 to 6.0) I recognized a improved
> validation of TrieDateField.
> Syntactical correct but impossible days are rejected now. (stack trace
> at the end of the mail)
>
> Examples:
> - '1997-02-29T00:00:00Z'
> - '2006-06-31T00:00:00Z'
> - '2000-00-00T00:00:00Z'
> The first two dates are formal ok, but the Date does not exist. The
> third date is more suspicions, but was also accepted by Solr 4.10.
>
> I appreciate this improvement in principle, but I have to respect the
> original data. The dates might be intentionally wrong.
>
> Is there an easy way to get the weaker validation back?
>
> Regards
> Uwe
>
>
> > Invalid Date in Date Math String:'1997-02-29T00:00:00Z'
> > at
> org.apache.solr.util.DateMathParser.parseMath(DateMathParser.java:254)
> > at
> org.apache.solr.schema.TrieField.createField(TrieField.java:726)
> > at
> org.apache.solr.schema.TrieField.createFields(TrieField.java:763)
> > at
> org.apache.solr.update.DocumentBuilder.addField(DocumentBuilder.java:47)
>
> --
Lucene/Solr Search Committer, Consultant, Developer, Author, Speaker
LinkedIn: http://linkedin.com/in/davidwsmiley | Book:
http://www.solrenterprisesearchserver.com


Trouble getting "langid.map.individual" setting to work in Solr 5.0.x

2015-08-03 Thread David Smith
I am trying to use “languid.map.individual” setting to allow field “a” to 
detect as, say, English, and be mapped to “a_en”, while in the same document, 
field “b” detects as, say, German and is mapped to “b_de”.

What happens in my tests is that the global language is detected (for example, 
German), but BOTH fields are mapped to “_de” as a result.  I cannot get 
individual detection or mapping to work.  Am I mis-understanding the purpose of 
this setting?

Here is the resulting document from my test:


  {
"id": "1005!22345",
"language": [
  "de"
],
"a_de": "A title that should be detected as English with high 
confidence",
"b_de": "Die Einführung einer anlasslosen Speicherung von 
Passagierdaten für alle Flüge aus einem Nicht-EU-Staat in die EU und umgekehrt 
ist näher gerückt. Der Ausschuss des EU-Parlaments für bürgerliche Freiheiten, 
Justiz und Inneres (LIBE) hat heute mit knapper Mehrheit für einen 
entsprechenden Richtlinien-Entwurf der EU-Kommission gestimmt. Bürgerrechtler, 
Grüne und Linke halten die geplante Richtlinie für eine andere Form der 
anlasslosen Vorratsdatenspeicherung, die alle Flugreisenden zu Verdächtigen 
mache.",
"_version_": 1508494723734569000
  }


I expected “a_de” to be “a_en”, and the “language” multi-valued field to have 
“en” and “de”.

Here is my configuration in solrconfig.xml:





true
a,b
true
true
language
af:uns,ar:uns,bg:uns,bn:uns,cs:uns,da:uns,el:uns,et:uns,fa:uns,fi:uns,gu:uns,he:uns,hi:uns,hr:uns,hu:uns,id:uns,ja:uns,kn:uns,ko:uns,lt:uns,lv:uns,mk:uns,ml:uns,mr:uns,ne:uns,nl:uns,no:uns,pa:uns,pl:uns,ro:uns,ru:uns,sk:uns,sl:uns,so:uns,sq:uns,sv:uns,sw:uns,ta:uns,te:uns,th:uns,tl:uns,tr:uns,uk:uns,ur:uns,vi:uns,zh-cn:uns,zh-tw:uns
en








The debug output of lang detect, during indexing, is as follows:

---
DEBUG - 2015-08-03 14:37:54.450; 
org.apache.solr.update.processor.LanguageIdentifierUpdateProcessor; Language 
detected de with certainty 0.964723182276
DEBUG - 2015-08-03 14:37:54.450; 
org.apache.solr.update.processor.LanguageIdentifierUpdateProcessor; Detected 
main document language from fields [a, b]: de
DEBUG - 2015-08-03 14:37:54.450; 
org.apache.solr.update.processor.LangDetectLanguageIdentifierUpdateProcessor; 
Appending field a
DEBUG - 2015-08-03 14:37:54.451; 
org.apache.solr.update.processor.LangDetectLanguageIdentifierUpdateProcessor; 
Appending field b
DEBUG - 2015-08-03 14:37:54.453; 
org.apache.solr.update.processor.LanguageIdentifierUpdateProcessor; Language 
detected de with certainty 0.964723182276
DEBUG - 2015-08-03 14:37:54.453; 
org.apache.solr.update.processor.LanguageIdentifierUpdateProcessor; Mapping 
field a using individually detected language de
DEBUG - 2015-08-03 14:37:54.454; 
org.apache.solr.update.processor.LanguageIdentifierUpdateProcessor; Doing 
mapping from a with language de to field a_de
DEBUG - 2015-08-03 14:37:54.454; 
org.apache.solr.update.processor.LanguageIdentifierUpdateProcessor; Mapping 
field 1005!22345 to de
DEBUG - 2015-08-03 14:37:54.454; org.eclipse.jetty.webapp.WebAppClassLoader; 
loaded class org.apache.solr.common.SolrInputField from 
WebAppClassLoader=525571@80503
DEBUG - 2015-08-03 14:37:54.454; 
org.apache.solr.update.processor.LanguageIdentifierUpdateProcessor; Removing 
old field a
DEBUG - 2015-08-03 14:37:54.455; 
org.apache.solr.update.processor.LangDetectLanguageIdentifierUpdateProcessor; 
Appending field a
DEBUG - 2015-08-03 14:37:54.455; 
org.apache.solr.update.processor.LangDetectLanguageIdentifierUpdateProcessor; 
Appending field b
DEBUG - 2015-08-03 14:37:54.456; 
org.apache.solr.update.processor.LanguageIdentifierUpdateProcessor; Language 
detected de with certainty 0.980402022373
DEBUG - 2015-08-03 14:37:54.456; 
org.apache.solr.update.processor.LanguageIdentifierUpdateProcessor; Mapping 
field b using individually detected language de
DEBUG - 2015-08-03 14:37:54.456; 
org.apache.solr.update.processor.LanguageIdentifierUpdateProcessor; Doing 
mapping from b with language de to field b_de
DEBUG - 2015-08-03 14:37:54.456; 
org.apache.solr.update.processor.LanguageIdentifierUpdateProcessor; Mapping 
field 1005!22345 to de
DEBUG - 2015-08-03 14:37:54.456; 
org.apache.solr.update.processor.LanguageIdentifierUpdateProcessor; Removing 
old field b
-

From this, my takeaway is that every time the 
LangDetectLanguageIdentifierUpdateProcessor is asked to detect the language, it 
is using field a AND b.  But I can’t quite tell from this output.

Any insight appreciated.

Regards,

David




collection mbeans: requests

2015-08-04 Thread David Santamauro


I have a question about how the stat 'requests' is calculated. I would 
really appreciate it if anyone could shed some light on the figures below.


Assumptions:
  version: 5.2.0
  layout: 8 node solrcloud, no replicas (node71-node78)
  collection: col1
  handler: /search
  stats request: /col1/admin/mbeans?stats=true&cat=QUERYHANDLER&wt=json'

I wrote a simple shell script that grabs the requests stats member from 
every node.


After collection reload
node 71 -- requests: 2
node 72 -- requests: 2
node 73 -- requests: 2
node 74 -- requests: 2
node 75 -- requests: 2
node 76 -- requests: 2
node 77 -- requests: 2
node 78 -- requests: 2
* I assume these are the auto-warm searches


After submitting 1 request (q=*:*)
node 71 -- requests: 4
node 72 -- requests: 3
node 73 -- requests: 3
node 74 -- requests: 3
node 75 -- requests: 3
node 76 -- requests: 4
node 77 -- requests: 3
node 78 -- requests: 3

After resubmitting the same request
node 71 -- requests: 6
node 72 -- requests: 4
node 73 -- requests: 4
node 74 -- requests: 4
node 75 -- requests: 4
node 76 -- requests: 5
node 77 -- requests: 5
node 78 -- requests: 4

If that wasn't strange enough, things get out of control if I add in 
facet.pivot parameter(s)


Fresh after reload (see above, 2 for every node)

Total after a facet.pivot on two fields
node 71 -- requests: 13
node 72 -- requests: 15
node 73 -- requests: 14
node 74 -- requests: 12
node 75 -- requests: 14
node 76 -- requests: 12
node 77 -- requests: 14
node 78 -- requests: 12

I imagine I'm seeing the internal cross-talk between nodes and if so, 
how can one reliably keep stats on the number of "real" requests?


thanks

David


Re: collection mbeans: requests

2015-08-04 Thread David Santamauro


I have your suggested shards.qt set up in another collection for another 
reason but I'll do that redirect here as well, thanks for the confirmation.


On 08/04/2015 10:45 AM, Shawn Heisey wrote:

On 8/4/2015 5:19 AM, David Santamauro wrote:


I have a question about how the stat 'requests' is calculated. I would
really appreciate it if anyone could shed some light on the figures below.

Assumptions:
   version: 5.2.0
   layout: 8 node solrcloud, no replicas (node71-node78)
   collection: col1
   handler: /search
   stats request: /col1/admin/mbeans?stats=true&cat=QUERYHANDLER&wt=json'

I wrote a simple shell script that grabs the requests stats member from
every node.

After collection reload
node 71 -- requests: 2
node 72 -- requests: 2
node 73 -- requests: 2
node 74 -- requests: 2
node 75 -- requests: 2
node 76 -- requests: 2
node 77 -- requests: 2
node 78 -- requests: 2
* I assume these are the auto-warm searches


After submitting 1 request (q=*:*)
node 71 -- requests: 4
node 72 -- requests: 3
node 73 -- requests: 3
node 74 -- requests: 3
node 75 -- requests: 3
node 76 -- requests: 4
node 77 -- requests: 3
node 78 -- requests: 3

After resubmitting the same request
node 71 -- requests: 6
node 72 -- requests: 4
node 73 -- requests: 4
node 74 -- requests: 4
node 75 -- requests: 4
node 76 -- requests: 5
node 77 -- requests: 5
node 78 -- requests: 4

If that wasn't strange enough, things get out of control if I add in
facet.pivot parameter(s)

Fresh after reload (see above, 2 for every node)

Total after a facet.pivot on two fields
node 71 -- requests: 13
node 72 -- requests: 15
node 73 -- requests: 14
node 74 -- requests: 12
node 75 -- requests: 14
node 76 -- requests: 12
node 77 -- requests: 14
node 78 -- requests: 12

I imagine I'm seeing the internal cross-talk between nodes and if so,
how can one reliably keep stats on the number of "real" requests?


Queries on distributed indexes change from the one request that you make
into a request to every shard, to check for relevant documents.  If
relevant documents are found, a second call to those specific shards is
made to retrieve those documents.  So if you have 5 shards in your
index, there could be up to 11 requests counted for a single query.  If
all the shards are on separate nodes, then for that 11-request query,
one of those nodes would count three requests and the others would count
two.

I know what I'm going to say next would work on an index that is
distributed but *not* SolrCloud, and I think it will work in SolrCloud too.

If you add a "shards.qt" parameter to defaults in your main request
handler (usually /select) that points at another, identically configured
handler (perhaps named "/shards") that is also in solrconfig.xml, then
that other handler should receive the distributed requests and the main
handler should only count the "real" requests.  You would be able to
track those numbers separately.

Thanks,
Shawn



Hash of solr documents

2015-08-26 Thread david . davila
Hi,

I have read in one post in the Internet that the hash Solr Cloud 
calculates over the key field to send each document to a different shard 
is indexed. Is this true? If true, is there any way to show this hash for 
each document?

Thanks,

David

Re: Hash of solr documents

2015-08-26 Thread david . davila
Yes, it´s an XY  problem :)

We are making the first tests to split our shard (Solr 5.1)

The problem we have is this: the number of documents indexed in the new 
shards is lower than in the original one (19814  and 19653, vs 61100), and 
always the same. We have no idea why Solr is doing this. A problem with 
some documents, with the segment?

A long time after we changed from "normal" Solr to Solr Cloud, we found 
that the parameter "router" in clusterstate.json was incorrect, because we 
wanted to have "compositeId" and it was set as "explicit". The solution 
was deleting the clusterstate.json and restart Solr. And we are thinking 
that maybe the problem with the SPLIT is related with that: some documents 
are stored with the hash value and others not, and SPLIT needs that to 
distribute them. But I know that this likely has nothing to do with the 
SPLIT problem, it's only an idea. 

This is the log, all seem to be normal:

INFO  - 2015-08-26 09:13:47.654; 
org.apache.solr.handler.admin.CoreAdminHandler; Invoked split action for 
core: buscon
INFO  - 2015-08-26 09:13:47.656; 
org.apache.solr.update.DirectUpdateHandler2; start 
commit{,optimize=false,openSearcher=true,
waitSearcher=true,expungeDeletes=false,softCommit=false,prepareCommit=false}
INFO  - 2015-08-26 09:13:47.656; 
org.apache.solr.update.DirectUpdateHandler2; No uncommitted changes. 
Skipping IW.commit.
INFO  - 2015-08-26 09:13:47.657; org.apache.solr.core.SolrCore; 
SolrIndexSearcher has not changed - not re-opening: org.apach
e.solr.search.SolrIndexSearcher
INFO  - 2015-08-26 09:13:47.657; 
org.apache.solr.update.DirectUpdateHandler2; end_commit_flush
INFO  - 2015-08-26 09:13:47.658; org.apache.solr.update.SolrIndexSplitter; 
SolrIndexSplitter: partitions=2 segments=1
INFO  - 2015-08-26 09:13:47.922; org.apache.solr.update.SolrIndexSplitter; 
SolrIndexSplitter: partition #0 partitionCount=2 r
ange=0-3fff
INFO  - 2015-08-26 09:13:47.922; org.apache.solr.update.SolrIndexSplitter; 
SolrIndexSplitter: partition #0 partitionCount=2 r
ange=0-3fff segment #0 segmentCount=1
INFO  - 2015-08-26 09:22:19.533; org.apache.solr.update.SolrIndexSplitter; 
SolrIndexSplitter: partition #1 partitionCount=2 r
ange=4000-7fff
INFO  - 2015-08-26 09:22:19.536; org.apache.solr.update.SolrIndexSplitter; 
SolrIndexSplitter: partition #1 partitionCount=2 r
ange=4000-7fff segment #0 segmentCount=1
INFO  - 2015-08-26 09:30:44.141; 
org.apache.solr.servlet.SolrDispatchFilter; [admin] webapp=null 
path=/admin/cores params={ta
rgetCore=buscon_shard2_0_replica1&targetCore=buscon_shard2_1_replica1&action=SPLIT&core=buscon&wt=javabin&qt=/admin/cores&ver
sion=2} status=0 QTime=1016486 
INFO  - 2015-08-26 09:30:44.387; 
org.apache.solr.handler.admin.CoreAdminHandler; Applying buffered updates 
on core: buscon_sh
ard2_0_replica1
INFO  - 2015-08-26 09:30:44.387; 
org.apache.solr.handler.admin.CoreAdminHandler; No buffered updates 
available. core=buscon_s
hard2_0_replica1
INFO  - 2015-08-26 09:30:44.388; 
org.apache.solr.servlet.SolrDispatchFilter; [admin] webapp=null 
path=/admin/cores params={na
me=buscon_shard2_0_replica1&action=REQUESTAPPLYUPDATES&wt=javabin&qt=/admin/cores&version=2}
 
status=0 QTime=2 
INFO  - 2015-08-26 09:30:44.441; 
org.apache.solr.handler.admin.CoreAdminHandler; Applying buffered updates 
on core: buscon_sh
ard2_1_replica1
INFO  - 2015-08-26 09:30:44.441; 
org.apache.solr.handler.admin.CoreAdminHandler; No buffered updates 
available. core=buscon_s
hard2_1_replica1
INFO  - 2015-08-26 09:30:44.441; 
org.apache.solr.servlet.SolrDispatchFilter; [admin] webapp=null 
path=/admin/cores params={na
me=buscon_shard2_1_replica1&action=REQUESTAPPLYUPDATES&wt=javabin&qt=/admin/cores&version=2}
 
status=0 QTime=0 
INFO  - 2015-08-26 09:30:44.743; 
org.apache.solr.common.cloud.ZkStateReader$2; A cluster state change: 
WatchedEvent state:Syn
cConnected type:NodeDataChanged path:/clusterstate.json, has occurred - 
updating... (live nodes size: 4)




Thanks,

David



De: Anshum Gupta 
Para:   "solr-user@lucene.apache.org" , 
Fecha:  26/08/2015 10:27
Asunto: Re: Hash of solr documents



Hi David,

The route key itself is indexed, but not the hash value. Why do you need 
to
know and display the hash value? This seems like an XY problem to me:
http://people.apache.org/~hossman/#xyproblem

On Wed, Aug 26, 2015 at 1:17 AM,  wrote:

> Hi,
>
> I have read in one post in the Internet that the hash Solr Cloud
> calculates over the key field to send each document to a different shard
> is indexed. Is this true? If true, is there any way to show this hash 
for
> each document?
>
> Thanks,
>
> David




-- 
Anshum Gupta



Re: collection API timeout

2015-11-04 Thread Julien David

I forgot to mention that we are using Solr 4.9.0 and zookeeper 3.4.6

Thanks

Julien

Le 04/11/2015 11:37, Julien DAVID - Decalog a écrit :

Hi all,

We have a production environment composed by 6 solrcloud server and 3 
zookeeper.

We've got around 30 collections, with 6 shards each.
We recently moved from 3 solr to 6, splitting the shards (3 to 6).

As the last weeks were a low period we didn't noticed any problem.
But since monday, the API collections calls go systematically to timeout.
We use calls to CLUSTERSTATUS, but LIST or OVERSEERSTATUS has the same 
results, whatever the node.


We don't have any problem on the qualification environment which is 
identical, except the load.


The error message is :
CLUSTERSTATUS the collection time 
out:180sorg.apache.solr.common.SolrException: 
CLUSTERSTATUS the collection time out:180s
at 
org.apache.solr.handler.admin.CollectionsHandler.handleResponse(CollectionsHandler.java:368)
at 
org.apache.solr.handler.admin.CollectionsHandler.handleResponse(CollectionsHandler.java:320)
at 
org.apache.solr.handler.admin.CollectionsHandler.handleClusterStatus(CollectionsHandler.java:639)
at 
org.apache.solr.handler.admin.CollectionsHandler.handleRequestBody(CollectionsHandler.java:220)
at 
org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:135)
at 
org.apache.solr.servlet.SolrDispatchFilter.handleAdminRequest(SolrDispatchFilter.java:729)
at 
org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:267)
at 
org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:207)
at 
org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1419)
at 
org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:455)
at 
org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:137)
at 
org.eclipse.jetty.security.SecurityHandler.handle(SecurityHandler.java:557)
at 
org.eclipse.jetty.server.session.SessionHandler.doHandle(SessionHandler.java:231)
at 
org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1075)
at 
org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHandler.java:384)
at 
org.eclipse.jetty.server.session.SessionHandler.doScope(SessionHandler.java:193)
at 
org.eclipse.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:1009)
at 
org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:135)
at 
org.eclipse.jetty.server.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:255)
at 
org.eclipse.jetty.server.handler.HandlerCollection.handle(HandlerCollection.java:154)
at 
org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:116)

at org.eclipse.jetty.server.Server.handle(Server.java:368)
at 
org.eclipse.jetty.server.AbstractHttpConnection.handleRequest(AbstractHttpConnection.java:489)
at 
org.eclipse.jetty.server.BlockingHttpConnection.handleRequest(BlockingHttpConnection.java:53)
at 
org.eclipse.jetty.server.AbstractHttpConnection.headerComplete(AbstractHttpConnection.java:942)
at 
org.eclipse.jetty.server.AbstractHttpConnection$RequestHandler.headerComplete(AbstractHttpConnection.java:1004)

at org.eclipse.jetty.http.HttpParser.parseNext(HttpParser.java:640)
at 
org.eclipse.jetty.http.HttpParser.parseAvailable(HttpParser.java:235)
at 
org.eclipse.jetty.server.BlockingHttpConnection.handle(BlockingHttpConnection.java:72)
at 
org.eclipse.jetty.server.bio.SocketConnector$ConnectorEndPoint.run(SocketConnector.java:264)
at 
org.eclipse.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.java:608)
at 
org.eclipse.jetty.util.thread.QueuedThreadPool$3.run(QueuedThreadPool.java:543)

at java.lang.Thread.run(Thread.java:745)


Thanks for your help

--
Julien









Re: collection API timeout

2015-11-05 Thread Julien David

Seems I'ill need to upgrade to 5.3.1

It is possible to upgrade from 4.9 to 5.3 or do I need deploy all 
intermediate versions?


Thks











Re: Arabic analyser

2015-11-11 Thread David Murgatroyd
>So BasisTech works for the latest version of solr?

Yes, our latest Arabic analyzer supports up through 5.3.x. But since the
examples you give are names, it sounds like you might instead/also want our
fuzzy name matcher which will find "عبد الله" not only with "عبدالله" but
also with typos like "عبالله" or even translations into 'English' like
"abdollah". You can visit http://www.basistech.com/solutions/search/solr/
and fill out the form there to learn more (mentioning this thread). See
also http://www.slideshare.net/dmurga/simple-fuzzy-name-matching-in-solr
for a talk I gave at the San Francisco Solr Meet-up in April on how it
plugs in to Solr by creating a special field type you can query just like
any other; this was also presented at Lucene/Solr Revolution last month (
http://lucenerevolution.org/sessions/simple-fuzzy-name-matching-in-solr/).

Best,
David Murgatroyd
(VP, Engineering, Basis Technology)

On Wed, Nov 11, 2015 at 4:31 AM, Mahmoud Almokadem 
wrote:

> Thank Alex,
>
> So BasisTech works for the latest version of solr?
>
> Sincerely,
> Mahmoud
>
> On Tue, Nov 10, 2015 at 5:28 PM, Alexandre Rafalovitch  >
> wrote:
>
> > If this is for a significant project and you are ready to pay for it,
> > BasisTech has commercial solutions in this area I believe.
> >
> > Regards,
> >Alex.
> > 
> > Solr Analyzers, Tokenizers, Filters, URPs and even a newsletter:
> > http://www.solr-start.com/
> >
> >
> > On 10 November 2015 at 08:46, Mahmoud Almokadem 
> > wrote:
> > > Thanks Pual,
> > >
> > > Arabic analyser applying filters of normalisation and stemming only for
> > > single terms out of standard tokenzier.
> > > Gathering all synonyms will be hard work. Should I customise my
> Tokenizer
> > > to handle this case?
> > >
> > > Sincerely,
> > > Mahmoud
> > >
> > >
> > > On Tue, Nov 10, 2015 at 3:06 PM, Paul Libbrecht 
> > wrote:
> > >
> > >> Mahmoud,
> > >>
> > >> there is an arabic analyzer:
> > >>   https://wiki.apache.org/solr/LanguageAnalysis#Arabic
> > >> doesn't it do what you describe?
> > >> Synonyms probably work there too.
> > >>
> > >> Paul
> > >>
> > >> > Mahmoud Almokadem <mailto:prog.mahm...@gmail.com>
> > >> > 9 novembre 2015 17:47
> > >> > Thanks Jack,
> > >> >
> > >> > This is a good solution, but we have more combinations that I think
> > >> > can’t be handled as synonyms like every word starts with ‘عبد’ ‘Abd’
> > >> > and ‘أبو’ ‘Abo’. When using Standard tokenizer on ‘أبو بكر’ ‘Abo
> > >> > Bakr’, It’ll be tokenised to ‘أبو’ and ‘بكر’ and the filters will be
> > >> > applied for each separate term.
> > >> >
> > >> > Is there available tokeniser to tokenise ‘أبو *’ or ‘عبد *' as a
> > >> > single term?
> > >> >
> > >> > Thanks,
> > >> > Mahmoud
> > >> >
> > >> >
> > >> >
> > >> > Jack Krupansky <mailto:jack.krupan...@gmail.com>
> > >> > 9 novembre 2015 16:47
> > >> > Use an index-time (but not query time) synonym filter with a rule
> > like:
> > >> >
> > >> > Abd Allah,Abdallah
> > >> >
> > >> > This will index the combined word in addition to the separate words.
> > >> >
> > >> > -- Jack Krupansky
> > >> >
> > >> > On Mon, Nov 9, 2015 at 4:48 AM, Mahmoud Almokadem <
> > >> prog.mahm...@gmail.com>
> > >> >
> > >> > Mahmoud Almokadem <mailto:prog.mahm...@gmail.com>
> > >> > 9 novembre 2015 10:48
> > >> > Hello,
> > >> >
> > >> > We are indexing Arabic content and facing a problem for tokenizing
> > multi
> > >> > terms phrases like 'عبد الله' 'Abd Allah', so users will search for
> > >> > 'عبدالله' 'Abdallah' without space and need to get the results of
> 'عبد
> > >> > الله' with space. We are using StandardTokenizer.
> > >> >
> > >> >
> > >> > Is there any configurations to handle this case?
> > >> >
> > >> > Thank you,
> > >> > Mahmoud
> > >> >
> > >>
> > >>
> >
>


Re: Boosting by calculated distance buckets

2015-02-14 Thread David Smiley
Hello,
You can totally boost by calculations that happen on-the-fly on a
per-document basis when you search.  These are called function queries in
Solr.

Your your specific example… a solution that doesn’t involve writing a custom
so-called ValueSource in Java would likely mean calculating the distance
multiple times per document for each range.  Instead I suggest a continuous
function, like the reciprocal of the distance.  See the definition of the
formula here: 
https://cwiki.apache.org/confluence/display/solr/Function+Queries#FunctionQueries-AvailableFunctions
  
For ‘m’ provide 1.0.  For ‘a’ and ‘b’ I suggest using the same value set to
roughly 1/10th the distance to the perimeter of the region of relevant
interest — perhaps 1/10th of say 200km.  You will of course fiddle with this
to your liking.  Assuming you use edismax, you could multiply the natural
score by something like:
&boost=recip(geodist(),1,20,20)

~ David Smiley
Freelance Apache Lucene/Solr Search Consultant/Developer
http://www.linkedin.com/in/davidwsmiley


sraav wrote
> I hit a block when I ran into a use case where I had to boost on ranges of
> distances calculated at query time. This is the case when the distance is
> not present in the document initially but calulated based on the user
> entered lat/long values. 
> 
> 1. Is it required that all the boost parameters be searchable or can we
> boost on dynamic parameters which are calculated ?
> 2. Is there a way to boost on geodist() in a specific range – For example
> – Boost all the cars listed within 20-50kms range(from the search zip) by
> 100. And give a boost of 85 to all the cars listed within 51-80kms range 
> from the search zip. 
> 
> Please provide your feedback and let me know if there are any other
> options that i could try out.





-
 Author: http://www.packtpub.com/apache-solr-3-enterprise-search-server/book
 Independent Lucene/Solr search consultant, 
http://www.linkedin.com/in/davidwsmiley
--
View this message in context: 
http://lucene.472066.n3.nabble.com/Boosting-by-calculated-distance-buckets-tp4186504p4186587.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: Boosting by calculated distance buckets

2015-02-17 Thread David Smiley
Raav,

You may need to actually subscribe to the solr-user list.  Nabble seems to
not be working to well.
p.s. I’m on vacation this week so I can’t be very responsive

First of all... it's not clear you actually want to *boost* (since you seem
to not care about the relevancy score), it seems you want to *sort* based on
a function query.  So simply sort by the function query instead of using the
'bq' param.

Have you read about geodist() in the Solr Reference Guide?  It returns the
spatial distance.  With that and other function queries like map() you could
do something like sum(map(geodist(),0,40,40,0),map(geodist(),0,20,10,0)) and
you could put that into your main function query.  I purposefully overlapped
the map ranges so that I didn't have to deal with double-counting an edge. 
The only thing I don't like about this is that the distance is going to be
calculated as many times as you reference the function, and it's slow.  So
you may want to write your own function query (internally called a
ValueSource), which is relatively easy to do in Solr.

~ David


sraav wrote
> David,
> 
> Thank you for your prompt response. I truly appreciate it. Also, My post
> was not accepted the first two times so I am posting it again one final
> time. 
> 
> In my case I want to turn off the dependency on scoring and let solr use
> just the boost values that I pass to each function to sort on. Here is a
> quick example of how I got that to work with non-geo fields which are
> present in the document and are not dynamically calculated. Using edismax
> ofcourse.
> 
> I was able to turn off the scoring (i mean remove the dependency on score)
> on the result set and drive the sort by the boost that I mentioned in the
> below query. In the below function For example - if the "document1"
> matches the date listed it gets a boost = 5. If the same document matches
> the owner AND product  - it will get an additional boost of 5 more. The
> total boost of this "document1" is 10. From what ever I have seen, it
> seems like i was able to turn off of negate the affects of solr score.
> There was a query norm param that was affecting the boost but it seemed to
> be a constant around 0.70345...most of the time for any fq mentioned).  
> 
> bq = {!func}sum(if(query({!v='datelisted:[2015-01-22T00:00:00.000Z TO
> *]'}),5,0),if(and(query({!v='owner:*BRAVE*'}),query({!v='PRODUCT:*SWORD*'}),5,0))
> 
> What I am trying to do is to add additional boosting function to the
> custom boost that will eventually tie into the above function and boost
> value.
> 
> For example - if "document1" falls in 0-20 KM range i would like to add a
> boost of 50 making the final boost value to be 60. If it falls under
> 20-40KM - i would like to add a boost of 40 and so on.  
> 
> Is there a way we can do this?  Please let me know if I can provide better
> clarity on the use case that I am trying to solve. Thank you David.
> 
> Thanks,
> Raav





-
 Author: http://www.packtpub.com/apache-solr-3-enterprise-search-server/book
 Independent Lucene/Solr search consultant, 
http://www.linkedin.com/in/davidwsmiley
--
View this message in context: 
http://lucene.472066.n3.nabble.com/Boosting-by-calculated-distance-buckets-tp4186504p4187112.html
Sent from the Solr - User mailing list archive at Nabble.com.


Problem with queries that includes NOT

2015-02-25 Thread david . davila
Hello,

We have problems with some queries. All of them include the tag NOT, and 
in my opinion, the results don´t make any sense.

First problem:

This query " NOT Proc:ID01 "   returns   95806 results, however this one "
NOT Proc:ID01 OR FileType:PDF_TEXT" returns  11484 results. But it's 
impossible that adding a tag OR the query has less number of results.

Second problem. Here the problem is because of the brackets and the NOT 
tag:

 This query:

(NOT Proc:"ID01" AND NOT FileType:PDF_TEXT) AND sys_FileType:PROTOTIPE 
returns 0 documents.

But this query:

(NOT Proc:"ID01" AND NOT FileType:PDF_TEXT AND sys_FileType:PROTOTIPE) 
returns 53 documents, which is correct. So, the problem is the position of 
the bracket. I have checked the same query without NOTs, and it works fine 
returning the same number of results in both cases.  So, I think the 
problem is the combination of the bracket positions and the NOT tag.

This second problem is less important, but the queries comes from a web 
page and I'd have to change it, so I need to know if the problem is Solr 
or not.



This is the part of the scheme that applies:





Thank you very much,




David Dávila 

DIT - 915828763


Re: Problem with queries that includes NOT

2015-02-25 Thread david . davila
Hi Shawn,

thank you for your quick response. I will read your links and make some 
tests.

Regards,

David Dávila
DIT - 915828763




De: Shawn Heisey 
Para:   solr-user@lucene.apache.org, 
Fecha:  25/02/2015 13:23
Asunto: Re: Problem with queries that includes NOT



On 2/25/2015 4:04 AM, david.dav...@correo.aeat.es wrote:
> We have problems with some queries. All of them include the tag NOT, and 

> in my opinion, the results don´t make any sense.
> 
> First problem:
> 
> This query " NOT Proc:ID01 "   returns   95806 results, however this one 
"
> NOT Proc:ID01 OR FileType:PDF_TEXT" returns  11484 results. But it's 
> impossible that adding a tag OR the query has less number of results.
> 
> Second problem. Here the problem is because of the brackets and the NOT 
> tag:
> 
>  This query:
> 
> (NOT Proc:"ID01" AND NOT FileType:PDF_TEXT) AND sys_FileType:PROTOTIPE 
> returns 0 documents.
> 
> But this query:
> 
> (NOT Proc:"ID01" AND NOT FileType:PDF_TEXT AND sys_FileType:PROTOTIPE) 
> returns 53 documents, which is correct. So, the problem is the position 
of 
> the bracket. I have checked the same query without NOTs, and it works 
fine 
> returning the same number of results in both cases.  So, I think the 
> problem is the combination of the bracket positions and the NOT tag.

For the first query, there is a difference between "NOT condition1 OR
condition2" and "NOT (condition1 OR condition2)" ... I can imagine the
first one increasing the document count compared to just "NOT
condition1" ... the second one wouldn't increase it.

Boolean queries in Solr (and very likely Lucene as well) do not always
do what people expect.

http://robotlibrarian.billdueber.com/2011/12/solr-and-boolean-operators/
https://lucidworks.com/blog/why-not-and-or-and-not/

As mentioned in the second link above, you'll get better results if you
use the prefix operators with explicit parentheses.  One word of
warning, though -- the prefix operators do not work correctly if you
change the default operator to AND.

Thanks,
Shawn




Re: Problem with queries that includes NOT

2015-02-26 Thread david . davila
Hi,

I thought that we were using the edismax query parser, but it seems that 
we had configured the dismax parser.
I have made some tests with the edismax parser and it works fine, so I'll 
change it in our production Solr.

Regards,

David Dávila
DIT - 915828763




De: Alvaro Cabrerizo 
Para:   "solr-user@lucene.apache.org" , 
Fecha:  25/02/2015 16:41
Asunto: Re: Problem with queries that includes NOT



Hi,

The edismax parser should be able to manage the query you want to ask. 
I've
made a test and the next both queries give me the right result (see the
parenthesis):

   - {!edismax}(NOT id:7 AND NOT id:8  AND id:9)   (gives 1 
hit
   the id:9)
   - {!edismax}((NOT id:7 AND NOT id:8)  AND id:9) (gives 1 
hit
   the id:9)

In general, the issue appears when using the lucene query parser mixing
different boolean clauses (including NOT). Thus, as you commented, the 
next
queries gives different result


   - NOT id:7 AND NOT id:8  AND id:9   (gives 1 hit the
   id:9)
   - (NOT id:7 AND NOT id:8)  AND id:9 (gives 0 hits when
   expecting 1 )

Since I read the chapter "Limitations of prohibited clauses in 
sub-queries"
from the "Apache Solr 3 Enterprise Search Server" many years ago,  I 
always
add the *all documents query clause *:**  to the negative clauses to avoid
the problem you mentioned. Thus I will recommend to rewrite the query you
showed us as:

   - (**:*: AND* NOT Proc:"ID01" AND NOT FileType:PDF_TEXT) AND
   sys_FileType:PROTOTIPE
   - (NOT id:7 AND NOT id:8 *AND *:**)  AND id:9 (gives 1 hit
   as expected)

The above query can be read then as give me all the documents except those
having ID01 and PDF_TEXT and having PROTOTIPE

Regards.




On Wed, Feb 25, 2015 at 1:23 PM, Shawn Heisey  wrote:

> On 2/25/2015 4:04 AM, david.dav...@correo.aeat.es wrote:
> > We have problems with some queries. All of them include the tag NOT, 
and
> > in my opinion, the results don´t make any sense.
> >
> > First problem:
> >
> > This query " NOT Proc:ID01 "   returns   95806 results, however this 
one
> "
> > NOT Proc:ID01 OR FileType:PDF_TEXT" returns  11484 results. But it's
> > impossible that adding a tag OR the query has less number of results.
> >
> > Second problem. Here the problem is because of the brackets and the 
NOT
> > tag:
> >
> >  This query:
> >
> > (NOT Proc:"ID01" AND NOT FileType:PDF_TEXT) AND sys_FileType:PROTOTIPE
> > returns 0 documents.
> >
> > But this query:
> >
> > (NOT Proc:"ID01" AND NOT FileType:PDF_TEXT AND sys_FileType:PROTOTIPE)
> > returns 53 documents, which is correct. So, the problem is the 
position
> of
> > the bracket. I have checked the same query without NOTs, and it works
> fine
> > returning the same number of results in both cases.  So, I think the
> > problem is the combination of the bracket positions and the NOT tag.
>
> For the first query, there is a difference between "NOT condition1 OR
> condition2" and "NOT (condition1 OR condition2)" ... I can imagine the
> first one increasing the document count compared to just "NOT
> condition1" ... the second one wouldn't increase it.
>
> Boolean queries in Solr (and very likely Lucene as well) do not always
> do what people expect.
>
> http://robotlibrarian.billdueber.com/2011/12/solr-and-boolean-operators/
> https://lucidworks.com/blog/why-not-and-or-and-not/
>
> As mentioned in the second link above, you'll get better results if you
> use the prefix operators with explicit parentheses.  One word of
> warning, though -- the prefix operators do not work correctly if you
> change the default operator to AND.
>
> Thanks,
> Shawn
>
>



Re: Solr join + Boost in single query

2015-03-03 Thread David Smiley
No, not without writing something custom anyway. It'd be difficult to make it
fast if there's a lot of documents to join on.


sraav wrote
> David,
> 
> Is it possible to write a query to join two cores and either bring back
> data from the two cores or to boost on the data coming back from either of
> the cores? Is that possible with Solr? 
> 
> Raavi





-
 Author: http://www.packtpub.com/apache-solr-3-enterprise-search-server/book
 Independent Lucene/Solr search consultant, 
http://www.linkedin.com/in/davidwsmiley
--
View this message in context: 
http://lucene.472066.n3.nabble.com/Solr-join-Boost-in-single-query-tp4190825p4190849.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: Price Range Faceting Based on Date Constraints

2015-05-21 Thread David Smiley
Another more modern option, very related to this, is to use DateRangeField in 
5.0.  You have full 64 bit precision.  More info is in the Solr Ref Guide.

If Alessandro sticks with RPT, then the best reference to give is this:
http://wiki.apache.org/solr/SpatialForTimeDurations

~ David
https://www.linkedin.com/in/davidwsmiley

> On May 21, 2015, at 11:49 AM, Holger Rieß  
> wrote:
> 
> Give geospatial search a chance. Use the 
> 'SpatialRecursivePrefixTreeFieldType' field type, set 'geo' to false.
> The date is located on the X-axis, prices on the Y axis.
> For every price you get a horizontal line between start and end date. Index a 
> rectangle with height 0.001(< 1 cent) and width 'end date - start date'.
> 
> Find all prices that are valid on a given day or in a given date range with 
> the 'geofilt' function.
> 
> The field type could look like (not tested):
> 
>  class="solr.SpatialRecursivePrefixTreeFieldType"
>   geo="false" distErrPct="0.025" maxDistErr="0.09" units="degrees"
>   worldBounds="1 0 366 1" />
> 
> Faceting possibly can be done with a facet query for every of your price 
> ranges.
> For example day 20, price range 0-5$, rectangle: 20.0 0.0 
> 21.0 5.0.
> 
> Regards Holger
> 



fq and defType

2015-06-01 Thread david . davila
Hello,

I need to parse some complicated queries that only works properly with the 
edismax query parser, in q and fq parameters. I am testing with 
defType=edismax, but it seems that this clause only affects to the q 
parameter. Is there any way to set edismax to the fq parameter?

Thank you very much, 


David Dávila Atienza
DIT
Teléfono: 915828763
Extensión: 36763

Re: fq and defType

2015-06-01 Thread david . davila
Thank you!

David



De: Shawn Heisey 
Para:   solr-user@lucene.apache.org, 
Fecha:  01/06/2015 18:53
Asunto: Re: fq and defType



On 6/1/2015 10:44 AM, david.dav...@correo.aeat.es wrote:
> I need to parse some complicated queries that only works properly with 
the 
> edismax query parser, in q and fq parameters. I am testing with 
> defType=edismax, but it seems that this clause only affects to the q 
> parameter. Is there any way to set edismax to the fq parameter?

fq={!edismax}querystring

The other edismax parameters on your request (qf, etc) apply to those
filter queries just like they would for the q parameter.

Thanks,
Shawn




Looking for help in building a configuration that should be simple

2015-06-02 Thread David Patterson
I've been asked to build a sample configuration of SolrCloud using Solr
4.10.

I want to have two instances (virtual machines) each with two solr nodes.
Let's call the instances 1 and 2, and the nodes 1AO, 1BB, 2AB, and 2BO.  I
want 1AO to be the owner of that shard with 2AB as the backup, and 2BO to
be the owner of its data and have 1BB as its backup.

I also want to use an external ZooKeeper that we already have and trust for
all 4 solr nodes.

Is this something that is doable, and what does it take to make it so?

Thanks.

Dave Patterson


Could not find configName for collection client_active found:nul

2015-06-03 Thread David McReynolds
I’m helping someone with this but my zookeeper experience is limited (as in
none). They have purportedly followed the instruction from the wiki.



https://cwiki.apache.org/confluence/display/solr/Setting+Up+an+External+ZooKeeper+Ensemble





Jun 02, 2015 2:40:37 PM org.apache.solr.common.cloud.ZkStateReader
updateClusterState

INFO: Updating cloud state from ZooKeeper...

Jun 02, 2015 2:40:37 PM org.apache.solr.cloud.ZkController
createCollectionZkNode

INFO: Check for collection zkNode:client_active

Jun 02, 2015 2:40:37 PM org.apache.solr.cloud.Overseer$ClusterStateUpdater
updateState

INFO: Update state numShards=null message={

  "operation":"state",

  "state":"down",

  "base_url":"http://10.10.1.178:8983/solr";,

  "core":"client_active",

  "roles":null,

  "node_name":"10.10.1.178:8983_solr",

  "shard":null,

  "collection":"client_active",

  "numShards":null,

  "core_node_name":"10.10.1.178:8983_solr_client_active"}

Jun 02, 2015 2:40:37 PM org.apache.solr.cloud.ZkController
createCollectionZkNode

INFO: Creating collection in ZooKeeper:client_active

Jun 02, 2015 2:40:37 PM org.apache.solr.cloud.Overseer$ClusterStateUpdater
updateState

INFO: shard=shard1 is already registered

Jun 02, 2015 2:40:37 PM org.apache.solr.cloud.ZkController getConfName

INFO: Looking for collection configName

Jun 02, 2015 2:40:37 PM org.apache.solr.cloud.ZkController getConfName

INFO: Could not find collection configName - pausing for 3 seconds and
trying again - try: 1

Jun 02, 2015 2:40:37 PM
org.apache.solr.cloud.DistributedQueue$LatchChildWatcher process

INFO: LatchChildWatcher fired on path: /overseer/queue state: SyncConnected
type NodeChildrenChanged

Jun 02, 2015 2:40:37 PM org.apache.solr.common.cloud.ZkStateReader$2 process

INFO: A cluster state change: WatchedEvent state:SyncConnected
type:NodeDataChanged path:/clusterstate.json, has occurred - updating...
(live nodes size: 1)

Jun 02, 2015 2:40:40 PM org.apache.solr.cloud.ZkController getConfName

INFO: Could not find collection configName - pausing for 3 seconds and
trying again - try: 2

Jun 02, 2015 2:40:43 PM org.apache.solr.cloud.ZkController getConfName

INFO: Could not find collection configName - pausing for 3 seconds and
trying again - try: 3

Jun 02, 2015 2:40:46 PM org.apache.solr.cloud.ZkController getConfName

INFO: Could not find collection configName - pausing for 3 seconds and
trying again - try: 4

Jun 02, 2015 2:40:49 PM org.apache.solr.cloud.ZkController getConfName

INFO: Could not find collection configName - pausing for 3 seconds and
trying again - try: 5

Jun 02, 2015 2:40:52 PM org.apache.solr.cloud.ZkController getConfName

SEVERE: Could not find configName for collection client_active

Jun 02, 2015 2:40:52 PM org.apache.solr.core.CoreContainer recordAndThrow

SEVERE: Unable to create core: client_active

org.apache.solr.common.cloud.ZooKeeperException: Could not find configName
for collection client_active found:null

-- 
--
*Mi aerodeslizador está lleno de anguilas.*


How important is the name of the data collection?

2015-06-08 Thread David Patterson
I'm trying to make two virtual machines, each with one 4.10 SOLR-cloud code
instance, connected to the same external Zookeeper site.

I want to create one data collection with one shard on each of these two
machines.

If I use the "start" method as described in the Apache Solr Reference Guide
for my release, will the two machines be connected if I declare the same
data collection name for both of them? If not, how do I connect them?

(I know the start method can make two solr-cloud instances on ONE virtual
machine, but I want to make one on each of two virtual machines.)

Thanks

Dave P


Re: Highlighting phone numbers

2016-05-18 Thread David Smiley
Perhaps an easy thing to try is see of the FastVectorHighlighter yields any
different results.  There are some nuances to the highlighters -- it might.

Failing that, this likely due to your analysis chain, and where exactly the
offsets point to, which you can see/debug in Solr's analysis screen.  You
might have to develop custom analysis components (e.g. custom TokenFilter)
if the offsets aren't what you want.

Good luck,
~ David

On Wed, May 18, 2016 at 9:07 AM marotosg  wrote:

> Hi,
>
> I have a solr multivalued field with a list of phone numbers with many
> different formats. Below field type.
> 
> 
> 
> 
>  pattern="([^0-9])"
> replacement="" replace="all"/>
>  minGramSize="5" maxGramSize="30"
> />
> 
> 
> 
> 
>  pattern="([^0-9])"
> replacement="" replace="all"/>
>  minGramSize="3" maxGramSize="30"
> />
> 
>  class="com.spencerstuart.similarities.SpencerStuartNoSimilarity"/>
> 
>
> I have a requirement to highlight the part of the number matched to explain
> to the user why this record is returned.
>
> If I search for "17573062033" I am able to match many results but the
> fullnumber is highlighted.
>
> 
>   0
>   12
>   
> CoreID,PhoneListS
> true
> PhoneListS:17573062033
> 1463576646314
> 
> 
> PhoneListS
> xml
> true
> 3
>   
> 
> 
>   
> 
>   1757.306.2033
> 
> 10224838
>   
> 
>   1757.306.2033
> 
> 10224840
>   
> 
>   1757.306.2089
>   1757.306.7006
> 
> 10034811
> 
> 
>   
> 
>   1757.306.2033
> 
>   
>   
> 
>   1757.306.2033
> 
>   
>   
> 
>   1757.306.2089
> 
>   
> 
> 
>
> Would it be possible to get the piece of information which matches.
> Something like this 1757.306.2089
>
> thanks
> Sergio
>
>
>
>
> --
> View this message in context:
> http://lucene.472066.n3.nabble.com/Highlighting-phone-numbers-tp4277491.html
> Sent from the Solr - User mailing list archive at Nabble.com.
>
-- 
Lucene/Solr Search Committer, Consultant, Developer, Author, Speaker
LinkedIn: http://linkedin.com/in/davidwsmiley | Book:
http://www.solrenterprisesearchserver.com


Re: Facet heatmaps: cluster coordinates based on average position of docs

2016-05-18 Thread David Smiley
Sorry for such a belated response; I don't monitor this list as much as I
used to.
My response is within...

On Wed, Apr 20, 2016 at 4:28 AM Anton K.  wrote:

> Thanks for your answer, David, and have a good vacation.
>
> It seems more detailed heatmap is not a goods solution in my case because i
> need to display cluster icon with number of items inside cluster. So if i
> got very large amount of cells on map, some of the cells will overlap.
>

I did not mean to suggest you display one cluster for each non-zero heatmap
cell; I meant you funnel this as input to other client-side heatmap
renderers that do the clustering.  The point of this is to keep the number
of inputs to that renderer manageable instead of potentially a gazillion if
you have that many docs/points.

I also think about Stat component for facet.heatmap feature. Maybe we can
> use stat component to add average positions of documents in cell?
>

I think I've seen hand-rolled heatmap capabilities added to Solr (i.e. no
custom Solr hacking) that went about it kinda like that.  stats.facet on
some geohash (or similar), then average lat & average lon.

~ David


> 2016-04-20 4:28 GMT+03:00 David Smiley :
>
> > Hi Anton,
> >
> > Perhaps you should request a more detailed / high-res heatmap, and then
> > work with that, perhaps using some clustering technique?  I confess I
> don't
> > work on the UI end of things these days.
> >
> > p.s. I'm on vacation this week; so I don't respond quickly
> >
> > ~ David
> >
> > On Thu, Apr 7, 2016 at 3:43 PM Anton K.  wrote:
> >
> > > I am working with new solr feature: facet heatmaps. It works great, i
> > > create clusters on my map with counts. When user click on cluster i
> zoom
> > in
> > > that area and i might show him more clusters or documents (based on
> > current
> > > zoom level).
> > >
> > > But all my cluster icons (i use round one, see screenshot below) placed
> > > straight in the center of cluster's rectangles:
> > >
> > > https://dl.dropboxusercontent.com/u/1999619/images/map_grid3.png
> > >
> > > Some clusters can be in sea and so on. Also it feels not natural in my
> > case
> > > to have icons placed orderly on the world map.
> > >
> > > I want to place cluster's icons in average coords based on coordinates
> of
> > > all my docs inside cluster. Is there any way to achieve this? I am
> trying
> > > to use stats component for facet heatmap but it isn't implemented yet.
> > >
> > --
> > Lucene/Solr Search Committer, Consultant, Developer, Author, Speaker
> > LinkedIn: http://linkedin.com/in/davidwsmiley | Book:
> > http://www.solrenterprisesearchserver.com
> >
>
-- 
Lucene/Solr Search Committer, Consultant, Developer, Author, Speaker
LinkedIn: http://linkedin.com/in/davidwsmiley | Book:
http://www.solrenterprisesearchserver.com


Re: Issues with coordinates in Solr during updating of fields

2016-06-13 Thread David Smiley
Zheng,
There are a few Solr FieldTypes that are basically composite fields -- a
virtual field of other fields.  AFAIK they are all spatial related.  You
don't necessarily need to pay attention to the fact that gps_1_coordinate
exists under the hood unless you wish to customize the options on that
field type in the schema.  e.g. if you don't need it for filtering (perhaps
using RPT for that) then you can set indexed=false.
~ David

On Fri, Jun 10, 2016 at 8:43 PM Zheng Lin Edwin Yeo 
wrote:

> Would like to check, what is the use of the gps_0_coordinate and
> gps_1_coordinate
> field then? Is it just to store the data points, or does it have any other
> use?
>
> When I do the query, I found that we are only querying the gps_field, which
> is something like this:
> http://localhost:8983/solr/collection1/highlight?q=*:*&fq={!geofilt
> pt=1.5,100.0
> <http://localhost:8983/solr/collection1/highlight?q=*:*&fq=%7B!geofiltpt=1.5,100.0>
> sfield=gps d=5}
>
>
> Regards,
> Edwin
>
> On 27 May 2016 at 08:48, Erick Erickson  wrote:
>
> > Should be fine. When the location field is
> > re-indexed (as it is with Atomic Updates)
> > the two fields will be filled back in.
> >
> > Best,
> > Erick
> >
> > On Thu, May 26, 2016 at 4:45 PM, Zheng Lin Edwin Yeo
> >  wrote:
> > > Thanks Erick for your reply.
> > >
> > > It works when I remove the 'stored="true" ' from the gps_0_coordinate
> and
> > > gps_1_coordinate.
> > >
> > > But will this affect the search functions of the gps coordinates in the
> > > future?
> > >
> > > Yes, I am referring to Atomic Updates.
> > >
> > > Regards,
> > > Edwin
> > >
> > >
> > > On 27 May 2016 at 02:02, Erick Erickson 
> wrote:
> > >
> > >> Try removing the 'stored="true" ' from the gps_0_coordinate and
> > >> gps_1_coordinate.
> > >>
> > >> When you say "...tried to do an update on any other fileds" I'm
> assuming
> > >> you're
> > >> talking about Atomic Updates, which require that the destinations of
> > >> copyFields are single valued. Under the covers the location type is
> > >> split and copied to the other two fields so I suspect that's what's
> > going
> > >> on.
> > >>
> > >> And you could also try one of the other types, see:
> > >> https://cwiki.apache.org/confluence/display/solr/Spatial+Search
> > >>
> > >> Best,
> > >> Erick
> > >>
> > >> On Thu, May 26, 2016 at 1:46 AM, Zheng Lin Edwin Yeo
> > >>  wrote:
> > >> > Anyone has any solutions to this problem?
> > >> >
> > >> > I tried to remove the gps_0_coordinate and gps_1_coordinate, but I
> > will
> > >> get
> > >> > the following error during indexing.
> > >> > ERROR: [doc=id1] unknown field 'gps_0_coordinate'
> > >> >
> > >> > Regards,
> > >> > Edwin
> > >> >
> > >> >
> > >> > On 25 May 2016 at 11:37, Zheng Lin Edwin Yeo 
> > >> wrote:
> > >> >
> > >> >> Hi,
> > >> >>
> > >> >> I have an implementation of storing the coordinates in Solr during
> > >> >> indexing.
> > >> >> During indexing, I will only store the value in the field name
> > ="gps".
> > >> For
> > >> >> the field name = "gps_0_coordinate" and "gps_1_coordinate", the
> value
> > >> will
> > >> >> be auto filled and indexed from the "gps" field.
> > >> >>
> > >> >> > >> required="false"/>
> > >> >> > >> stored="true" required="false"/>
> > >> >> > >> stored="true" required="false"/>
> > >> >>
> > >> >> But when I tried to do an update on any other fields in the index,
> > Solr
> > >> >> will try to add another value in the "gps_0_coordinate" and
> > >> >> "gps_1_coordinate". However, as these 2 fields are not
> multi-Valued,
> > it
> > >> >> will lead to an error:
> > >> >> multiple values encountered for non multiValued field
> > gps_0_coordinate:
> > >> >> [1.0,1.0]
> > >> >>
> > >> >> Does anyone knows how we can solve this issue?
> > >> >>
> > >> >> I am using Solr 5.4.0
> > >> >>
> > >> >> Regards,
> > >> >> Edwin
> > >> >>
> > >>
> >
>
-- 
Lucene/Solr Search Committer, Consultant, Developer, Author, Speaker
LinkedIn: http://linkedin.com/in/davidwsmiley | Book:
http://www.solrenterprisesearchserver.com


Re: json facet - date range & interval

2016-06-28 Thread David Santamauro


Have you tried %-escaping?

json.facet = {
  daterange : { type  : range,
field : datefield,
start : "NOW/DAY%2D10DAYS",
end   : "NOW/DAY",
gap   : "%2B1DAY"
  }
}


On 06/28/2016 01:19 PM, Jay Potharaju wrote:

json.facet={daterange : {type : range, field : datefield, start :
"NOW/DAY-10DAYS", end : "NOW/DAY",gap:"\+1DAY"} }

Escaping the plus sign also gives the same error. Any other suggestions how
can i make this work?
Thanks
Jay

On Mon, Jun 27, 2016 at 10:23 PM, Erick Erickson 
wrote:


First thing I'd do is escape the plus. It's probably being interpreted
as a space.

Best,
Erick

On Mon, Jun 27, 2016 at 9:24 AM, Jay Potharaju 
wrote:

Hi,
I am trying to use the json range facet with a tdate field. I tried the
following but get an error. Any suggestions on how to fix the following
error /examples for date range facets.

json.facet={daterange : {type : range, field : datefield, start
:"NOW-10DAYS", end : "NOW/DAY", gap : "+1DAY" } }

  msg": "Can't add gap 1DAY to value Fri Jun 17 15:49:36 UTC 2016 for

field:

datefield", "code": 400

--
Thanks
Jay








Re: error rendering solr spatial in geoserver

2016-06-29 Thread David Smiley
For polygons in 6.0 you need to set
spatialContextFactory="org.locationtech.spatial4j.context.jts.JtsSpatialContextFactory"
-- see
https://cwiki.apache.org/confluence/display/solr/Spatial+Search and the
example.  And of course as you probably already know, put the JTS jar on
Solr's classpath.  What likely tripped you up between 5x and 6x is the
change in value of the spatialContextFactory as a result in organizational
package moving "com.spatial4j.core" to "org.locationtech.spatial4j".

On Wed, Jun 29, 2016 at 12:44 PM tkg_cangkul  wrote:

> hi erick, thx for your reply.
>
> i've solve this problem.
> i got this error when i use solr 6.0.0
> so i try to downgrade my solr to version 5.5.0 and it's successfull
>
>
> On 29/06/16 22:39, Erick Erickson wrote:
> > There is not nearly enough information here to say anything very helpful.
> > What does your schema look like for this field?
> > What does the input look like?
> > How are you pulling data from geoserver?
> >
> > You might want to review:
> > http://wiki.apache.org/solr/UsingMailingLists
> >
> > Best,
> > Erick
> >
> > On Wed, Jun 29, 2016 at 2:31 AM, tkg_cangkul  > > wrote:
> >
> > hi, i try to load data spatial from solr with geoserver.
> > when i try to show the layer preview i've got this error message.
> >
> > error
> >
> >
> > anybody can help me pls?
> >
> >
>
> --
Lucene/Solr Search Committer, Consultant, Developer, Author, Speaker
LinkedIn: http://linkedin.com/in/davidwsmiley | Book:
http://www.solrenterprisesearchserver.com


Re: error rendering solr spatial in geoserver

2016-07-01 Thread David Smiley
Sorry, good point Era; I forgot about that.  I filed an issue:
https://issues.apache.org/jira/browse/SOLR-9270
When I work on that I'll add an upgrading note to the 6x section.

~ David

On Wed, Jun 29, 2016 at 6:31 AM Ere Maijala  wrote:

> It would have been _really_ nice if this had been in the release notes.
> Made me also scratch my head for a while when upgrading to Solr 6.
> Additionally, this makes a rolling upgrade from Solr 5.x a bit more
> scary since you have to update the collection schema to make the Solr 6
> nodes work while making sure that no Solr 5 node reloads the configuration.
>
> --Ere
>
> 30.6.2016, 3.46, David Smiley kirjoitti:
> > For polygons in 6.0 you need to set
> >
> spatialContextFactory="org.locationtech.spatial4j.context.jts.JtsSpatialContextFactory"
> > -- see
> > https://cwiki.apache.org/confluence/display/solr/Spatial+Search and the
> > example.  And of course as you probably already know, put the JTS jar on
> > Solr's classpath.  What likely tripped you up between 5x and 6x is the
> > change in value of the spatialContextFactory as a result in
> organizational
> > package moving "com.spatial4j.core" to "org.locationtech.spatial4j".
> >
> > On Wed, Jun 29, 2016 at 12:44 PM tkg_cangkul 
> wrote:
> >
> >> hi erick, thx for your reply.
> >>
> >> i've solve this problem.
> >> i got this error when i use solr 6.0.0
> >> so i try to downgrade my solr to version 5.5.0 and it's successfull
> >>
> >>
> >> On 29/06/16 22:39, Erick Erickson wrote:
> >>> There is not nearly enough information here to say anything very
> helpful.
> >>> What does your schema look like for this field?
> >>> What does the input look like?
> >>> How are you pulling data from geoserver?
> >>>
> >>> You might want to review:
> >>> http://wiki.apache.org/solr/UsingMailingLists
> >>>
> >>> Best,
> >>> Erick
> >>>
> >>> On Wed, Jun 29, 2016 at 2:31 AM, tkg_cangkul  >>> <mailto:yuza.ras...@gmail.com>> wrote:
> >>>
> >>> hi, i try to load data spatial from solr with geoserver.
> >>> when i try to show the layer preview i've got this error message.
> >>>
> >>> error
> >>>
> >>>
> >>> anybody can help me pls?
> >>>
> >>>
> >>
> >> --
> > Lucene/Solr Search Committer, Consultant, Developer, Author, Speaker
> > LinkedIn: http://linkedin.com/in/davidwsmiley | Book:
> > http://www.solrenterprisesearchserver.com
> >
>
> --
> Ere Maijala
> Kansalliskirjasto / The National Library of Finland
>
-- 
Lucene/Solr Search Committer, Consultant, Developer, Author, Speaker
LinkedIn: http://linkedin.com/in/davidwsmiley | Book:
http://www.solrenterprisesearchserver.com


Re: analyzer for _text_ field

2016-07-15 Thread David Santamauro


The opening and closing single quotes don't match

-data-binary '{ ... }’

it should be:

-data-binary '{ ... }'


On 07/15/2016 02:59 PM, Steve Rowe wrote:

Waldyr, maybe it got mangled by my email client or yours?

Here’s the same command:

   

--
Steve
www.lucidworks.com


On Jul 15, 2016, at 2:16 PM, Waldyr Neto  wrote:

Hy Steves, tks for the help
unfortunately i'm making some mistake

when i try to run



curl -X POST -H 'Content-type: application/json’ \
http://localhost:8983/solr/gettingstarted/schema --data-binary
'{"add-field-type": { "name": "my_new_field_type", "class":
"solr.TextField","analyzer": {"charFilters": [{"class":
"solr.HTMLStripCharFilterFactory"}], "tokenizer": {"class":
"solr.StandardTokenizerFactory"},"filters":[{"class":
"solr.WordDelimiterFilterFactory"}, {"class":
"solr.LowerCaseFilterFactory"}]}},"replace-field": { "name":
"_text_","type": "my_new_field_type", "multiValued": "true","indexed":
"true","stored": "false"}}’

i receave the folow error msg from curl program
:

curl: (3) [globbing] unmatched brace in column 1

curl: (6) Could not resolve host: name

curl: (6) Could not resolve host: my_new_field_type,

curl: (6) Could not resolve host: class

curl: (6) Could not resolve host: solr.TextField,analyzer

curl: (3) [globbing] unmatched brace in column 1

curl: (3) [globbing] bad range specification in column 2

curl: (3) [globbing] unmatched close brace/bracket in column 32

curl: (6) Could not resolve host: tokenizer

curl: (3) [globbing] unmatched brace in column 1

curl: (3) [globbing] unmatched close brace/bracket in column 30

curl: (3) [globbing] unmatched close brace/bracket in column 32

curl: (3) [globbing] unmatched brace in column 1

curl: (3) [globbing] unmatched close brace/bracket in column 28

curl: (3) [globbing] unmatched brace in column 1

curl: (6) Could not resolve host: name

curl: (6) Could not resolve host: _text_,type

curl: (6) Could not resolve host: my_new_field_type,

curl: (6) Could not resolve host: multiValued

curl: (6) Could not resolve host: true,indexed

curl: (6) Could not resolve host: true,stored

curl: (3) [globbing] unmatched close brace/bracket in column 6

cvs1:~ vvisionphp1$

On Fri, Jul 15, 2016 at 2:45 PM, Steve Rowe  wrote:


Hi Waldyr,

An example of changing the _text_ analyzer by first creating a new field
type, and then changing the _text_ field to use the new field type (after
starting Solr 6.1 with “bin/solr start -e schemaless”):

-
PROMPT$ curl -X POST -H 'Content-type: application/json’ \
http://localhost:8983/solr/gettingstarted/schema --data-binary '{
  "add-field-type": {
"name": "my_new_field_type",
"class": "solr.TextField",
"analyzer": {
  "charFilters": [{
"class": "solr.HTMLStripCharFilterFactory"
  }],
  "tokenizer": {
"class": "solr.StandardTokenizerFactory"
  },
  "filters":[{
  "class": "solr.WordDelimiterFilterFactory"
}, {
  "class": "solr.LowerCaseFilterFactory"
  }]}},
  "replace-field": {
"name": "_text_",
"type": "my_new_field_type",
"multiValued": "true",
"indexed": "true",
"stored": "false"
  }}’
-

PROMPT$ curl
http://localhost:8983/solr/gettingstarted/schema/fields/_text_

-
{
  "responseHeader”:{ […] },
  "field":{
"name":"_text_",
"type":"my_new_field_type",
"multiValued":true,
"indexed":true,
"stored":false}}
-

--
Steve
www.lucidworks.com


On Jul 15, 2016, at 12:54 PM, Waldyr Neto  wrote:

Hy, How can i configure the analyzer for the _text_ field?







Re: error indexing spatial

2016-07-25 Thread David Smiley
Hi tig.  Most likely, you didn't repeat the first point as the last.  Even
though it's redundant, nonetheless this is what WKT (and some other spatial
formats) calls for.
~ David

On Wed, Jul 20, 2016 at 10:13 PM tkg_cangkul  wrote:

> hi i try to indexing spatial format to solr 5.5.0 but i've got this error
> message.
>
> [image: error1]
>
> [image: error2]
> anybody can help me to solve this pls?
>
-- 
Lucene/Solr Search Committer, Consultant, Developer, Author, Speaker
LinkedIn: http://linkedin.com/in/davidwsmiley | Book:
http://www.solrenterprisesearchserver.com


Re: Need Help Resolving Unknown Shape Definition Error

2016-08-15 Thread David Smiley
Hello Jennifer,

The spatial documentation is largely this page:
https://cwiki.apache.org/confluence/display/solr/Spatial+Search
(however note the online version is always for the latest Solr release. You
can download a PDF versioned against your Solr version).

To do polygon searches, you both need to add the JTS jar (which you already
did), and also to set the spatialContextFactory as the ref guide indicates
-- that you have yet to do and is I think why you see that error.

Another thing I see that looks like a problem is that you set geo=false,
yet didn't set the worldBounds.  Typically geo=true and you get the typical
decimal degree +/- 180, +/- 90 box.  But if you set false then the grid
system  needs to know the extent of your grid.

~ David

On Thu, Aug 11, 2016 at 4:04 PM Jennifer Coston <
jennifer.cos...@raytheon.com> wrote:

>
> Hello,
>
> I am trying to setup a local solr core so that I can perform Spatial
> searches on it. I am using version 5.2.1. I have updated my schema.xml file
> to include the location-rpt fieldType:
>
>  class="solr.SpatialRecursivePrefixTreeFieldType"
> geo="false" distErrPct="0.025" maxDistErr="0.001"
> distanceUnits="degrees" />
>
> And I have defined my field to use this type:
>
>  stored="true" />
>
> I also added the jts-1.4.0.jar file to C:\solr-5.2.1\server\solr-webapp
> \webapp\WEB-INF\lib.
>
> However when I try to add a document through the Solr Admin Console I am
> seeing this response:
>
> {
>   "responseHeader": {
> "status": 400,
> "QTime": 6
>   },
>   "error": {
> "msg": "Unknown Shape definition [POLYGON((-77.23 38.922, -77.23
> 38.923, -77.228 38.923, -77.228 38.922, -77.23 38.922))]",
> "code": 400
>   }
> }
>
> I can submit documents successfully if I remove the positionWkt field. Did
> I miss a configuration step?
>
> Here is the document I am trying to add:
>
> {
> "observationId": "8e09f47f",
> "observationType": "image",
> "startTime": "2015-09-19T21:03:51Z",
> "endTime": "2015-09-19T21:03:51Z",
> "receiptTime": "2016-07-29T15:49:49.328Z",
> "locationLat": 38.9225015078814,
> "locationLon": -77.22900299194423,
> "position": "38.9225015078814,-77.22900299194423",
> "positionWkt": "POLYGON((-77.23 38.922, -77.23 38.923, -77.228
> 38.923, -77.228 38.922, -77.23 38.922))",
> "provider": "a"
> }
>
> Here are the fields I added to the schema.xml file (I started with the
> template, please let me know if you need the whole thing):
>
> observationId
>
> 
> 
> 
>  required="true" multiValued="false"/>
> 
> 
> 
> 
> 
> 
> 
>  stored="true" />
>
> Thank you!
>
> Jennifer

-- 
Lucene/Solr Search Committer, Consultant, Developer, Author, Speaker
LinkedIn: http://linkedin.com/in/davidwsmiley | Book:
http://www.solrenterprisesearchserver.com


Re: Sorting on DateRangeField?

2016-09-09 Thread David Smiley
Hi Alex,

DateRangeField extends some spatial stuff, which has that error message in
it, not in DateRangeField proper.  You cannot sort on a DateRangeField.  If
you want to... try adding either one plain docValues field if you just have
date instances, or a pair of them to hold a min & max and pick the right
one to sort on.

The "sorting by the query" in the context of spatial refers to doing a
score sorted sort, noting that the score of a spatial query can be the
distance or some formula involving the distance or possibly overlap of the
shape with something else.  e.g.  q={!geofilt score=distance ...}  This
is documented in the ref guide on the spatial page, including an example
for BBoxField.

&q={!field f=bbox score=overlapRatio}Intersects(ENVELOPE(-10, 20, 15, 10))


I think that example could be simpler using {!bbox} but probably wants to
show different ways to skin this cat, so to speak.

~ David

On Wed, Sep 7, 2016 at 1:49 PM Alexandre Rafalovitch 
wrote:

> So, I tried sorting on a DateRangeField. And I got back:  "Sorting not
> supported on SpatialField: release_date, instead try sorting by
> query."
>
> Two questions:
> 1) Spatial is kind of super-internal info here, the message is rather
> confusing.
> 2) What's "sorting by query" in this case? Can I still sort on the
> field, but with a different syntax?
>
> Regards,
>Alex.
> 
> Newsletter and resources for Solr beginners and intermediates:
> http://www.solr-start.com/
>
-- 
Lucene/Solr Search Committer, Consultant, Developer, Author, Speaker
LinkedIn: http://linkedin.com/in/davidwsmiley | Book:
http://www.solrenterprisesearchserver.com


Re: script to get core num docs

2016-09-19 Thread David Santamauro


https://cwiki.apache.org/confluence/display/solr/CoreAdmin+API

wget -O- -q \

'/admin/cores?action=STATUS&core=coreName&wt=json&indent=true' 
\


  | grep numDocs

//


/admin/cores?action=STATUS&core=alexandria_shard2_replica1&wt=json&indent=1'|grep 
numDocs|cut -f2 -d':'|


On 09/19/2016 11:22 AM, KRIS MUSSHORN wrote:

How can i get the count of docs from a core with bash?
Seems like I have to call Admin/Luke but cant find any specifics.
Thanks
Kris



Re: request SOLR - spatial field with Intersect and Contains functions

2016-09-19 Thread David Smiley
Hi Leo,

You should use two spatial fields for this -- one is for an indexed
Box/Envelope, and another for an indexed LineString.  The indexed box
should use either BBoxField or RptWithGeometrySpatialField, and the
LineString field should use RptWithGeometrySpatialField.   If you have an
older installation 5.x version, RptWithGeometrySpatialField may not be
available in which case settle
for solr.SpatialRecursivePrefixTreeFieldType.  When you do a search, it'd
be a search for one field OR the other with the requirements you have for
each.

~ David

On Mon, Sep 19, 2016 at 8:48 AM Leo BRUVRY-LAGADEC <
leo.bruvry.laga...@partenaire-exterieur.ifremer.fr> wrote:

> Hi,
>
> I am trying spatial search in SOLR 5.0 and I don't know how to implement
> a solution for the problem I will try to explain.
>
> On a SOLR server I have indexed a collection of objects that contains
> spacial field :
>
>  multiValued="true" />
>  class="solr.SpatialRecursivePrefixTreeFieldType"
> geo="true"
> distErrPct="0.025"
> maxDistErr="0.09"
> distanceUnits="degrees" />
>
> The spatial data indexed in the field named "geo" can be ENVELOPE or
> LINESTRING :
>
> LINESTRING(-4.6837 48.5792, -4.6835 48.5788, -4.684
> 48.5788, -4.6832 48.579, -4.6837 48.5792, -4.6188 48.6265, -4.6122
> 48.63, -4.615 48.6258, -4.6125 48.6215, -4.6112 48.6218)
>
> or
>
> ENVELOPE(-5.0, -4.0, 49.0, 48.0)
>
> Actually in my application, when I do a SOLR request to get objects that
> are in a spatial area, I do something like this :
>
> q=:&fq=(geo:"Intersects(ENVELOPE(-116.894531, 107.402344, 57.433227,
> -42.146973))")
>
> But I want to change how it work. Now, when the geo field contain an
> ENVELOPE I want to do an CONTAINS request and when it contain a
> LINESTRING I want to do an INTERSECTS request.
>
> example :
>
> If geo = ENVELOPE then q=*:*&fq=(geo:"Contains(ENVELOPE(-116.894531,
> 107.402344, 57.433227, -42.146973))")
>
> If geo = LINESTRING then q=*:*&fq=(geo:"Intersects(ENVELOPE(-116.894531,
> 107.402344, 57.433227, -42.146973))")
>
> How can my application know if the field contain ENVELOPE or LINESTRING ?
>
> Any idea can this be done ?
>
> Best reguards,
> Leo.
>
-- 
Lucene/Solr Search Committer, Consultant, Developer, Author, Speaker
LinkedIn: http://linkedin.com/in/davidwsmiley | Book:
http://www.solrenterprisesearchserver.com


Re: Negative Date Query for Local Params in Solr

2016-09-20 Thread David Smiley
It should, I think... what happens? Can you ascertain the nature of the
results?
~ David

On Tue, Sep 20, 2016 at 5:35 AM Sandeep Khanzode
 wrote:

> For Solr 6.1.0
> This works .. -{!field f=schedule op=Intersects}2016-08-26T12:00:56Z
>
> This works .. {!field f=schedule op=Contains}[2016-08-26T12:00:12Z TO
> 2016-08-26T15:00:12Z]
>
>
> Why does this not work?-{!field f=schedule
> op=Contains}[2016-08-26T12:00:12Z TO 2016-08-26T15:00:12Z]
>  SRK

-- 
Lucene/Solr Search Committer, Consultant, Developer, Author, Speaker
LinkedIn: http://linkedin.com/in/davidwsmiley | Book:
http://www.solrenterprisesearchserver.com


Re: Negative Date Query for Local Params in Solr

2016-09-20 Thread David Smiley
OH!  Ok the moment the query no longer starts with "{!", the query is
parsed by defType (for 'q') and will default to lucene QParser.  So then it
appears we have a clause with a NOT operator.  In this parsing mode,
embedded "{!" terminates at the "}".  This means you can't put the
sub-query text after the "}", you instead need to put it in the special "v"
local-param.  e.g.:
-{!field f=schedule op=Contains v='[2016-08-26T12:00:12Z TO
2016-08-26T15:00:12Z]'}

On Tue, Sep 20, 2016 at 8:15 AM Sandeep Khanzode
 wrote:

> This is what I get ...
> { "responseHeader": { "status": 400, "QTime": 1, "params": { "q":
> "-{!field f=schedule op=Contains}[2016-08-26T12:00:12Z TO
> 2016-08-26T15:00:12Z]", "indent": "true", "wt": "json", "_":
> "1474373612202" } }, "error": { "msg": "Invalid Date in Date Math
> String:'[2016-08-26T12:00:12Z'", "code": 400 }}
>  SRK
>
> On Tuesday, September 20, 2016 5:34 PM, David Smiley <
> david.w.smi...@gmail.com> wrote:
>
>
>  It should, I think... what happens? Can you ascertain the nature of the
> results?
> ~ David
>
> On Tue, Sep 20, 2016 at 5:35 AM Sandeep Khanzode
>  wrote:
>
> > For Solr 6.1.0
> > This works .. -{!field f=schedule op=Intersects}2016-08-26T12:00:56Z
> >
> > This works .. {!field f=schedule op=Contains}[2016-08-26T12:00:12Z TO
> > 2016-08-26T15:00:12Z]
> >
> >
> > Why does this not work?-{!field f=schedule
> > op=Contains}[2016-08-26T12:00:12Z TO 2016-08-26T15:00:12Z]
> >  SRK
>
> --
> Lucene/Solr Search Committer, Consultant, Developer, Author, Speaker
> LinkedIn: http://linkedin.com/in/davidwsmiley | Book:
> http://www.solrenterprisesearchserver.com
>
>
>

-- 
Lucene/Solr Search Committer, Consultant, Developer, Author, Speaker
LinkedIn: http://linkedin.com/in/davidwsmiley | Book:
http://www.solrenterprisesearchserver.com


Re: Negative Date Query for Local Params in Solr

2016-09-20 Thread David Smiley
Personally I learned this by pouring over Solr's source code some time
ago.  I suppose the only official reference to this stuff is:
https://cwiki.apache.org/confluence/display/solr/Local+Parameters+in+Queries
But that page doesn't address the implications for when the syntax is a
clause of a larger query instead of being the whole query (i.e. has "{!"...
but but not at the first char).

On Tue, Sep 20, 2016 at 2:06 PM Sandeep Khanzode
 wrote:

> Wow. Simply awesome!
> Where can I read more about this? I am not sure whether I understand what
> is going on behind the scenes ... like which parser is invoked for !field,
> how can we know which all special local params exist, whether we should
> prefer edismax over others, when is the LuceneQParser invoked in other
> conditions, etc? Would appreciate if you could indicate some references to
> catch up.
> Thanks a lot ...  SRK
>
>   Show original message On Tuesday, September 20, 2016 5:54 PM, David
> Smiley  wrote:
>
>
>  OH!  Ok the moment the query no longer starts with "{!", the query is
> parsed by defType (for 'q') and will default to lucene QParser.  So then it
> appears we have a clause with a NOT operator.  In this parsing mode,
> embedded "{!" terminates at the "}".  This means you can't put the
> sub-query text after the "}", you instead need to put it in the special "v"
> local-param.  e.g.:
> -{!field f=schedule op=Contains v='[2016-08-26T12:00:12Z TO
> 2016-08-26T15:00:12Z]'}
>
> On Tue, Sep 20, 2016 at 8:15 AM Sandeep Khanzode
>  wrote:
>
> > This is what I get ...
> > { "responseHeader": { "status": 400, "QTime": 1, "params": { "q":
> > "-{!field f=schedule op=Contains}[2016-08-26T12:00:12Z TO
> > 2016-08-26T15:00:12Z]", "indent": "true", "wt": "json", "_":
> > "1474373612202" } }, "error": { "msg": "Invalid Date in Date Math
> > String:'[2016-08-26T12:00:12Z'", "code": 400 }}
> >  SRK
> >
> >On Tuesday, September 20, 2016 5:34 PM, David Smiley <
> > david.w.smi...@gmail.com> wrote:
> >
> >
> >  It should, I think... what happens? Can you ascertain the nature of the
> > results?
> > ~ David
> >
> > On Tue, Sep 20, 2016 at 5:35 AM Sandeep Khanzode
> >  wrote:
> >
> > > For Solr 6.1.0
> > > This works .. -{!field f=schedule op=Intersects}2016-08-26T12:00:56Z
> > >
> > > This works .. {!field f=schedule op=Contains}[2016-08-26T12:00:12Z TO
> > > 2016-08-26T15:00:12Z]
> > >
> > >
> > > Why does this not work?-{!field f=schedule
> > > op=Contains}[2016-08-26T12:00:12Z TO 2016-08-26T15:00:12Z]
> > >  SRK
> >
> > --
> > Lucene/Solr Search Committer, Consultant, Developer, Author, Speaker
> > LinkedIn: http://linkedin.com/in/davidwsmiley | Book:
> > http://www.solrenterprisesearchserver.com
> >
> >
> >
>
> --
> Lucene/Solr Search Committer, Consultant, Developer, Author, Speaker
> LinkedIn: http://linkedin.com/in/davidwsmiley | Book:
> http://www.solrenterprisesearchserver.com
>
>
>

-- 
Lucene/Solr Search Committer, Consultant, Developer, Author, Speaker
LinkedIn: http://linkedin.com/in/davidwsmiley | Book:
http://www.solrenterprisesearchserver.com


Re: Negative Date Query for Local Params in Solr

2016-09-20 Thread David Smiley
So that page referenced describes local-params, and describes the special
"v" local-param.  But first, see a list of all query parsers (which lists
"field"): https://cwiki.apache.org/confluence/display/solr/Other+Parsers
and
https://cwiki.apache.org/confluence/display/solr/The+Standard+Query+Parser for
the "lucene" one.

The "op" param is rather unique... it's not defined by any query parser.  A
trick is done in which a custom field type (DateRangeField in this case) is
able to inspect the local-params, and thus define and use params it needs.
https://cwiki.apache.org/confluence/display/solr/Working+with+Dates "More
DateRangeField Details" mentions "op".  {!lucene df=dateRange
op=Contains}... would also work.  I don't know of any other local-param
used in this way.

On Tue, Sep 20, 2016 at 11:21 PM David Smiley 
wrote:

> Personally I learned this by pouring over Solr's source code some time
> ago.  I suppose the only official reference to this stuff is:
>
> https://cwiki.apache.org/confluence/display/solr/Local+Parameters+in+Queries
> But that page doesn't address the implications for when the syntax is a
> clause of a larger query instead of being the whole query (i.e. has "{!"...
> but but not at the first char).
>
> On Tue, Sep 20, 2016 at 2:06 PM Sandeep Khanzode
>  wrote:
>
>> Wow. Simply awesome!
>> Where can I read more about this? I am not sure whether I understand what
>> is going on behind the scenes ... like which parser is invoked for !field,
>> how can we know which all special local params exist, whether we should
>> prefer edismax over others, when is the LuceneQParser invoked in other
>> conditions, etc? Would appreciate if you could indicate some references to
>> catch up.
>> Thanks a lot ...  SRK
>>
>>   Show original message On Tuesday, September 20, 2016 5:54 PM, David
>> Smiley  wrote:
>>
>>
>>  OH!  Ok the moment the query no longer starts with "{!", the query is
>> parsed by defType (for 'q') and will default to lucene QParser.  So then
>> it
>> appears we have a clause with a NOT operator.  In this parsing mode,
>> embedded "{!" terminates at the "}".  This means you can't put the
>> sub-query text after the "}", you instead need to put it in the special
>> "v"
>> local-param.  e.g.:
>> -{!field f=schedule op=Contains v='[2016-08-26T12:00:12Z TO
>> 2016-08-26T15:00:12Z]'}
>>
>> On Tue, Sep 20, 2016 at 8:15 AM Sandeep Khanzode
>>  wrote:
>>
>> > This is what I get ...
>> > { "responseHeader": { "status": 400, "QTime": 1, "params": { "q":
>> > "-{!field f=schedule op=Contains}[2016-08-26T12:00:12Z TO
>> > 2016-08-26T15:00:12Z]", "indent": "true", "wt": "json", "_":
>> > "1474373612202" } }, "error": { "msg": "Invalid Date in Date Math
>> > String:'[2016-08-26T12:00:12Z'", "code": 400 }}
>> >  SRK
>> >
>> >On Tuesday, September 20, 2016 5:34 PM, David Smiley <
>> > david.w.smi...@gmail.com> wrote:
>> >
>> >
>> >  It should, I think... what happens? Can you ascertain the nature of the
>> > results?
>> > ~ David
>> >
>> > On Tue, Sep 20, 2016 at 5:35 AM Sandeep Khanzode
>> >  wrote:
>> >
>> > > For Solr 6.1.0
>> > > This works .. -{!field f=schedule op=Intersects}2016-08-26T12:00:56Z
>> > >
>> > > This works .. {!field f=schedule op=Contains}[2016-08-26T12:00:12Z TO
>> > > 2016-08-26T15:00:12Z]
>> > >
>> > >
>> > > Why does this not work?-{!field f=schedule
>> > > op=Contains}[2016-08-26T12:00:12Z TO 2016-08-26T15:00:12Z]
>> > >  SRK
>> >
>> > --
>> > Lucene/Solr Search Committer, Consultant, Developer, Author, Speaker
>> > LinkedIn: http://linkedin.com/in/davidwsmiley | Book:
>> > http://www.solrenterprisesearchserver.com
>> >
>> >
>> >
>>
>> --
>> Lucene/Solr Search Committer, Consultant, Developer, Author, Speaker
>> LinkedIn: http://linkedin.com/in/davidwsmiley | Book:
>> http://www.solrenterprisesearchserver.com
>>
>>
>>
>
> --
> Lucene/Solr Search Committer, Consultant, Developer, Author, Speaker
> LinkedIn: http://linkedin.com/in/davidwsmiley | Book:
> http://www.solrenterprisesearchserver.com
>
-- 
Lucene/Solr Search Committer, Consultant, Developer, Author, Speaker
LinkedIn: http://linkedin.com/in/davidwsmiley | Book:
http://www.solrenterprisesearchserver.com


Re: Removing SOLR fields from schema

2016-09-22 Thread David Santamauro



On 09/22/2016 08:55 AM, Shawn Heisey wrote:

On 9/21/2016 11:46 PM, Selvam wrote:

We use SOLR 5.x in cloud mode and have huge set of fields. We now want
to remove some 50 fields from Index/schema itself so that indexing &
querying will be faster. Is there a way to do that without losing
existing data on other fields? We don't want to do full re-indexing.


When you remove fields from your schema, you can continue to use Solr
with no problems even without a reindex.  But you won't see any benefit
to your query performance until you DO reindex.  Until the reindex is
done (ideally wiping the index first), all the data from the removed
fields will remain in the index and affect your query speeds.


Will an optimize remove those fields and corresponding data?





Re: how to remove duplicate from search result

2016-09-27 Thread David Santamauro

Have a look at

https://cwiki.apache.org/confluence/display/solr/Result+Grouping


On 09/27/2016 11:03 AM, googoo wrote:

hi,

We want to provide remove duplicate from search result function.

like we have below documents.
id(uniqueKey)   guid
doc1G1
doc2G2
doc3G3
doc4G1

user run one query and hit doc1, doc2 and doc4.
user want to remove duplicate from search result based on guid field.
since doc1 and doc4 has same guid, one of them should be drop from search
result.

how we can address this requirement?

Thanks,
Yongtao





--
View this message in context: 
http://lucene.472066.n3.nabble.com/how-to-remove-duplicate-from-search-result-tp4298272.html
Sent from the Solr - User mailing list archive at Nabble.com.



Re: Migrating to Solr 6.1.0 from 5.5.0

2016-09-29 Thread David Smiley
Arjun,

Your input is a POLYGON -- as seen in the error message.  The "Try JTS" was
hopefully a clue -- on
https://cwiki.apache.org/confluence/display/solr/Spatial+Search search for
"JTS" and you should see how to set the spatialContextFactory to JTS, and a
mention of needing JTS jar.  I'll try and add a bit more info on suggesting
exactly where to put it and a download link.  I'll also mention a shortcut
so you don't have to type out the classname -- a recent feature in 6.2.

Since you said you were upgrading... presumably your spatialContextFactory
attribute was already set for this to work at all in 5.5?  The package
reference changed for this value -- I imagine you would have seen a
warning/error to this effect in Solr's logs.  Do you?

~ David

On Tue, Sep 27, 2016 at 10:29 AM William Bell  wrote:

> the documentation is not good on this. Not sure how to fix it either.
>
> On Tue, Sep 27, 2016 at 3:41 AM, M, Arjun (Nokia - IN/Bangalore) <
> arju...@nokia.com> wrote:
>
> > Hi,
> >
> > We are getting the below errors when migrating Solr from 5.5.0 to
> > 6.1.0. Could anyone help in resolving the issue, if you have come across
> > this?
> >
> >
>  org.apache.solr.client.solrj.impl.HttpSolrClient$RemoteSolrException:
> > Error from server at http://127.0.0.1:41569/solr/collection1: Unable to
> > parse shape given formats "lat,lon", "x y" or as WKT because
> > java.text.ParseException: java.lang.UnsupportedOperationException:
> > Unsupported shape of this SpatialContext. Try JTS or Geo3D. input:
> > POLYGON((-10 30, -40 40, -10 -20, 0 0, -10 30))
> >
> > Thanks in advance..
> >
> > Thanks & Regards,
> >Arjun M
> >
> >
> >
> >
>
>
> --
> Bill Bell
> billnb...@gmail.com
> cell 720-256-8076
>
-- 
Lucene/Solr Search Committer, Consultant, Developer, Author, Speaker
LinkedIn: http://linkedin.com/in/davidwsmiley | Book:
http://www.solrenterprisesearchserver.com


Re: Heatmap in JSON facet API

2016-11-01 Thread David Smiley
I plan on adding this in the near future... hopefully for Solr 6.4.

On Mon, Oct 31, 2016 at 7:06 AM Никита Веневитин 
wrote:

> I've built query as described in https://cwiki.apache.org/confluence/x/ZYDxAQ";>Heatmap Faceting,
> but I would like to get same results using JSON facet API
>
> 2016-10-30 15:24 GMT+03:00 GW :
>
> > If we are talking about the same kind of heat maps you might want to look
> > at the TomTom map API for a quick and dirty yet solid solution. Just
> supply
> > a whack of coordinates and let TomTom do the work. The Heat maps will
> zoom
> > in and de-cluster.
> >
> > Example below.
> >
> > http://www.frogclassifieds.com/tomtom/markers-clustering.html
> >
> >
> > On 28 October 2016 at 09:05, Никита Веневитин  >
> > wrote:
> >
> > > Hi!
> > > Is it possible to use JSON facet API to get heatmaps?
> > >
> >
>
-- 
Lucene/Solr Search Committer, Consultant, Developer, Author, Speaker
LinkedIn: http://linkedin.com/in/davidwsmiley | Book:
http://www.solrenterprisesearchserver.com


How-To: Secure Solr by IP Address

2016-11-04 Thread David Smiley
I was just researching how to secure Solr by IP address and I finally
figured it out.  Perhaps this might go in the ref guide but I'd like to
share it here anyhow.  The scenario is where only "localhost" should have
full unfettered access to Solr, whereas everyone else (notably web clients)
can only access some whitelisted paths.  This setup is intended for a
single instance of Solr (not a member of a cluster); the particular config
below would probably need adaptations for a cluster of Solr instances.  The
technique here uses a utility with Jetty called IPAccessHandler --
http://download.eclipse.org/jetty/stable-9/apidocs/org/eclipse/jetty/server/handler/IPAccessHandler.html
For reasons I don't know (and I did search), it was recently deprecated and
there's another InetAccessHandler (not in Solr's current version of Jetty)
but it doesn't support constraints incorporating paths, so it's a
non-option for my needs.

First, Java must be told to insist on it's IPv4 stack. This is because
Jetty's IPAccessHandler simply doesn't support IPv6 IP matching; it throws
NPEs in my experience. In recent versions of Solr, this can be easily done
just by adding -Djava.net.preferIPv4Stack=true at the Solr start
invocation.  Alternatively put it into SOLR_OPTS perhaps in solr.in.sh.

Edit server/etc/jetty.xml, and replace the line
mentioning ContextHandlerCollection with this:


   
 
   127.0.0.1
   -.-.-.-|/solr/techproducts/select
 
   
   false
   
 
   
 

This mechanism wraps ContextHandlerCollection (which ultimately serves
Solr) with this handler that adds the constraints.  These constraints above
allow localhost to do anything; other IP addresses can only access
/solr/techproducts/select.  That line could be duplicated for other
white-listed paths -- I recommend creating request handlers for your use,
possibly with invariants to further constraint what someone can do.

note: I originally tried inserting the IPAccessHandler in
server/contexts/solr-jetty-context.xml but found that there's a bug in
IPAccessHanlder that fails to consider when HttpServletRequest.getPathInfo
is null.  And it wound up letting everything through (if I recall).  But I
like it up in server.xml anyway as it intercepts everything

~ David

-- 
Lucene/Solr Search Committer, Consultant, Developer, Author, Speaker
LinkedIn: http://linkedin.com/in/davidwsmiley | Book:
http://www.solrenterprisesearchserver.com


Re: How-To: Secure Solr by IP Address

2016-11-04 Thread David Smiley
Not to knock the other suggestions, but a benefit to securing Jetty like
this is that *everyone* can do this approach.

On Fri, Nov 4, 2016 at 9:54 AM john saylor  wrote:

> hi
>
> any firewall worth it's name should be able to do this. in fact, that is
> one of several things that a firewall was designed to do.
>
> also, you are stopping this traffic at the application, which is good;
> but you'd prolly be better off stopping it at the network interface
> [using a firewall, for instance].
>
> of course, firewalls have their own complexity ...
>
> good luck!
>
> --
Lucene/Solr Search Committer, Consultant, Developer, Author, Speaker
LinkedIn: http://linkedin.com/in/davidwsmiley | Book:
http://www.solrenterprisesearchserver.com


Re: Aggregate Values Inside a Facet Range

2016-11-04 Thread David Santamauro


I believe your answer is in the subject
  => facet.range
https://cwiki.apache.org/confluence/display/solr/Faceting#Faceting-RangeFaceting

//

On 11/04/2016 02:25 PM, Furkan KAMACI wrote:

I have documents like that

id:5
timestamp:NOW //pseudo date representation
count:13

id:4
timestamp:NOW //pseudo date representation
count:3

id:3
timestamp:NOW-1DAY //pseudo date representation
count:21

id:2
timestamp:NOW-1DAY //pseudo date representation
count:29

id:1
timestamp:NOW-3DAY //pseudo date representation
count:4

When I want to facet last 3 days data by timestamp its OK. However my need
is that:

facets:
 TODAY: 16 //pseudo representation
 TODAY - 1: 50 //pseudo date representation
 TODAY - 2: 0 //pseudo date representation
 TODAY - 3: 4 //pseudo date representation

I mean, I have to facet by dates and aggregate values inside that facet
range. Is it possible to do that without multiple queries at Solr?

Kind Regards,
Furkan KAMACI



Re: Overlapped Gap Facets

2016-11-17 Thread David Santamauro


I had a similar question a while back but it was regarding date 
differences. Perhaps that might give you some ideas.


http://lucene.472066.n3.nabble.com/date-difference-faceting-td4249364.html

//



On 11/17/2016 09:49 AM, Furkan KAMACI wrote:

Is it possible to do such a facet on a date field:

  Last 1 Day
  Last 1 Week
  Last 1 Month
  Last 6 Month
  Last 1 Year
  Older than 1 Year

which has overlapped facet gaps?

Kind Regards,
Furkan KAMACI



Adding a Basic Authentication user fails with 404

2017-06-06 Thread David Parker
Hello,

I am running a stand-alone instance of Solr 6.5 (without ZooKeeper).  I am
attempting to implement Basic Authentication per the documentation, but
when I try to use the API to add a user, I get a 404 error.  It seems the
/admin/authentication API entry point isn't there:

$ curl --user solr:SolrRocks http://localhost:8983/solr/admin/authentication
-H 'Content-type:application/json' -d '{"set-user": {"myuser" :
"mypasswd"}}'



Error 404 Not Found

HTTP ERROR 404
Problem accessing /solr/admin/authentication. Reason:
Not Found



But according to the documentation, the API entry point is
admin/authentication, and it states the following:

"This endpoint is not collection-specific, so users are created for the
entire Solr cluster. If users need to be restricted to a specific
collection, that can be done with the authorization rules."

The only thing which stands out to me is "users are created for the entire
Solr cluster."  Is this entry point missing because I'm running Solr
stand-alone?

Any help is greatly appreciated!

- Dave

-- 
Dave Parker
Database & Systems Administrator
Utica College
Integrated Information Technology Services
(315) 792-3229
Registered Linux User #408177


Re: Score higher if multiple terms match

2017-06-07 Thread David Hastings
well, short answer, use the analyzer to see whats happening.
long answer
 theres a difference between
name:tv promotion   -->  name:tv default_field:promotion
name:"tv promotion"   -->  name:"tv promotion"
name:tv AND name:promotion --> name:tv AND name:promotion


since your default field most likely isnt name, its going to search only
the default field for it.  you can alter this behavior using qf parameters:



qf='name^5 text'


for example would apply a boost of 5 if it matched the field 'name', and
only 1 for 'text'

On Wed, Jun 7, 2017 at 4:35 PM, OTH  wrote:

> Hello,
>
> I have what I would think to be a fairly simple problem to solve, however
> I'm not sure how it's done in Solr and couldn't find an answer on Google.
>
> Say I have two documents, "TV" and "TV promotion".  If the search query is
> "TV promotion", then, obviously, I would like the document "TV promotion"
> to score higher.  However, that is not the case right now.
>
> My syntax is something like this:
> http://localhost:8983/solr/sales/select?indent=on&wt=
> json&fl=*,score&q=name:tv
> promotion
> (I tried "q=name:tv+promotion (added the '+'), but it made no difference.)
>
> It's not scoring the document "TV promotion" higher than "TV"; in fact it's
> scoring it lower.
>
> Thanks
>


Re: Score higher if multiple terms match

2017-06-07 Thread David Hastings
sorry, i meant debug query where you would get output like this:

"debug": {
"rawquerystring": "name:tv promotion",
"querystring": "name:tv promotion",
"parsedquery": "+name:tv +text:promotion",


On Wed, Jun 7, 2017 at 4:41 PM, David Hastings  wrote:

> well, short answer, use the analyzer to see whats happening.
> long answer
>  theres a difference between
> name:tv promotion   -->  name:tv default_field:promotion
> name:"tv promotion"   -->  name:"tv promotion"
> name:tv AND name:promotion --> name:tv AND name:promotion
>
>
> since your default field most likely isnt name, its going to search only
> the default field for it.  you can alter this behavior using qf parameters:
>
>
>
> qf='name^5 text'
>
>
> for example would apply a boost of 5 if it matched the field 'name', and
> only 1 for 'text'
>
> On Wed, Jun 7, 2017 at 4:35 PM, OTH  wrote:
>
>> Hello,
>>
>> I have what I would think to be a fairly simple problem to solve, however
>> I'm not sure how it's done in Solr and couldn't find an answer on Google.
>>
>> Say I have two documents, "TV" and "TV promotion".  If the search query is
>> "TV promotion", then, obviously, I would like the document "TV promotion"
>> to score higher.  However, that is not the case right now.
>>
>> My syntax is something like this:
>> http://localhost:8983/solr/sales/select?indent=on&wt=json&;
>> fl=*,score&q=name:tv
>> promotion
>> (I tried "q=name:tv+promotion (added the '+'), but it made no difference.)
>>
>> It's not scoring the document "TV promotion" higher than "TV"; in fact
>> it's
>> scoring it lower.
>>
>> Thanks
>>
>
>


Re: Score higher if multiple terms match

2017-06-08 Thread David Hastings
Agreed, you need to show the debug query info from your original query:


My syntax is something like this:
>> >>> >> http://localhost:8983/solr/sales/select?indent=on&wt=json&;
>> >>> >> fl=*,score&q=name:tv
>> >>> >> promotion

and could probably help you get the results you want


On Thu, Jun 8, 2017 at 10:54 AM, Erick Erickson 
wrote:

> bq: I hope that clears the confusion.
>
> Nope, doesn't clear it up at all. It's not clear which query you're
> talking about at least to me.
>
> If you're searching for
> name:tv AND name:promotion
>
> and getting back a document that has only "tv" in the name field
> that's simply wrong and you need to find out why.
>
> If you're saying that searching for
> name:tv OR name:promotion
>
> returns both and that docs with both terms score higher, that's likely
> true although it'll be fuzzy. I'm guessing that the name field is
> fairly short so the length norm will be the sam and this will be
> fairly reliable. If the field could have a widely varying number of
> terms it's less reliable.
>
> Best,
> Erick
>
> On Thu, Jun 8, 2017 at 1:41 AM, OTH  wrote:
> > Hi - Sorry it was very late at night for me and I think I didn't pick my
> > wordings right.
> > bq: it is indeed returning documents with only either one of the two
> query
> > terms
> > What I meant was:  Initially, I thought it was only returning documents
> > which contained both 'tv' and 'promotion'.  Then I realized I was
> mistaken;
> > it was also returning documents which contained either 'tv' or
> 'promotion'
> > (as well as documents which contained both, which were scored higher).
> > I hope that clears the confusion.
> > Thanks
> >
> > On Thu, Jun 8, 2017 at 9:04 AM, Erick Erickson 
> > wrote:
> >
> >> bq: it is indeed returning documents with only either one of the two
> query
> >> terms
> >>
> >> Uhm, this should not be true. What's the output of adding debug=query?
> >> And are you totally sure the above is true and you're just not seeing
> >> the other term in the return? Or that you have a synonyms file that is
> >> somehow making docs match? Or ???
> >>
> >> So you're saying you get the exact same number of hits for
> >> name:tv OR name:promotion
> >> and
> >> name:tv AND name:promotion
> >> ??? Definitely not expected unless all docs happen to have both these
> >> terms in the name field either through normal input or synonyms etc.
> >>
> >> You should need something like:
> >> name:tv OR name:promotion OR (name:tv AND name:promotion)^100
> >> to score all the docs with both terms in the name field higher than just
> >> one.
> >>
> >> Best,
> >> Erick
> >>
> >> On Wed, Jun 7, 2017 at 3:05 PM, OTH  wrote:
> >> > I'm sorry, there was a mistake.
> >> >
> >> > I previously wrote:
> >> >
> >> > However, these are returning only those documents which have both the
> >> terms
> >> >> 'tv promotion' in them (there are a few).  It's not returning any
> >> >> document which have only 'tv' or only 'promotion' in them.
> >> >
> >> >
> >> > That's not true at all; it is indeed returning documents with only
> either
> >> > one of the two query terms (so, documents with only 'tv' or only
> >> > 'promotion' in them).  Sorry.  You can disregard my question in the
> last
> >> > email.
> >> >
> >> > Thanks
> >> >
> >> > On Thu, Jun 8, 2017 at 2:03 AM, OTH  wrote:
> >> >
> >> >> Thanks.
> >> >> Both of these are working in my case:
> >> >> name:"tv promotion"   -->  name:"tv promotion"
> >> >> name:tv AND name:promotion --> name:tv AND name:promotion
> >> >> (Although I'm assuming, the first might not have worked if my
> document
> >> had
> >> >> been say 'promotion tv' or 'tv xyz promotion')
> >> >>
> >> >> However, these are returning only those documents which have both the
> >> >> terms 'tv promotion' in them (there are a few).  It's not returning
> any
> >> >> document which have only 'tv' or only &#x

Re: Highlighter not working on some documents

2017-06-11 Thread David Smiley
Probably the most common reason is the default hl.maxAnalyzedChars -- thus
your highlightable text might not be in the first 51200 chars of text.  The
first Solr release with the unified highlighter had an even lower default
of 10k chars.

On Fri, Jun 9, 2017 at 9:58 PM Phil Scadden  wrote:

> Tried hard to find difference between pdfs returning no highlighter and
> ones that do for same search term.  Includes pdfs that have been OCRed and
> ones that were text to begin with. Head scratching to me.
>
> -Original Message-
> From: Erick Erickson [mailto:erickerick...@gmail.com]
> Sent: Saturday, 10 June 2017 6:22 a.m.
> To: solr-user 
> Subject: Re: Highlighter not working on some documents
>
> Need lots more information. I.e. schema definitions, query you use,
> handler configuration and the like. Note that highlighted fields must have
> stored="true" set and likely the _text_ field doesn't. At least in the
> default schemas stored is set to false for the catch-all field.
> And you don't want to store that information anyway since it's usually the
> destination of copyField directives and you'd highlight _those_ fields.
>
> Best,
> Erick
>
> On Thu, Jun 8, 2017 at 8:37 PM, Phil Scadden  wrote:
> > Do a search with:
> > fl=id,title,datasource&hl=true&hl.method=unified&limit=50&page=1&q=pre
> > ssure+AND+testing&rows=50&start=0&wt=json
> >
> > and I get back a good list of documents. However, some documents are
> returning empty fields in the highlighter. Eg, in the highlight array have:
> > "W:\\Reports\\OCR\\4272.pdf":{"_text_":[]}
> >
> > Getting this well up the list of results with good highlighted matchers
> above and below this entry. Why would the highlighter be failing?
> >
> > Notice: This email and any attachments are confidential and may not be
> used, published or redistributed without the prior written consent of the
> Institute of Geological and Nuclear Sciences Limited (GNS Science). If
> received in error please destroy and immediately notify GNS Science. Do not
> copy or disclose the contents.
> Notice: This email and any attachments are confidential and may not be
> used, published or redistributed without the prior written consent of the
> Institute of Geological and Nuclear Sciences Limited (GNS Science). If
> received in error please destroy and immediately notify GNS Science. Do not
> copy or disclose the contents.
>
-- 
Lucene/Solr Search Committer, Consultant, Developer, Author, Speaker
LinkedIn: http://linkedin.com/in/davidwsmiley | Book:
http://www.solrenterprisesearchserver.com


Re: Swapping indexes on disk

2017-06-14 Thread David Hastings
I dont have an answer to why the folder got cleared, however i am wondering
why you arent using basic replication to do this exact same thing, since
solr will natively take care of all this for you with no interruption to
the user and no stop/start routines etc.

On Wed, Jun 14, 2017 at 2:26 PM, Mike Lissner <
mliss...@michaeljaylissner.com> wrote:

> We are replacing a drive mounted at /old with one mounted at /new. Our
> index currently lives on /old, and our plan was to:
>
> 1. Create a new index on /new
> 2. Reindex from our database so that the new index on /new is properly
> populated.
> 3. Stop solr.
> 4. Symlink /old to /new (Solr now looks for the index at /old/solr, which
> redirects to /new/solr)
> 5. Start solr
> 6. (Later) Stop solr, swap the drives (old for new), and start solr. (Solr
> now looks for the index at /old/solr again, and finds it there.)
> 7. Delete the index pointing to /new created in step 1.
>
> The idea was that this would create a new index for solr, would populate it
> with the right content, and would avoid having to touch our existing solr
> configurations aside from creating one new index, which we could soon
> delete.
>
> I just did steps 1-5, but I got null pointer exceptions when starting solr,
> and it appears that the index on /new has been almost completely deleted by
> Solr (this is a bummer, since it takes days to populate).
>
> Is this expected? Am I terribly crazy to try to swap indexes on disk? As
> far as I know, the only difference between the indexes is their name.
>
> We're using Solr version 4.10.4.
>
> Thank you,
>
> Mike
>


Re: Issue with highlighter

2017-06-14 Thread David Smiley
> Beware of NOT plus OR in a search. That will certainly produce no
highlights. (eg test -results when default op is OR)

Seems like a bug to me; the default operator shouldn't matter in that case
I think since there is only one clause that has no BooleanQuery.Occur
operator and thus the OR/AND shouldn't matter.  The end effect is "test" is
effectively required and should definitely be highlighted.

Note to Ali: Phil's comment implies use of hl.method=unified which is not
the default.

On Wed, Jun 14, 2017 at 10:22 PM Phil Scadden  wrote:

> Just had similar issue - works for some, not others. First thing to look
> at is hl.maxAnalyzedChars is the query. The default is quite small.
> Since many of my documents are large PDF files, I opted to use
> storeOffsetsWithPositions="true" termVectors="true" on the field I was
> searching on.
> This certainly did increase my index size but not too bad and certainly
> fast.
> https://cwiki.apache.org/confluence/display/solr/Highlighting
>
> Beware of NOT plus OR in a search. That will certainly produce no
> highlights. (eg test -results when default op is OR)
>
>
> -Original Message-
> From: Ali Husain [mailto:alihus...@outlook.com]
> Sent: Thursday, 15 June 2017 11:11 a.m.
> To: solr-user@lucene.apache.org
> Subject: Issue with highlighter
>
> Hi,
>
>
> I think I've found a bug with the highlighter. I search for the word
> "something" and I get an empty highlighting response for all the documents
> that are returned shown below. The fields that I am searching over are
> text_en, the highlighter works for a lot of queries. I have no
> stopwords.txt list that could be messing this up either.
>
>
>  "highlighting":{
> "310":{},
> "103":{},
> "406":{},
> "1189":{},
> "54":{},
> "292":{},
> "309":{}}}
>
>
> Just changing the search term to "something like" I get back this:
>
>
> "highlighting":{
> "310":{},
> "309":{
>   "content":["1949 Convention, like those"]},
> "103":{},
> "406":{},
> "1189":{},
> "54":{},
> "292":{},
> "286":{
>   "content":["persons in these classes are treated like
> combatants, but in other respects"]},
> "336":{
>   "content":["   be treated like engagement"]}}}
>
>
> So I know that I have it setup correctly, but I can't figure this out.
> I've searched through JIRA/Google and haven't been able to find a similar
> issue.
>
>
> Any ideas?
>
>
> Thanks,
>
> Ali
> Notice: This email and any attachments are confidential and may not be
> used, published or redistributed without the prior written consent of the
> Institute of Geological and Nuclear Sciences Limited (GNS Science). If
> received in error please destroy and immediately notify GNS Science. Do not
> copy or disclose the contents.
>
-- 
Lucene/Solr Search Committer, Consultant, Developer, Author, Speaker
LinkedIn: http://linkedin.com/in/davidwsmiley | Book:
http://www.solrenterprisesearchserver.com


how to leave the mailing list? eof

2017-06-19 Thread david fernandes



Re: How are people using the ICUTokenizer?

2017-06-20 Thread David Hastings
Have you successfully used the shingles with the MoreLikeThis query?
Really curious about if this would to return the "interesting Phrases"

On Tue, Jun 20, 2017 at 12:01 PM, Davis, Daniel (NIH/NLM) [C] <
daniel.da...@nih.gov> wrote:

> Joel,
>
> I think the issue is doing word-breaking according to ICU rules.   So, if
> you are trying to make sure your index breaks words properly on eastern
> languages, just use ICU Tokenizer.   Unless your text is already in an ICU
> normal form, you should always use the ICUNormalizer character filter along
> with this:
>
> https://cwiki.apache.org/confluence/display/solr/CharFilterFactories#
> CharFilterFactories-solr.ICUNormalizer2CharFilterFactory
>
> I think that this would be good with Shingles when you are not removing
> stop words, maybe in an alternate analysis of the same content.
>
> I'm using it in this way, with shingles for phrase recognition and only
> doc freq and term freq - my possibly naïve idea is that I do not need
> positions and offsets if I'm using shingles, and my main goal is to do a
> MoreLikeThis query using the shingled versions of fields.
>
> -Original Message-
> From: Joel Bernstein [mailto:joels...@gmail.com]
> Sent: Tuesday, June 20, 2017 11:52 AM
> To: solr-user@lucene.apache.org
> Subject: How are people using the ICUTokenizer?
>
> It seems that there are some powerful capabilities in the ICUTokenizer. I
> was wondering how the community is making use of it.
>
> Does anyone have experience working with the ICUTokenizer that they can
> share?
>
>
> Joel Bernstein
> http://joelsolr.blogspot.com/
>


Re: Polygon search query working but NOT Multipolygon

2017-06-28 Thread David Smiley
Hi Puneeta,

So what does your field type definition look like?  I'd imagine you're using 
RptWithGeometrySpatialField.  And what is your Solr version?

BTW note the settings here 
https://locationtech.github.io/spatial4j/apidocs/org/locationtech/spatial4j/context/jts/JtsSpatialContextFactory.html
 
<https://locationtech.github.io/spatial4j/apidocs/org/locationtech/spatial4j/context/jts/JtsSpatialContextFactory.html>
  are reflected as attributes on the field type, thus you can set say 
useJtsMulti="false" to change the 'multi implementation.

~ David

> On Jun 28, 2017, at 6:44 AM, puneeta  wrote:
> 
> Hi,
> I am new to Solr Geospatial data and have set up JTS within solr. I have
> geo spatial data with Multipolygons. I am passing the coordinates and trying
> to find out which multipolygon contains those coordinates.However, The
> search query is working fine if I insert the data as a polygon. The same is
> not working if my data is inserted as a Multipolygon. I am unable to figure
> out what am I missing. Can anyone suggest where am I going wrong?
> 
> Data as Polygon:
> { "parcel_id":"6",
>"geo":["POLYGON((-86.452970463 32.449739005, 
>  -86.452889912 32.4494390510001, 
>  -86.453365379 32.44942802195, 
>  -86.453514854 32.44942453595))"]
> }
> 
> Data as Multipolygon:
> 
> { "parcel_id":"6",
>"geo":["MULTIPOLYGON(((-86.452970463 32.449739005, 
>  -86.452889912 32.4494390510001, 
>  -86.453365379 32.44942802195, 
>  -86.453514854 32.44942453595)))"]
> }
> 
> My search query:
> fq=geo:"Intersects(-86.453097892 32.449735102)"
> 
> This device surely lies between the polygon (My polygon coordinates are many
> more in the actual data. To reduce the size here I have omited few of the
> coordinates)
> 
> The query is returning only the polygon data. The multipolygon search is not
> happening.
> 
> Any help is highly appreciated.
> 
> Thanks in Advance,
> Puneeta
> 
> 
> 
> --
> View this message in context: 
> http://lucene.472066.n3.nabble.com/Polygon-search-query-working-but-NOT-Multipolygon-tp4343143.html
> Sent from the Solr - User mailing list archive at Nabble.com.



Re: Polygon search query working but NOT Multipolygon

2017-06-28 Thread David Smiley
I suggest using RptWithGeometry field, and with that change remove distErrPct 
and maxDistErr.  See the ref guide, and note the geometry cache option.
BTW spatialContextFactory can simply be "jts".

If this fixes the issue, then the issue was related to grid approximation.

BTW you never quite said what it was about the results that was wrong.  Did you 
get hits you didn't expect (I'm guessing yes) or the inverse?

~ David

> On Jun 28, 2017, at 10:55 AM, puneeta  wrote:
> 
> Hi David,
> Thank you for the prompt reply. My field definition in schema.xml is :
> 
> I commented the existing location_rpt
> 
> 
> 
> And added:
>  class="solr.SpatialRecursivePrefixTreeFieldType"
> 
> spatialContextFactory="org.locationtech.spatial4j.context.jts.JtsSpatialContextFactory"
>   autoIndex="true"
>   validationRule="repairBuffer0"
>   distErrPct="0.025"
>   maxDistErr="0.001"
>   distanceUnits="kilometers" />
> 
> My Solr version is 6.2.1
> 
> Thanks,
> Puneeta
> 
> 
> 
> 
> --
> View this message in context: 
> http://lucene.472066.n3.nabble.com/Polygon-search-query-working-but-NOT-Multipolygon-tp4343143p4343162.html
> Sent from the Solr - User mailing list archive at Nabble.com.



Re: Solr 5.5 - spatial intersects query returns results outside of search box

2017-06-28 Thread David Smiley

> On Jun 27, 2017, at 3:28 AM, Leila Gonzales  wrote:
> 
> {
> 
>"id": "5230",
> 
>"location_geo":
> ["ENVELOPE(-75.0,-75.939723,39.3597224,38.289722)"]
> 
>  }

This is an unusual rectangle.  Remember this is minX, maxX, maxY, minY.  Thus 
this rectangle wraps the entire globe except for nearly a degree.  It matches 
your query rectangle.

Re: Solr 5.5 - spatial intersects query returns results outside of search box

2017-06-28 Thread David Smiley
No prob.

BTW you may want to investigate use of BBoxField or 
RptWithGeometrySpatialField; both are also more accurate... but vanilla RPT may 
be just fine (fastest).


> On Jun 28, 2017, at 11:32 AM, Leila Gonzales  wrote:
> 
> Thanks David! I fixed the coordinates and put some error checking in my
> Solr indexing script to trap for this type of coordinate mismatch.
> 
> -Original Message-
> From: David Smiley [mailto:david.w.smi...@gmail.com]
> Sent: Wednesday, June 28, 2017 8:21 AM
> To: solr-user@lucene.apache.org
> Subject: Re: Solr 5.5 - spatial intersects query returns results outside
> of search box
> 
> 
>> On Jun 27, 2017, at 3:28 AM, Leila Gonzales  wrote:
>> 
>> {
>> 
>>   "id": "5230",
>> 
>>   "location_geo":
>> 
> ["ENVELOPE(-75.0,-75.939723,39.3597224,38.289722)"
> ]
>> 
>> }
> 
> This is an unusual rectangle.  Remember this is minX, maxX, maxY, minY.
> Thus this rectangle wraps the entire globe except for nearly a degree.  It
> matches your query rectangle.



Re: Polygon search query working but NOT Multipolygon

2017-06-28 Thread David Smiley
https://lucene.apache.org/solr/guide/6_6/spatial-search.html#SpatialSearch-RptWithGeometrySpatialField
 
<https://lucene.apache.org/solr/guide/6_6/spatial-search.html#SpatialSearch-RptWithGeometrySpatialField>


> On Jun 28, 2017, at 11:32 AM, puneeta  wrote:
> 
> Hi David,
> I am sorry ,I did not understand what do you mean by "I suggest using
> RptWithGeometry field". Should leave the existing location_rpt definition in
> schema.xml?
>  class="solr.SpatialRecursivePrefixTreeFieldType"
>   geo="true" distErrPct="0.025" maxDistErr="0.001"
> distanceUnits="kilometers" />
> This line I have commented. Should I uncomment it?
> 
> 1."remove distErrPct and maxDistErr" - 
> 2.Added usejtsMulti="false"
> 
> I will change the  field definition as follows, try to execute and report
> back.
>class="solr.SpatialRecursivePrefixTreeFieldType" 
> 
> jts*="org.locationtech.spatial4j.context.jts.JtsSpatialContextFactory" 
>  autoIndex="true"
>  validationRule="repairBuffer0"
>  distanceUnits="kilometers" 
>  *useJtsMulti="false"*/> 
> 
> 
> The issue I am facing is that the I am not getting the search result for
> Multipolygon i.e I should get hits.Currently, the numFound = 0, It should
> find atleast 1 record as it does for a Polygon search.
> 
> Thanks,
> Puneeta
> 
> david.w.smi...@gmail.com <mailto:david.w.smi...@gmail.com> wrote
>> I suggest using RptWithGeometry field, and with that change remove
>> distErrPct and maxDistErr.  See the ref guide, and note the geometry cache
>> option.
>> BTW spatialContextFactory can simply be "jts".
>> 
>> If this fixes the issue, then the issue was related to grid approximation.
>> 
>> BTW you never quite said what it was about the results that was wrong. 
>> Did you get hits you didn't expect (I'm guessing yes) or the inverse?
>> 
>> ~ David
>> 
>>> On Jun 28, 2017, at 10:55 AM, puneeta <
> 
>> pverma@
> 
>> > wrote:
>>> 
>>> Hi David,
>>> Thank you for the prompt reply. My field definition in schema.xml is :
>>> 
>>> I commented the existing location_rpt
>>> 
>>> 
>>> 
>>> And added:
>>> 
>> >> 
>> class="solr.SpatialRecursivePrefixTreeFieldType"
>>> 
>>> spatialContextFactory="org.locationtech.spatial4j.context.jts.JtsSpatialContextFactory"
>>>  autoIndex="true"
>>>  validationRule="repairBuffer0"
>>>  distErrPct="0.025"
>>>  maxDistErr="0.001"
>>>  distanceUnits="kilometers" />
>>> 
>>> My Solr version is 6.2.1
>>> 
>>> Thanks,
>>> Puneeta
>>> 
>>> 
>>> 
>>> 
>>> --
>>> View this message in context:
>>> http://lucene.472066.n3.nabble.com/Polygon-search-query-working-but-NOT-Multipolygon-tp4343143p4343162.html
>>> Sent from the Solr - User mailing list archive at Nabble.com.
> 
> 
> 
> 
> 
> --
> View this message in context: 
> http://lucene.472066.n3.nabble.com/Polygon-search-query-working-but-NOT-Multipolygon-tp4343143p4343184.html
>  
> <http://lucene.472066.n3.nabble.com/Polygon-search-query-working-but-NOT-Multipolygon-tp4343143p4343184.html>
> Sent from the Solr - User mailing list archive at Nabble.com 
> <http://nabble.com/>.



Re: Spatial Search based on the amount of docs, not the distance

2017-06-28 Thread David Smiley
Deniz didn't mention document-to-document distance sort but he/she didn't say 
it wasn't that case either.

Any way, FYI at the Lucene level with LatLonPoint there is some sophisticated 
BKD search code to efficiently return the top N distance ordered documents 
(where you supply N).  Although as far as I recall, it also has no filtering 
mechanism, so if you have any other filters (keyword/time/whatever), it 
wouldn't work.

I once did this feature on an RPT index for a client and I got the open-source 
permission but I haven't gotten around to properly adding it to Solr.  I might 
approach it a bit differently now.

~ David

> On Jun 22, 2017, at 8:34 PM, Tim Casey  wrote:
> 
> deniz,
> 
> I was going to add something here.  The reason what you want is probably
> hard to do is because you are asking solr, which stores a document, to
> return documents using an attribute of document pairs.  As only a though
> exercise, if you stored record pairs as a single document, you could
> probably query it directly.  That is, if you have d1 and d2 and you are
> querying  around d1 and ordering by distance, then you could get this
> directly from a document representing a record pair.  I don't think this is
> practical, because it is an n^2 store.
> 
> Since the n^2 problem is there, people are going to suggest some heuristic
> which avoids this problem.  What Erick is suggesting is down this path.
> Query around a point and sort by distance taking the top K results.  The
> result is taking a linear slice of the n^2 distance attribute.
> 
> tim
> 
> 
> 
> On Wed, Jun 21, 2017 at 7:50 PM, Erick Erickson 
> wrote:
> 
>> Would it serve to sort by distance? True, if you matched a zillion
>> documents within a 1km radius you'd still perform the distance calcs, but
>> the result would be a manageable number.
>> 
>> I have to ask "Why to you care?". Is this an efficiency question (i.e. you
>> want to keep Solr from having to do expensive work) or is it a question of
>> having to get hits at all? It's at least possible that the solution for one
>> is not the solution for the other.
>> 
>> Best,
>> Erick
>> 
>> On Wed, Jun 21, 2017 at 5:32 PM, deniz  wrote:
>> 
>>> it is for sure possible to use d value for limiting the distance,
>> however,
>>> it
>>> might not be very efficient, as some of the coords may not have any docs
>>> around for a large value of d... so it is hard to determine a default
>> value
>>> for d.
>>> 
>>> though it sounds like havinga default d and gradual increments on its
>> value
>>> might be a work around for top K results...
>>> 
>>> 
>>> 
>>> 
>>> 
>>> -
>>> Zeki ama calismiyor... Calissa yapar...
>>> --
>>> View this message in context: http://lucene.472066.n3.
>>> nabble.com/Spatial-Search-based-on-the-amount-of-docs-not-the-distance-
>>> tp4342108p4342258.html
>>> Sent from the Solr - User mailing list archive at Nabble.com.
>>> 
>> 



Re: Polygon search query working but NOT Multipolygon

2017-06-28 Thread David Smiley
This polygon is fairly rectangular with one side having a ton of points.
Nonetheless the query point is clearly far apart from it (it's much lower
(smaller 'y' dimension).

On Wed, Jun 28, 2017 at 10:17 PM puneeta  wrote:

> Hi David,
>   Actually my polygon had too many coordinates, so i just omitted some
> while
> posting my query. Here is my complete multipolygon where the last point is
> same as the first one:
>
> 
> MULTIPOLYGON (((-86.477551331 32.490605651,
> -86.477637350 32.4903921820001, -86.478257247 32.4905655910001,
> -86.478250466 32.4905802390001, -86.478243988 32.49059368096,
> -86.47823751 32.490607122, -86.478231749 32.49061910096, -86.478224637
> 32.4906340650001, -86.478218237 32.490647541, -86.478211847
> 32.49066103595, -86.478205478 32.4906745260001, -86.47820210799989
> 32.4906816669, -86.478199132 32.4906880240001, -86.478192825
> 32.490701523, -86.478186533 32.490715047, -86.478183209 32.4907222090001,
> -86.4781802789 32.4907285690001, -86.478174063 32.4907421250001,
> -86.478167851 32.4907556540001, -86.478162558 32.49076723696,
> -86.47815905399989 32.490774513000105, -86.477551331 32.490605651)))
> 
> 
>
> Thanks,
> Puneeta
>
>
>
>
> david.w.smi...@gmail.com wrote
> > I tried your data in the "JTS TestBuilder" GUI.  Firstly, your polygon
> > isn't "closed", but that was easily fixed by repeating the first point at
> > the end.  See the attached screenshot of the GUI for what these shapes
> > look like.  The red dot (the query point) is outside of this
> > triangular-ish shape, and thus not a match.
> >
> >
> >
> >
> >> On Jun 28, 2017, at 12:33 PM, puneeta <
>
> > pverma@
>
> > > wrote:
> >>
> >> Hi David,
> >>  I did the following changes:
> >>
> >> Changed in schema.xml:
> >>
> >  >>
> >
> >>
> spatialContextFactory="org.locationtech.spatial4j.context.jts.JtsSpatialContextFactory"
> >>   autoIndex="true"
> >>   validationRule="repairBuffer0"
> >>   distanceUnits="kilometers"
> >> useJtsMulti="false"
> >> />
> >>
> >>
> >> Added in solrconfig.xml:
> >>
> >  >>
> >class="solr.LRUCache"
> >>   size="256"
> >>   initialSize="0"
> >>   autowarmCount="100%"
> >>   regenerator="solr.NoOpRegenerator"/>
> >>
> >> My fields in the core as defined in the schema is:
> >> <
> http://lucene.472066.n3.nabble.com/file/n4343221/SolrGeoFieldDefinition.png>
> ;
> >>
> >> However, I still face the same issue. No results found for a
> multipolygon
> >> search.
> >>
> >> Not sure whats happening :(
> >>
> >> Puneeta
> >>
> >>
> >>
> >>
> >>
> >>
>
> > david.w.smiley@
>
> >  wrote
> >>>
> https://lucene.apache.org/solr/guide/6_6/spatial-search.html#SpatialSearch-RptWithGeometrySpatialField
> >>> <
> https://lucene.apache.org/solr/guide/6_6/spatial-search.html#SpatialSearch-RptWithGeometrySpatialField>
> ;
> >>>
> >>>
> >>>> On Jun 28, 2017, at 11:32 AM, puneeta <
> >>
> >>> pverma@
> >>
> >>> > wrote:
> >>>>
> >>>> Hi David,
> >>>> I am sorry ,I did not understand what do you mean by "I suggest using
> >>>> RptWithGeometry field". Should leave the existing location_rpt
> >>>> definition
> >>>> in
> >>>> schema.xml?
> >>>>
> >>>
> >  >>
> >>>
> >>> class="solr.SpatialRecursivePrefixTreeFieldType"
> >>>>  geo="true" distErrPct="0.025" maxDistErr="0.001"
> >>>> distanceUnits="kilometers" />
> >>>> This line I have commented. Should I uncomment it?
> >>>>
> >>>> 1."remove distErrPct and maxDistErr" -
> >>>> 2.Added usejtsMulti="false"
> >>>>
> >>>> I will change the  field definition as follows, try to execute and
> >>>> report
> >>>> back.
> >>>>
> &g

Re: Not highlighting "and" and "or"?

2017-06-28 Thread David Smiley
Hi Walter,
No they are not.  Does debug=query show that these words are in your parsed
query?

On Wed, Jun 28, 2017 at 5:13 PM Walter Underwood 
wrote:

> Is there some special casing in the highlighter to skip query syntax
> words? The words “and” and “or” don’t get highlighted.
>
> This is in 6.5.0.
>
>question
>html
>440
>fastVector
>1
>
> wunder
> Walter Underwood
> wun...@wunderwood.org
> http://observer.wunderwood.org/  (my blog)
>
>
> --
Lucene/Solr Search Committer, Consultant, Developer, Author, Speaker
LinkedIn: http://linkedin.com/in/davidwsmiley | Book:
http://www.solrenterprisesearchserver.com


  1   2   3   4   5   6   7   8   9   10   >