Re: Metadata and HTML ending up in searchable text

2016-05-31 Thread Simon Blandford

Hi Alex,

That sounds similar. I am puzzled by what I am seeing because it looks 
like a major bug and I am following the docs for curl as closely as 
possible, but hardly anyone else seems to have noticed it. To me it is a 
show-stopper.


If I convert the docs to txt with html2text first then I can sort-of 
live with the results, although I'd rather not have the metadata in the 
document, but at least the main text body doesn't have tag content in 
it, as it does with HTML source.


I just want to make sure I'm not missing something really obvious before 
submitting a bug report.


Regards,
Simon


On 27/05/16 20:22, Alexandre Rafalovitch wrote:

I think Solr's layer above Tika was merging in metadata and text all
together without a way (that I could see) to separate them.

That's all I remember of my examination of this issue when I run into
something similar. Not very helpful, I know.

Regards,
Alex.

Newsletter and resources for Solr beginners and intermediates:
http://www.solr-start.com/


On 27 May 2016 at 23:48, Simon Blandford  wrote:

Hi Timothy,

Thanks for responding.

java -jar tika-app-1.13.jar -t
"/home/user/Documents/library/UsingMailingLists.txt"
...gives a clean result with no CSS or other nasties in the output. So it
looks like the latest version of tika itself is OK.

I was basing the test case on this doc page as closely as possible,
including the prefix and content mapping.
https://wiki.apache.org/solr/ExtractingRequestHandler

 From the same page, extractFormat=text only applies when extractOnly is
true, which just shows the output from tika without indexing the document.
Running it in "extractOnly" mode resulting in a XML output. The difference
between selecting "text" or "xml" format is that the escaped document in the
 tag is either the original HTML (xml mode) or stripped HTML (text
mode). It seems some Javascript creeps into the text version. (See below)

Regards,
Simon

HTML mode sample:


051
;


 
 ...

TEXT mode (Blank lines stripped):

047
UsingMailingLists - Solr Wiki
Search:

Solr Wiki
Login






On 27/05/16 13:31, Allison, Timothy B. wrote:

I'm only minimally familiar with Solr Cell, but...

1) It looks like you aren't setting extractFormat=text.  According to
[0]...the default is xhtml which will include a bunch of the metadata.
2) is there an attr_* dynamic field in your index with type="ignored"?
This would strip out the attr_ fields so they wouldn't even be indexed...if
you don't want them.

As for the HTML file, it looks like Tika is failing to strip out the style
section.  Try running the file alone with tika-app: java -jar tika-app.jar
-t inputfile.html.  If you are finding the noise there.  Please open an
issue on our JIRA: https://issues.apache.org/jira/browse/tika


[0]
https://cwiki.apache.org/confluence/display/solr/Uploading+Data+with+Solr+Cell+using+Apache+Tika


-Original Message-
From: Simon Blandford [mailto:simon.blandf...@bkconnect.net]
Sent: Thursday, May 26, 2016 9:49 AM
To: solr-user@lucene.apache.org
Subject: Metadata and HTML ending up in searchable text

Hi,

I am using Solr 6.0 on Ubuntu 14.04.

I am ending up with loads of junk in the text body. It starts like,

The JSON entry output of a search result shows the indexed text starting
with...
body_txt_en: " stream_size 36499 X-Parsed-By
org.apache.tika.parser.DefaultParser X-Parsed-By"

And then once it gets to the actual text I get CSS class names appearing
that were in  or  tags etc.
e.g. "the power of calibre3 silence calibre2 and", where
"calibre3" etc are the CSS class names.

All this junk is searchable and is polluting the index.

I would like to index _only_ the actual content I am interested in
searching for.

Steps to reproduce:

1) Solr installed by untaring solr tgz in /opt.

2) Core created by typing "bin/solr create -c mycore"

3) Solr started with bin/solr start

4) TXT document index using the following command curl
"http://localhost:8983/solr/mycore/update/extract?literal.id=doc1&uprefix=attr_&fmap.content=body_txt_en&commit=true";
-F

"content/UsingMailingLists.txt=@/home/user/Documents/library/UsingMailingLists.txt"

5) HTML document index using following command curl
"http://localhost:8983/solr/mycore/update/extract?literal.id=doc2&uprefix=attr_&fmap.content=body_txt

help need example code of solrj to get schema of a given core

2016-05-31 Thread Liu, Ming (Ming)
Hello,

I am very new to Solr, I want to write a simple Java program to get a core's 
schema information. Like how many field and details of each field. I spent a 
few time searching on internet, but cannot get much information about this. The 
solrj wiki seems not updated for long time. I am using Solr 5.5.0

Hope there are some example code, or please give me some advices, or simple 
hint like which java class I can take a look at.

Thanks in advance!
Ming


Re: Can a DocTransformer access the whole results tree?

2016-05-31 Thread Upayavira
I was always under the impression that a search component couldn't
modify the output of a previous search component. If it can, then the
highlight component could add its results to the output of the query
component, and we're done.

Upayavira (who sees the confusion on people's faces often when he shows
them Solr highlighting)

On Fri, 27 May 2016, at 06:27 PM, Erick Erickson wrote:
> Maybe you'd be better off using a custom search component.
> instead of a doc transformer. The intent of a doc transformer
> is, as you've discovered, working on single docs at a time. You
> want to manipulate the whole response which seems to fit more
> naturally into a search component. Make sure to put it after
> the highlight component (i.e. last-components).
> 
> Best,
> Erick
> 
> On Fri, May 27, 2016 at 6:55 AM, Upayavira  wrote:
> > In a JSON response, we get this:
> >
> > {
> >   "responseHeader": {...},
> >   "response": { "docs": [...] },
> >   "highlighting": {...}
> >   ...
> > }
> >
> > I'm assuming that the getProcessedDocuments call would give me the docs:
> > {} element, whereas I'm after the whole response so I can retrieve the
> > "highlighting" element.
> >
> > Make sense?
> >
> > On Fri, 27 May 2016, at 02:45 PM, Mikhail Khludnev wrote:
> >> Upayavira,
> >>
> >> It's not clear what do you mean in "results themselves", perhaps you mean
> >> SolrDocuments ?
> >>
> >> public abstract class ResultContext {
> >>  ..
> >>   public Iterator getProcessedDocuments() {
> >> return new DocsStreamer(this);
> >>   }
> >>
> >> On Fri, May 27, 2016 at 4:15 PM, Upayavira  wrote:
> >>
> >> > Yes, I've seen that. I can see the getDocList() method will presumably
> >> > give me the results themselves, but I need the full response so I can
> >> > get the highlighting details, but I can't see them anywhere.
> >> >
> >> > On Thu, 26 May 2016, at 09:39 PM, Mikhail Khludnev wrote:
> >> > > public abstract class ResultContext {
> >> > >
> >> > >  /// here are all results
> >> > >   public abstract DocList getDocList();
> >> > >
> >> > >   public abstract ReturnFields getReturnFields();
> >> > >
> >> > >   public abstract SolrIndexSearcher getSearcher();
> >> > >
> >> > >   public abstract Query getQuery();
> >> > >
> >> > >   public abstract SolrQueryRequest getRequest();
> >> > >
> >> > > On Thu, May 26, 2016 at 11:25 PM, Upayavira  wrote:
> >> > >
> >> > > > Hi Mikhail,
> >> > > >
> >> > > > Is there really? If I look at ResultContext, I see it is an abstract
> >> > > > class, completed by BasicResultContext. I don't see any context 
> >> > > > method
> >> > > > there. I can see a getContext() on SolrQueryRequest which just 
> >> > > > returns
> >> > a
> >> > > > hashmap. Will I find the response in there? Is that what you are
> >> > > > suggesting?
> >> > > >
> >> > > > Upayavira
> >> > > >
> >> > > > On Thu, 26 May 2016, at 06:28 PM, Mikhail Khludnev wrote:
> >> > > > > Hello,
> >> > > > >
> >> > > > > There is a protected ResultContext field named context.
> >> > > > >
> >> > > > > On Thu, May 26, 2016 at 5:31 PM, Upayavira  
> >> > > > > wrote:
> >> > > > >
> >> > > > > > Looking at the code for a sample DocTransformer, it seems that a
> >> > > > > > DocTransformer only has access to the document itself, not to the
> >> > whole
> >> > > > > > results. Because of this, it isn't possible to use a
> >> > DocTransformer to
> >> > > > > > merge, for example, the highlighting results into the main
> >> > document.
> >> > > > > >
> >> > > > > > Am I missing something?
> >> > > > > >
> >> > > > > > Upayavira
> >> > > > > >
> >> > > > >
> >> > > > >
> >> > > > >
> >> > > > > --
> >> > > > > Sincerely yours
> >> > > > > Mikhail Khludnev
> >> > > > > Principal Engineer,
> >> > > > > Grid Dynamics
> >> > > > >
> >> > > > > 
> >> > > > > 
> >> > > >
> >> > >
> >> > >
> >> > >
> >> > > --
> >> > > Sincerely yours
> >> > > Mikhail Khludnev
> >> > > Principal Engineer,
> >> > > Grid Dynamics
> >> > >
> >> > > 
> >> > > 
> >> >
> >>
> >>
> >>
> >> --
> >> Sincerely yours
> >> Mikhail Khludnev
> >> Principal Engineer,
> >> Grid Dynamics
> >>
> >> 
> >> 


Re: searching in two indices

2016-05-31 Thread Mikhail Khludnev
Hello Bernd,

I recently committed [subquery] document transformer which sounds pretty
much the same.
Find the details at
https://cwiki.apache.org/confluence/display/solr/Transforming+Result+Documents
It's not yet released, thus I appreciate if you take a nightly build from
https://builds.apache.org/job/Solr-Artifacts-6.x/lastSuccessfulBuild/artifact/solr/package/
and check how it works for your problem.
I appreciate a feedback, I really want to make it effective and convenient
until it's released.


On Mon, May 30, 2016 at 1:20 PM, Bernd Fehling <
bernd.fehl...@uni-bielefeld.de> wrote:

> Has anyone experiences with searching in two indices?
>
> E.g. having one index with nearly static data (like personal data)
> and a second index with articles which changes pretty much.
>
> A search would then start for articles and from the list of results
> (e.g. first page, 10 articles) start a sub search in the second
> index for personal data to display the results side by side.
>
> Has anyone managed this and how?
>
> If not, how would you try to solve this?
>
>
> Regards,
> Bernd
>



-- 
Sincerely yours
Mikhail Khludnev
Principal Engineer,
Grid Dynamics





Re: help need example code of solrj to get schema of a given core

2016-05-31 Thread Georg Sorst
Querying the schema can be done with the Schema API (
https://cwiki.apache.org/confluence/display/solr/Schema+API), which is
fully supported by SolrJ:
http://lucene.apache.org/solr/6_0_0/solr-solrj/org/apache/solr/client/solrj/request/schema/package-summary.html
.

Liu, Ming (Ming)  schrieb am Di., 31. Mai 2016 09:41:

> Hello,
>
> I am very new to Solr, I want to write a simple Java program to get a
> core's schema information. Like how many field and details of each field. I
> spent a few time searching on internet, but cannot get much information
> about this. The solrj wiki seems not updated for long time. I am using Solr
> 5.5.0
>
> Hope there are some example code, or please give me some advices, or
> simple hint like which java class I can take a look at.
>
> Thanks in advance!
> Ming
>


RE: Metadata and HTML ending up in searchable text

2016-05-31 Thread Allison, Timothy B.
>>  From the same page, extractFormat=text only applies when extractOnly 
>> is true, which just shows the output from tika without indexing the document.

Y, sorry.  I just looked through the source code.  You're right.  If you use 
DIH (TikaEntityProcessor) instead of Solr Cell (ExtractingDocumentLoader), you 
should be able to set the handler type by setting the "format" attribute, and 
"text" is one option there.

>>I just want to make sure I'm not missing something really obvious before 
>>submitting a bug report.
I don't think you are.

>>  From the same page, extractFormat=text only applies when extractOnly 
>> is true, which just shows the output from tika without indexing the document.
>> Running it in "extractOnly" mode resulting in a XML output. The 
>> difference between selecting "text" or "xml" format is that the 
>> escaped document in the  tag is either the original HTML 
>> (xml mode) or stripped HTML (text mode). It seems some Javascript 
>> creeps into the text version. (See below)
>>
>> Regards,
>> Simon
>>
>> HTML mode sample:
>>   > name="responseHeader">0> name="QTime">51> name="UsingMailingLists.html">> version="1.0" encoding="UTF-8"?>
>> ;
>> 
>> >  rel="stylesheet" type="text/css" charset="utf-8" media="all"
>> href="/wiki/modernized/css/common.css"/>
>>  >  media="screen" href="/wiki/modernized/css/screen.css"/>
>>  >  media="print" href="/wiki/modernized/css/print.css"/>...
>>
>> TEXT mode (Blank lines stripped):
>> 
>> 0> name="QTime">47 
>> UsingMailingLists - Solr Wiki
>> Search:
>> 
>> Solr Wiki
>> Login
>>
>>
>>
>>
>>
>>
>> On 27/05/16 13:31, Allison, Timothy B. wrote:
>>> I'm only minimally familiar with Solr Cell, but...
>>>
>>> 1) It looks like you aren't setting extractFormat=text.  According 
>>> to [0]...the default is xhtml which will include a bunch of the metadata.
>>> 2) is there an attr_* dynamic field in your index with type="ignored"?
>>> This would strip out the attr_ fields so they wouldn't even be 
>>> indexed...if you don't want them.
>>>
>>> As for the HTML file, it looks like Tika is failing to strip out the 
>>> style section.  Try running the file alone with tika-app: java -jar 
>>> tika-app.jar -t inputfile.html.  If you are finding the noise there.  
>>> Please open an issue on our JIRA: 
>>> https://issues.apache.org/jira/browse/tika
>>>
>>>
>>> [0]
>>> https://cwiki.apache.org/confluence/display/solr/Uploading+Data+with
>>> +Solr+Cell+using+Apache+Tika
>>>
>>>
>>> -Original Message-
>>> From: Simon Blandford [mailto:simon.blandf...@bkconnect.net]
>>> Sent: Thursday, May 26, 2016 9:49 AM
>>> To: solr-user@lucene.apache.org
>>> Subject: Metadata and HTML ending up in searchable text
>>>
>>> Hi,
>>>
>>> I am using Solr 6.0 on Ubuntu 14.04.
>>>
>>> I am ending up with loads of junk in the text body. It starts like,
>>>
>>> The JSON entry output of a search result shows the indexed text 
>>> starting with...
>>> body_txt_en: " stream_size 36499 X-Parsed-By 
>>> org.apache.tika.parser.DefaultParser X-Parsed-By"
>>>
>>> And then once it gets to the actual text I get CSS class names 
>>> appearing that were in  or  tags etc.
>>> e.g. "the power of calibre3 silence calibre2 and", where 
>>> "calibre3" etc are the CSS class names.
>>>
>>> All this junk is searchable and is polluting the index.
>>>
>>> I would like to index _only_ the actual content I am interested in 
>>> searching for.
>>>
>>> Steps to reproduce:
>>>
>>> 1) Solr installed by untaring solr tgz in /opt.
>>>
>>> 2) Core created by typing "bin/solr create -c mycore"
>>>
>>> 3) Solr started with bin/solr start
>>>
>>> 4) TXT document index using the following command curl 
>>> "http://localhost:8983/solr/mycore/update/extract?literal.id=doc1&uprefix=attr_&fmap.content=body_txt_en&commit=true";
>>> -F
>>>
>>> "content/UsingMailingLists.txt=@/home/user/Documents/library/UsingMailingLists.txt"
>>>
>>> 5) HTML document index using following command curl 
>>> "http://localhost:8983/solr/mycore/update/extract?literal.id=doc2&uprefix=attr_&fmap.content=body_txt_en&commit=true";
>>> -F
>>>
>>> "content/UsingMailingLists.html=@/home/user/Documents/library/UsingMailingLists.html"
>>>
>>> 6) Query using URL:
>>> http://localhost:8983/solr/mycore/select?q=especially&wt=json
>>>
>>> Result:
>>>
>>> For the txt file, I get the following JSON for the document...
>>>
>>> {
>>>id: "doc1",
>>>attr_stream_size: [
>>>"8107"
>>>],
>>>attr_x_parsed_by: [
>>>"org.apache.tika.parser.

Re: Solr vs JDBC driver

2016-05-31 Thread Vachon , Jean-Sébastien
I am using Java 8 (JDK 1.8.091) and it’s an application layer on top of
Solr 6 using SolrJ.
Here is the section of my pom.xml


org.apache.solr
solr-solrj
6.0.0



I had to manually load the driver
(“org.apache.solr.client.solrj.io.sql.DriverImpl")
to make it work (no big deal).

My connection string is as follows:

"jdbc:solr://10.28.213.133:2181?collection=Current1"


Is there something wrong in my setup?

Thanks

On 2016-05-28, 9:42 AM, "Joel Bernstein"  wrote:

>The driver is included in /META-INF/services/java.sql.Driver. So if you're
>using JDBC 4.0, the driver should be autoloaded.
>
>What version of java are you running?
>
>Joel Bernstein
>https://urldefense.proofpoint.com/v2/url?u=http-3A__joelsolr.blogspot.com_
>&d=CwIFaQ&c=zzHkMf6HMoOvCB4yTPe0Gg&r=-x-rme77PLDQ3ZHnf9yvFvYlNUI3MVyqcKj3K
>lyZF0A&m=pZojVpq10BZok4-rElqKq3mXwoUx8HsGwFK0hhWGqAE&s=IEvtNH443siONK2tXX4
>IPdFocT4Kc0JnwY-eqITAteg&e=
>
>On Fri, May 27, 2016 at 8:16 PM, Vachon, Jean-Sébastien <
>jvac...@cebglobal.com> wrote:
>
>> Never mindŠ I had to load the class just like any database driver:
>>
>>
>>
>>Class.forName("org.apache.solr.client.solrj.io.sql.DriverImpl").newInstan
>>ce
>> ();
>>
>>
>>
>>
>> On 2016-05-27, 2:59 PM, "Vachon, Jean-Sébastien" 
>> wrote:
>>
>> >Hi All,
>> >
>> >
>> >
>> >I am trying to use Solr¹s JDBC driver in Java and I¹m stuck with the
>> >following error message:
>> >
>> >
>> >
>> >
>> >
>> >14:52:37,802 ERROR [consoleLogger] java.sql.SQLException: No suitable
>> >driver found for jdbc:solr://10.28.213.133:2181/solr?collection=Current
>> >
>> >
>> >
>> >My pom.xml contains:
>> >
>> >
>> >
>> >
>> >
>> >
>> >
>> >org.apache.solr
>> >
>> >
>> >
>> >solr-solrj
>> >
>> >
>> >
>> >6.0.0
>> >
>> >
>> >
>> >
>> >
>> >
>> >
>> >I looked at different posts:
>> >
>> >
>> >
>> >Yonnik¹s:
>> >
>>
>>https://urldefense.proofpoint.com/v2/url?u=http-3A__yonik.com_solr-2D6_&d
>>=
>>
>>>CwIGaQ&c=zzHkMf6HMoOvCB4yTPe0Gg&r=oMPffnCI8igMuHU_-rBzYXM4_YN0UQILws5Lxi
>>>Hl
>>
>>>0UMSHcx1HOXvooqVgod85DbS&m=DgBFXI6SnwLs-KZ4iYaH6oaILBnR6DSJHIloMLKIrp8&s
>>>=-
>> >bVufG7EPgmW-V_ya5J9YMQDMKwuR14YORhwW6IAU2o&e=
>> >
>> >Sematext:
>> >
>>
>>https://urldefense.proofpoint.com/v2/url?u=https-3A__sematext.com_blog_20
>>1
>>
>>>6_04_26_solr-2D6-2Das-2Djdbc-2Ddata-2Dsource_&d=CwIGaQ&c=zzHkMf6HMoOvCB4
>>>yT
>>
>>>Pe0Gg&r=oMPffnCI8igMuHU_-rBzYXM4_YN0UQILws5LxiHl0UMSHcx1HOXvooqVgod85DbS
>>>&m
>>
>>>=DgBFXI6SnwLs-KZ4iYaH6oaILBnR6DSJHIloMLKIrp8&s=-KJ2iAt0odQ4BrkKaxc-TgJ0w
>>>kL
>> >l7vTOWmYbSmnpVYM&e=
>> >
>> >
>> >
>> >And I seem to meet all the requirements
>> >
>> >
>> >
>> >Any idea on what I¹m doing wrong?
>> >
>> >
>> >
>> >Thanks
>> >
>> >
>> >
>> >
>> >
>> >
>> >
>> >CEB Canada Inc. Registration No: 1781071. Registered office: 199 Bay
>> >Street Commerce Court West, # 2800, Toronto, Ontario, Canada, M5L 1AP.
>> >
>> >
>> >
>> >
>> >
>> >
>> >
>> >This e-mail and/or its attachments are intended only for the use of the
>> >addressee(s) and may contain confidential and legally privileged
>> >information belonging to CEB and/or its subsidiaries, including SHL. If
>> >you have received this e-mail in error, please notify the sender and
>> >immediately, destroy all copies of this email and its attachments. The
>> >publication, copying, in whole or in part, or use or dissemination in
>>any
>> >other way of this e-mail and attachments by anyone other than the
>> >intended person(s) is prohibited.
>> >
>> >
>> >
>> >
>> >
>>
>>
>>
>> CEB Canada Inc. Registration No: 1781071. Registered office: 199 Bay
>> Street Commerce Court West, # 2800, Toronto, Ontario, Canada, M5L 1AP.
>>
>> This e-mail and/or its attachments are intended only for the use of the
>> addressee(s) and may contain confidential and legally privileged
>> information belonging to CEB and/or its subsidiaries, including SHL. If
>>you
>> have received this e-mail in error, please notify the sender and
>> immediately, destroy all copies of this email and its attachments. The
>> publication, copying, in whole or in part, or use or dissemination in
>>any
>> other way of this e-mail and attachments by anyone other than the
>>intended
>> person(s) is prohibited.
>>
>>



CEB Canada Inc. Registration No: 1781071. Registered office: 199 Bay Street 
Commerce Court West, # 2800, Toronto, Ontario, Canada, M5L 1AP.

This e-mail and/or its attachments are intended only for the use of the 
addressee(s) and may contain confidential and legally privileged information 
belonging to CEB and/or its subsidiaries, including SHL. If you have received 
this e-mail in error, please notify the sender and immediately, destroy all 
copies of this email and its attachments. The publication, copying, in whole or 
in part, or use or dissemination in any other way of this e-mail and 
attachments by anyone other than the intended person(s) is prohibited.



Re: searching in two indices

2016-05-31 Thread Bernd Fehling
Hi Mikhail,

I will check that out, thanks.

Regards,
Bernd

Am 31.05.2016 um 10:53 schrieb Mikhail Khludnev:
> Hello Bernd,
> 
> I recently committed [subquery] document transformer which sounds pretty
> much the same.
> Find the details at
> https://cwiki.apache.org/confluence/display/solr/Transforming+Result+Documents
> It's not yet released, thus I appreciate if you take a nightly build from
> https://builds.apache.org/job/Solr-Artifacts-6.x/lastSuccessfulBuild/artifact/solr/package/
> and check how it works for your problem.
> I appreciate a feedback, I really want to make it effective and convenient
> until it's released.
> 
> 
> On Mon, May 30, 2016 at 1:20 PM, Bernd Fehling <
> bernd.fehl...@uni-bielefeld.de> wrote:
> 
>> Has anyone experiences with searching in two indices?
>>
>> E.g. having one index with nearly static data (like personal data)
>> and a second index with articles which changes pretty much.
>>
>> A search would then start for articles and from the list of results
>> (e.g. first page, 10 articles) start a sub search in the second
>> index for personal data to display the results side by side.
>>
>> Has anyone managed this and how?
>>
>> If not, how would you try to solve this?
>>
>>
>> Regards,
>> Bernd
>>
> 
> 
> 


Solr leaking references to deleted files

2016-05-31 Thread Gavin Harcourt

Hi All,

I've noticed on some of my solr nodes that the disk usage is increasing 
over time. After checking the output of lsof I found hundreds of 
references to deleted index files being held by solr. This totaled 24GB 
on a 16GB index. A restart of solr can obviously fix this but this is 
not an ideal solution. We are running solr 5.4.0 on OpenJDK  1.8.0_91. 
We are using the Concurrent Mark Sweep GC although I've also seen the 
same problem on nodes using the G1 GC. Our update handler has autoCommit 
and softAutoCommit enabled (at different intervals). We are using solr 
cloud and have multiple shards with 2 nodes each in our collections. 
I've not seen any pattern between this appearing on leaders or replicas. 
Not all my nodes appear to be exhibiting the problem either. Our usage 
pattern does involve a lot of churn in our index with the majority of 
documents being updated/deleted every day.


Searching JIRA and the web in general I could only find references to 
this sort of problem when running solr in tomcat. Can anyone suggest a 
reason why this might be happening or a way I can manage it without 
needing to restart solr?


Example lsof output:
java   1100  s123  DEL   REG 202,3   8919406 
/home/s123/solr/data/uk_shard2_replica1/data/index/_3m9s.fdt
java   1100  s123  DEL   REG 202,3   8919159 
/home/s123/solr/data/uk_shard2_replica1/data/index/_3mnk.tvd
java   1100  s123  DEL   REG 202,3   8919150 
/home/s123/solr/data/uk_shard2_replica1/data/index/_3mnk_Lucene50_0.tim
java   1100  s123  DEL   REG 202,3   8919094 
/home/s123/solr/data/uk_shard2_replica1/data/index/_3mq1_Lucene50_0.tim
java   1100  s123  DEL   REG 202,3   8919103 
/home/s123/solr/data/uk_shard2_replica1/data/index/_3mq1.tvd


Regards,
Gavin.



Re: Clarity on Sharding Concepts.

2016-05-31 Thread Siddhartha Singh Sandhu
Thank you Mugeesh.

On Tue, May 31, 2016 at 12:19 AM, Mugeesh Husain  wrote:

> Hi,
>
> To read out this document
>
> https://cwiki.apache.org/confluence/display/solr/Shards+and+Indexing+Data+in+SolrCloud
> for proper understanding.
>
> FYI, you are using implicit router, a document will be divided randomly
> based on hashing technique.
>
> If you indexed 50 documents, it will be divided into 2 parts, 1 goes to
> shard1, second one is shard2 and same document will be go their replica
> respectively .
>
>
> Thanks
> Mugeesh
>
>
>
> --
> View this message in context:
> http://lucene.472066.n3.nabble.com/Clarity-on-Sharding-Concepts-tp4279842p4279856.html
> Sent from the Solr - User mailing list archive at Nabble.com.
>


Sorting documents in one core based on a field in another core

2016-05-31 Thread Mark Robinson
Hi,

I have a requirement to sort records in one core/ collection based on a
field in
another core/collection.

Could some one please advise how it can be done in SOLR.

I have used !join to restrict documents in one core based on field values
in another core. Is there some way to sort like that?


Thanks!
Mark.


Add a new field dynamically to each of the result docs and sort on it

2016-05-31 Thread Mark Robinson
Hi,

My core does not have a field say *fieldnew*.

*Case 1:-*
But in my results I would like to have *fieldnew *also and my results
should be sorted on only this new field.

*Case 2:-*
Just adding one more case further.
Suppose I have other fields also in the sort criteria and *fieldnew *is one
among them, in that case how do I realize this multi field sort also.

Could some one suggest a way pls.

Thanks!
Mark.


Re: Add a new field dynamically to each of the result docs and sort on it

2016-05-31 Thread Erick Erickson
I really don't understand this. If you don't have
"fieldnew", where is the value coming from? It's
not in the index so

If you mean you're _adding_ a field after the index
already has some docs in it, then the normal
sort rules apply and you can specify sortMisingFirst/Last
to tell Solr where other docs without that field shold go.

Normal sort rules are '&sort=field1 asc,field2 desc' etc.

Best,
Erick

On Tue, May 31, 2016 at 7:53 AM, Mark Robinson  wrote:
> Hi,
>
> My core does not have a field say *fieldnew*.
>
> *Case 1:-*
> But in my results I would like to have *fieldnew *also and my results
> should be sorted on only this new field.
>
> *Case 2:-*
> Just adding one more case further.
> Suppose I have other fields also in the sort criteria and *fieldnew *is one
> among them, in that case how do I realize this multi field sort also.
>
> Could some one suggest a way pls.
>
> Thanks!
> Mark.


Re: Sorting documents in one core based on a field in another core

2016-05-31 Thread Erick Erickson
Join doesn't work like that, which is why it's referred
to as "pseudo join". There's no way that I know of
to do what you want here.

I'd strongly recommend you flatten your data at index time.

Best,
Erick

On Tue, May 31, 2016 at 7:41 AM, Mark Robinson  wrote:
> Hi,
>
> I have a requirement to sort records in one core/ collection based on a
> field in
> another core/collection.
>
> Could some one please advise how it can be done in SOLR.
>
> I have used !join to restrict documents in one core based on field values
> in another core. Is there some way to sort like that?
>
>
> Thanks!
> Mark.


Re: Solr leaking references to deleted files

2016-05-31 Thread Erick Erickson
Possibly:
SOLR-9116 or SOLR-9117? Note those two require that the core be
reloaded, so you have to be doing something a bit unusual for them to
be the problem.

Best,
Erick

On Tue, May 31, 2016 at 5:41 AM, Gavin Harcourt
 wrote:
> Hi All,
>
> I've noticed on some of my solr nodes that the disk usage is increasing over
> time. After checking the output of lsof I found hundreds of references to
> deleted index files being held by solr. This totaled 24GB on a 16GB index. A
> restart of solr can obviously fix this but this is not an ideal solution. We
> are running solr 5.4.0 on OpenJDK  1.8.0_91. We are using the Concurrent
> Mark Sweep GC although I've also seen the same problem on nodes using the G1
> GC. Our update handler has autoCommit and softAutoCommit enabled (at
> different intervals). We are using solr cloud and have multiple shards with
> 2 nodes each in our collections. I've not seen any pattern between this
> appearing on leaders or replicas. Not all my nodes appear to be exhibiting
> the problem either. Our usage pattern does involve a lot of churn in our
> index with the majority of documents being updated/deleted every day.
>
> Searching JIRA and the web in general I could only find references to this
> sort of problem when running solr in tomcat. Can anyone suggest a reason why
> this might be happening or a way I can manage it without needing to restart
> solr?
>
> Example lsof output:
> java   1100  s123  DEL   REG 202,3   8919406
> /home/s123/solr/data/uk_shard2_replica1/data/index/_3m9s.fdt
> java   1100  s123  DEL   REG 202,3   8919159
> /home/s123/solr/data/uk_shard2_replica1/data/index/_3mnk.tvd
> java   1100  s123  DEL   REG 202,3   8919150
> /home/s123/solr/data/uk_shard2_replica1/data/index/_3mnk_Lucene50_0.tim
> java   1100  s123  DEL   REG 202,3   8919094
> /home/s123/solr/data/uk_shard2_replica1/data/index/_3mq1_Lucene50_0.tim
> java   1100  s123  DEL   REG 202,3   8919103
> /home/s123/solr/data/uk_shard2_replica1/data/index/_3mq1.tvd
>
> Regards,
> Gavin.
>


Re: float or string type for a field with whole number and decimal number values?

2016-05-31 Thread Erick Erickson
First, when changing the topic of the thread, please start a new thread. This
is called "thread hijacking" and makes it difficult to find threads later.

Collection aliasing does not do _anything_ about adding/deleting/whatever.
It's just a way to do exactly what you want. Your clients point to
mycollection.

You use the CREATEALIAS command to point mycollection to mycollection_1.
Thereafter you can do anything you want to mycollection_1 using either name.

That is, you can address mycollection_1 explicitly. You can use mycollection. It
doesn't matter.

Then you can create mycollection_2. So far you can _only_ address mycollection_2
explicitly. You then use the CREATEALIAS to point mycollection at
mycollection_2.
At that point, anybody using mycollection will start working with
mycollection_2.

Meanywhile, mycollection_1 is still addressable (presumably by the back end) by
addressing it explicitly rather than through an alias. It has _not_ been changed
in any way by creating the new alias.

Best,
Erick

On Mon, May 30, 2016 at 11:15 PM, Derek Poh  wrote:
> Hi Erick
>
> Thank you for pointing out the sort behaviour of numbers in a string field.
> I did not think of that. Will use float.
>
> Would like to know how would you guys handle the usage of collection alias
> in my case.
> I have a 'product' collectionand Icreate a new collection'product_tmp' for
> this field type change and index into it. I create an alias 'product' on
> this new 'product_tmp' collection.
> IfI were to index to or delete documents from the 'product' collection, SOLR
> will index on and delete from 'product_tmp' collection, am I right?
> That means the 'product' collection cannot be usedanymore?
> Even if I were to create an alias 'product_old' on 'product'
> collection;issue a delete all documents or index on 'product_old', SOLR will
> delete or index on 'product_tmp' collection instead?
>
> My intention is to avoid having to updatethe clients serversto point to
> 'product_tmp' collection.
>
>
> On 5/31/2016 10:57 AM, Erick Erickson wrote:
>>
>> bq: Should I change the field type to "float" or "string"?
>>
>> I'd go with float. Let's assume you want to sort by
>> this field. 10.00 sorts before 9.0 if you
>> just use Strings. Plus floats are generally much more
>> compact.
>>
>> bq: do I need to delete all documents in the index and do a full indexing
>>
>> That's the way I'd do it. You can always index to a _new_ collection
>> (assuming SolrCloud) and use collection aliasing to switch your
>> search all at once
>>
>> Best,
>> Erick
>>
>> On Sun, May 29, 2016 at 12:56 AM, Derek Poh 
>> wrote:
>>>
>>> I am using solr 4.10.4.
>>>
>>>
>>> On 5/29/2016 3:52 PM, Derek Poh wrote:

 Hi

 I have a field that is of "int" type currentlyand it's values are whole
 numbers.

 >>> stored="true" multiValued="false"/>

 Due tochange inbusiness requirement, this field will need to take in
 decimal numbers as well.
 This fieldis sorted onand filter by range (field:[ 1 to *]).

 Should I change the field type to "float" or "string"?
 For the change to take effect, do I need to delete all documents in the
 index and do a full indexing? Or I can just do a full indexing without
 theneed to delete all documents first?

 Derek

 --
 CONFIDENTIALITY NOTICE
 This e-mail (including any attachments) may contain confidential and/or
 privileged information. If you are not the intended recipient or have
 received this e-mail in error, please inform the sender immediately and
 delete this e-mail (including any attachments) from your computer, and
 you
 must not use, disclose to anyone else or copy this e-mail (including any
 attachments), whether in whole or in part.
 This e-mail and any reply to it may be monitored for security, legal,
 regulatory compliance and/or other appropriate reasons.
>>>
>>>
>>>
>>> --
>>> CONFIDENTIALITY NOTICE
>>> This e-mail (including any attachments) may contain confidential and/or
>>> privileged information. If you are not the intended recipient or have
>>> received this e-mail in error, please inform the sender immediately and
>>> delete this e-mail (including any attachments) from your computer, and
>>> you
>>> must not use, disclose to anyone else or copy this e-mail (including any
>>> attachments), whether in whole or in part.
>>> This e-mail and any reply to it may be monitored for security, legal,
>>> regulatory compliance and/or other appropriate reasons.
>>
>>
>
>
> --
> CONFIDENTIALITY NOTICE
> This e-mail (including any attachments) may contain confidential and/or
> privileged information. If you are not the intended recipient or have
> received this e-mail in error, please inform the sender immediately and
> delete this e-mail (including any attachments) from your computer, and you
> must not use, disclose to anyone else or copy this e-ma

Re: Solr leaking references to deleted files

2016-05-31 Thread Gavin Harcourt
Those two bugs would make sense as we have been reloading the cores 
quite frequently recently to apply new config and schema changes. I'll 
keep an eye on the situation now our reload spree has ended and see if 
it recurs.


Thanks,
Gavin.

On 31/05/16 16:14, Erick Erickson wrote:

Possibly:
SOLR-9116 or SOLR-9117? Note those two require that the core be
reloaded, so you have to be doing something a bit unusual for them to
be the problem.

Best,
Erick

On Tue, May 31, 2016 at 5:41 AM, Gavin Harcourt
 wrote:

Hi All,

I've noticed on some of my solr nodes that the disk usage is increasing over
time. After checking the output of lsof I found hundreds of references to
deleted index files being held by solr. This totaled 24GB on a 16GB index. A
restart of solr can obviously fix this but this is not an ideal solution. We
are running solr 5.4.0 on OpenJDK  1.8.0_91. We are using the Concurrent
Mark Sweep GC although I've also seen the same problem on nodes using the G1
GC. Our update handler has autoCommit and softAutoCommit enabled (at
different intervals). We are using solr cloud and have multiple shards with
2 nodes each in our collections. I've not seen any pattern between this
appearing on leaders or replicas. Not all my nodes appear to be exhibiting
the problem either. Our usage pattern does involve a lot of churn in our
index with the majority of documents being updated/deleted every day.

Searching JIRA and the web in general I could only find references to this
sort of problem when running solr in tomcat. Can anyone suggest a reason why
this might be happening or a way I can manage it without needing to restart
solr?

Example lsof output:
java   1100  s123  DEL   REG 202,3   8919406
/home/s123/solr/data/uk_shard2_replica1/data/index/_3m9s.fdt
java   1100  s123  DEL   REG 202,3   8919159
/home/s123/solr/data/uk_shard2_replica1/data/index/_3mnk.tvd
java   1100  s123  DEL   REG 202,3   8919150
/home/s123/solr/data/uk_shard2_replica1/data/index/_3mnk_Lucene50_0.tim
java   1100  s123  DEL   REG 202,3   8919094
/home/s123/solr/data/uk_shard2_replica1/data/index/_3mq1_Lucene50_0.tim
java   1100  s123  DEL   REG 202,3   8919103
/home/s123/solr/data/uk_shard2_replica1/data/index/_3mq1.tvd

Regards,
Gavin.





Re: Clarity on Sharding Concepts.

2016-05-31 Thread Siddhartha Singh Sandhu
Hi Mugeesh,

I was speculating whether sharding is done on:
1. index terms with each shard having the whole document space.
2. document space with each shard have num(documents/no. of shards) of the
documents divided between them.

Regards,

Sid.

On Tue, May 31, 2016 at 12:19 AM, Mugeesh Husain  wrote:

> Hi,
>
> To read out this document
>
> https://cwiki.apache.org/confluence/display/solr/Shards+and+Indexing+Data+in+SolrCloud
> for proper understanding.
>
> FYI, you are using implicit router, a document will be divided randomly
> based on hashing technique.
>
> If you indexed 50 documents, it will be divided into 2 parts, 1 goes to
> shard1, second one is shard2 and same document will be go their replica
> respectively .
>
>
> Thanks
> Mugeesh
>
>
>
> --
> View this message in context:
> http://lucene.472066.n3.nabble.com/Clarity-on-Sharding-Concepts-tp4279842p4279856.html
> Sent from the Solr - User mailing list archive at Nabble.com.
>


Re: SolrCloud Shard console shows roughly same number of documents?

2016-05-31 Thread Siddhartha Singh Sandhu
Hi,

I was speculating whether sharding is done on:
1. index terms with each shard having the whole document space.
2. document space with each shard have num(documents/no. of shards) of the
documents divided between them.

Regards,

Sid.

On Tue, May 31, 2016 at 9:27 AM, Siddhartha Singh Sandhu <
sandhus...@gmail.com> wrote:

> Thank you.
>
> On Mon, May 30, 2016 at 11:15 PM, Erick Erickson 
> wrote:
>
>> You should have:
>> shard1_replica1 + shard2_replica1 = 50 ?
>>
>> On Sat, May 28, 2016 at 9:58 AM, Siddhartha Singh Sandhu
>>  wrote:
>> > Still struggling with this. Bump. :)
>> >
>> > On Thu, May 26, 2016 at 3:53 PM, Siddhartha Singh Sandhu
>> >  wrote:
>> >>
>> >> Hi Erick,
>> >>
>> >> Thank you for the reply. What I meant was suppose I have the config:
>> >>
>> >> 2 shards each with 1 replica.
>> >>
>> >> Hence, on both servers I have
>> >> 1.  shard1_replica1
>> >> 2 . shard2_replica1
>> >>
>> >> Suppose I have 50 documents then,
>> >> shard1_replica1 + shard2_replica1 = 50 ?
>> >>
>> >> or shard2_replica1 = 50 && shard1_replica1 = 50 ?
>> >>
>> >> Regards,
>> >>
>> >> Sid.
>> >>
>> >> On Thu, May 26, 2016 at 2:30 PM, Erick Erickson <
>> erickerick...@gmail.com>
>> >> wrote:
>> >>>
>> >>> Q1: Not quite sure what you mean. Let's say I have 2 shards, 3
>> >>> replicas each 16 docs on each.I _think_ you're
>> >>> talking about the "core selector", which shows the docs on that
>> >>> particular core, 16 in our case not 48.
>> >>>
>> >>> Q2: Yes, that's how SolrCloud is designed. It has to be for HA/DR.
>> >>> Every replica in a shard has all the docs, 16 as above. Otherwise if
>> >>> one of your machines went down there could be no guarantee even
>> >>> attempted about there not being data loss.
>> >>>
>> >>> Q3: Yes, indexing will be slower when there is more than one replica
>> >>> per shard since the raw document is forwarded from the leader to all
>> >>> followers before acking back. In distributed situations, you will have
>> >>> a bunch (potentially) more machines doing indexing so total throughput
>> >>> can be faster.
>> >>>
>> >>> Why do you care? Is there a problem or is this just general background
>> >>> info? There are a number of techniques for speeding up indexing, the
>> >>> first is to use SolrJ and CloudSolrClient and send batches of docs at
>> >>> once rather than one-at-a-time.
>> >>>
>> >>> Best,
>> >>> Erick
>> >>>
>> >>> On Wed, May 25, 2016 at 1:54 PM, Siddhartha Singh Sandhu
>> >>>  wrote:
>> >>> > Hi,
>> >>> >
>> >>> > I recently moved to a SolrCloud config. I had a few questions:
>> >>> >
>> >>> > Q1. Does a shard show cumulative number of documents or documents
>> >>> > present
>> >>> > in that particular shard on the admin console of respective shard?
>> >>> >
>> >>> > Q2. If 1's answer is non-cumulative then my shards(on different
>> >>> > servers)
>> >>> > are indexing all the documents on each instance of shard. Is this
>> >>> > natural?
>> >>> > I created the shards with compositeId.
>> >>> >
>> >>> > Q3. If the answer to 1 is cumulative then my indexing was slower
>> then a
>> >>> > single core instance which was on the same machine of which I have 2
>> >>> >  now(my shards). What could I be missing while configuring Solr?
>> >>> >
>> >>> >
>> >>> > I am using Solr 6.0.0 on Ubuntu 14.04 with external zookeeper.
>> >>> >
>> >>> > Regards,
>> >>> >
>> >>> > Sid.
>> >>
>> >>
>> >
>>
>
>


Re: Sorting documents in one core based on a field in another core

2016-05-31 Thread Mark Robinson
Thanks for the reply Eric!

Can we write a custom sort component to achieve this?...
I am thinking of normalizing as the last option as clear separation of the
cores helps me.

Thanks!
Mark.

On Tue, May 31, 2016 at 11:12 AM, Erick Erickson 
wrote:

> Join doesn't work like that, which is why it's referred
> to as "pseudo join". There's no way that I know of
> to do what you want here.
>
> I'd strongly recommend you flatten your data at index time.
>
> Best,
> Erick
>
> On Tue, May 31, 2016 at 7:41 AM, Mark Robinson 
> wrote:
> > Hi,
> >
> > I have a requirement to sort records in one core/ collection based on a
> > field in
> > another core/collection.
> >
> > Could some one please advise how it can be done in SOLR.
> >
> > I have used !join to restrict documents in one core based on field values
> > in another core. Is there some way to sort like that?
> >
> >
> > Thanks!
> > Mark.
>


RE: Clarity on Sharding Concepts.

2016-05-31 Thread Garth Grimm
Both.

One shard will have roughly half the documents, and the indices built from 
them; the other shard will have the other half of the documents, and the 
indices built from those.

There won't be one location that contains all the documents, nor all the 
indices.

-Original Message-
From: Siddhartha Singh Sandhu [mailto:sandhus...@gmail.com] 
Sent: Tuesday, May 31, 2016 10:43 AM
To: solr-user@lucene.apache.org; muge...@gmail.com
Subject: Re: Clarity on Sharding Concepts.

Hi Mugeesh,

I was speculating whether sharding is done on:
1. index terms with each shard having the whole document space.
2. document space with each shard have num(documents/no. of shards) of the 
documents divided between them.

Regards,

Sid.

On Tue, May 31, 2016 at 12:19 AM, Mugeesh Husain  wrote:

> Hi,
>
> To read out this document
>
> https://cwiki.apache.org/confluence/display/solr/Shards+and+Indexing+D
> ata+in+SolrCloud
> for proper understanding.
>
> FYI, you are using implicit router, a document will be divided 
> randomly based on hashing technique.
>
> If you indexed 50 documents, it will be divided into 2 parts, 1 goes 
> to shard1, second one is shard2 and same document will be go their 
> replica respectively .
>
>
> Thanks
> Mugeesh
>
>
>
> --
> View this message in context:
> http://lucene.472066.n3.nabble.com/Clarity-on-Sharding-Concepts-tp4279
> 842p4279856.html Sent from the Solr - User mailing list archive at 
> Nabble.com.
>


Re: Add a new field dynamically to each of the result docs and sort on it

2016-05-31 Thread Mark Robinson
sorry Eric... I did not phrase it right ... what I meant was the field is
there in the schema, but I do not have values for it when normal indexing
happens.
When a query comes in, I want to populate value for this field in the
results based on some values passed in the query.
So what needs to be accommodated in the result depends on a parameter in
the query and I would like to sort the final results on this field also,
which is dynamically populated.

What could be the best way to dynamically add value to this field based on
a query parameter and sort on this field also.

Will a custom component help, with code in the *process *method to access
the results one by one and plug in this field help?
If so do I need to first index the value inside the *process *method for
reach result or is there a way to just add this value to each of my results
doc (no indexing) iterating through the result set and plugging in this
value for each result.

How will sort be applicable on this dynamically populated field as I am
already working on the results and is it too late to specify a sort and if
so how could it be possible.

Thanks!
Mark.






On Tue, May 31, 2016 at 11:10 AM, Erick Erickson 
wrote:

> I really don't understand this. If you don't have
> "fieldnew", where is the value coming from? It's
> not in the index so
>
> If you mean you're _adding_ a field after the index
> already has some docs in it, then the normal
> sort rules apply and you can specify sortMisingFirst/Last
> to tell Solr where other docs without that field shold go.
>
> Normal sort rules are '&sort=field1 asc,field2 desc' etc.
>
> Best,
> Erick
>
> On Tue, May 31, 2016 at 7:53 AM, Mark Robinson 
> wrote:
> > Hi,
> >
> > My core does not have a field say *fieldnew*.
> >
> > *Case 1:-*
> > But in my results I would like to have *fieldnew *also and my results
> > should be sorted on only this new field.
> >
> > *Case 2:-*
> > Just adding one more case further.
> > Suppose I have other fields also in the sort criteria and *fieldnew *is
> one
> > among them, in that case how do I realize this multi field sort also.
> >
> > Could some one suggest a way pls.
> >
> > Thanks!
> > Mark.
>


ClusterState says we are the leader, but locally we don't think so

2016-05-31 Thread Jon Drews
We have seen the following error on four separate instances of Solr. The
result is that all or most shards go into "Down" state and do not recover
on restart of Solr.

I'm hoping one of you has some insight into what might be causing it as we
haven't been able to track down the issue or reproduce it reliably.

2016-05-26 21:00:09.000 ERROR (qtp1450821318-15) [c:log s:20160526
r:core_node4 x:log_20160526_replica1] o.a.s.c.SolrCore
org.apache.solr.common.SolrException: ClusterState says we are the leader (
https://localhost:8984/solr/log_20160526_replica1), but locally we don't
think so. Request came from
https://localhost:8984/solr/log_20160524_replica1/

We were able to recover by using https://github.com/echoma/zkui/ to
manually edit the /clusterstate.json and /collections/log/state.json to set
shards from "Down" to "Active". After that the error subsided and
functionality was restored.

A few notes:
- All four systems were on either Windows 7 or Windows Server 2012.
- All four systems are on single servers with embedded zookeepers.
- SSL was enabled in Solr, but no authentication
- After the issue, we increased the zkClientTimeout and restarted, however
all shards were still in a Down state and error persisted.
- Migrating the solr instance to a new Windows install did not solve issue.

Please let me know if you have any ideas as to why this is happening and
possible solutions. Thanks!


Re: Add a new field dynamically to each of the result docs and sort on it

2016-05-31 Thread Shawn Heisey
On 5/31/2016 10:16 AM, Mark Robinson wrote:
> sorry Eric... I did not phrase it right ... what I meant was the field is
> there in the schema, but I do not have values for it when normal indexing
> happens.
> When a query comes in, I want to populate value for this field in the
> results based on some values passed in the query.
> So what needs to be accommodated in the result depends on a parameter in
> the query and I would like to sort the final results on this field also,
> which is dynamically populated.
>
> What could be the best way to dynamically add value to this field based on
> a query parameter and sort on this field also.

Queries do not normally change the index.  They normally use a Lucene
object that can search the index, not an object that can write to the index.

I do not know whether a custom query component will be able to achieve
write access to the index or not.  If it can, then you *might* be able
to do what you want, but be aware that the index must meet the
requirements for Atomic Update functionality, or the entire idea won't
work at all:

https://wiki.apache.org/solr/Atomic_Updates#Caveats_and_Limitations

I have never written a custom query component, so I cannot say for sure
that this is achievable.

> Will a custom component help, with code in the *process *method to access
> the results one by one and plug in this field help?
> If so do I need to first index the value inside the *process *method for
> reach result or is there a way to just add this value to each of my results
> doc (no indexing) iterating through the result set and plugging in this
> value for each result.

The mention of a "process" method here suggests that you are thinking of
an UpdateProcessor.  This would work if you are indexing ... but above
you said this would happen at *query* time, which is a little bit different.

> How will sort be applicable on this dynamically populated field as I am
> already working on the results and is it too late to specify a sort and if
> so how could it be possible.

I do not know anything about custom code and sorting.

Thanks,
Shawn



Re: SolrCloud Shard console shows roughly same number of documents?

2016-05-31 Thread Shawn Heisey
On 5/31/2016 9:53 AM, Siddhartha Singh Sandhu wrote:
> I was speculating whether sharding is done on: 1. index terms with
> each shard having the whole document space. 2. document space with
> each shard have num(documents/no. of shards) of the documents divided
> between them. 

If the router for the collection is "implicit" then sharding is 100
percent manual.  You decide which shard gets the document when you index
it.  There is no automatic shard routing.

If the router is "compositeId" then the shard is determined by doing a
hash on the value of the uniqueKey field, then looking up which shard
handles that hash in the clusterstate.  This choice can be influenced by
using a composite ID value.  If there are plenty of documents and you
don't use composite IDs, the distribution between shards will be mostly
equal.  The following URL contains some information on composite ID routing:

https://lucidworks.com/blog/2014/01/06/multi-level-composite-id-routing-solrcloud/

Thanks,
Shawn



Re: Faceting and Grouping Performance Degradation in Solr 5

2016-05-31 Thread Alessandro Benedetti
Interesting developments :

https://issues.apache.org/jira/browse/SOLR-9176

I think we found why term Enum seems slower in recent Solr !
In our case it is likely to be related to the commit I mention in the Jira.
Have a check Joel !

On Wed, May 25, 2016 at 12:30 PM, Alessandro Benedetti <
abenede...@apache.org> wrote:

> I am investigating this scenario right now.
> I can confirm that the enum slowness is in Solr 6.0 as well.
> And I agree with Joel, it seems to be un-related with the famous faceting
> regression :(
>
> Furthermore with the legacy facet approach, if you set docValues for the
> field you are not going to be able to try the enum approach anymore.
>
> org/apache/solr/request/SimpleFacets.java:448
>
> if (method == FacetMethod.ENUM && sf.hasDocValues()) {
>   // only fc can handle docvalues types
>   method = FacetMethod.FC;
> }
>
>
> I got really horrible regressions simply using term enum in both Solr 4
> and Solr 6.
>
> And even the most optimized fcs approach with docValues and
> facet.threads=nCore does not perform as the simple enum in Solr 4 .
>
> i.e.
>
> For some sample queries I have 40 ms vs 160 ms and similar...
> I think we should open an issue if we can confirm it is not related with
> the other.
> A lot of people will continue using the legacy approach for a while...
>
> On Wed, May 18, 2016 at 10:42 PM, Joel Bernstein 
> wrote:
>
>> The enum slowness is interesting. It would appear on the surface to not be
>> related to the FieldCache issue. I don't think the main emphasis of the
>> JSON facet API has been the enum approach. You may find using the JSON
>> facet API and eliminating the use of enum meets your performance needs.
>>
>> With the CollapsingQParserPlugin top_fc is definitely faster during
>> queries. The tradeoff is slower warming times and increased memory usage
>> if
>> the collapse fields are used in faceting, as faceting will load the field
>> into a different cache.
>>
>> Joel Bernstein
>> http://joelsolr.blogspot.com/
>>
>> On Wed, May 18, 2016 at 5:28 PM, Solr User  wrote:
>>
>> > Joel,
>> >
>> > Thank you for taking the time to respond to my question.  I tried the
>> JSON
>> > Facet API for one query that uses facet.method=enum (since this one has
>> a
>> > ton of unique values and performed better with enum) but this was way
>> > slower than even the slower Solr 5 times.  I did not try the new API
>> with
>> > the non-enum queries though so I will give that a go.  It looks like
>> Solr
>> > 5.5.1 also has a facet.method=uif which will be interesting to try.
>> >
>> > If these do not prove helpful, it looks like I will need to wait for
>> > SOLR-8096 to be resolved before upgrading.
>> >
>> > Thanks also for your comment on top_fc for the CollapsingQParser.  I use
>> > collapse/expand for some queries but traditional grouping for others
>> due to
>> > performance.  It will be interesting to see if those grouping queries
>> > perform better now using CollapsingQParser with top_fc.
>> >
>> > On Wed, May 18, 2016 at 11:39 AM, Joel Bernstein 
>> > wrote:
>> >
>> > > Yes, SOLR-8096 is the issue here.
>> > >
>> > > I don't believe indexing with docValues is going to help too much with
>> > > this. The enum slowness may not be related, but I'm not positive about
>> > > that.
>> > >
>> > > The major slowdowns are likely due to the removal of the top level
>> > > FieldCache from general use and the removal of the FieldValuesCache
>> which
>> > > was used for multi-value field faceting.
>> > >
>> > > The JSON facet API covers all the functionality in the traditional
>> > > faceting, and it has been developed to be very performant.
>> > >
>> > > You may also want to see if Collapse/Expand can meet your applications
>> > > needs rather Grouping. It allows you to specify using a top level
>> > > FieldCache if performance is a blocker without it.
>> > >
>> > >
>> > >
>> > >
>> > > Joel Bernstein
>> > > http://joelsolr.blogspot.com/
>> > >
>> > > On Wed, May 18, 2016 at 10:42 AM, Solr User 
>> wrote:
>> > >
>> > > > Does anyone know the answer to this?
>> > > >
>> > > > On Wed, May 4, 2016 at 2:19 PM, Solr User 
>> wrote:
>> > > >
>> > > > > I recently was attempting to upgrade from Solr 4.8.1 to Solr 5.4.1
>> > but
>> > > > had
>> > > > > to abort due to average response times degraded from a baseline
>> > volume
>> > > > > performance test.  The affected queries involved faceting (both
>> enum
>> > > > method
>> > > > > and default) and grouping.  There is a critical bug
>> > > > > https://issues.apache.org/jira/browse/SOLR-8096 currently open
>> > which I
>> > > > > gather is the cause of the slower response times.  One concern I
>> have
>> > > is
>> > > > > that discussions around the issue offer the suggestion of indexing
>> > with
>> > > > > docValues which alleviated the problem in at least that one
>> reported
>> > > > case.
>> > > > > However, indexing with docValues did not improve the performance
>> in
>> > my
>> > > > case.
>> > > > >
>> > > > > Can someone ple

Re: [Solr 6] Migration from Solr 4.10.2

2016-05-31 Thread Alessandro Benedetti
I think we found our performance killer here :

https://issues.apache.org/jira/browse/SOLR-9176

Basically we were thinking to use Term Enum, but actually under the hood
 Solr forces you to use FCS with single valued numeric fields.
In Solr 4 was not like that.
I checked the commit related , and it is not functionally equivalent and no
message in there related.
Let's continue the discussion in Jira.

On Wed, May 25, 2016 at 9:45 AM, Alessandro Benedetti  wrote:

> I was taking a look into the code again :
> org/apache/solr/search/facet/FacetField.java:115 ( branch 6.0 )
>
> if (!multiToken) {
>> if (ntype != null) {
>> // single valued numeric (docvalues or fieldcache)
>> return new FacetFieldProcessorNumeric(fcontext, this, sf);
>> } else {
>> // single valued string...
>> return new FacetFieldProcessorDV(fcontext, this, sf);
>> }
>> }
>> // multi-valued after this point
>> if (sf.hasDocValues() || method == FacetMethod.DV) {
>> // single and multi-valued string docValues
>> return new FacetFieldProcessorDV(fcontext, this, sf);
>> }
>> // Top-level multi-valued field cache (UIF)
>> return new FacetFieldProcessorUIF(fcontext, this, sf);
>
>
> This part is for the new Json Facet code ( but when you pass the uif
> method in legacy facet, we pass to this code mocking the Json ).
> According to this code if you have docValues for the field, single valued
> or multi Valued you are going to use FacetFieldProcessorDV.
> This seems to be the reason I don't see my fieldValueCache populated, I
> have both single/multi valued fields now, but all of them have docValues!
>
> On Tue, May 24, 2016 at 9:38 PM, Mikhail Khludnev <
> mkhlud...@griddynamics.com> wrote:
>
>> Alessandro,
>>
>> I checked with Solr 6.0 distro on techproducts.
>> Faceting on cat with uif hits fieldValueCache
>>
>> http://localhost:8983/solr/techproducts/select?facet.field=cat&facet.method=uif&facet=on&indent=on&q=*:*&wt=json
>>
>> fieldValueCache
>> - class:org.apache.solr.search.FastLRUCache
>> - description:Concurrent LRU Cache(maxSize=1, initialSize=10,
>> minSize=9000, acceptableSize=9500, cleanupThread=false)
>> - src:
>> - version:1.0 stats:
>>
>>- cumulative_evictions:0
>>- cumulative_hitratio:0.5
>>- cumulative_hits:1
>>- cumulative_inserts:2
>>- cumulative_lookups:2
>>- evictions:0
>>- hitratio:0.5
>>- hits:1
>>- inserts:2
>>- item_cat:
>>
>>  
>> {field=cat,memSize=4665,tindexSize=46,time=28,phase1=27,nTerms=16,bigTerms=2,termInstances=21,uses=0}
>>- lookups:2
>>- size:1
>>
>> Beware, for example field manu_exact doesn't hit field value cache,
>> because
>> it single valued and goes to FacetFieldProcessorDV instead of
>> FacetFieldProcessorUIF.  And cat is multivalued and hits UIF. see
>> org.apache.solr.search.facet.FacetField.createFacetProcessor(FacetContext)
>> it might need to just debug there.
>>
>> In summary, uif works and you have a chance to hit it. Goof Luck!
>>
>> On Tue, May 24, 2016 at 7:43 PM, Alessandro Benedetti <
>> benedetti.ale...@gmail.com> wrote:
>>
>> > Update , it seems clear I incurred in the bad
>> > https://issues.apache.org/jira/browse/SOLR-8096 :
>> >
>> > Just adding some additional information as I just incurred on the issue
>> > with Solr 6.0 :
>> > Static index, around 50 *10^6 docs, 20 fields to facet, 1 of them with
>> high
>> > cardinality on top of grouping.
>> > Groping was not affecting at all.
>> >
>> > All the symptoms are there, Solr 4.10.2 around 150 ms and Solr 6.0
>> around
>> > 550 ms .
>> > The 'fieldValueCache' seems to be unused (no inserts nor lookups) in
>> Solr
>> > 6.0.
>> > In Solr 4.10 the 'fieldValueCache' is in heavy use with a
>> > cumulative_hitratio of 0.96 .
>> > Switching from enum to fc to fcs to uif did not change that much.
>> >
>> > Moving to DocValues didn't improve that much the situation ( but I was
>> on
>> > an optimized index, so I need to try the multi-segmented one according
>> > to Mikhail
>> > Khludnev
>> > 
>> > contribution
>> > in Solr 5.4.0 ) .
>> >
>> > Moving to field collapsing moved down the query to 110-120 ms ( but
>> this is
>> > normal, we were faceting on 260 /1 million orignal docs)
>> > Adding facet.threads=NCores moved down the queryTime to 100 ms, in
>> > combination with field collapsing we reached 80-90 ms when warmed.
>> >
>> > What are the plan for the future related this ?
>> > Do we want to deprecate the legacy facets implementation and move
>> > everything to Json facets ( like it happened with the UIF ) ?
>> > So backward compatible but different implementation ?
>> >
>> > I think for migrations should be a transparent process.
>> >
>> >
>> > Cheers
>> >
>> > On Mon, May 23, 2016 at 6:49 PM, Alessandro Benedetti <
>> > benedetti.ale...@gmail.com> wrote:
>> >
>> > > Furthermore I was checking the internals of the old facet
>> implementation
>> > (
>> > > which comes when using the classic request parameter based,  inst

Re: [Solr 6] Legacy faceting Term Enum method VS DocValues

2016-05-31 Thread Alessandro Benedetti
Further investigations lead to :

https://issues.apache.org/jira/browse/SOLR-9176

On Tue, May 24, 2016 at 12:47 PM, Alessandro Benedetti <
abenede...@apache.org> wrote:

> Hi guys,
> It has been a while I was thinking about this and yesterday I took a look
> into the code :
>
> I was wondering if the termEnum approach is still a valid alternative to
> docValues when we have low cardinality fields.
>
> The reason I am asking this is because yesterday I run into this piece of
> code :
>
> org/apache/solr/request/SimpleFacets.java:448
>
> if (method == FacetMethod.ENUM && sf.hasDocValues()) {
>   // only fc can handle docvalues types
>   method = FacetMethod.FC;
> }
>
> So it seems that , if you enable the docValues in the schema, we are
> always going to use them even if the method select is term enum.
>
> So does it mean, in case we have enough disk space, that it is always
> suggested to use docValues now ?
>
> Of course I know that would be great to move as soon as possible to the
> new json facet API approach.
>
> P.S. still verifying the famous legacy facet degradation on latest Solr
> compared to old Solr4.
>
> Cheers
> --
> --
>
> Benedetti Alessandro
> Visiting card : http://about.me/alessandro_benedetti
>
> "Tyger, tyger burning bright
> In the forests of the night,
> What immortal hand or eye
> Could frame thy fearful symmetry?"
>
> William Blake - Songs of Experience -1794 England
>



-- 
--

Benedetti Alessandro
Visiting card : http://about.me/alessandro_benedetti

"Tyger, tyger burning bright
In the forests of the night,
What immortal hand or eye
Could frame thy fearful symmetry?"

William Blake - Songs of Experience -1794 England


Re: ClusterState says we are the leader, but locally we don't think so

2016-05-31 Thread Jon Drews
I forgot to add that this is Apache Solr 5.3.1.

There are three collections, two of which have one shard and and the other
has 3-5 shards. Approximately 200,000 documents across all collections.

Jon Drews
jondrews.com

On Tue, May 31, 2016 at 12:15 PM, Jon Drews  wrote:

> We have seen the following error on four separate instances of Solr. The
> result is that all or most shards go into "Down" state and do not recover
> on restart of Solr.
>
> I'm hoping one of you has some insight into what might be causing it as we
> haven't been able to track down the issue or reproduce it reliably.
>
> 2016-05-26 21:00:09.000 ERROR (qtp1450821318-15) [c:log s:20160526
> r:core_node4 x:log_20160526_replica1] o.a.s.c.SolrCore
> org.apache.solr.common.SolrException: ClusterState says we are the leader (
> https://localhost:8984/solr/log_20160526_replica1), but locally we don't
> think so. Request came from
> https://localhost:8984/solr/log_20160524_replica1/
>
> We were able to recover by using https://github.com/echoma/zkui/ to
> manually edit the /clusterstate.json and /collections/log/state.json to set
> shards from "Down" to "Active". After that the error subsided and
> functionality was restored.
>
> A few notes:
> - All four systems were on either Windows 7 or Windows Server 2012.
> - All four systems are on single servers with embedded zookeepers.
> - SSL was enabled in Solr, but no authentication
> - After the issue, we increased the zkClientTimeout and restarted, however
> all shards were still in a Down state and error persisted.
> - Migrating the solr instance to a new Windows install did not solve issue.
>
> Please let me know if you have any ideas as to why this is happening and
> possible solutions. Thanks!
>


Re: Solr vs JDBC driver

2016-05-31 Thread Joel Bernstein
You mentioned that you had to use Class.foreName() for other drivers as
well. Possibly there is something in your setup that is suppressing the
driver auto loading.

Joel Bernstein
http://joelsolr.blogspot.com/

On Tue, May 31, 2016 at 8:30 AM, Vachon, Jean-Sébastien <
jvac...@cebglobal.com> wrote:

> I am using Java 8 (JDK 1.8.091) and it’s an application layer on top of
> Solr 6 using SolrJ.
> Here is the section of my pom.xml
>
> 
> org.apache.solr
> solr-solrj
> 6.0.0
> 
>
>
> I had to manually load the driver
> (“org.apache.solr.client.solrj.io.sql.DriverImpl")
> to make it work (no big deal).
>
> My connection string is as follows:
>
> "jdbc:solr://10.28.213.133:2181?collection=Current1"
>
>
> Is there something wrong in my setup?
>
> Thanks
>
> On 2016-05-28, 9:42 AM, "Joel Bernstein"  wrote:
>
> >The driver is included in /META-INF/services/java.sql.Driver. So if you're
> >using JDBC 4.0, the driver should be autoloaded.
> >
> >What version of java are you running?
> >
> >Joel Bernstein
> >
> https://urldefense.proofpoint.com/v2/url?u=http-3A__joelsolr.blogspot.com_
> >&d=CwIFaQ&c=zzHkMf6HMoOvCB4yTPe0Gg&r=-x-rme77PLDQ3ZHnf9yvFvYlNUI3MVyqcKj3K
> >lyZF0A&m=pZojVpq10BZok4-rElqKq3mXwoUx8HsGwFK0hhWGqAE&s=IEvtNH443siONK2tXX4
> >IPdFocT4Kc0JnwY-eqITAteg&e=
> >
> >On Fri, May 27, 2016 at 8:16 PM, Vachon, Jean-Sébastien <
> >jvac...@cebglobal.com> wrote:
> >
> >> Never mindŠ I had to load the class just like any database driver:
> >>
> >>
> >>
> >>Class.forName("org.apache.solr.client.solrj.io.sql.DriverImpl").newInstan
> >>ce
> >> ();
> >>
> >>
> >>
> >>
> >> On 2016-05-27, 2:59 PM, "Vachon, Jean-Sébastien"  >
> >> wrote:
> >>
> >> >Hi All,
> >> >
> >> >
> >> >
> >> >I am trying to use Solr¹s JDBC driver in Java and I¹m stuck with the
> >> >following error message:
> >> >
> >> >
> >> >
> >> >
> >> >
> >> >14:52:37,802 ERROR [consoleLogger] java.sql.SQLException: No suitable
> >> >driver found for jdbc:solr://
> 10.28.213.133:2181/solr?collection=Current
> >> >
> >> >
> >> >
> >> >My pom.xml contains:
> >> >
> >> >
> >> >
> >> >
> >> >
> >> >
> >> >
> >> >org.apache.solr
> >> >
> >> >
> >> >
> >> >solr-solrj
> >> >
> >> >
> >> >
> >> >6.0.0
> >> >
> >> >
> >> >
> >> >
> >> >
> >> >
> >> >
> >> >I looked at different posts:
> >> >
> >> >
> >> >
> >> >Yonnik¹s:
> >> >
> >>
> >>
> https://urldefense.proofpoint.com/v2/url?u=http-3A__yonik.com_solr-2D6_&d
> >>=
> >>
> >>>CwIGaQ&c=zzHkMf6HMoOvCB4yTPe0Gg&r=oMPffnCI8igMuHU_-rBzYXM4_YN0UQILws5Lxi
> >>>Hl
> >>
> >>>0UMSHcx1HOXvooqVgod85DbS&m=DgBFXI6SnwLs-KZ4iYaH6oaILBnR6DSJHIloMLKIrp8&s
> >>>=-
> >> >bVufG7EPgmW-V_ya5J9YMQDMKwuR14YORhwW6IAU2o&e=
> >> >
> >> >Sematext:
> >> >
> >>
> >>
> https://urldefense.proofpoint.com/v2/url?u=https-3A__sematext.com_blog_20
> >>1
> >>
> >>>6_04_26_solr-2D6-2Das-2Djdbc-2Ddata-2Dsource_&d=CwIGaQ&c=zzHkMf6HMoOvCB4
> >>>yT
> >>
> >>>Pe0Gg&r=oMPffnCI8igMuHU_-rBzYXM4_YN0UQILws5LxiHl0UMSHcx1HOXvooqVgod85DbS
> >>>&m
> >>
> >>>=DgBFXI6SnwLs-KZ4iYaH6oaILBnR6DSJHIloMLKIrp8&s=-KJ2iAt0odQ4BrkKaxc-TgJ0w
> >>>kL
> >> >l7vTOWmYbSmnpVYM&e=
> >> >
> >> >
> >> >
> >> >And I seem to meet all the requirements
> >> >
> >> >
> >> >
> >> >Any idea on what I¹m doing wrong?
> >> >
> >> >
> >> >
> >> >Thanks
> >> >
> >> >
> >> >
> >> >
> >> >
> >> >
> >> >
> >> >CEB Canada Inc. Registration No: 1781071. Registered office: 199 Bay
> >> >Street Commerce Court West, # 2800, Toronto, Ontario, Canada, M5L 1AP.
> >> >
> >> >
> >> >
> >> >
> >> >
> >> >
> >> >
> >> >This e-mail and/or its attachments are intended only for the use of the
> >> >addressee(s) and may contain confidential and legally privileged
> >> >information belonging to CEB and/or its subsidiaries, including SHL. If
> >> >you have received this e-mail in error, please notify the sender and
> >> >immediately, destroy all copies of this email and its attachments. The
> >> >publication, copying, in whole or in part, or use or dissemination in
> >>any
> >> >other way of this e-mail and attachments by anyone other than the
> >> >intended person(s) is prohibited.
> >> >
> >> >
> >> >
> >> >
> >> >
> >>
> >>
> >>
> >> CEB Canada Inc. Registration No: 1781071. Registered office: 199 Bay
> >> Street Commerce Court West, # 2800, Toronto, Ontario, Canada, M5L 1AP.
> >>
> >> This e-mail and/or its attachments are intended only for the use of the
> >> addressee(s) and may contain confidential and legally privileged
> >> information belonging to CEB and/or its subsidiaries, including SHL. If
> >>you
> >> have received this e-mail in error, please notify the sender and
> >> immediately, destroy all copies of this email and its attachments. The
> >> publication, copying, in whole or in part, or use or dissemination in
> >>any
> >> other way of this e-mail and attachments by anyone other than the
> >>intended
> >> person(s) is prohibited.
> >>
> >>
>
>
>
> CEB Canada Inc. Registration No: 1781071. Registered office: 199 Bay
> Street Commerce Court West, # 2800, Toront

Alternate Port Not Working for Solr 6.0.0

2016-05-31 Thread Teague James
Hello,

I am trying to install Solr 6.0.0 and have been successful with the default
installation, following the instructions provided on the Apache Solr
website. However, I do not want Solr running on port 8983, I want it to run
on port 80. I started a new Ubuntu 14.04 VM, installed open JDK 8, then
installed Solr with the following commands:

Command: tar xzf solr-6.0.0.tgz solr-6.0.0/bin/install_solr_service.sh
--strip-components=2
Response: None, which is good.

Command: ./install_solr_service.sh solr-6.0.0.tgz -p 80
Response: Misplaced or Unknown flag -p

So I tried...
Command: ./install_solr_service.sh solr-6.0.0.tgz -i /opt -d /var/solr -u
solr -s solr -p 80
Response: A dump of the log, which is INFO only with no errors or warnings,
at the top of which is "Solr process 4831 from /var/solr/solr-80.pid not
found"

If I look in the /var/solr directory I find a file called solr-80.pid, but
nothing else. What did I miss? Previous versions of Solr, which I deployed
with Tomcat instead of Jetty, allowed me to control this in the server.xml
file in /etc/tomcat7/, but obviously this no longer applies. I like the ease
of the installation script; I just want to be able to control the port
assignment. Any help is appreciated! Thanks!

-Teague

PS - Please resist the urge to ask me why I want it on port 80. I am well
aware of the security implications, etc., but regardless I still need to
make this operational on port 80. Cheers!



Re: Alternate Port Not Working for Solr 6.0.0

2016-05-31 Thread John Bickerstaff
This may be no help at all, but my first thought is to wonder if anything
else is already running on port 80?

That might explain the somewhat silent "fail"...

Nicely said by the way - resisting the urge 

On Tue, May 31, 2016 at 2:02 PM, Teague James 
wrote:

> Hello,
>
> I am trying to install Solr 6.0.0 and have been successful with the default
> installation, following the instructions provided on the Apache Solr
> website. However, I do not want Solr running on port 8983, I want it to run
> on port 80. I started a new Ubuntu 14.04 VM, installed open JDK 8, then
> installed Solr with the following commands:
>
> Command: tar xzf solr-6.0.0.tgz solr-6.0.0/bin/install_solr_service.sh
> --strip-components=2
> Response: None, which is good.
>
> Command: ./install_solr_service.sh solr-6.0.0.tgz -p 80
> Response: Misplaced or Unknown flag -p
>
> So I tried...
> Command: ./install_solr_service.sh solr-6.0.0.tgz -i /opt -d /var/solr -u
> solr -s solr -p 80
> Response: A dump of the log, which is INFO only with no errors or warnings,
> at the top of which is "Solr process 4831 from /var/solr/solr-80.pid not
> found"
>
> If I look in the /var/solr directory I find a file called solr-80.pid, but
> nothing else. What did I miss? Previous versions of Solr, which I deployed
> with Tomcat instead of Jetty, allowed me to control this in the server.xml
> file in /etc/tomcat7/, but obviously this no longer applies. I like the
> ease
> of the installation script; I just want to be able to control the port
> assignment. Any help is appreciated! Thanks!
>
> -Teague
>
> PS - Please resist the urge to ask me why I want it on port 80. I am well
> aware of the security implications, etc., but regardless I still need to
> make this operational on port 80. Cheers!
>
>


Re: Solr leaking references to deleted files

2016-05-31 Thread Erick Erickson
Cool, please let us know what you find out.

On Tue, May 31, 2016 at 8:34 AM, Gavin Harcourt
 wrote:
> Those two bugs would make sense as we have been reloading the cores quite
> frequently recently to apply new config and schema changes. I'll keep an eye
> on the situation now our reload spree has ended and see if it recurs.
>
> Thanks,
> Gavin.
>
> On 31/05/16 16:14, Erick Erickson wrote:
>>
>> Possibly:
>> SOLR-9116 or SOLR-9117? Note those two require that the core be
>> reloaded, so you have to be doing something a bit unusual for them to
>> be the problem.
>>
>> Best,
>> Erick
>>
>> On Tue, May 31, 2016 at 5:41 AM, Gavin Harcourt
>>  wrote:
>>>
>>> Hi All,
>>>
>>> I've noticed on some of my solr nodes that the disk usage is increasing
>>> over
>>> time. After checking the output of lsof I found hundreds of references to
>>> deleted index files being held by solr. This totaled 24GB on a 16GB
>>> index. A
>>> restart of solr can obviously fix this but this is not an ideal solution.
>>> We
>>> are running solr 5.4.0 on OpenJDK  1.8.0_91. We are using the Concurrent
>>> Mark Sweep GC although I've also seen the same problem on nodes using the
>>> G1
>>> GC. Our update handler has autoCommit and softAutoCommit enabled (at
>>> different intervals). We are using solr cloud and have multiple shards
>>> with
>>> 2 nodes each in our collections. I've not seen any pattern between this
>>> appearing on leaders or replicas. Not all my nodes appear to be
>>> exhibiting
>>> the problem either. Our usage pattern does involve a lot of churn in our
>>> index with the majority of documents being updated/deleted every day.
>>>
>>> Searching JIRA and the web in general I could only find references to
>>> this
>>> sort of problem when running solr in tomcat. Can anyone suggest a reason
>>> why
>>> this might be happening or a way I can manage it without needing to
>>> restart
>>> solr?
>>>
>>> Example lsof output:
>>> java   1100  s123  DEL   REG 202,3   8919406
>>> /home/s123/solr/data/uk_shard2_replica1/data/index/_3m9s.fdt
>>> java   1100  s123  DEL   REG 202,3   8919159
>>> /home/s123/solr/data/uk_shard2_replica1/data/index/_3mnk.tvd
>>> java   1100  s123  DEL   REG 202,3   8919150
>>> /home/s123/solr/data/uk_shard2_replica1/data/index/_3mnk_Lucene50_0.tim
>>> java   1100  s123  DEL   REG 202,3   8919094
>>> /home/s123/solr/data/uk_shard2_replica1/data/index/_3mq1_Lucene50_0.tim
>>> java   1100  s123  DEL   REG 202,3   8919103
>>> /home/s123/solr/data/uk_shard2_replica1/data/index/_3mq1.tvd
>>>
>>> Regards,
>>> Gavin.
>>>
>


Re: Alternate Port Not Working for Solr 6.0.0

2016-05-31 Thread Shawn Heisey
On 5/31/2016 2:02 PM, Teague James wrote:
> Hello, I am trying to install Solr 6.0.0 and have been successful with
> the default installation, following the instructions provided on the
> Apache Solr website. However, I do not want Solr running on port 8983,
> I want it to run on port 80. I started a new Ubuntu 14.04 VM,
> installed open JDK 8, then installed Solr with the following commands:
> Command: tar xzf solr-6.0.0.tgz solr-6.0.0/bin/install_solr_service.sh
> --strip-components=2 Response: None, which is good. Command:
> ./install_solr_service.sh solr-6.0.0.tgz -p 80 Response: Misplaced or
> Unknown flag -p So I tried... Command: ./install_solr_service.sh
> solr-6.0.0.tgz -i /opt -d /var/solr -u solr -s solr -p 80 Response: A
> dump of the log, which is INFO only with no errors or warnings, at the
> top of which is "Solr process 4831 from /var/solr/solr-80.pid not
> found" If I look in the /var/solr directory I find a file called
> solr-80.pid, but nothing else. What did I miss? Previous versions of
> Solr, which I deployed with Tomcat instead of Jetty, allowed me to
> control this in the server.xml file in /etc/tomcat7/, but obviously
> this no longer applies. I like the ease of the installation script; I
> just want to be able to control the port assignment. Any help is
> appreciated! Thanks! 

The port can be changed after install, although I have been also able to
change the port during install with the -p parameter.  Check
/etc/default/solr.in.sh and look for a line setting SOLR_PORT.  On my
dev server, it looks like this:

SOLR_PORT=8982

Before making any changes in that file, make sure that Solr is not
running at all, or you may be forced to manually kill it.

Thanks,
Shawn



Why Doesn't Solr Really Quit on Zookeeper Exceptions?

2016-05-31 Thread jimtronic
When I try to launch Solr 6.0 in cloud mode and connect it to a specific
chroot in zookeeper that doesn't exist, I get an error in my solr.log.
That's expected, but the solr process continues to launch and succeeds.

Why wouldn't we want the start process simply to fail and exit?

There's no mechanism to trigger a retry, so Solr just sits there like a
zombie.





--
View this message in context: 
http://lucene.472066.n3.nabble.com/Why-Doesn-t-Solr-Really-Quit-on-Zookeeper-Exceptions-tp4279971.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: Add a new field dynamically to each of the result docs and sort on it

2016-05-31 Thread Erick Erickson
To have Lucene/Solr do the sorting, your value must be in the
docs at search time. Consider the clause "&sort=my_field asc".
If rows=10, then only the top 10 docs are kept. So if a doc's score
is non-zero, it's value is compared against the 10 docs in the list and
either replaces one or is discarded.

You can't really do this dynamically. You don't know ahead of time
what docs will have non-zero score. Even if you did, you'd potentially
re-index your entire corpus (i.e. q=*:*&sort=something_dynamic) which
would be horrible.

Now, if what you want to do is just re-sort the topN docs, then you
can either create a custom component that re-sorts them or
use the ReRankingQParserPlugin.

But it sounds like this could really use some more thought. Have you
consider sorting by function? That can take the pre-indexed value in
the doc (or none) and sort the result by some set of calculations. You
haven't told us _what_ kinds of sorting you want to do so I have no
idea whether it applies or not.

Best,
Erick

On Tue, May 31, 2016 at 9:33 AM, Shawn Heisey  wrote:
> On 5/31/2016 10:16 AM, Mark Robinson wrote:
>> sorry Eric... I did not phrase it right ... what I meant was the field is
>> there in the schema, but I do not have values for it when normal indexing
>> happens.
>> When a query comes in, I want to populate value for this field in the
>> results based on some values passed in the query.
>> So what needs to be accommodated in the result depends on a parameter in
>> the query and I would like to sort the final results on this field also,
>> which is dynamically populated.
>>
>> What could be the best way to dynamically add value to this field based on
>> a query parameter and sort on this field also.
>
> Queries do not normally change the index.  They normally use a Lucene
> object that can search the index, not an object that can write to the index.
>
> I do not know whether a custom query component will be able to achieve
> write access to the index or not.  If it can, then you *might* be able
> to do what you want, but be aware that the index must meet the
> requirements for Atomic Update functionality, or the entire idea won't
> work at all:
>
> https://wiki.apache.org/solr/Atomic_Updates#Caveats_and_Limitations
>
> I have never written a custom query component, so I cannot say for sure
> that this is achievable.
>
>> Will a custom component help, with code in the *process *method to access
>> the results one by one and plug in this field help?
>> If so do I need to first index the value inside the *process *method for
>> reach result or is there a way to just add this value to each of my results
>> doc (no indexing) iterating through the result set and plugging in this
>> value for each result.
>
> The mention of a "process" method here suggests that you are thinking of
> an UpdateProcessor.  This would work if you are indexing ... but above
> you said this would happen at *query* time, which is a little bit different.
>
>> How will sort be applicable on this dynamically populated field as I am
>> already working on the results and is it too late to specify a sort and if
>> so how could it be possible.
>
> I do not know anything about custom code and sorting.
>
> Thanks,
> Shawn
>


Re: Why Doesn't Solr Really Quit on Zookeeper Exceptions?

2016-05-31 Thread Shawn Heisey
On 5/31/2016 2:34 PM, jimtronic wrote:
> When I try to launch Solr 6.0 in cloud mode and connect it to a specific
> chroot in zookeeper that doesn't exist, I get an error in my solr.log.
> That's expected, but the solr process continues to launch and succeeds.
>
> Why wouldn't we want the start process simply to fail and exit?
>
> There's no mechanism to trigger a retry, so Solr just sits there like a
> zombie.

I can think of two ways to handle this:  Keep retrying the zkHost values
to get an initial connection, on a configurable interval that probably
should default to between 120 and 500 seconds, or die gracefully.  I
would prefer to see the retry, myself.

I thought I saw an issue about this in the past, but now I can't find it.

Thanks,
Shawn



Re: Sorting documents in one core based on a field in another core

2016-05-31 Thread Mikhail Khludnev
Hello Mark,

Is it sounds like what's described at
http://blog-archive.griddynamics.com/2015/08/scoring-join-party-in-solr-53.html
?

On Tue, May 31, 2016 at 5:41 PM, Mark Robinson 
wrote:

> Hi,
>
> I have a requirement to sort records in one core/ collection based on a
> field in
> another core/collection.
>
> Could some one please advise how it can be done in SOLR.
>
> I have used !join to restrict documents in one core based on field values
> in another core. Is there some way to sort like that?
>
>
> Thanks!
> Mark.
>



-- 
Sincerely yours
Mikhail Khludnev
Principal Engineer,
Grid Dynamics





Re: DIH Delete with Full Import

2016-05-31 Thread nikosmarinos
Thank you Kiran. Simple and nice. I lost a day today trying to make the
delta-import work.



--
View this message in context: 
http://lucene.472066.n3.nabble.com/DIH-Delete-with-Full-Import-tp4040070p4279981.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: Solr Cloud and Multi-word Synonyms :: synonym_edismax parser

2016-05-31 Thread John Bickerstaff
All --

I'm now attempting to use the hon_lucene_synonyms project from github.

I found the documents that were infered by the dead links on the readme in
the repository -- however, given that I'm using Solr 5.4.x, I no longer
have the need to integrate into a war file (as far as I can see).

The suggestion on the readme is that I can drop the hon_lucene_synonyms jar
file into the $SOLR_HOME directory, but this does not seem to be working -
I'm getting class not found exceptions.

Does anyone on this list have direct experience with getting this plugin to
work in Solr 5.x?

Thanks in advance...

On Mon, May 30, 2016 at 6:57 PM, MaryJo Sminkey  wrote:

> It's been awhile since I installed it so I really can't say. I'm more of a
> code monkey than a server gal (particularly Linux... I'm amazed I got Solr
> installed in the first place, LOL!) So I had asked our network guy to look
> it over recently and see if it looked like I did it okay. He said since it
> shows up in the list of jars in the Solr admin that it's installed if
> that's not necessarily true, I probably need to point him in the right
> direction for what else to do since he really doesn't know Solr well
> either.
>
> Mary Jo
>
>
>
>
> On Mon, May 30, 2016 at 7:49 PM, John Bickerstaff <
> j...@johnbickerstaff.com>
> wrote:
>
> > Thanks for the comment Mary Jo...
> >
> > The error loading the class rings a bell - did you find and follow
> > instructions for adding that to the WAR file?  I vaguely remember seeing
> > something about that.
> >
> > I'm going to try my own tests on the auto phrasing one..  If I'm
> > successful, I'll post back.
> >
> > On Mon, May 30, 2016 at 3:45 PM, MaryJo Sminkey 
> > wrote:
> >
> > > This is a very timely discussion for me as well as we're trying to
> tackle
> > > the multi term synonym issue as well and have not been able to
> hon-lucene
> > > plugin to work, the jar shows up as installed but when we set up the
> > sample
> > > request handler it throws this error:
> > >
> > >
> >
> org.apache.solr.common.SolrException:org.apache.solr.common.SolrException:
> > > Error loading class
> > >
> >
> 'com.github.healthonnet.search.SynonymExpandingExtendedDismaxQParserPlugin'
> > >
> > > I have tried the auto-phrasing one as well (I did set up a field using
> > copy
> > > to configure it on) but when testing it didn't seem to return the
> > synonyms
> > > as expected. So gave up on that one too (am willing to give it another
> > try
> > > though, that was awhile ago). Would definitely like to hear what other
> > > people have found works on the latest versions of Solr 5.x and/or 6.
> Just
> > > sucks that this issue has never been fixed in the core product such
> that
> > > you still need to mess with plugins and patches to get such a basic
> > > functionality working properly.
> > >
> > >
> > > *Mary Jo Sminkey*
> > > *Senior ColdFusion Developer*
> > >
> > > *CF Webtools*
> > > You Dream It... We Build It. 
> > > 11204 Davenport Suite 100
> > > Omaha, Nebraska 68154
> > > O: 402.408.3733 x128
> > > E:  maryjo.smin...@cfwebtools.com
> > > Skype: maryjos.cfwebtools
> > >
> > >
> > > On Mon, May 30, 2016 at 5:02 PM, John Bickerstaff <
> > > j...@johnbickerstaff.com>
> > > wrote:
> > >
> > > > So I'm looking at the solution mentioned here:
> > > >
> > > >
> > >
> >
> https://lucidworks.com/blog/2014/07/12/solution-for-multi-term-synonyms-in-lucenesolr-using-the-auto-phrasing-tokenfilter/
> > > >
> > > > The thing that's troubling me slightly is that the way it's
> documented
> > it
> > > > seems to be missing a small but important link...
> > > >
> > > > What exactly causes the results listed to be returned?
> > > >
> > > > Here's my thought process:
> > > >
> > > > 1. The entry for /autophrase searchHandler does not specify a default
> > > > search field.
> > > > 2. The field type "text_autophrase" is set up as the one with the
> > > > AutoPhrasingFilterFactory as part of it's indexing
> > > >
> > > > There isn't any mention (perhaps because it's too obvious) of the
> need
> > to
> > > > copy or otherwise get data into the "text_autophrase" field at index
> > > time.
> > > >
> > > > There isn't any explicit listing of "text_autophrase" as the default
> > > search
> > > > field in the /autophrase search handler
> > > >
> > > > There isn't any explicit statement of "df=text_autophrase" in the
> query
> > > > statment: [/autophrase?q=New+York]
> > > >
> > > > Therefore it seems to me that if someone tries to implement this,
> > they're
> > > > going to be disappointed in the results unless they:
> > > > a. copy or otherwise get ALL the text they're interested in -- into
> the
> > > > "text_autophrase" field as part of the schema.xml setup (to happen at
> > > index
> > > > time)
> > > > b. somehow explicitly declare "text_autophrase" as the default search
> > > field
> > > > - either in the searchHandler or wherever else the default field is
> > > > configured.
> > > >
> > > > If anyone out t

Re: Why Doesn't Solr Really Quit on Zookeeper Exceptions?

2016-05-31 Thread Dennis Gove
The retry logic for errors in construction of SolrZooKeeper was added in
https://issues.apache.org/jira/browse/SOLR-8599 and is in 5.5.1 and 6.0. I
wonder if either that is not working as expected during startup or if
startup is following a different code path.

- Dennis

On Tue, May 31, 2016 at 4:40 PM, Shawn Heisey  wrote:

> On 5/31/2016 2:34 PM, jimtronic wrote:
> > When I try to launch Solr 6.0 in cloud mode and connect it to a specific
> > chroot in zookeeper that doesn't exist, I get an error in my solr.log.
> > That's expected, but the solr process continues to launch and succeeds.
> >
> > Why wouldn't we want the start process simply to fail and exit?
> >
> > There's no mechanism to trigger a retry, so Solr just sits there like a
> > zombie.
>
> I can think of two ways to handle this:  Keep retrying the zkHost values
> to get an initial connection, on a configurable interval that probably
> should default to between 120 and 500 seconds, or die gracefully.  I
> would prefer to see the retry, myself.
>
> I thought I saw an issue about this in the past, but now I can't find it.
>
> Thanks,
> Shawn
>
>


Re: Why Doesn't Solr Really Quit on Zookeeper Exceptions?

2016-05-31 Thread jimtronic
Thanks Shawn. I'm leaning towards a retry as well.

So, there's no mechanism that currently exists within Solr that would allow
me to automatically retry the zookeeper connection on launch?

My options then would be:

1. Externally monitor the status of Solr (eg
/solr/admin/collections?action=CLUSTERSTATUS or bin/solr status) and force a
restart. 

2. Write a patch to retry Zookeeper connections based on some configuration
values that specify attempts and wait times.





--
View this message in context: 
http://lucene.472066.n3.nabble.com/Why-Doesn-t-Solr-Really-Quit-on-Zookeeper-Exceptions-tp4279971p4279987.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: Solr Cloud and Multi-word Synonyms :: synonym_edismax parser

2016-05-31 Thread Joe Lawson
The docs are out of date for the synonym_edismax but it does work. Check
out the tests for working examples. I'll try to update it soon. I've run
the plugin on Solr 5 and 6, solrcloud and standalone. For running in
SolrCloud make sure you follow
https://cwiki.apache.org/confluence/display/solr/Adding+Custom+Plugins+in+SolrCloud+Mode
On May 31, 2016 5:13 PM, "John Bickerstaff" 
wrote:

> All --
>
> I'm now attempting to use the hon_lucene_synonyms project from github.
>
> I found the documents that were infered by the dead links on the readme in
> the repository -- however, given that I'm using Solr 5.4.x, I no longer
> have the need to integrate into a war file (as far as I can see).
>
> The suggestion on the readme is that I can drop the hon_lucene_synonyms jar
> file into the $SOLR_HOME directory, but this does not seem to be working -
> I'm getting class not found exceptions.
>
> Does anyone on this list have direct experience with getting this plugin to
> work in Solr 5.x?
>
> Thanks in advance...
>
> On Mon, May 30, 2016 at 6:57 PM, MaryJo Sminkey 
> wrote:
>
> > It's been awhile since I installed it so I really can't say. I'm more of
> a
> > code monkey than a server gal (particularly Linux... I'm amazed I got
> Solr
> > installed in the first place, LOL!) So I had asked our network guy to
> look
> > it over recently and see if it looked like I did it okay. He said since
> it
> > shows up in the list of jars in the Solr admin that it's installed if
> > that's not necessarily true, I probably need to point him in the right
> > direction for what else to do since he really doesn't know Solr well
> > either.
> >
> > Mary Jo
> >
> >
> >
> >
> > On Mon, May 30, 2016 at 7:49 PM, John Bickerstaff <
> > j...@johnbickerstaff.com>
> > wrote:
> >
> > > Thanks for the comment Mary Jo...
> > >
> > > The error loading the class rings a bell - did you find and follow
> > > instructions for adding that to the WAR file?  I vaguely remember
> seeing
> > > something about that.
> > >
> > > I'm going to try my own tests on the auto phrasing one..  If I'm
> > > successful, I'll post back.
> > >
> > > On Mon, May 30, 2016 at 3:45 PM, MaryJo Sminkey 
> > > wrote:
> > >
> > > > This is a very timely discussion for me as well as we're trying to
> > tackle
> > > > the multi term synonym issue as well and have not been able to
> > hon-lucene
> > > > plugin to work, the jar shows up as installed but when we set up the
> > > sample
> > > > request handler it throws this error:
> > > >
> > > >
> > >
> >
> org.apache.solr.common.SolrException:org.apache.solr.common.SolrException:
> > > > Error loading class
> > > >
> > >
> >
> 'com.github.healthonnet.search.SynonymExpandingExtendedDismaxQParserPlugin'
> > > >
> > > > I have tried the auto-phrasing one as well (I did set up a field
> using
> > > copy
> > > > to configure it on) but when testing it didn't seem to return the
> > > synonyms
> > > > as expected. So gave up on that one too (am willing to give it
> another
> > > try
> > > > though, that was awhile ago). Would definitely like to hear what
> other
> > > > people have found works on the latest versions of Solr 5.x and/or 6.
> > Just
> > > > sucks that this issue has never been fixed in the core product such
> > that
> > > > you still need to mess with plugins and patches to get such a basic
> > > > functionality working properly.
> > > >
> > > >
> > > > *Mary Jo Sminkey*
> > > > *Senior ColdFusion Developer*
> > > >
> > > > *CF Webtools*
> > > > You Dream It... We Build It. 
> > > > 11204 Davenport Suite 100
> > > > Omaha, Nebraska 68154
> > > > O: 402.408.3733 x128
> > > > E:  maryjo.smin...@cfwebtools.com
> > > > Skype: maryjos.cfwebtools
> > > >
> > > >
> > > > On Mon, May 30, 2016 at 5:02 PM, John Bickerstaff <
> > > > j...@johnbickerstaff.com>
> > > > wrote:
> > > >
> > > > > So I'm looking at the solution mentioned here:
> > > > >
> > > > >
> > > >
> > >
> >
> https://lucidworks.com/blog/2014/07/12/solution-for-multi-term-synonyms-in-lucenesolr-using-the-auto-phrasing-tokenfilter/
> > > > >
> > > > > The thing that's troubling me slightly is that the way it's
> > documented
> > > it
> > > > > seems to be missing a small but important link...
> > > > >
> > > > > What exactly causes the results listed to be returned?
> > > > >
> > > > > Here's my thought process:
> > > > >
> > > > > 1. The entry for /autophrase searchHandler does not specify a
> default
> > > > > search field.
> > > > > 2. The field type "text_autophrase" is set up as the one with the
> > > > > AutoPhrasingFilterFactory as part of it's indexing
> > > > >
> > > > > There isn't any mention (perhaps because it's too obvious) of the
> > need
> > > to
> > > > > copy or otherwise get data into the "text_autophrase" field at
> index
> > > > time.
> > > > >
> > > > > There isn't any explicit listing of "text_autophrase" as the
> default
> > > > search
> > > > > field in the /autophrase search handler
> > > > >
> > > > > There is

Re: Solr Cloud and Multi-word Synonyms :: synonym_edismax parser

2016-05-31 Thread John Bickerstaff
Many thanks Joe!  I'll follow the instructions on the linked webpage.

On Tue, May 31, 2016 at 4:05 PM, Joe Lawson <
jlaw...@opensourceconnections.com> wrote:

> The docs are out of date for the synonym_edismax but it does work. Check
> out the tests for working examples. I'll try to update it soon. I've run
> the plugin on Solr 5 and 6, solrcloud and standalone. For running in
> SolrCloud make sure you follow
>
> https://cwiki.apache.org/confluence/display/solr/Adding+Custom+Plugins+in+SolrCloud+Mode
> On May 31, 2016 5:13 PM, "John Bickerstaff" 
> wrote:
>
> > All --
> >
> > I'm now attempting to use the hon_lucene_synonyms project from github.
> >
> > I found the documents that were infered by the dead links on the readme
> in
> > the repository -- however, given that I'm using Solr 5.4.x, I no longer
> > have the need to integrate into a war file (as far as I can see).
> >
> > The suggestion on the readme is that I can drop the hon_lucene_synonyms
> jar
> > file into the $SOLR_HOME directory, but this does not seem to be working
> -
> > I'm getting class not found exceptions.
> >
> > Does anyone on this list have direct experience with getting this plugin
> to
> > work in Solr 5.x?
> >
> > Thanks in advance...
> >
> > On Mon, May 30, 2016 at 6:57 PM, MaryJo Sminkey 
> > wrote:
> >
> > > It's been awhile since I installed it so I really can't say. I'm more
> of
> > a
> > > code monkey than a server gal (particularly Linux... I'm amazed I got
> > Solr
> > > installed in the first place, LOL!) So I had asked our network guy to
> > look
> > > it over recently and see if it looked like I did it okay. He said since
> > it
> > > shows up in the list of jars in the Solr admin that it's installed
> if
> > > that's not necessarily true, I probably need to point him in the right
> > > direction for what else to do since he really doesn't know Solr well
> > > either.
> > >
> > > Mary Jo
> > >
> > >
> > >
> > >
> > > On Mon, May 30, 2016 at 7:49 PM, John Bickerstaff <
> > > j...@johnbickerstaff.com>
> > > wrote:
> > >
> > > > Thanks for the comment Mary Jo...
> > > >
> > > > The error loading the class rings a bell - did you find and follow
> > > > instructions for adding that to the WAR file?  I vaguely remember
> > seeing
> > > > something about that.
> > > >
> > > > I'm going to try my own tests on the auto phrasing one..  If I'm
> > > > successful, I'll post back.
> > > >
> > > > On Mon, May 30, 2016 at 3:45 PM, MaryJo Sminkey  >
> > > > wrote:
> > > >
> > > > > This is a very timely discussion for me as well as we're trying to
> > > tackle
> > > > > the multi term synonym issue as well and have not been able to
> > > hon-lucene
> > > > > plugin to work, the jar shows up as installed but when we set up
> the
> > > > sample
> > > > > request handler it throws this error:
> > > > >
> > > > >
> > > >
> > >
> >
> org.apache.solr.common.SolrException:org.apache.solr.common.SolrException:
> > > > > Error loading class
> > > > >
> > > >
> > >
> >
> 'com.github.healthonnet.search.SynonymExpandingExtendedDismaxQParserPlugin'
> > > > >
> > > > > I have tried the auto-phrasing one as well (I did set up a field
> > using
> > > > copy
> > > > > to configure it on) but when testing it didn't seem to return the
> > > > synonyms
> > > > > as expected. So gave up on that one too (am willing to give it
> > another
> > > > try
> > > > > though, that was awhile ago). Would definitely like to hear what
> > other
> > > > > people have found works on the latest versions of Solr 5.x and/or
> 6.
> > > Just
> > > > > sucks that this issue has never been fixed in the core product such
> > > that
> > > > > you still need to mess with plugins and patches to get such a basic
> > > > > functionality working properly.
> > > > >
> > > > >
> > > > > *Mary Jo Sminkey*
> > > > > *Senior ColdFusion Developer*
> > > > >
> > > > > *CF Webtools*
> > > > > You Dream It... We Build It. 
> > > > > 11204 Davenport Suite 100
> > > > > Omaha, Nebraska 68154
> > > > > O: 402.408.3733 x128
> > > > > E:  maryjo.smin...@cfwebtools.com
> > > > > Skype: maryjos.cfwebtools
> > > > >
> > > > >
> > > > > On Mon, May 30, 2016 at 5:02 PM, John Bickerstaff <
> > > > > j...@johnbickerstaff.com>
> > > > > wrote:
> > > > >
> > > > > > So I'm looking at the solution mentioned here:
> > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> https://lucidworks.com/blog/2014/07/12/solution-for-multi-term-synonyms-in-lucenesolr-using-the-auto-phrasing-tokenfilter/
> > > > > >
> > > > > > The thing that's troubling me slightly is that the way it's
> > > documented
> > > > it
> > > > > > seems to be missing a small but important link...
> > > > > >
> > > > > > What exactly causes the results listed to be returned?
> > > > > >
> > > > > > Here's my thought process:
> > > > > >
> > > > > > 1. The entry for /autophrase searchHandler does not specify a
> > default
> > > > > > search field.
> > > > > > 2. The field type "text_autophrase" is set up as the 

Re: Solr Cloud and Multi-word Synonyms :: synonym_edismax parser

2016-05-31 Thread Jeff Wartes
I’ve generally been dropping foreign plugin jars in this dir:
server/solr-webapp/webapp/WEB-INF/lib/
This is because it then gets loaded by the same classloader as Solr itself, 
which can be useful if you’re, say, overriding some solr-protected-space method.

If you don’t care about the classloader, I believe you can use whatever dir you 
want, with the appropriate bit of solrconfig.xml to load it. Something like:



On 5/31/16, 2:13 PM, "John Bickerstaff"  wrote:

>All --
>
>I'm now attempting to use the hon_lucene_synonyms project from github.
>
>I found the documents that were infered by the dead links on the readme in
>the repository -- however, given that I'm using Solr 5.4.x, I no longer
>have the need to integrate into a war file (as far as I can see).
>
>The suggestion on the readme is that I can drop the hon_lucene_synonyms jar
>file into the $SOLR_HOME directory, but this does not seem to be working -
>I'm getting class not found exceptions.
>
>Does anyone on this list have direct experience with getting this plugin to
>work in Solr 5.x?
>
>Thanks in advance...
>
>On Mon, May 30, 2016 at 6:57 PM, MaryJo Sminkey  wrote:
>
>> It's been awhile since I installed it so I really can't say. I'm more of a
>> code monkey than a server gal (particularly Linux... I'm amazed I got Solr
>> installed in the first place, LOL!) So I had asked our network guy to look
>> it over recently and see if it looked like I did it okay. He said since it
>> shows up in the list of jars in the Solr admin that it's installed if
>> that's not necessarily true, I probably need to point him in the right
>> direction for what else to do since he really doesn't know Solr well
>> either.
>>
>> Mary Jo
>>
>>
>>
>>
>> On Mon, May 30, 2016 at 7:49 PM, John Bickerstaff <
>> j...@johnbickerstaff.com>
>> wrote:
>>
>> > Thanks for the comment Mary Jo...
>> >
>> > The error loading the class rings a bell - did you find and follow
>> > instructions for adding that to the WAR file?  I vaguely remember seeing
>> > something about that.
>> >
>> > I'm going to try my own tests on the auto phrasing one..  If I'm
>> > successful, I'll post back.
>> >
>> > On Mon, May 30, 2016 at 3:45 PM, MaryJo Sminkey 
>> > wrote:
>> >
>> > > This is a very timely discussion for me as well as we're trying to
>> tackle
>> > > the multi term synonym issue as well and have not been able to
>> hon-lucene
>> > > plugin to work, the jar shows up as installed but when we set up the
>> > sample
>> > > request handler it throws this error:
>> > >
>> > >
>> >
>> org.apache.solr.common.SolrException:org.apache.solr.common.SolrException:
>> > > Error loading class
>> > >
>> >
>> 'com.github.healthonnet.search.SynonymExpandingExtendedDismaxQParserPlugin'
>> > >
>> > > I have tried the auto-phrasing one as well (I did set up a field using
>> > copy
>> > > to configure it on) but when testing it didn't seem to return the
>> > synonyms
>> > > as expected. So gave up on that one too (am willing to give it another
>> > try
>> > > though, that was awhile ago). Would definitely like to hear what other
>> > > people have found works on the latest versions of Solr 5.x and/or 6.
>> Just
>> > > sucks that this issue has never been fixed in the core product such
>> that
>> > > you still need to mess with plugins and patches to get such a basic
>> > > functionality working properly.
>> > >
>> > >
>> > > *Mary Jo Sminkey*
>> > > *Senior ColdFusion Developer*
>> > >
>> > > *CF Webtools*
>> > > You Dream It... We Build It. 
>> > > 11204 Davenport Suite 100
>> > > Omaha, Nebraska 68154
>> > > O: 402.408.3733 x128
>> > > E:  maryjo.smin...@cfwebtools.com
>> > > Skype: maryjos.cfwebtools
>> > >
>> > >
>> > > On Mon, May 30, 2016 at 5:02 PM, John Bickerstaff <
>> > > j...@johnbickerstaff.com>
>> > > wrote:
>> > >
>> > > > So I'm looking at the solution mentioned here:
>> > > >
>> > > >
>> > >
>> >
>> https://lucidworks.com/blog/2014/07/12/solution-for-multi-term-synonyms-in-lucenesolr-using-the-auto-phrasing-tokenfilter/
>> > > >
>> > > > The thing that's troubling me slightly is that the way it's
>> documented
>> > it
>> > > > seems to be missing a small but important link...
>> > > >
>> > > > What exactly causes the results listed to be returned?
>> > > >
>> > > > Here's my thought process:
>> > > >
>> > > > 1. The entry for /autophrase searchHandler does not specify a default
>> > > > search field.
>> > > > 2. The field type "text_autophrase" is set up as the one with the
>> > > > AutoPhrasingFilterFactory as part of it's indexing
>> > > >
>> > > > There isn't any mention (perhaps because it's too obvious) of the
>> need
>> > to
>> > > > copy or otherwise get data into the "text_autophrase" field at index
>> > > time.
>> > > >
>> > > > There isn't any explicit listing of "text_autophrase" as the default
>> > > search
>> > > > field in the /autophrase search handler
>> > > >
>> > > > There isn't any explicit statement of "df=text_autophrase" in the
>> query
>>

Re: Solr Cloud and Multi-word Synonyms :: synonym_edismax parser

2016-05-31 Thread John Bickerstaff
Thanks Jeff,

I believe I tried that, and it still refused to load..  But I'd sure love
it to work since the other process is a bit convoluted - although I see
it's value in a large Solr installation.

When I "locate" the jar on the linux command line I get:

/opt/solr-5.4.0/server/solr-webapp/webapp/WEB-INF/lib/hon-lucene-synonyms-2.0.0.jar

But the log file is still carrying class not found exceptions when I
restart...

Are you in "Cloud" mode?  What version of Solr are you using?

On Tue, May 31, 2016 at 4:08 PM, Jeff Wartes  wrote:

> I’ve generally been dropping foreign plugin jars in this dir:
> server/solr-webapp/webapp/WEB-INF/lib/
> This is because it then gets loaded by the same classloader as Solr
> itself, which can be useful if you’re, say, overriding some
> solr-protected-space method.
>
> If you don’t care about the classloader, I believe you can use whatever
> dir you want, with the appropriate bit of solrconfig.xml to load it.
> Something like:
> 
>
>
> On 5/31/16, 2:13 PM, "John Bickerstaff"  wrote:
>
> >All --
> >
> >I'm now attempting to use the hon_lucene_synonyms project from github.
> >
> >I found the documents that were infered by the dead links on the readme in
> >the repository -- however, given that I'm using Solr 5.4.x, I no longer
> >have the need to integrate into a war file (as far as I can see).
> >
> >The suggestion on the readme is that I can drop the hon_lucene_synonyms
> jar
> >file into the $SOLR_HOME directory, but this does not seem to be working -
> >I'm getting class not found exceptions.
> >
> >Does anyone on this list have direct experience with getting this plugin
> to
> >work in Solr 5.x?
> >
> >Thanks in advance...
> >
> >On Mon, May 30, 2016 at 6:57 PM, MaryJo Sminkey 
> wrote:
> >
> >> It's been awhile since I installed it so I really can't say. I'm more
> of a
> >> code monkey than a server gal (particularly Linux... I'm amazed I got
> Solr
> >> installed in the first place, LOL!) So I had asked our network guy to
> look
> >> it over recently and see if it looked like I did it okay. He said since
> it
> >> shows up in the list of jars in the Solr admin that it's installed
> if
> >> that's not necessarily true, I probably need to point him in the right
> >> direction for what else to do since he really doesn't know Solr well
> >> either.
> >>
> >> Mary Jo
> >>
> >>
> >>
> >>
> >> On Mon, May 30, 2016 at 7:49 PM, John Bickerstaff <
> >> j...@johnbickerstaff.com>
> >> wrote:
> >>
> >> > Thanks for the comment Mary Jo...
> >> >
> >> > The error loading the class rings a bell - did you find and follow
> >> > instructions for adding that to the WAR file?  I vaguely remember
> seeing
> >> > something about that.
> >> >
> >> > I'm going to try my own tests on the auto phrasing one..  If I'm
> >> > successful, I'll post back.
> >> >
> >> > On Mon, May 30, 2016 at 3:45 PM, MaryJo Sminkey 
> >> > wrote:
> >> >
> >> > > This is a very timely discussion for me as well as we're trying to
> >> tackle
> >> > > the multi term synonym issue as well and have not been able to
> >> hon-lucene
> >> > > plugin to work, the jar shows up as installed but when we set up the
> >> > sample
> >> > > request handler it throws this error:
> >> > >
> >> > >
> >> >
> >>
> org.apache.solr.common.SolrException:org.apache.solr.common.SolrException:
> >> > > Error loading class
> >> > >
> >> >
> >>
> 'com.github.healthonnet.search.SynonymExpandingExtendedDismaxQParserPlugin'
> >> > >
> >> > > I have tried the auto-phrasing one as well (I did set up a field
> using
> >> > copy
> >> > > to configure it on) but when testing it didn't seem to return the
> >> > synonyms
> >> > > as expected. So gave up on that one too (am willing to give it
> another
> >> > try
> >> > > though, that was awhile ago). Would definitely like to hear what
> other
> >> > > people have found works on the latest versions of Solr 5.x and/or 6.
> >> Just
> >> > > sucks that this issue has never been fixed in the core product such
> >> that
> >> > > you still need to mess with plugins and patches to get such a basic
> >> > > functionality working properly.
> >> > >
> >> > >
> >> > > *Mary Jo Sminkey*
> >> > > *Senior ColdFusion Developer*
> >> > >
> >> > > *CF Webtools*
> >> > > You Dream It... We Build It. 
> >> > > 11204 Davenport Suite 100
> >> > > Omaha, Nebraska 68154
> >> > > O: 402.408.3733 x128
> >> > > E:  maryjo.smin...@cfwebtools.com
> >> > > Skype: maryjos.cfwebtools
> >> > >
> >> > >
> >> > > On Mon, May 30, 2016 at 5:02 PM, John Bickerstaff <
> >> > > j...@johnbickerstaff.com>
> >> > > wrote:
> >> > >
> >> > > > So I'm looking at the solution mentioned here:
> >> > > >
> >> > > >
> >> > >
> >> >
> >>
> https://lucidworks.com/blog/2014/07/12/solution-for-multi-term-synonyms-in-lucenesolr-using-the-auto-phrasing-tokenfilter/
> >> > > >
> >> > > > The thing that's troubling me slightly is that the way it's
> >> documented
> >> > it
> >> > > > seems to be missing a small but important

Re: Solr Cloud and Multi-word Synonyms :: synonym_edismax parser

2016-05-31 Thread John Bickerstaff
Jeff - Looking at the page, I'm unclear exactly how to set things up.  I
get using the blob api and I get adding the blob/jar to the collection, but
the bit about  runtimeLib=true  is confusing.

Does that go on the entry in the solrconfig.xml file like this?  Is
anything else required?  (The bit about the valuesourceyparser is a bit
confusing)



Thanks

On Tue, May 31, 2016 at 5:02 PM, John Bickerstaff 
wrote:

> Thanks Jeff,
>
> I believe I tried that, and it still refused to load..  But I'd sure love
> it to work since the other process is a bit convoluted - although I see
> it's value in a large Solr installation.
>
> When I "locate" the jar on the linux command line I get:
>
>
> /opt/solr-5.4.0/server/solr-webapp/webapp/WEB-INF/lib/hon-lucene-synonyms-2.0.0.jar
>
> But the log file is still carrying class not found exceptions when I
> restart...
>
> Are you in "Cloud" mode?  What version of Solr are you using?
>
> On Tue, May 31, 2016 at 4:08 PM, Jeff Wartes 
> wrote:
>
>> I’ve generally been dropping foreign plugin jars in this dir:
>> server/solr-webapp/webapp/WEB-INF/lib/
>> This is because it then gets loaded by the same classloader as Solr
>> itself, which can be useful if you’re, say, overriding some
>> solr-protected-space method.
>>
>> If you don’t care about the classloader, I believe you can use whatever
>> dir you want, with the appropriate bit of solrconfig.xml to load it.
>> Something like:
>> 
>>
>>
>> On 5/31/16, 2:13 PM, "John Bickerstaff"  wrote:
>>
>> >All --
>> >
>> >I'm now attempting to use the hon_lucene_synonyms project from github.
>> >
>> >I found the documents that were infered by the dead links on the readme
>> in
>> >the repository -- however, given that I'm using Solr 5.4.x, I no longer
>> >have the need to integrate into a war file (as far as I can see).
>> >
>> >The suggestion on the readme is that I can drop the hon_lucene_synonyms
>> jar
>> >file into the $SOLR_HOME directory, but this does not seem to be working
>> -
>> >I'm getting class not found exceptions.
>> >
>> >Does anyone on this list have direct experience with getting this plugin
>> to
>> >work in Solr 5.x?
>> >
>> >Thanks in advance...
>> >
>> >On Mon, May 30, 2016 at 6:57 PM, MaryJo Sminkey 
>> wrote:
>> >
>> >> It's been awhile since I installed it so I really can't say. I'm more
>> of a
>> >> code monkey than a server gal (particularly Linux... I'm amazed I got
>> Solr
>> >> installed in the first place, LOL!) So I had asked our network guy to
>> look
>> >> it over recently and see if it looked like I did it okay. He said
>> since it
>> >> shows up in the list of jars in the Solr admin that it's installed
>> if
>> >> that's not necessarily true, I probably need to point him in the right
>> >> direction for what else to do since he really doesn't know Solr well
>> >> either.
>> >>
>> >> Mary Jo
>> >>
>> >>
>> >>
>> >>
>> >> On Mon, May 30, 2016 at 7:49 PM, John Bickerstaff <
>> >> j...@johnbickerstaff.com>
>> >> wrote:
>> >>
>> >> > Thanks for the comment Mary Jo...
>> >> >
>> >> > The error loading the class rings a bell - did you find and follow
>> >> > instructions for adding that to the WAR file?  I vaguely remember
>> seeing
>> >> > something about that.
>> >> >
>> >> > I'm going to try my own tests on the auto phrasing one..  If I'm
>> >> > successful, I'll post back.
>> >> >
>> >> > On Mon, May 30, 2016 at 3:45 PM, MaryJo Sminkey > >
>> >> > wrote:
>> >> >
>> >> > > This is a very timely discussion for me as well as we're trying to
>> >> tackle
>> >> > > the multi term synonym issue as well and have not been able to
>> >> hon-lucene
>> >> > > plugin to work, the jar shows up as installed but when we set up
>> the
>> >> > sample
>> >> > > request handler it throws this error:
>> >> > >
>> >> > >
>> >> >
>> >>
>> org.apache.solr.common.SolrException:org.apache.solr.common.SolrException:
>> >> > > Error loading class
>> >> > >
>> >> >
>> >>
>> 'com.github.healthonnet.search.SynonymExpandingExtendedDismaxQParserPlugin'
>> >> > >
>> >> > > I have tried the auto-phrasing one as well (I did set up a field
>> using
>> >> > copy
>> >> > > to configure it on) but when testing it didn't seem to return the
>> >> > synonyms
>> >> > > as expected. So gave up on that one too (am willing to give it
>> another
>> >> > try
>> >> > > though, that was awhile ago). Would definitely like to hear what
>> other
>> >> > > people have found works on the latest versions of Solr 5.x and/or
>> 6.
>> >> Just
>> >> > > sucks that this issue has never been fixed in the core product such
>> >> that
>> >> > > you still need to mess with plugins and patches to get such a basic
>> >> > > functionality working properly.
>> >> > >
>> >> > >
>> >> > > *Mary Jo Sminkey*
>> >> > > *Senior ColdFusion Developer*
>> >> > >
>> >> > > *CF Webtools*
>> >> > > You Dream It... We Build It. 
>> >> > > 11204 Davenport Suite 100
>> >> > > Omaha, Nebraska 68154
>> >> > > O: 402.408.3733 x128
>> >> > > E:  maryjo.smi

Re: Add a new field dynamically to each of the result docs and sort on it

2016-05-31 Thread Chris Hostetter

: When a query comes in, I want to populate value for this field in the
: results based on some values passed in the query.
: So what needs to be accommodated in the result depends on a parameter in
: the query and I would like to sort the final results on this field also,
: which is dynamically populated.

populated how? ... what exactly do you want to provide at query time, and 
how exactly do you want it to affect your query results / sorting?

The details of what you *think* you mean matter, because based on the 
information you've provided we have no way of guessing what your goal 
is -- and if we can't guess what you mean, then there's no way to imagein 
Solr can figure it out ... software doesn't have an imagination.

We need to know what your documents are going to look like at index 
time (with *real* details, and specific example docs) and what your 
queries are going to look like (again: with *real* details on the "some 
values passed in the query") and a detailed explanation of how what 
results you want to see and why -- describe in words how the final sorting 
of the docs you should have already described to use would be determined 
acording to the info pased in at query time which you should have also 
already described to us.


In general i think i smell and XY Problem...

https://people.apache.org/~hossman/#xyproblem
XY Problem

Your question appears to be an "XY Problem" ... that is: you are dealing
with "X", you are assuming "Y" will help you, and you are asking about "Y"
without giving more details about the "X" so that we can understand the
full issue.  Perhaps the best solution doesn't involve "Y" at all?
See Also: http://www.perlmonks.org/index.pl?node_id=542341


-Hoss
http://www.lucidworks.com/


Re: After Solr 5.5, mm parameter doesn't work properly

2016-05-31 Thread Greg Pendlebury
I don't think it is 8812. q.op was completely ignored by edismax prior to
5.5, so it is not mm that changed.

If you do the same 5.4 query with q.op=OR I suspect it will not change the
debug query at all.

On 30 May 2016 at 21:07, Jan Høydahl  wrote:

> Hi,
>
> This may be related to SOLR-8812, but still different. Please file a JIRA
> issue for this.
>
> --
> Jan Høydahl, search solution architect
> Cominvent AS - www.cominvent.com
>
> > 29. mai 2016 kl. 18.20 skrev Issei Nishigata :
> >
> > Hi,
> >
> > “mm" parameter does not work properly, when I set "q.op=AND” after Solr
> 5.5.
> > In Solr 5.4, mm parameter works expectedly with the following setting.
> >
> > ---
> > [schema]
> > 
> >   
> >  maxGramSize="2"/>
> >   
> > 
> >
> >
> > [request]
> >
> http://localhost:8983/solr/collection1/select?defType=edismax&q.op=AND&mm=2&q=solar
> > —
> >
> > After Solr 5.5, the result will not be the same as Solr 5.4.
> > Has the setting of mm parameter specs, or description of file setting
> changed?
> >
> >
> > [Solr 5.4]
> > 
> > ...
> >   
> > 2
> > solar
> > edismax
> > AND
> >   
> > ...
> > 
> >   
> > 0
> > 
> >   solr
> > 
> >   
> > 
> > 
> >   solar
> >   solar
> >   
> >   (+DisjunctionMaxQuerytext:so text:ol text:la
> text:ar)~2/no_coord
> >   
> >   +(((text:so text:ol text:la
> text:ar)~2))
> >   ...
> > 
> >
> >
> >
> > [Solr 6.0.1]
> >
> > 
> > ...
> >   
> > 2
> > solar
> > edismax
> > AND
> >   
> > ...
> > 
> >   
> > solar
> > solar
> > 
> > (+DisjunctionMaxQuery(((+text:so +text:ol +text:la
> +text:ar/no_coord
> > 
> > +((+text:so +text:ol +text:la
> +text:ar))
> > ...
> >
> >
> > As shown above, parsedquery also differs from Solr 5.4 and Solr
> 6.0.1(after Solr 5.5).
> >
> >
> > —
> > Thanks
> > Issei Nishigata
>
>


Re: float or string type for a field with whole number and decimal number values?

2016-05-31 Thread Derek Poh

Sorry about that.

Thank you for your explanation. I still have some questions on using and 
setting up collection alias for my current situation. I will start a new 
threadon this.


On 5/31/2016 11:21 PM, Erick Erickson wrote:

First, when changing the topic of the thread, please start a new thread. This
is called "thread hijacking" and makes it difficult to find threads later.

Collection aliasing does not do _anything_ about adding/deleting/whatever.
It's just a way to do exactly what you want. Your clients point to
mycollection.

You use the CREATEALIAS command to point mycollection to mycollection_1.
Thereafter you can do anything you want to mycollection_1 using either name.

That is, you can address mycollection_1 explicitly. You can use mycollection. It
doesn't matter.

Then you can create mycollection_2. So far you can _only_ address mycollection_2
explicitly. You then use the CREATEALIAS to point mycollection at
mycollection_2.
At that point, anybody using mycollection will start working with
mycollection_2.

Meanywhile, mycollection_1 is still addressable (presumably by the back end) by
addressing it explicitly rather than through an alias. It has _not_ been changed
in any way by creating the new alias.

Best,
Erick

On Mon, May 30, 2016 at 11:15 PM, Derek Poh  wrote:

Hi Erick

Thank you for pointing out the sort behaviour of numbers in a string field.
I did not think of that. Will use float.

Would like to know how would you guys handle the usage of collection alias
in my case.
I have a 'product' collectionand Icreate a new collection'product_tmp' for
this field type change and index into it. I create an alias 'product' on
this new 'product_tmp' collection.
IfI were to index to or delete documents from the 'product' collection, SOLR
will index on and delete from 'product_tmp' collection, am I right?
That means the 'product' collection cannot be usedanymore?
Even if I were to create an alias 'product_old' on 'product'
collection;issue a delete all documents or index on 'product_old', SOLR will
delete or index on 'product_tmp' collection instead?

My intention is to avoid having to updatethe clients serversto point to
'product_tmp' collection.


On 5/31/2016 10:57 AM, Erick Erickson wrote:

bq: Should I change the field type to "float" or "string"?

I'd go with float. Let's assume you want to sort by
this field. 10.00 sorts before 9.0 if you
just use Strings. Plus floats are generally much more
compact.

bq: do I need to delete all documents in the index and do a full indexing

That's the way I'd do it. You can always index to a _new_ collection
(assuming SolrCloud) and use collection aliasing to switch your
search all at once

Best,
Erick

On Sun, May 29, 2016 at 12:56 AM, Derek Poh 
wrote:

I am using solr 4.10.4.


On 5/29/2016 3:52 PM, Derek Poh wrote:

Hi

I have a field that is of "int" type currentlyand it's values are whole
numbers.



Due tochange inbusiness requirement, this field will need to take in
decimal numbers as well.
This fieldis sorted onand filter by range (field:[ 1 to *]).

Should I change the field type to "float" or "string"?
For the change to take effect, do I need to delete all documents in the
index and do a full indexing? Or I can just do a full indexing without
theneed to delete all documents first?

Derek

--
CONFIDENTIALITY NOTICE
This e-mail (including any attachments) may contain confidential and/or
privileged information. If you are not the intended recipient or have
received this e-mail in error, please inform the sender immediately and
delete this e-mail (including any attachments) from your computer, and
you
must not use, disclose to anyone else or copy this e-mail (including any
attachments), whether in whole or in part.
This e-mail and any reply to it may be monitored for security, legal,
regulatory compliance and/or other appropriate reasons.



--
CONFIDENTIALITY NOTICE
This e-mail (including any attachments) may contain confidential and/or
privileged information. If you are not the intended recipient or have
received this e-mail in error, please inform the sender immediately and
delete this e-mail (including any attachments) from your computer, and
you
must not use, disclose to anyone else or copy this e-mail (including any
attachments), whether in whole or in part.
This e-mail and any reply to it may be monitored for security, legal,
regulatory compliance and/or other appropriate reasons.




--
CONFIDENTIALITY NOTICE
This e-mail (including any attachments) may contain confidential and/or
privileged information. If you are not the intended recipient or have
received this e-mail in error, please inform the sender immediately and
delete this e-mail (including any attachments) from your computer, and you
must not use, disclose to anyone else or copy this e-mail (including any
attachments), whether in whole or in part.
This e-mail and any reply to it may be monitored for security, legal,

Re: Solr Cloud and Multi-word Synonyms :: synonym_edismax parser

2016-05-31 Thread Shawn Heisey
On 5/31/2016 3:13 PM, John Bickerstaff wrote:
> The suggestion on the readme is that I can drop the
> hon_lucene_synonyms jar file into the $SOLR_HOME directory, but this
> does not seem to be working - I'm getting class not found exceptions. 

What I typically do with *all* extra jars (dataimport, mysql, ICU jars,
etc) is put them into $SOLR_HOME/lib ... a directory that you will
usually need to create.  If the installer script is used with default
options, that directory will be /var/solr/data/lib.

Any jar that you place in that directory will be loaded once at Solr
startup and available to all cores.  The best thing about this directory
is that it requires zero configuration.

For 5.3 and later, loading jars into
server/solr-webapp/webapp/WEB-INF/lib should also work, but then you are
modifying the actual Solr install, which I normally avoid because it
makes it a little bit harder to upgrade Solr.

> Does anyone on this list have direct experience with getting this
> plugin to work in Solr 5.x? 

I don't have any experience with that specific plugin, but I have
successfully used other plugin jars with the lib directory mentioned above.

Thanks,
Shawn