Taking Solr to production with docker

2015-09-23 Thread aurelien . mazoyer

Hi Solr community,

I can find many blog posts on how to deploy Solr with docker but I am 
wondering if Solr/Docker is really ready for production.

Has anybody ever ran Solr in production with Docker?

Thank you for your feedback,

Aurélien



Re: Passivate core in Solr Cloud

2014-07-28 Thread aurelien . mazoyer

Thank you Erick,

Ok, I will probably perform some tests. It seems to be a good candidate 
for a future blog post...


Regards,

Aurelien

On 27.07.2014 20:20, Erick Erickson wrote:

"Does not play nice" really means it was designed to run in a
non-distributed mode. There has
been no work done to verify that it does work in cloud mode, I fully 
expect

some "interesting"
problems in that mode. If/when we get to it that is.

About replication: I haven't heard of any problems, but I also haven't
heard of it
working in that environment. I expect that it'll only try to replicate 
when

it's
loaded, so that might be interesting

Best,
Erick


On Thu, Jul 24, 2014 at 6:49 AM, Aurélien MAZOYER <
aurelien.mazo...@francelabs.com> wrote:


Thank you Erick and Alex for your answers. Lots of core stuff seems to
meet my requirement but it is a problem if it does not work with Solr
Cloud. Is there an issue opened for this problem?
If I understand well, the only solution for me is to use multiple
monoinstances of Solr using transient cores and to distribute manually 
the
cores for my tenant (I assume the LRU mechanimn will be less effective 
as

it will be done per solr instance).
When you say "does NOT play nice with distributed mode", does it also
include the standard replication mecanism?

Thanks,

Regards,

Aurelien



Le 23/07/2014 17:21, Erick Erickson a écrit :

 Do note that the lots of cores stuff does NOT play nice with in

distributed mode (yet).

Best,
Erick


On Wed, Jul 23, 2014 at 6:00 AM, Alexandre 
Rafalovitch
>
wrote:

 Solr has some support for large number of cores, including transient

cores:http://wiki.apache.org/solr/LotsOfCores

Regards,
Alex.
Personal:http://www.outerthoughts.com/  and @arafalov
Solr resources:http://www.solr-start.com/  and @solrstart
Solr popularizers 
community:https://www.linkedin.com/groups?gid=6713853



On Wed, Jul 23, 2014 at 7:55 PM, Aurélien MAZOYER
  wrote:


Hello,

We want to setup a Solr Cloud cluster in order to handle a high 
volume

of
documents with a multi-tenant architecture. The problem is that an
application-level isolation for a tenant (using a mutual index with 
a



field

"customer") is not enough to fit our requirements. As a result, we 
need

1
collection/customer. There is more than a thousand customers and it
seems
unreasonable to create thousands of collections in Solr Cloud... 
But as



we

know that there are less than 1 query/customer/day, we are 
currently



looking

for a way to passivate collection when they are not in use. Can it 
be a



good


idea? If yes, are there best practices to implement this? What side


effects

can we expect? Do we need to put some application-level logic on 
top on



the

Solr Cloud cluster to choose which collection we have to unload 
(and



maybe

there is something smarter (and quicker?) than simply 
loading/unloading



the


core when it is not in used?) ?


Thank you for your answer(s),

Aurelien






Re: Character encoding problems

2014-07-29 Thread aurelien . mazoyer

Hi,

If you use solr 4.8.1, you don't have to add URIEncoding="UTF-8" in the 
tomcat conf file anymore :

https://wiki.apache.org/solr/SolrTomcat


Regards,

Aurélien MAZOYER

On 29.07.2014 14:22, Gulliver Smith wrote:

I have solr 4.8.1 under Tomcat 7 on Debian Linux. The connector in
Tomcat's server.xml has been changed to include character encoding
UTF-8:

 


I am posting to the server from PHP 5.5 curl. The extract POST was
intercepted and confirmed that everything is being encode in UTF-8.

However, the responses to query commands, whether XML or JSON are
returning field values such as title_fr in something that looks like
latin1 or iso-8859-1 when displayed in a browser or editor.

E.g.: "title_fr":[" appelé au téléphone"]

The highlights in the query response do have correctly displaying
character codes.

E.g. "text_fr":[" \n \n  \n  \n  \n  \n  \n  \n  \n \n \nappelé au
téléphone\nappelé au téléphone\n

PHP's utf8_decode doesn't make sense of the title_fr.

Is there something to configure to fix this and get proper UTF8
results for everything?

Thanks
Gulliver


Re : Re: Multipart documents with different update cycles

2014-07-29 Thread aurelien . mazoyer
Yes, that is the point : I have to handle complex queries that perform 
full text search both on user-metadata and main part of documents :-(...


Aurélien


Do you search the frequently changing user-metadata? If not, maybe the
external file field is helpful.
https://cwiki.apache.org/confluence/display/solr/Working+with+External+Files+and+Processes

Regards,
   Alex.
Personal: http://www.outerthoughts.com/ and @arafalov
Solr resources and newsletter: http://www.solr-start.com/ and @solrstart
Solr popularizers community: https://www.linkedin.com/groups?gid=6713853


On Fri, Jul 25, 2014 at 12:04 AM, Aurélien MAZOYER
 wrote:

Hello,

I have to index a dataset containing multipart documents. The "main" 
part

and the "user metadata" part have different update cycles : we want to
update the "user metadata part" frequently without having to refetch 
the

main part from the datasource nor storing every fields in order to use
atomic update. As there is no true field level update in Solr yet, I am
afraid that I have to build an index for both parts and to perform a 
query
time join, with all the well-known performance limitation. I have also 
heard
of side car index. Is it a solution that can meet my requirements? Is 
it
stable enough to be usable in production? Does the community plan to 
make it

part of the trunk code?

Thanks,

Aurelien





Re: Solr on AWS ubuntu12.04 instance

2014-07-29 Thread aurelien . mazoyer

Hi Pusakar,

Did you try to ping your solr from localhost in your ssh console: curl 
http://localhost:8983(or 8984 if you change the jetty 
port)/solr/collection1/admin/ping

?

Aurélien

On 29.07.2014 15:15, pushkar sawant wrote:

Hi Team,
I have done Solr 4.9.0 setup on ubuntu 12.04 instanace on AWS.
with Java 7. When i start the solr with "java -jar start.jar"
it start with attached output.
It sys -:
5460 [main] INFO  org.eclipse.jetty.server.AbstractConnector  – Started
SocketConnector@0.0.0.0:8984

When i try to open it through browser it do not open the web interface
attache is the error.

Please suggest if any one come across with same issue & resolved.

Thanks
Pusakar


Re: Solr on AWS ubuntu12.04 instance

2014-07-29 Thread aurelien . mazoyer

Ooops, didn't see Andrew's answer: sorry for my redundant answer :-)

Aurélien

On 29.07.2014 15:47, aurelien.mazo...@francelabs.com wrote:

Hi Pusakar,

Did you try to ping your solr from localhost in your ssh console: curl
http://localhost:8983(or 8984 if you change the jetty
port)/solr/collection1/admin/ping
?

Aurélien

On 29.07.2014 15:15, pushkar sawant wrote:

Hi Team,
I have done Solr 4.9.0 setup on ubuntu 12.04 instanace on AWS.
with Java 7. When i start the solr with "java -jar start.jar"
it start with attached output.
It sys -:
5460 [main] INFO  org.eclipse.jetty.server.AbstractConnector  – 
Started

SocketConnector@0.0.0.0:8984

When i try to open it through browser it do not open the web interface
attache is the error.

Please suggest if any one come across with same issue & resolved.

Thanks
Pusakar


Re: Searching and highlighting ten's of fields

2014-07-30 Thread aurelien . mazoyer

Hello,

Do you use classic highlighter or fast vector highlighter?

Aurélien

On 30.07.2014 09:36, Manuel Le Normand wrote:

Hello,
I need to expose the search and highlighting capabilities over few tens 
of
fields. The edismax's qf param makes it possible but the time 
performances

for searching tens of words over tens of fields is problematic.

I made a copyField (indexed, not stored) for these fields, which gives 
way
better search performances but does not enable highlighting the 
original

fields which are stored.

Is there any way of searching this copyField and highlighting other 
fields

with any of the highlight components?

BTW, I need to keep the field structure so storing the copyField is not 
an

alternative.


Re: Multiple shards in the same Solr instance server

2014-10-10 Thread aurelien . mazoyer

Hi Nabil,

Does this blog answer to your question? : 
http://solr.pl/en/2013/01/07/solr-4-1-solrcloud-multiple-shards-on-the-same-solr-node/


Regards,

Aurélien

On 10.10.2014 11:48, nabil Kouici wrote:

Hi All,

in Solr, is it possible to create for the same index multiple shards
in the same Solr instance (or server)?

Regards,
Nabil.


On 10.10.2014 11:48, nabil Kouici wrote:

Hi All,

in Solr, is it possible to create for the same index multiple shards
in the same Solr instance (or server)?

Regards,
Nabil.


Nested documents in Solr

2014-10-20 Thread aurelien . mazoyer

Hi,

I have some question regarding nested document queries.

For example, let’s say that I have many books, one of which is the 
following one:

Book _title: Nested documents for dummies
Chapter1_Title: Introduction
Chapter1_Content: Nested documents are fun.
Chapter2_Title: Which technology should I use?
Chapter2_Content: Lucene of course!

First I want to find books that contain an introduction and that are 
about Lucene. So I decide to flatten my data and use 3 multivalued 
fields (Book_Title,Chapter_Title and Chapter_Content), I index my 
document and then I get what I want when I run the following query : “ 
chapter_title:Introduction AND chapter_title:Lucene “
But now I want to find books that contain “fun” in a chapter which name 
is “introduction”.  My model is no more valid (Chapter2_content is no 
more linked with Chapter2_title). That is why I change my datamodel and 
use nested documents:
I now have a parent with a single valued field Book_title and different 
childs with single valued fields Chapter_title and Chapter_Content. Now, 
when I run the query “chapter_title: Introduction AND 
chapter_content:fun” I also get what I want… But what do I have to do if 
I want to use these two kinds of query with a unique data model?
Maybe the only way to do this is to use nested documents and to index 
data both in child documents and in a flattened form in the parent 
document. Then we will be able to run the two different queries.


Do you have any other (better) idea?

Thank you,

Regards,

Aurélien


Re: Nested documents in Solr

2014-10-22 Thread aurelien . mazoyer

Hi Ramzi,

Thank you but I am not sure to understand well your answer. In your 
example, I suppose that the indexed docs are flattened. If I want an AND 
query instead of an OR query (let say, for example 'chapter_title:Lucene 
AND chapter_content:fun'), how can I be sure that the terms "Lucene" and 
"fun" will be matched in the same chapter of the book? (since in this 
case chapter_content and chapter_title are multivalued fields)?


Regards,

Aurélien

On 21.10.2014 19:59, Ramzi Alqrainy wrote:
I think if I have your question right, You can use multiple custom 
query
syntax. You explicitly specify an alternative query parser such as 
DisMax or

eDisMax, you're using the standard Lucene query parser by default.

In your case, I think I can solve it by using this query
chapter_title:Introduction ( chapter_title:Lucene OR 
chapter_content:fun )


Here are some query examples demonstrating the query syntax.

*Keyword matching*

Search for word "foo" in the title field.

title:foo
Search for phrase "foo bar" in the title field.

title:"foo bar"
Search for phrase "foo bar" in the title field AND the phrase "quick 
fox" in

the body field.

title:"foo bar" AND body:"quick fox"
Search for either the phrase "foo bar" in the title field AND the 
phrase

"quick fox" in the body field, or the word "fox" in the title field.

(title:"foo bar" AND body:"quick fox") OR title:fox
Search for word "foo" and not "bar" in the title field.

title:foo -title:bar

*Wildcard matching*

Search for any word that starts with "foo" in the title field.

title:foo*
Search for any word that starts with "foo" and ends with bar in the 
title

field.

title:foo*bar
Note that Lucene doesn't support using a * symbol as the first 
character of

a search.







--
View this message in context:
http://lucene.472066.n3.nabble.com/Nested-documents-in-Solr-tp4165099p4165232.html
Sent from the Solr - User mailing list archive at Nabble.com.



Hi,

I have question regarding nested document queries:
For example, let’s say that I have the following book:
Book _title: Nested document for dummies
Chapter1_Title: Introduction
Chapter1_Content: Nested documents are fun.
Chapter2_Title: Which technology should I use?
Chapter2_Content: Lucene of course!

First I want to find books that contain an introduction and that are
about Lucene. So I decide to flatten my data and use 3 multivalued 
fields

(Book_Title,Chapter_Title and Chapter_Content), I index my document and
then I get what I want when I use the following query : “
chapter_title:Introduction AND chapter_title:Lucene “
Now I want to find books that contain “fun” in a chapter called
“introduction”.  My model is no more valid (Chapter2_content is no more
linked with Chapter2_title). That is why I change my datamodel and use
nested documents:
I have now a parent with a single valued field Book_title and different
childs with single valued fields Chapter_title and Chapter_Content. Now,
when I run the query “chapter_title: Introduction AND 
chapter_content:fun”
I also get what I want… But what do I have to do if I want to use these 
two

kinds of query with a unique data model?

Thank you,


Regards,

Aurélien MAZOYER

-
To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-user-h...@lucene.apache.org