RE: Unable to perform search query after changing uniqueKey

2015-04-01 Thread steve
Gently walking into rough waters here, but if you use any API with GET, you're 
sending a URI which must be properly encoded. This has nothing to do with with 
the programming language that generates key and store pairs on the browser or 
the one(s) used on the server. Lots and lots of good folks have tripped over 
this one.http://www.w3schools.com/tags/ref_urlencode.asp
Play hard, but play safe!

> Date: Wed, 1 Apr 2015 13:58:55 +0800
> Subject: Re: Unable to perform search query after changing uniqueKey
> From: edwinye...@gmail.com
> To: solr-user@lucene.apache.org
> 
> Thanks Erick.
> 
> Yes, it is able to work correct if I do not use spaces for the field names,
> especially for the uniqueKey.
> 
> Regards,
> Edwin
> 
> 
> On 31 March 2015 at 13:58, Erick Erickson  wrote:
> 
> > I would never put spaces in my field names! Frankly I have no clue
> > what Solr does with that, but it can't be good. Solr explicitly
> > supports Java naming conventions, camel case, underscores and numbers.
> > Special symbols are frowned upon, I never use anything but upper case,
> > lower case and underscores. Actually, I don't use upper case either
> > but that's a personal preference. Other things might work, but only by
> > chance.
> >
> > Best,
> > Erick
> >
> > On Mon, Mar 30, 2015 at 8:59 PM, Zheng Lin Edwin Yeo
> >  wrote:
> > > Latest information that I've found for this is that the error only occurs
> > > for shard2.
> > >
> > > If I do a search for just shard1, those records that are assigned to
> > shard1
> > > will be able to be displayed. Only when I search for shard2 will the
> > > NullPointerException error occurs. Previously I was doing a search for
> > both
> > > shards.
> > >
> > > Is there any settings that I required to do for shard2 in order to solve
> > > this issue? Currently I have not made any changes to the shards since I
> > > created it using
> > >
> > http://localhost:8983/solr/admin/collections?action=CREATE&name=nps1&numShards=2&collection.configName=collection1
> > >
> > >
> > > Regards,
> > > Edwin
> > >
> > > On 31 March 2015 at 09:42, Zheng Lin Edwin Yeo 
> > wrote:
> > >
> > >> Hi Erick,
> > >>
> > >> I've changed the uniqueKey from id to Item No.
> > >>
> > >> Item No
> > >>
> > >>
> > >> Below are my definitions for both the id and Item No.
> > >>
> > >>  > >> required="false" multiValued="false" />
> > >> 
> > >>
> > >> Regards,
> > >> Edwin
> > >>
> > >>
> > >> On 30 March 2015 at 23:05, Erick Erickson 
> > wrote:
> > >>
> > >>> Well, let's see the definition of your ID field, 'cause I'm puzzled.
> > >>>
> > >>> It's definitely A Bad Thing to have it be any kind of tokenized field
> > >>> though, but that's a shot in the dark.
> > >>>
> > >>> Best,
> > >>> Erick
> > >>>
> > >>> On Mon, Mar 30, 2015 at 2:17 AM, Zheng Lin Edwin Yeo
> > >>>  wrote:
> > >>> > Hi Mostafa,
> > >>> >
> > >>> > Yes, I've defined all the fields in schema.xml. It is able to work on
> > >>> the
> > >>> > version without SolrCloud, but it is not working for the one with
> > >>> SolrCloud.
> > >>> > Both of them are using the same schema.xml.
> > >>> >
> > >>> > Regards,
> > >>> > Edwin
> > >>> >
> > >>> >
> > >>> >
> > >>> > On 30 March 2015 at 14:34, Mostafa Gomaa 
> > >>> wrote:
> > >>> >
> > >>> >> Hi Zheng,
> > >>> >>
> > >>> >> It's possible that there's a problem with your schema.xml. Are all
> > >>> fields
> > >>> >> defined and have appropriate options enabled?
> > >>> >>
> > >>> >> Regards,
> > >>> >>
> > >>> >> Mostafa.
> > >>> >>
> > >>> >> On Mon, Mar 30, 2015 at 7:49 AM, Zheng Lin Edwin Yeo <
> > >>> edwinye...@gmail.com
> > >>> >> >
> > >>> >> wrote:
> > >>> >>
> > >>> >> > Hi Erick,
> > >>> >> >
> > >>> >> > I've tried that, and removed the data directory from both the
> > >>> shards. But
> > >>> >> > the same problem still occurs, so we probably can rule out the
> > >>> "memory"
> > >>> >> > issue.
> > >>> >> >
> > >>> >> > Regards,
> > >>> >> > Edwin
> > >>> >> >
> > >>> >> > On 30 March 2015 at 12:39, Erick Erickson <
> > erickerick...@gmail.com>
> > >>> >> wrote:
> > >>> >> >
> > >>> >> > > I meant shut down Solr and physically remove the entire data
> > >>> >> > > directory. Not saying this is the cure, but it can't hurt to
> > rule
> > >>> out
> > >>> >> > > the index having "memory"...
> > >>> >> > >
> > >>> >> > > Best,
> > >>> >> > > Erick
> > >>> >> > >
> > >>> >> > > On Sun, Mar 29, 2015 at 6:35 PM, Zheng Lin Edwin Yeo
> > >>> >> > >  wrote:
> > >>> >> > > > Hi Erick,
> > >>> >> > > >
> > >>> >> > > > I used the following query to delete all the index.
> > >>> >> > > >
> > >>> >> > > > http://localhost:8983/solr/update?stream.body=
> > >>> >> > > *:*
> > >>> >> > > http://localhost:8983/solr/update?stream.body=
> > >>> >> > > >
> > >>> >> > > >
> > >>> >> > > > Or is it better to physically delete the entire data
> > directory?
> > >>> >> > > >
> > >>> >> > > >
> > >>> >> > > > Regards,
> > >>> >> > > > Edwin
> > >>> >> > > >
> > >>> >> > > >
> > >>> >> > > > On 28 March 2015 at 02:27, E

Re: Collapse and Expand behaviour on result with 1 document.

2015-04-01 Thread Derek Poh

Hi Joel

Correct me if my understanding is wrong.
Using supplier id as the field to collapse on.

- If thecollapse group heads inthe main result set has only 1document in 
each group, the expanded section will be empty since there are no 
documents to expandfor each collapse group.
- To render the page, I need to iterate the main result set. For each 
document I have to check if there is an expanded group with the same 
supplier id.
- The facets counts is based on the number of collapse groupsin the main 
result set (start="0">)


-Derek

On 3/31/2015 7:43 PM, Joel Bernstein wrote:

The way that collapse/expand is designed to be used is as follows:

The main result set will contain the collapsed group heads.

The expanded section will contain the expanded groups for the page of
results.

To render the page you iterate the main result set. For each document check
to see if there is an expanded group.




Joel Bernstein
http://joelsolr.blogspot.com/

On Tue, Mar 31, 2015 at 7:37 AM, Joel Bernstein  wrote:


You should be able to use collapse/expand with one result.

Does the document in the main result set have group members that aren't
being expanded?



Joel Bernstein
http://joelsolr.blogspot.com/

On Tue, Mar 31, 2015 at 2:00 AM, Derek Poh  wrote:


If I want to group the results (by a certain field) even if there is only
1 document, I should use the group parameter instead?
The requirement is to group the result of product documents by their
supplier id.
"&group=true&group.field=P_SupplierId&group.limit=5"

Is it true that the performance of collapse is better than group
parameter on large data set, say 10-20 million documents?

-Derek


On 3/31/2015 10:03 AM, Joel Bernstein wrote:


The expanded section will only include groups that have expanded
documents.

So, if the document that in the main result set has no documents to
expand,
then this is working as expected.



Joel Bernstein
http://joelsolr.blogspot.com/

On Mon, Mar 30, 2015 at 8:43 PM, Derek Poh 
wrote:

  Hi

I have a query which return 1 document.
When I add the collapse and expand parameters to it,
"&expand=true&expand.rows=5&fq={!collapse%20field=P_SupplierId}", the
expanded section is empty ().

Is this the behaviour of collapse and expand parameters on result which
contain only 1 document?

-Derek








Customzing Solr Dedupe

2015-04-01 Thread thakkar.aayush
I'm facing a challenges using de-dupliation of Solr documents.

De-duplicate is done using TextProfileSignature with following parameters: 
field1, field2, field3 
0.5
3

Here Field3 is normal text with few lines of data.
Field1 and Field2 can contain upto 5 or 6 words of data. 

I want to de-duplicate when data in field1 and field2 are exactly the same
and 90% of the lines in field3 is matched to that in another document. 

Is there anyway to achieve this?



--
View this message in context: 
http://lucene.472066.n3.nabble.com/Customzing-Solr-Dedupe-tp4196879.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: Customzing Solr Dedupe

2015-04-01 Thread Jack Krupansky
Solr dedupe is based on the concept of a signature - some fields and rules
that reduce a document into a discrete signature, and then checking if that
signature exists as a document key that can be looked up quickly in the
index. That's the conceptual basis. It is not based on any kind of field by
field comparison to all existing documents.

-- Jack Krupansky

On Wed, Apr 1, 2015 at 6:35 AM, thakkar.aayush 
wrote:

> I'm facing a challenges using de-dupliation of Solr documents.
>
> De-duplicate is done using TextProfileSignature with following parameters:
> field1, field2, field3
> 0.5
> 3
>
> Here Field3 is normal text with few lines of data.
> Field1 and Field2 can contain upto 5 or 6 words of data.
>
> I want to de-duplicate when data in field1 and field2 are exactly the same
> and 90% of the lines in field3 is matched to that in another document.
>
> Is there anyway to achieve this?
>
>
>
> --
> View this message in context:
> http://lucene.472066.n3.nabble.com/Customzing-Solr-Dedupe-tp4196879.html
> Sent from the Solr - User mailing list archive at Nabble.com.
>


Re: Solr 3.6, Highlight and multi words?

2015-04-01 Thread Bruno Mannina
Sorry to disturb you with the renew but nobody use or have problem with 
multi-terms and highlight ?


regards,

Le 29/03/2015 21:15, Bruno Mannina a écrit :

Dear Solr User,

I try to work with highlight, it works well but only if I have only 
one keyword in my query?!

If my request is plastic AND bicycle then only plastic is highlight.

my request is:

./select/?q=ab%3A%28plastic+and+bicycle%29&version=2.2&start=0&rows=10&indent=on&hl=true&hl.fl=tien,aben&fl=pn&f.aben.hl.snippets=5 



Could you help me please to understand ? I read doc, google, without 
success...

so I post here...

my result is:



 

  (EP2423092A1) #CMT# #/CMT# The bicycle pedal has a pedal 
body (10) made fromplastic material
  , particularly for touring bike. #CMT#ADVANTAGE : #/CMT# 
The bicycle pedal has a pedal body made 
fromplastic


  
  

   betweenplastic  tapes 3 and 3 having 
two heat fusion layers, and the twoplastic  tapes 
3 and 3 are stuck


  
  

elements. A connecting element is formed as a hinge, a 
flexible foil or a flexibleplastic  part. 
#CMT#USE


  
  

  A bicycle handlebar grip includes an inner fiber layer and 
an outerplastic layer. Thus, the fiber
handlebar grip, while theplastic  
layer is soft and has an adjustable thickness to provide a 
comfortable
sensation to a user. In addition, 
theplastic  layer includes a holding portion 
coated on the outer surface
layer to enhance the combination strength between the 
fiber layer and theplastic  layer and to 
enhance


  






---
Ce courrier électronique ne contient aucun virus ou logiciel 
malveillant parce que la protection avast! Antivirus est active.

http://www.avast.com





Re: Collapse and Expand behaviour on result with 1 document.

2015-04-01 Thread Joel Bernstein
Exactly correct.

Joel Bernstein
http://joelsolr.blogspot.com/

On Wed, Apr 1, 2015 at 5:44 AM, Derek Poh  wrote:

> Hi Joel
>
> Correct me if my understanding is wrong.
> Using supplier id as the field to collapse on.
>
> - If thecollapse group heads inthe main result set has only 1document in
> each group, the expanded section will be empty since there are no documents
> to expandfor each collapse group.
> - To render the page, I need to iterate the main result set. For each
> document I have to check if there is an expanded group with the same
> supplier id.
> - The facets counts is based on the number of collapse groupsin the main
> result set ( start="0">)
>
> -Derek
>
>
> On 3/31/2015 7:43 PM, Joel Bernstein wrote:
>
>> The way that collapse/expand is designed to be used is as follows:
>>
>> The main result set will contain the collapsed group heads.
>>
>> The expanded section will contain the expanded groups for the page of
>> results.
>>
>> To render the page you iterate the main result set. For each document
>> check
>> to see if there is an expanded group.
>>
>>
>>
>>
>> Joel Bernstein
>> http://joelsolr.blogspot.com/
>>
>> On Tue, Mar 31, 2015 at 7:37 AM, Joel Bernstein 
>> wrote:
>>
>>  You should be able to use collapse/expand with one result.
>>>
>>> Does the document in the main result set have group members that aren't
>>> being expanded?
>>>
>>>
>>>
>>> Joel Bernstein
>>> http://joelsolr.blogspot.com/
>>>
>>> On Tue, Mar 31, 2015 at 2:00 AM, Derek Poh 
>>> wrote:
>>>
>>>  If I want to group the results (by a certain field) even if there is
 only
 1 document, I should use the group parameter instead?
 The requirement is to group the result of product documents by their
 supplier id.
 "&group=true&group.field=P_SupplierId&group.limit=5"

 Is it true that the performance of collapse is better than group
 parameter on large data set, say 10-20 million documents?

 -Derek


 On 3/31/2015 10:03 AM, Joel Bernstein wrote:

  The expanded section will only include groups that have expanded
> documents.
>
> So, if the document that in the main result set has no documents to
> expand,
> then this is working as expected.
>
>
>
> Joel Bernstein
> http://joelsolr.blogspot.com/
>
> On Mon, Mar 30, 2015 at 8:43 PM, Derek Poh 
> wrote:
>
>   Hi
>
>> I have a query which return 1 document.
>> When I add the collapse and expand parameters to it,
>> "&expand=true&expand.rows=5&fq={!collapse%20field=P_SupplierId}", the
>> expanded section is empty ().
>>
>> Is this the behaviour of collapse and expand parameters on result
>> which
>> contain only 1 document?
>>
>> -Derek
>>
>>
>>
>>
>>
>


shard splitting (solr 4.4.0)

2015-04-01 Thread Ashwin Kumar
 Hello Solr Community,
 
Greetings ! This is my first post to this group.
 
I am very new to solr, so please do not mind if some of my questions below 
sound dumb :)
 
Let me explain my present setup:
 
Solr version : Solr_4.4.0 
Zookeeper version: zookeeper-3.4.5
-
 
Present Setup
Unix_box_1
One Solr instance (Collection 1 : contains around 24 million indexed documents) 
running on port 8983
 

 
Target setup
 
Now as the number of users are going to increase and also we are looking for 
high availability, I am thinking of setting up solr cloud with the following 
setup: 
 
Unix box 1
zookeeper 1(master)
Solr instance 1(Shard 1 - leader node)

 
Unix_box_2
zookeeper 2
Solr instance 2  (Shard 2)

 
Unix_box_3
zookeeper 3
Solr instance 3  (Replica for Shard 1)

 
Unix_box_4
Solr instance 4 (Replica for Shard 2)

 

 
Now following are my queries:
 
1) Is it possible for me to split the present solr running on one node with 24 
million docs under Collection1 into 2 shards as shown above ?
2) If yes how can I achieve this, and approximately how long does it take ?
3) For my application to fetch the result from solr, I need to give one solr 
url meaning http://Unix_box_1:8983/solr   . In this case if I have some docs on 
shard2 (which is on Unix_box_2) and some on shard1 (Unix_box_1), will my search 
result in the application fetch docs from both the shards and combine the 
result ? 
 
=
 
 
Thank you for your patience and time.
 
Regards,
Ashwin
  

solr 4.10.3 and index.xxxxxxxxxxx directory

2015-04-01 Thread Dominique Bejean
Hi,

Is it normal with Solr 4.10.3 that the data directory of replicas still
contains directories like

index.3636365667474747
index.999080980976

and files

index.properties
replica.properties

If yes, why and in which circumstances ?

Regards

Dominique


Solr -indexing from csv file having 28 cols taking lot of time ..plz help i m new to solr

2015-04-01 Thread avinash09

  
  
 




--
View this message in context: 
http://lucene.472066.n3.nabble.com/Solr-indexing-from-csv-file-having-28-cols-taking-lot-of-time-plz-help-i-m-new-to-solr-tp4196904.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: Solr -indexing from csv file having 28 cols taking lot of time ..plz help i m new to solr

2015-04-01 Thread Alexandre Rafalovitch
Solr actually has CSV update handler. You could send file to that directly.

Have you tried that?

Regards,
Alex
On 1 Apr 2015 11:56 pm, "avinash09"  wrote:

>
>processor="LineEntityProcessor"
> dataSource="fds"
> url="test.csv"
> rootEntity="true"
> transformer="RegexTransformer,TemplateTransformer" >
>   
>
> regex="^(.*),(.*),(.*),(.*),(.*),(.*),(.*),(.*),(.*),(.*),(.*),(.*),(.*),(.*),(.*),(.*),(.*),(.*),(.*),(.*),(.*),(.*),(.*),(.*),(.*),(.*),(.*),(.*)$"
>  groupNames="test,,
>
> ,,,is_frequency_cap_enabled,,,daily_spend_limit,,," />
>  
> 
>
>
>
> --
> View this message in context:
> http://lucene.472066.n3.nabble.com/Solr-indexing-from-csv-file-having-28-cols-taking-lot-of-time-plz-help-i-m-new-to-solr-tp4196904.html
> Sent from the Solr - User mailing list archive at Nabble.com.
>


Suspicious message with attachment

2015-04-01 Thread help
The following message addressed to you was quarantined because it likely 
contains a virus:

Subject: Error while reading index
From: Moshe Recanati 

However, if you know the sender and are expecting an attachment, please reply 
to this message, and we will forward the quarantined message to you.


RE: Error while reading index

2015-04-01 Thread Moshe Recanati
Hi,
I uploaded the log to drive.
https://drive.google.com/file/d/0B0GR0M-lL5QHX1B2a2NZZXh3a1E/view?usp=sharing



Regards,
Moshe Recanati
SVP Engineering
Office + 972-73-2617564
Mobile  + 972-52-6194481
Skype:  recanati
[KMS2]
More at:  www.kmslh.com | 
LinkedIn | 
FB


From: Moshe Recanati [mailto:mos...@kmslh.com]
Sent: Wednesday, April 01, 2015 5:22 PM
To: solr-user@lucene.apache.org
Subject: Error while reading index

Hi,
We're running on production environment with Solr 4.7.1 master and slave with 
replication every 1 minute.
During regular activity and index delta build we got the following error:
ERROR - 2015-03-30 04:06:12.318; java.lang.RuntimeException: [was class 
java.net.SocketException] Connection reset
at 
com.ctc.wstx.util.ExceptionUtil.throwRuntimeException(ExceptionUtil.java:18)
at 
com.ctc.wstx.sr.StreamScanner.throwLazyError(StreamScanner.java:731)

After additional 2 minutes we got the following error:
ERROR - 2015-03-30 04:07:39.875; Unable to get file names for indexCommit 
generation: 638
java.io.FileNotFoundException: _tu.fdt
at 
org.apache.lucene.store.FSDirectory.fileLength(FSDirectory.java:261)
at 
org.apache.lucene.store.NRTCachingDirectory.fileLength(NRTCachingDirectory.java:178)

And since than Solr wasn't recover until we did full rebuild of all documents.
Detailed log attached.

Let me know if you familiar with such issue.
And what can create such issue that prevent from recovery and requires rebuild 
index. This is major issue for us.

Thank you in advance,


Regards,
Moshe Recanati
SVP Engineering
Office + 972-73-2617564
Mobile  + 972-52-6194481
Skype:  recanati
[KMS2]
More at:  www.kmslh.com | 
LinkedIn | 
FB




Re: Solr -indexing from csv file having 28 cols taking lot of time ..plz help i m new to solr

2015-04-01 Thread avinash09
no could you please share an example



--
View this message in context: 
http://lucene.472066.n3.nabble.com/Solr-indexing-from-csv-file-having-28-cols-taking-lot-of-time-plz-help-i-m-new-to-solr-tp4196904p4196928.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: Solr -indexing from csv file having 28 cols taking lot of time ..plz help i m new to solr

2015-04-01 Thread Alexandre Rafalovitch
Well, I believe the tutorial has an example. Always a good thing -
going through the tutorial.

And the reference guide has the details:
https://cwiki.apache.org/confluence/display/solr/Uploading+Data+with+Index+Handlers#UploadingDatawithIndexHandlers-CSVFormattedIndexUpdates
.

Regards,
   Alex.

Solr Analyzers, Tokenizers, Filters, URPs and even a newsletter:
http://www.solr-start.com/


On 2 April 2015 at 01:37, avinash09  wrote:
> no could you please share an example
>
>
>
> --
> View this message in context: 
> http://lucene.472066.n3.nabble.com/Solr-indexing-from-csv-file-having-28-cols-taking-lot-of-time-plz-help-i-m-new-to-solr-tp4196904p4196928.html
> Sent from the Solr - User mailing list archive at Nabble.com.


Re: Solr -indexing from csv file having 28 cols taking lot of time ..plz help i m new to solr

2015-04-01 Thread avinash09
sir , a silly  question m confuse here what is difference between data import
handler and update csv



--
View this message in context: 
http://lucene.472066.n3.nabble.com/Solr-indexing-from-csv-file-having-28-cols-taking-lot-of-time-plz-help-i-m-new-to-solr-tp4196904p4196940.html
Sent from the Solr - User mailing list archive at Nabble.com.


RE: Solr 3.6, Highlight and multi words?

2015-04-01 Thread Reitzel, Charles
Haven't used Solr 3.x in a long time.  But with 4.10.x, I haven't had any 
trouble with multiple terms.  I'd look at a few things.

1.  Do you have a typo in your query?  Shouldn't it be q=aben:(plastic and 
bicycle)?

   ^^
2. Try removing the word "and" from the query.  There may be some interaction 
with a stop word filter.  If you want a phrase query, wrap it in quotes.  

3.  Also, be sure that the query and indexing analyzers for the aben field are 
compatible with each other.

-Original Message-
From: Bruno Mannina [mailto:bmann...@free.fr] 
Sent: Wednesday, April 01, 2015 7:05 AM
To: solr-user@lucene.apache.org
Subject: Re: Solr 3.6, Highlight and multi words?

Sorry to disturb you with the renew but nobody use or have problem with 
multi-terms and highlight ?

regards,

Le 29/03/2015 21:15, Bruno Mannina a écrit :
> Dear Solr User,
>
> I try to work with highlight, it works well but only if I have only 
> one keyword in my query?!
> If my request is plastic AND bicycle then only plastic is highlight.
>
> my request is:
>
> ./select/?q=ab%3A%28plastic+and+bicycle%29&version=2.2&start=0&row
> s=10&indent=on&hl=true&hl.fl=tien,aben&fl=pn&f.aben.hl.snippets=5
>
>
> Could you help me please to understand ? I read doc, google, without 
> success...
> so I post here...
>
> my result is:
>
> 
>
>  
> 
>   (EP2423092A1) #CMT# #/CMT# The bicycle pedal has a pedal 
> body (10) made fromplastic material
>   , particularly for touring bike. #CMT#ADVANTAGE : #/CMT# 
> The bicycle pedal has a pedal body made 
> fromplastic
> 
>   
>   
> 
>betweenplastic  tapes 3 and 3 having 
> two heat fusion layers, and the twoplastic  tapes
> 3 and 3 are stuck
> 
>   
>   
> 
> elements. A connecting element is formed as a hinge, a 
> flexible foil or a flexibleplastic  part.
> #CMT#USE
> 
>   
>   
> 
>   A bicycle handlebar grip includes an inner fiber layer and 
> an outerplastic layer. Thus, the fiber
> handlebar grip, while theplastic 
> layer is soft and has an adjustable thickness to provide a 
> comfortable
> sensation to a user. In addition, 
> theplastic  layer includes a holding portion 
> coated on the outer surface
> layer to enhance the combination strength between the 
> fiber layer and theplastic  layer and to 
> enhance
> 
>   


*
This e-mail may contain confidential or privileged information.
If you are not the intended recipient, please notify the sender immediately and 
then delete it.

TIAA-CREF
*


Information regarding "This conf directory is not valid" SolrException.

2015-04-01 Thread Bar Weiner
Hi,

I'm working on upgrading a project from solr-4.10.3 to solr-5.0.0.
As part of our JUnit tests we have a few tests for deleting/creating
collections. Each test create&delete a collection with a different name,
but they all share the same config in ZK.
When running these tests in Eclipse everything works fine, but when running
the same tests through Maven we get the following error so I suspect this
is a timing related issue :

INFO  org.apache.solr.rest.ManagedResourceStorage  – Setting up
ZooKeeper-based storage for the RestManager with znodeBase:
/configs/SIMPLE_CONFIG
INFO  org.apache.solr.rest.ManagedResourceStorage  – Configured
ZooKeeperStorageIO with znodeBase: /configs/SIMPLE_CONFIG
INFO  org.apache.solr.rest.RestManager  – Initializing RestManager with
initArgs: {}
INFO  org.apache.solr.rest.ManagedResourceStorage  – Reading
_rest_managed.json using ZooKeeperStorageIO:path=/configs/SIMPLE_CONFIG
INFO  org.apache.solr.rest.ManagedResourceStorage  – No data found for
znode /configs/SIMPLE_CONFIG/_rest_managed.json
INFO  org.apache.solr.rest.ManagedResourceStorage  – Loaded null at path
_rest_managed.json using ZooKeeperStorageIO:path=/configs/SIMPLE_CONFIG
INFO  org.apache.solr.rest.RestManager  – Initializing 0 registered
ManagedResources
INFO  org.apache.solr.handler.ReplicationHandler  – Commits will be
reserved for  1
INFO  org.apache.solr.core.SolrCore  – [mycollection1] Registered new
searcher Searcher@3208a6c4[mycollection1]
main{ExitableDirectoryReader(UninvertingDirectoryReader())}
ERROR org.apache.solr.core.CoreContainer  – Error creating core
[mycollection1]: This conf directory is not valid
org.apache.solr.common.SolrException: This conf directory is not valid
at
org.apache.solr.cloud.ZkController.registerConfListenerForCore(ZkController.java:2229)
at
org.apache.solr.core.SolrCore.registerConfListener(SolrCore.java:2633)
at org.apache.solr.core.SolrCore.(SolrCore.java:936)
at org.apache.solr.core.SolrCore.(SolrCore.java:662)
at org.apache.solr.core.CoreContainer.create(CoreContainer.java:513)
at org.apache.solr.core.CoreContainer.create(CoreContainer.java:488)
at
org.apache.solr.handler.admin.CoreAdminHandler.handleCreateAction(CoreAdminHandler.java:573)
at
org.apache.solr.handler.admin.CoreAdminHandler.handleRequestInternal(CoreAdminHandler.java:197)
at
org.apache.solr.handler.admin.CoreAdminHandler.handleRequestBody(CoreAdminHandler.java:186)
at
org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:144)
at
org.apache.solr.servlet.SolrDispatchFilter.handleAdminRequest(SolrDispatchFilter.java:736)
at
org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:261)
at
org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:204)
at
org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1419)
at
org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:455)
at
org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:137)
at
org.eclipse.jetty.security.SecurityHandler.handle(SecurityHandler.java:557)
at
org.eclipse.jetty.server.session.SessionHandler.doHandle(SessionHandler.java:231)
at
org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1075)
at
org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHandler.java:384)
at
org.eclipse.jetty.server.session.SessionHandler.doScope(SessionHandler.java:193)
at
org.eclipse.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:1009)
at
org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:135)
at
org.eclipse.jetty.server.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:255)
at
org.eclipse.jetty.server.handler.HandlerCollection.handle(HandlerCollection.java:154)
at
org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:116)
at org.eclipse.jetty.server.Server.handle(Server.java:368)
at
org.eclipse.jetty.server.AbstractHttpConnection.handleRequest(AbstractHttpConnection.java:489)
at
org.eclipse.jetty.server.BlockingHttpConnection.handleRequest(BlockingHttpConnection.java:53)
at
org.eclipse.jetty.server.AbstractHttpConnection.content(AbstractHttpConnection.java:953)
at
org.eclipse.jetty.server.AbstractHttpConnection$RequestHandler.content(AbstractHttpConnection.java:1014)
at org.eclipse.jetty.http.HttpParser.parseNext(HttpParser.java:861)
at
org.eclipse.jetty.http.HttpParser.parseAvailable(HttpParser.java:240)
at
org.eclipse.jetty.server.BlockingHttpConnection.handle(BlockingHttpConnection.java:72)
at
org.eclipse.jetty.server.bio.SocketConnector$ConnectorEndPoint.run(SocketConnector.java:264)
at
org.eclipse.jetty.util.thread.QueuedThreadPool.runJob(Queued

Re: Unable to perform search query after changing uniqueKey

2015-04-01 Thread Erick Erickson
Steve:

Totally agree. Even if you _do_ correctly escape the URL though,
there's no guarantee that Solr will "do the right thing" with field
names with spaces. Plus endless chances for you to get it wrong when
constructing the URL

Best,
Erick

On Wed, Apr 1, 2015 at 1:01 AM, steve  wrote:
> Gently walking into rough waters here, but if you use any API with GET, 
> you're sending a URI which must be properly encoded. This has nothing to do 
> with with the programming language that generates key and store pairs on the 
> browser or the one(s) used on the server. Lots and lots of good folks have 
> tripped over this one.http://www.w3schools.com/tags/ref_urlencode.asp
> Play hard, but play safe!
>
>> Date: Wed, 1 Apr 2015 13:58:55 +0800
>> Subject: Re: Unable to perform search query after changing uniqueKey
>> From: edwinye...@gmail.com
>> To: solr-user@lucene.apache.org
>>
>> Thanks Erick.
>>
>> Yes, it is able to work correct if I do not use spaces for the field names,
>> especially for the uniqueKey.
>>
>> Regards,
>> Edwin
>>
>>
>> On 31 March 2015 at 13:58, Erick Erickson  wrote:
>>
>> > I would never put spaces in my field names! Frankly I have no clue
>> > what Solr does with that, but it can't be good. Solr explicitly
>> > supports Java naming conventions, camel case, underscores and numbers.
>> > Special symbols are frowned upon, I never use anything but upper case,
>> > lower case and underscores. Actually, I don't use upper case either
>> > but that's a personal preference. Other things might work, but only by
>> > chance.
>> >
>> > Best,
>> > Erick
>> >
>> > On Mon, Mar 30, 2015 at 8:59 PM, Zheng Lin Edwin Yeo
>> >  wrote:
>> > > Latest information that I've found for this is that the error only occurs
>> > > for shard2.
>> > >
>> > > If I do a search for just shard1, those records that are assigned to
>> > shard1
>> > > will be able to be displayed. Only when I search for shard2 will the
>> > > NullPointerException error occurs. Previously I was doing a search for
>> > both
>> > > shards.
>> > >
>> > > Is there any settings that I required to do for shard2 in order to solve
>> > > this issue? Currently I have not made any changes to the shards since I
>> > > created it using
>> > >
>> > http://localhost:8983/solr/admin/collections?action=CREATE&name=nps1&numShards=2&collection.configName=collection1
>> > >
>> > >
>> > > Regards,
>> > > Edwin
>> > >
>> > > On 31 March 2015 at 09:42, Zheng Lin Edwin Yeo 
>> > wrote:
>> > >
>> > >> Hi Erick,
>> > >>
>> > >> I've changed the uniqueKey from id to Item No.
>> > >>
>> > >> Item No
>> > >>
>> > >>
>> > >> Below are my definitions for both the id and Item No.
>> > >>
>> > >> > > >> required="false" multiValued="false" />
>> > >> 
>> > >>
>> > >> Regards,
>> > >> Edwin
>> > >>
>> > >>
>> > >> On 30 March 2015 at 23:05, Erick Erickson 
>> > wrote:
>> > >>
>> > >>> Well, let's see the definition of your ID field, 'cause I'm puzzled.
>> > >>>
>> > >>> It's definitely A Bad Thing to have it be any kind of tokenized field
>> > >>> though, but that's a shot in the dark.
>> > >>>
>> > >>> Best,
>> > >>> Erick
>> > >>>
>> > >>> On Mon, Mar 30, 2015 at 2:17 AM, Zheng Lin Edwin Yeo
>> > >>>  wrote:
>> > >>> > Hi Mostafa,
>> > >>> >
>> > >>> > Yes, I've defined all the fields in schema.xml. It is able to work on
>> > >>> the
>> > >>> > version without SolrCloud, but it is not working for the one with
>> > >>> SolrCloud.
>> > >>> > Both of them are using the same schema.xml.
>> > >>> >
>> > >>> > Regards,
>> > >>> > Edwin
>> > >>> >
>> > >>> >
>> > >>> >
>> > >>> > On 30 March 2015 at 14:34, Mostafa Gomaa 
>> > >>> wrote:
>> > >>> >
>> > >>> >> Hi Zheng,
>> > >>> >>
>> > >>> >> It's possible that there's a problem with your schema.xml. Are all
>> > >>> fields
>> > >>> >> defined and have appropriate options enabled?
>> > >>> >>
>> > >>> >> Regards,
>> > >>> >>
>> > >>> >> Mostafa.
>> > >>> >>
>> > >>> >> On Mon, Mar 30, 2015 at 7:49 AM, Zheng Lin Edwin Yeo <
>> > >>> edwinye...@gmail.com
>> > >>> >> >
>> > >>> >> wrote:
>> > >>> >>
>> > >>> >> > Hi Erick,
>> > >>> >> >
>> > >>> >> > I've tried that, and removed the data directory from both the
>> > >>> shards. But
>> > >>> >> > the same problem still occurs, so we probably can rule out the
>> > >>> "memory"
>> > >>> >> > issue.
>> > >>> >> >
>> > >>> >> > Regards,
>> > >>> >> > Edwin
>> > >>> >> >
>> > >>> >> > On 30 March 2015 at 12:39, Erick Erickson <
>> > erickerick...@gmail.com>
>> > >>> >> wrote:
>> > >>> >> >
>> > >>> >> > > I meant shut down Solr and physically remove the entire data
>> > >>> >> > > directory. Not saying this is the cure, but it can't hurt to
>> > rule
>> > >>> out
>> > >>> >> > > the index having "memory"...
>> > >>> >> > >
>> > >>> >> > > Best,
>> > >>> >> > > Erick
>> > >>> >> > >
>> > >>> >> > > On Sun, Mar 29, 2015 at 6:35 PM, Zheng Lin Edwin Yeo
>> > >>> >> > >  wrote:
>> > >>> >> > > > Hi Erick,
>> > >>> >> > > >
>> > >>> >> > > > I used the following query to delete all the index.

Re: shard splitting (solr 4.4.0)

2015-04-01 Thread Erick Erickson
Ashwin:

First, if at all possible I would simply set up my new SolrCloud
structure (2 shards, a leader and follower each) and re-index the
entire corpus. 24M docs isn't really very many, and you'll have to
have this capability sometime since somone, somewhere will want to
change the schema in ways that require it.

But to answer your questions:
1: Certainly. There's the SPLITSHARD command, see:
https://cwiki.apache.org/confluence/display/solr/Collections+API. That
said, Solr 4.4 used a relatively early version of SPLITSHARD and there
have been many improvements so make sure and back up first.

2: Not quite sure how long it takes, but I wouldn't expect it to take
hours. A lot depends on what the docs are like.

3: Yes, sending a query (or update for that matter) to any node in the
cluster will "do the right thing". In a production environment, and
assuming you're not using SolrJ, I'd put a load balancer in front of
the cluster for queries. If you _are_ querying through SolrJ from the
application, you only need to use the CloudSolrServer class as it
includes a software load balancer by default. Otherwise, if you
hard-code a single machine that machine becomes a single point of
failure.

Best,
Erick

On Wed, Apr 1, 2015 at 4:55 AM, Ashwin Kumar  wrote:
>  Hello Solr Community,
>
> Greetings ! This is my first post to this group.
>
> I am very new to solr, so please do not mind if some of my questions below 
> sound dumb :)
>
> Let me explain my present setup:
>
> Solr version : Solr_4.4.0
> Zookeeper version: zookeeper-3.4.5
> -
>
> Present Setup
> Unix_box_1
> One Solr instance (Collection 1 : contains around 24 million indexed 
> documents) running on port 8983
>
> 
>
> Target setup
>
> Now as the number of users are going to increase and also we are looking for 
> high availability, I am thinking of setting up solr cloud with the following 
> setup:
>
> Unix box 1
> zookeeper 1(master)
> Solr instance 1(Shard 1 - leader node)
> 
>
> Unix_box_2
> zookeeper 2
> Solr instance 2  (Shard 2)
> 
>
> Unix_box_3
> zookeeper 3
> Solr instance 3  (Replica for Shard 1)
> 
>
> Unix_box_4
> Solr instance 4 (Replica for Shard 2)
> 
>
> 
>
> Now following are my queries:
>
> 1) Is it possible for me to split the present solr running on one node with 
> 24 million docs under Collection1 into 2 shards as shown above ?
> 2) If yes how can I achieve this, and approximately how long does it take ?
> 3) For my application to fetch the result from solr, I need to give one solr 
> url meaning http://Unix_box_1:8983/solr   . In this case if I have some docs 
> on shard2 (which is on Unix_box_2) and some on shard1 (Unix_box_1), will my 
> search result in the application fetch docs from both the shards and combine 
> the result ?
>
> =
>
>
> Thank you for your patience and time.
>
> Regards,
> Ashwin
>


Re: Solr -indexing from csv file having 28 cols taking lot of time ..plz help i m new to solr

2015-04-01 Thread Erick Erickson
Data Import Handler is a process in Solr that reaches out, grabs
"something external" and indexes it. "Something external" can be a
database, files on the server etc. Along the way, you can do many
transformations of the data. The point is that the source can be
anything.

The update handler is an end-point in Solr that expects certain
specific formats and puts them in the index. For instance, if you
index XML, it _must_ be in a very specific form to throw at the update
handler, something like

   
 
 
   
   
 
 
   


The csv update handler is just an update handler that expects CSV
files. The headers are usually the field names although you can map
them from the column header in your csv file to your Solr schema.

In importing csv files should be very fast. I suspect your regex is costly.

As Alexandre says, though, it would be a good idea to go through the
CSV import tutorial. The Solr reference guide has the details:
https://cwiki.apache.org/confluence/display/solr/Uploading+Data+with+Index+Handlers#UploadingDatawithIndexHandlers-CSVFormattedIndexUpdates

Best,
Erick

On Wed, Apr 1, 2015 at 8:04 AM, avinash09  wrote:
> sir , a silly  question m confuse here what is difference between data import
> handler and update csv
>
>
>
> --
> View this message in context: 
> http://lucene.472066.n3.nabble.com/Solr-indexing-from-csv-file-having-28-cols-taking-lot-of-time-plz-help-i-m-new-to-solr-tp4196904p4196940.html
> Sent from the Solr - User mailing list archive at Nabble.com.


Re: solr 4.10.3 and index.xxxxxxxxxxx directory

2015-04-01 Thread Shawn Heisey
On 4/1/2015 6:35 AM, Dominique Bejean wrote:
> Is it normal with Solr 4.10.3 that the data directory of replicas still
> contains directories like
>
> index.3636365667474747
> index.999080980976
>
> and files
>
> index.properties
> replica.properties
>
> If yes, why and in which circumstances ?

The index. directories are created during master/slave
index replication.  If you're running SolrCloud, then replication is
only used for index recovery.  Index recovery is only required in
situations where the replicas are so far behind that the transaction log
cannot be used to synchronize them, and sometimes happens when a Solr
node is restarted.  If SolrCloud index recovery is actually required
when you are NOT restarting Solr instances, your index might be having
problems.

Regardless of whether you're running SolrCloud or not, normally when one
of those directories with a numeric suffix is created, it will be
changed to "index" with no suffix after the replication is complete, but
if Solr is unable to change the directories for some reason, it will
simply keep and use the new directory with the suffix.  Do you see any
ERROR or WARN entries in your solr logfile that would indicate why Solr
cannot change the directory name?  Are you on Windows?  Problems like
this are more common on Windows, because Windows prevents a lot of file
operations when files/directories are open.

The long-term existence of directories with this naming convention
indicates that *something* went wrong, but you would need to consult
your logs to find out what happened.  There have been several bugs over
Solr's history that cause this problem.

Thanks,
Shawn



Re: Customzing Solr Dedupe

2015-04-01 Thread Dan Davis
But you can potentially still use Solr dedupe if you do the upfront work
(in RDMS or NoSQL pre-index processing) to assign some sort of "Group ID".
  See OCLC's FRBR Work-Set Algorithm,
http://www.oclc.org/content/dam/research/activities/frbralgorithm/2009-08.pdf?urlm=161376
, for some details on one such algorithm.

If the job is too big for RDBMS, and/or you don't want to use/have a
suitable NoSQL, you can have two Solr indexes (collection/core/whatever) -
one for classification with only id, field1, field2, field3, and another
for production query.   Then, you put stuff into the classification index,
use queries and your own algorithm to do classification, assigning a
groupId, and then put the document with groupId assigned into the
production database.

A key question is whether you want to preserve the groupId.   In some
cases, you do, and in some cases, it is just an internal signature.   In
both cases, a non-deterministic up-front algorithm can work, but if the
groupId needs to be preserved, you need to work harder to make sure it all
hangs together.

Hope this helps,

-Dan

On Wed, Apr 1, 2015 at 7:05 AM, Jack Krupansky 
wrote:

> Solr dedupe is based on the concept of a signature - some fields and rules
> that reduce a document into a discrete signature, and then checking if that
> signature exists as a document key that can be looked up quickly in the
> index. That's the conceptual basis. It is not based on any kind of field by
> field comparison to all existing documents.
>
> -- Jack Krupansky
>
> On Wed, Apr 1, 2015 at 6:35 AM, thakkar.aayush 
> wrote:
>
> > I'm facing a challenges using de-dupliation of Solr documents.
> >
> > De-duplicate is done using TextProfileSignature with following
> parameters:
> > field1, field2, field3
> > 0.5
> > 3
> >
> > Here Field3 is normal text with few lines of data.
> > Field1 and Field2 can contain upto 5 or 6 words of data.
> >
> > I want to de-duplicate when data in field1 and field2 are exactly the
> same
> > and 90% of the lines in field3 is matched to that in another document.
> >
> > Is there anyway to achieve this?
> >
> >
> >
> > --
> > View this message in context:
> > http://lucene.472066.n3.nabble.com/Customzing-Solr-Dedupe-tp4196879.html
> > Sent from the Solr - User mailing list archive at Nabble.com.
> >
>


Re: solr 4.10.3 and index.xxxxxxxxxxx directory

2015-04-01 Thread Dominique Bejean
Hi Shawn,

Thank you for your response.

This is a Solrcloud installation on Centos.

There are 5 servers with 128 Gb ram each.
The collection contains 650 millions of small documents.
There are 3 shards with replicationfactor = 2 (so 9 cores).
The JVM Xmx parameter was set to 96 Gb. We changed it yesterday to 32 Gb in
order to be under the CompressedOops limit and free the direct memory for
MMapDirectory.

I will have access to both full solr and tomcat logs tomorrow.

What I know, is that there are some zookeeper time out in solr logs.
And the replications occur on some nodes after some commits (after DIH
import) and when nodes restart.

So, I will have more precise log messages tomorrow.

Thank you for your response.

Dominique



2015-04-01 18:29 GMT+02:00 Shawn Heisey :

> On 4/1/2015 6:35 AM, Dominique Bejean wrote:
> > Is it normal with Solr 4.10.3 that the data directory of replicas still
> > contains directories like
> >
> > index.3636365667474747
> > index.999080980976
> >
> > and files
> >
> > index.properties
> > replica.properties
> >
> > If yes, why and in which circumstances ?
>
> The index. directories are created during master/slave
> index replication.  If you're running SolrCloud, then replication is
> only used for index recovery.  Index recovery is only required in
> situations where the replicas are so far behind that the transaction log
> cannot be used to synchronize them, and sometimes happens when a Solr
> node is restarted.  If SolrCloud index recovery is actually required
> when you are NOT restarting Solr instances, your index might be having
> problems.
>
> Regardless of whether you're running SolrCloud or not, normally when one
> of those directories with a numeric suffix is created, it will be
> changed to "index" with no suffix after the replication is complete, but
> if Solr is unable to change the directories for some reason, it will
> simply keep and use the new directory with the suffix.  Do you see any
> ERROR or WARN entries in your solr logfile that would indicate why Solr
> cannot change the directory name?  Are you on Windows?  Problems like
> this are more common on Windows, because Windows prevents a lot of file
> operations when files/directories are open.
>
> The long-term existence of directories with this naming convention
> indicates that *something* went wrong, but you would need to consult
> your logs to find out what happened.  There have been several bugs over
> Solr's history that cause this problem.
>
> Thanks,
> Shawn
>
>


Re: solr 4.10.3 and index.xxxxxxxxxxx directory

2015-04-01 Thread Erick Erickson
I _really_ suspect that with the huge JVM heaps you had, you were hitting long
GC pauses that exceeded the Zookeeper timeout, causing ZK to believe the
node had gone away thus throwing it into recovery mode.

You can enable GC logging to see whether you see such long pauses, but with 96G
it's almost certain that you did.

Reducing the JVM allocation should help, but if you continue to see
nodes go into
recovery for no apparent reason enabling GC logging is a good idea so you have
a record..

See "Getting a view into garbage collection" here:
https://lucidworks.com/blog/garbage-collection-bootcamp-1-0/

Best
Erick

On Wed, Apr 1, 2015 at 10:35 AM, Dominique Bejean
 wrote:
> Hi Shawn,
>
> Thank you for your response.
>
> This is a Solrcloud installation on Centos.
>
> There are 5 servers with 128 Gb ram each.
> The collection contains 650 millions of small documents.
> There are 3 shards with replicationfactor = 2 (so 9 cores).
> The JVM Xmx parameter was set to 96 Gb. We changed it yesterday to 32 Gb in
> order to be under the CompressedOops limit and free the direct memory for
> MMapDirectory.
>
> I will have access to both full solr and tomcat logs tomorrow.
>
> What I know, is that there are some zookeeper time out in solr logs.
> And the replications occur on some nodes after some commits (after DIH
> import) and when nodes restart.
>
> So, I will have more precise log messages tomorrow.
>
> Thank you for your response.
>
> Dominique
>
>
>
> 2015-04-01 18:29 GMT+02:00 Shawn Heisey :
>
>> On 4/1/2015 6:35 AM, Dominique Bejean wrote:
>> > Is it normal with Solr 4.10.3 that the data directory of replicas still
>> > contains directories like
>> >
>> > index.3636365667474747
>> > index.999080980976
>> >
>> > and files
>> >
>> > index.properties
>> > replica.properties
>> >
>> > If yes, why and in which circumstances ?
>>
>> The index. directories are created during master/slave
>> index replication.  If you're running SolrCloud, then replication is
>> only used for index recovery.  Index recovery is only required in
>> situations where the replicas are so far behind that the transaction log
>> cannot be used to synchronize them, and sometimes happens when a Solr
>> node is restarted.  If SolrCloud index recovery is actually required
>> when you are NOT restarting Solr instances, your index might be having
>> problems.
>>
>> Regardless of whether you're running SolrCloud or not, normally when one
>> of those directories with a numeric suffix is created, it will be
>> changed to "index" with no suffix after the replication is complete, but
>> if Solr is unable to change the directories for some reason, it will
>> simply keep and use the new directory with the suffix.  Do you see any
>> ERROR or WARN entries in your solr logfile that would indicate why Solr
>> cannot change the directory name?  Are you on Windows?  Problems like
>> this are more common on Windows, because Windows prevents a lot of file
>> operations when files/directories are open.
>>
>> The long-term existence of directories with this naming convention
>> indicates that *something* went wrong, but you would need to consult
>> your logs to find out what happened.  There have been several bugs over
>> Solr's history that cause this problem.
>>
>> Thanks,
>> Shawn
>>
>>


Re: Solr -indexing from csv file having 28 cols taking lot of time ..plz help i m new to solr

2015-04-01 Thread avinash09
thanks Erick and Alexandre Rafalovitch R

one more doubt how to pass ctrl A(^A) seprator while csv upload  




--
View this message in context: 
http://lucene.472066.n3.nabble.com/Solr-indexing-from-csv-file-having-28-cols-taking-lot-of-time-plz-help-i-m-new-to-solr-tp4196904p4196998.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: Solr 3.6, Highlight and multi words?

2015-04-01 Thread Bruno Mannina

Dear Charles,

Thanks for your answer, please find below my answers.

ok it works if I use "aben" as field in my query as you say in Answer 1.
it doesn't work if I use "ab" may be because "ab" field is a copyField 
for abfr, aben, abit, abpt


Concerning the 2., yes you have right it's not and but AND

I have this result:



  Bicycle  frame comprises holder, particularly for 
water bottle, where holder is connected


  #CMT# #/CMT# Thebicycle  frame (7) comprises a holder 
(1), particularly for a water bottle
  . The holder is connected with thebicycle  frame by a 
screw (5), where a mounting element has a compensation
section which is made of an elastic material, particularly 
aplastic  material. The compensation section

  


So my last question is why I haven't  instead having colored ?
How can I tell to solr to use the colored ?

Thanks a lot,
Bruno


Le 01/04/2015 17:15, Reitzel, Charles a écrit :

Haven't used Solr 3.x in a long time.  But with 4.10.x, I haven't had any 
trouble with multiple terms.  I'd look at a few things.

1.  Do you have a typo in your query?  Shouldn't it be q=aben:(plastic and 
bicycle)?

^^
2. Try removing the word "and" from the query.  There may be some interaction 
with a stop word filter.  If you want a phrase query, wrap it in quotes.

3.  Also, be sure that the query and indexing analyzers for the aben field are 
compatible with each other.

-Original Message-
From: Bruno Mannina [mailto:bmann...@free.fr]
Sent: Wednesday, April 01, 2015 7:05 AM
To: solr-user@lucene.apache.org
Subject: Re: Solr 3.6, Highlight and multi words?

Sorry to disturb you with the renew but nobody use or have problem with 
multi-terms and highlight ?

regards,

Le 29/03/2015 21:15, Bruno Mannina a écrit :

Dear Solr User,

I try to work with highlight, it works well but only if I have only
one keyword in my query?!
If my request is plastic AND bicycle then only plastic is highlight.

my request is:

./select/?q=ab%3A%28plastic+and+bicycle%29&version=2.2&start=0&row
s=10&indent=on&hl=true&hl.fl=tien,aben&fl=pn&f.aben.hl.snippets=5


Could you help me please to understand ? I read doc, google, without
success...
so I post here...

my result is:



  
 
   (EP2423092A1) #CMT# #/CMT# The bicycle pedal has a pedal
body (10) made fromplastic material
   , particularly for touring bike. #CMT#ADVANTAGE : #/CMT#
The bicycle pedal has a pedal body made
fromplastic
 
   
   
 
betweenplastic  tapes 3 and 3 having
two heat fusion layers, and the twoplastic  tapes
3 and 3 are stuck
 
   
   
 
 elements. A connecting element is formed as a hinge, a
flexible foil or a flexibleplastic  part.
#CMT#USE
 
   
   
 
   A bicycle handlebar grip includes an inner fiber layer and
an outerplastic layer. Thus, the fiber
 handlebar grip, while theplastic
layer is soft and has an adjustable thickness to provide a
comfortable
 sensation to a user. In addition,
theplastic  layer includes a holding portion
coated on the outer surface
 layer to enhance the combination strength between the
fiber layer and theplastic  layer and to
enhance
 
   


*
This e-mail may contain confidential or privileged information.
If you are not the intended recipient, please notify the sender immediately and 
then delete it.

TIAA-CREF
*




RE: Solr 3.6, Highlight and multi words?

2015-04-01 Thread Reitzel, Charles
If you want to query on the field ab, you'll probably need to add it the qf 
parameter.

To control the highlighting markup, with the standard highlighter, use 
hl.simple.pre and hl.simple.post.   

https://cwiki.apache.org/confluence/display/solr/Standard+Highlighter


-Original Message-
From: Bruno Mannina [mailto:bmann...@free.fr] 
Sent: Wednesday, April 01, 2015 2:24 PM
To: solr-user@lucene.apache.org
Subject: Re: Solr 3.6, Highlight and multi words?

Dear Charles,

Thanks for your answer, please find below my answers.

ok it works if I use "aben" as field in my query as you say in Answer 1.
it doesn't work if I use "ab" may be because "ab" field is a copyField for 
abfr, aben, abit, abpt

Concerning the 2., yes you have right it's not and but AND

I have this result:


 
   Bicycle  frame comprises holder, particularly 
for water bottle, where holder is connected
 
 
   #CMT# #/CMT# Thebicycle  frame (7) comprises a 
holder (1), particularly for a water bottle
   . The holder is connected with thebicycle  
frame by a screw (5), where a mounting element has a compensation
 section which is made of an elastic material, particularly 
aplastic  material. The compensation section
 
   


So my last question is why I haven't  instead having colored ?
How can I tell to solr to use the colored ?

Thanks a lot,
Bruno


Le 01/04/2015 17:15, Reitzel, Charles a écrit :
> Haven't used Solr 3.x in a long time.  But with 4.10.x, I haven't had any 
> trouble with multiple terms.  I'd look at a few things.
>
> 1.  Do you have a typo in your query?  Shouldn't it be q=aben:(plastic and 
> bicycle)?
>   
>   
> ^^ 2. Try removing the word "and" from the query.  There may be some 
> interaction with a stop word filter.  If you want a phrase query, wrap it in 
> quotes.
>
> 3.  Also, be sure that the query and indexing analyzers for the aben field 
> are compatible with each other.
>
> -Original Message-
> From: Bruno Mannina [mailto:bmann...@free.fr]
> Sent: Wednesday, April 01, 2015 7:05 AM
> To: solr-user@lucene.apache.org
> Subject: Re: Solr 3.6, Highlight and multi words?
>
> Sorry to disturb you with the renew but nobody use or have problem with 
> multi-terms and highlight ?
>
> regards,
>
> Le 29/03/2015 21:15, Bruno Mannina a écrit :
>> Dear Solr User,
>>
>> I try to work with highlight, it works well but only if I have only 
>> one keyword in my query?!
>> If my request is plastic AND bicycle then only plastic is highlight.
>>
>> my request is:
>>
>> ./select/?q=ab%3A%28plastic+and+bicycle%29&version=2.2&start=0&ro
>> w
>> s=10&indent=on&hl=true&hl.fl=tien,aben&fl=pn&f.aben.hl.snippets=5
>>
>>
>> Could you help me please to understand ? I read doc, google, without 
>> success...
>> so I post here...
>>
>> my result is:
>>
>> 
>>
>>   
>>  
>>(EP2423092A1) #CMT# #/CMT# The bicycle pedal has a pedal 
>> body (10) made fromplastic material
>>, particularly for touring bike. #CMT#ADVANTAGE : #/CMT# 
>> The bicycle pedal has a pedal body made 
>> fromplastic
>>  
>>
>>
>>  
>> betweenplastic  tapes 3 and 3 
>> having two heat fusion layers, and the 
>> twoplastic  tapes
>> 3 and 3 are stuck
>>  
>>
>>
>>  
>>  elements. A connecting element is formed as a hinge, a 
>> flexible foil or a flexibleplastic  part.
>> #CMT#USE
>>  
>>
>>
>>  
>>A bicycle handlebar grip includes an inner fiber layer 
>> and an outerplastic layer. Thus, the fiber
>>  handlebar grip, while theplastic 
>> layer is soft and has an adjustable thickness to provide a 
>> comfortable
>>  sensation to a user. In addition, 
>> theplastic  layer includes a holding portion 
>> coated on the outer surface
>>  layer to enhance the combination strength between the 
>> fiber layer and theplastic  layer and to 
>> enhance
>>  
>>
>
> **
> *** This e-mail may contain confidential or privileged information.
> If you are not the intended recipient, please notify the sender immediately 
> and then delete it.
>
> TIAA-CREF
> **
> ***


*
This e-mail may contain confidential or privileged information.
If you are not the intended recipient, please notify the sender immediately and 
then delete it.

TIAA-CREF
*


Re: Solr 3.6, Highlight and multi words?

2015-04-01 Thread Bruno Mannina

ok for qf (i can't test now)

but concerning hl.simple.pre hl.simple.post I can define only one color no ?

in the sample solrconfig.xml there are several color,


  

  
  

  

How can I tell to solr to use these color instead of hl.simple.pre/post ?



Le 01/04/2015 20:58, Reitzel, Charles a écrit :

If you want to query on the field ab, you'll probably need to add it the qf 
parameter.

To control the highlighting markup, with the standard highlighter, use 
hl.simple.pre and hl.simple.post.

https://cwiki.apache.org/confluence/display/solr/Standard+Highlighter


-Original Message-
From: Bruno Mannina [mailto:bmann...@free.fr]
Sent: Wednesday, April 01, 2015 2:24 PM
To: solr-user@lucene.apache.org
Subject: Re: Solr 3.6, Highlight and multi words?

Dear Charles,

Thanks for your answer, please find below my answers.

ok it works if I use "aben" as field in my query as you say in Answer 1.
it doesn't work if I use "ab" may be because "ab" field is a copyField for 
abfr, aben, abit, abpt

Concerning the 2., yes you have right it's not and but AND

I have this result:


  
Bicycle  frame comprises holder, particularly for 
water bottle, where holder is connected
  
  
#CMT# #/CMT# Thebicycle  frame (7) comprises a 
holder (1), particularly for a water bottle
. The holder is connected with thebicycle  frame by 
a screw (5), where a mounting element has a compensation
  section which is made of an elastic material, particularly 
aplastic  material. The compensation section
  



So my last question is why I haven't  instead having colored ?
How can I tell to solr to use the colored ?

Thanks a lot,
Bruno


Le 01/04/2015 17:15, Reitzel, Charles a écrit :

Haven't used Solr 3.x in a long time.  But with 4.10.x, I haven't had any 
trouble with multiple terms.  I'd look at a few things.

1.  Do you have a typo in your query?  Shouldn't it be q=aben:(plastic and 
bicycle)?
 
^^ 2. Try removing the word "and" from the query.  There may be some interaction with a stop word filter.  If you want a phrase query, wrap it in quotes.


3.  Also, be sure that the query and indexing analyzers for the aben field are 
compatible with each other.

-Original Message-
From: Bruno Mannina [mailto:bmann...@free.fr]
Sent: Wednesday, April 01, 2015 7:05 AM
To: solr-user@lucene.apache.org
Subject: Re: Solr 3.6, Highlight and multi words?

Sorry to disturb you with the renew but nobody use or have problem with 
multi-terms and highlight ?

regards,

Le 29/03/2015 21:15, Bruno Mannina a écrit :

Dear Solr User,

I try to work with highlight, it works well but only if I have only
one keyword in my query?!
If my request is plastic AND bicycle then only plastic is highlight.

my request is:

./select/?q=ab%3A%28plastic+and+bicycle%29&version=2.2&start=0&ro
w
s=10&indent=on&hl=true&hl.fl=tien,aben&fl=pn&f.aben.hl.snippets=5


Could you help me please to understand ? I read doc, google, without
success...
so I post here...

my result is:



   
  
(EP2423092A1) #CMT# #/CMT# The bicycle pedal has a pedal
body (10) made fromplastic material
, particularly for touring bike. #CMT#ADVANTAGE : #/CMT#
The bicycle pedal has a pedal body made
fromplastic
  


  
 betweenplastic  tapes 3 and 3
having two heat fusion layers, and the
twoplastic  tapes
3 and 3 are stuck
  


  
  elements. A connecting element is formed as a hinge, a
flexible foil or a flexibleplastic  part.
#CMT#USE
  


  
A bicycle handlebar grip includes an inner fiber layer
and an outerplastic layer. Thus, the fiber
  handlebar grip, while theplastic
layer is soft and has an adjustable thickness to provide a
comfortable
  sensation to a user. In addition,
theplastic  layer includes a holding portion
coated on the outer surface
  layer to enhance the combination strength between the
fiber layer and theplastic  layer and to
enhance
  


**
*** This e-mail may contain confidential or privileged information.
If you are not the intended recipient, please notify the sender immediately and 
then delete it.

TIAA-CREF
**
***


*
This e-mail may contain confidential or privileged information.
If you are not the intended recipient, please notify the sender immediately and 
then delete it.

TIAA-CREF
*




RE: Solr 3.6, Highlight and multi words?

2015-04-01 Thread Reitzel, Charles
Sorry, I've never tried highlighting in multiple colors...

-Original Message-
From: Bruno Mannina [mailto:bmann...@free.fr] 
Sent: Wednesday, April 01, 2015 3:43 PM
To: solr-user@lucene.apache.org
Subject: Re: Solr 3.6, Highlight and multi words?

ok for qf (i can't test now)

but concerning hl.simple.pre hl.simple.post I can define only one color no ?

in the sample solrconfig.xml there are several color,


   
 
   
   
 
   

How can I tell to solr to use these color instead of hl.simple.pre/post ?



Le 01/04/2015 20:58, Reitzel, Charles a écrit :
> If you want to query on the field ab, you'll probably need to add it the qf 
> parameter.
>
> To control the highlighting markup, with the standard highlighter, use 
> hl.simple.pre and hl.simple.post.
>
> https://cwiki.apache.org/confluence/display/solr/Standard+Highlighter
>
>
> -Original Message-
> From: Bruno Mannina [mailto:bmann...@free.fr]
> Sent: Wednesday, April 01, 2015 2:24 PM
> To: solr-user@lucene.apache.org
> Subject: Re: Solr 3.6, Highlight and multi words?
>
> Dear Charles,
>
> Thanks for your answer, please find below my answers.
>
> ok it works if I use "aben" as field in my query as you say in Answer 1.
> it doesn't work if I use "ab" may be because "ab" field is a copyField 
> for abfr, aben, abit, abpt
>
> Concerning the 2., yes you have right it's not and but AND
>
> I have this result:
>
> 
>   
> Bicycle  frame comprises holder, 
> particularly for water bottle, where holder is connected
>   
>   
> #CMT# #/CMT# Thebicycle  frame (7) 
> comprises a holder (1), particularly for a water bottle
> . The holder is connected with thebicycle  
> frame by a screw (5), where a mounting element has a compensation
>   section which is made of an elastic material, particularly 
> aplastic  material. The compensation section
>   
> 
>
>
> So my last question is why I haven't  instead having colored ?
> How can I tell to solr to use the colored ?
>
> Thanks a lot,
> Bruno
>
>
> Le 01/04/2015 17:15, Reitzel, Charles a écrit :
>> Haven't used Solr 3.x in a long time.  But with 4.10.x, I haven't had any 
>> trouble with multiple terms.  I'd look at a few things.
>>
>> 1.  Do you have a typo in your query?  Shouldn't it be q=aben:(plastic and 
>> bicycle)?
>>  
>> 
>> ^^ 2. Try removing the word "and" from the query.  There may be some 
>> interaction with a stop word filter.  If you want a phrase query, wrap it in 
>> quotes.
>>
>> 3.  Also, be sure that the query and indexing analyzers for the aben field 
>> are compatible with each other.
>>
>> -Original Message-
>> From: Bruno Mannina [mailto:bmann...@free.fr]
>> Sent: Wednesday, April 01, 2015 7:05 AM
>> To: solr-user@lucene.apache.org
>> Subject: Re: Solr 3.6, Highlight and multi words?
>>
>> Sorry to disturb you with the renew but nobody use or have problem with 
>> multi-terms and highlight ?
>>
>> regards,
>>
>> Le 29/03/2015 21:15, Bruno Mannina a écrit :
>>> Dear Solr User,
>>>
>>> I try to work with highlight, it works well but only if I have only 
>>> one keyword in my query?!
>>> If my request is plastic AND bicycle then only plastic is highlight.
>>>
>>> my request is:
>>>
>>> ./select/?q=ab%3A%28plastic+and+bicycle%29&version=2.2&start=0&r
>>> o
>>> w
>>> s=10&indent=on&hl=true&hl.fl=tien,aben&fl=pn&f.aben.hl.snippets=5
>>>
>>>
>>> Could you help me please to understand ? I read doc, google, without 
>>> success...
>>> so I post here...
>>>
>>> my result is:
>>>
>>> 
>>>
>>>
>>>   
>>> (EP2423092A1) #CMT# #/CMT# The bicycle pedal has a 
>>> pedal body (10) made fromplastic material
>>> , particularly for touring bike. #CMT#ADVANTAGE : 
>>> #/CMT# The bicycle pedal has a pedal body made 
>>> fromplastic
>>>   
>>> 
>>> 
>>>   
>>>  betweenplastic  tapes 3 and 3 
>>> having two heat fusion layers, and the 
>>> twoplastic  tapes
>>> 3 and 3 are stuck
>>>   
>>> 
>>> 
>>>   
>>>   elements. A connecting element is formed as a hinge, 
>>> a flexible foil or a flexibleplastic  part.
>>> #CMT#USE
>>>   
>>> 
>>> 
>>>   
>>> A bicycle handlebar grip includes an inner fiber layer 
>>> and an outerplastic layer. Thus, the fiber
>>>   handlebar grip, while theplastic 
>>> layer is soft and has an adjustable thickness to provide a 
>>> comfortable
>>>   sensation to a user. In addition, 
>>> theplastic  layer includes a holding portion 
>>> coated on the outer surface
>>>   layer to enhance the combination strength between the 
>>> fiber layer and theplastic  layer and to 
>>> enhance
>>>   
>>> 
>> *

SolrCloud 5.0 cluster RAM requirements

2015-04-01 Thread Ryan Steele
Does a SolrCloud 5.0 cluster need enough RAM across the cluster to load 
all the collections into RAM at all times?


I'm building a SolrCloud cluster that may have approximately 1 TB of 
data spread across the collections.


Thanks,
Ryan
---
This email has been scanned for email related threats and delivered safely by 
Mimecast.
For more information please visit http://www.mimecast.com
---



Re: Solr -indexing from csv file having 28 cols taking lot of time ..plz help i m new to solr

2015-04-01 Thread Alexandre Rafalovitch
That's an interesting question. The reference shows you how to set a
separator, but ^A is a special case. You may need to pass it in as a
URL escape character or similar.

But I would first get a sample working with more conventional
separator and then worry about ^A. Just so you are not confusing
several problems.

Regards,
   Alex.

Solr Analyzers, Tokenizers, Filters, URPs and even a newsletter:
http://www.solr-start.com/


On 2 April 2015 at 05:05, avinash09  wrote:
> thanks Erick and Alexandre Rafalovitch R
>
> one more doubt how to pass ctrl A(^A) seprator while csv upload
>
>
>
>
> --
> View this message in context: 
> http://lucene.472066.n3.nabble.com/Solr-indexing-from-csv-file-having-28-cols-taking-lot-of-time-plz-help-i-m-new-to-solr-tp4196904p4196998.html
> Sent from the Solr - User mailing list archive at Nabble.com.


Re: Solr 3.6, Highlight and multi words?

2015-04-01 Thread Bruno Mannina

of course no prb charles, you already help me !

Le 01/04/2015 21:54, Reitzel, Charles a écrit :

Sorry, I've never tried highlighting in multiple colors...

-Original Message-
From: Bruno Mannina [mailto:bmann...@free.fr]
Sent: Wednesday, April 01, 2015 3:43 PM
To: solr-user@lucene.apache.org
Subject: Re: Solr 3.6, Highlight and multi words?

ok for qf (i can't test now)

but concerning hl.simple.pre hl.simple.post I can define only one color no ?

in the sample solrconfig.xml there are several color,



  


  


How can I tell to solr to use these color instead of hl.simple.pre/post ?



Le 01/04/2015 20:58, Reitzel, Charles a écrit :

If you want to query on the field ab, you'll probably need to add it the qf 
parameter.

To control the highlighting markup, with the standard highlighter, use 
hl.simple.pre and hl.simple.post.

https://cwiki.apache.org/confluence/display/solr/Standard+Highlighter


-Original Message-
From: Bruno Mannina [mailto:bmann...@free.fr]
Sent: Wednesday, April 01, 2015 2:24 PM
To: solr-user@lucene.apache.org
Subject: Re: Solr 3.6, Highlight and multi words?

Dear Charles,

Thanks for your answer, please find below my answers.

ok it works if I use "aben" as field in my query as you say in Answer 1.
it doesn't work if I use "ab" may be because "ab" field is a copyField
for abfr, aben, abit, abpt

Concerning the 2., yes you have right it's not and but AND

I have this result:


   
 Bicycle  frame comprises holder, particularly for 
water bottle, where holder is connected
   
   
 #CMT# #/CMT# Thebicycle  frame (7) comprises a 
holder (1), particularly for a water bottle
 . The holder is connected with thebicycle  frame 
by a screw (5), where a mounting element has a compensation
   section which is made of an elastic material, particularly 
aplastic  material. The compensation section
   
 


So my last question is why I haven't  instead having colored ?
How can I tell to solr to use the colored ?

Thanks a lot,
Bruno


Le 01/04/2015 17:15, Reitzel, Charles a écrit :

Haven't used Solr 3.x in a long time.  But with 4.10.x, I haven't had any 
trouble with multiple terms.  I'd look at a few things.

1.  Do you have a typo in your query?  Shouldn't it be q=aben:(plastic and 
bicycle)?
  
^^ 2. Try removing the word "and" from the query.  There may be some interaction with a stop word filter.  If you want a phrase query, wrap it in quotes.


3.  Also, be sure that the query and indexing analyzers for the aben field are 
compatible with each other.

-Original Message-
From: Bruno Mannina [mailto:bmann...@free.fr]
Sent: Wednesday, April 01, 2015 7:05 AM
To: solr-user@lucene.apache.org
Subject: Re: Solr 3.6, Highlight and multi words?

Sorry to disturb you with the renew but nobody use or have problem with 
multi-terms and highlight ?

regards,

Le 29/03/2015 21:15, Bruno Mannina a écrit :

Dear Solr User,

I try to work with highlight, it works well but only if I have only
one keyword in my query?!
If my request is plastic AND bicycle then only plastic is highlight.

my request is:

./select/?q=ab%3A%28plastic+and+bicycle%29&version=2.2&start=0&r
o
w
s=10&indent=on&hl=true&hl.fl=tien,aben&fl=pn&f.aben.hl.snippets=5


Could you help me please to understand ? I read doc, google, without
success...
so I post here...

my result is:




   
 (EP2423092A1) #CMT# #/CMT# The bicycle pedal has a
pedal body (10) made fromplastic material
 , particularly for touring bike. #CMT#ADVANTAGE :
#/CMT# The bicycle pedal has a pedal body made
fromplastic
   
 
 
   
  betweenplastic  tapes 3 and 3
having two heat fusion layers, and the
twoplastic  tapes
3 and 3 are stuck
   
 
 
   
   elements. A connecting element is formed as a hinge,
a flexible foil or a flexibleplastic  part.
#CMT#USE
   
 
 
   
 A bicycle handlebar grip includes an inner fiber layer
and an outerplastic layer. Thus, the fiber
   handlebar grip, while theplastic
layer is soft and has an adjustable thickness to provide a
comfortable
   sensation to a user. In addition,
theplastic  layer includes a holding portion
coated on the outer surface
   layer to enhance the combination strength between the
fiber layer and theplastic  layer and to
enhance
   
 

*
*
*** This e-mail may contain confidential or privileged information.
If you are not the intended recipient, please notify the sender immediately and 
then delete it.

TIAA-CREF
***

Re: SolrCloud 5.0 cluster RAM requirements

2015-04-01 Thread Shawn Heisey

On 4/1/2015 3:22 PM, Ryan Steele wrote:
Does a SolrCloud 5.0 cluster need enough RAM across the cluster to 
load all the collections into RAM at all times?


"Need" is too strong a word.  If you want the best possible performance, 
then you would have enough RAM across the cluster to cache the entire 
index.  That's not required for a *functional* system, ignoring 
performance.  For an index on that scale, caching the entire index is 
usually an unrealistically expensive goal.


Are you the person who mentioned a terabyte scale SolrCloud index on the 
#solr IRC channel that's hosted on Amazon?


Here's a general wiki page on performance problems with Solr that has a 
large amount of focus on RAM:


http://wiki.apache.org/solr/SolrPerformanceProblems

The unfortunate fact about this is that the only way you'll figure out 
what you actually need is to prototype, and prototyping on the scale of 
your index is difficult and expensive.


https://lucidworks.com/blog/sizing-hardware-in-the-abstract-why-we-dont-have-a-definitive-answer/

Thanks,
Shawn



Re: Unable to perform search query after changing uniqueKey

2015-04-01 Thread Zheng Lin Edwin Yeo
Hi Steve,

Thanks for the link and the information.

Regards,
Edwin


On 1 April 2015 at 23:17, Erick Erickson  wrote:

> Steve:
>
> Totally agree. Even if you _do_ correctly escape the URL though,
> there's no guarantee that Solr will "do the right thing" with field
> names with spaces. Plus endless chances for you to get it wrong when
> constructing the URL
>
> Best,
> Erick
>
> On Wed, Apr 1, 2015 at 1:01 AM, steve  wrote:
> > Gently walking into rough waters here, but if you use any API with GET,
> you're sending a URI which must be properly encoded. This has nothing to do
> with with the programming language that generates key and store pairs on
> the browser or the one(s) used on the server. Lots and lots of good folks
> have tripped over this one.http://www.w3schools.com/tags/ref_urlencode.asp
> > Play hard, but play safe!
> >
> >> Date: Wed, 1 Apr 2015 13:58:55 +0800
> >> Subject: Re: Unable to perform search query after changing uniqueKey
> >> From: edwinye...@gmail.com
> >> To: solr-user@lucene.apache.org
> >>
> >> Thanks Erick.
> >>
> >> Yes, it is able to work correct if I do not use spaces for the field
> names,
> >> especially for the uniqueKey.
> >>
> >> Regards,
> >> Edwin
> >>
> >>
> >> On 31 March 2015 at 13:58, Erick Erickson 
> wrote:
> >>
> >> > I would never put spaces in my field names! Frankly I have no clue
> >> > what Solr does with that, but it can't be good. Solr explicitly
> >> > supports Java naming conventions, camel case, underscores and numbers.
> >> > Special symbols are frowned upon, I never use anything but upper case,
> >> > lower case and underscores. Actually, I don't use upper case either
> >> > but that's a personal preference. Other things might work, but only by
> >> > chance.
> >> >
> >> > Best,
> >> > Erick
> >> >
> >> > On Mon, Mar 30, 2015 at 8:59 PM, Zheng Lin Edwin Yeo
> >> >  wrote:
> >> > > Latest information that I've found for this is that the error only
> occurs
> >> > > for shard2.
> >> > >
> >> > > If I do a search for just shard1, those records that are assigned to
> >> > shard1
> >> > > will be able to be displayed. Only when I search for shard2 will the
> >> > > NullPointerException error occurs. Previously I was doing a search
> for
> >> > both
> >> > > shards.
> >> > >
> >> > > Is there any settings that I required to do for shard2 in order to
> solve
> >> > > this issue? Currently I have not made any changes to the shards
> since I
> >> > > created it using
> >> > >
> >> >
> http://localhost:8983/solr/admin/collections?action=CREATE&name=nps1&numShards=2&collection.configName=collection1
> >> > >
> >> > >
> >> > > Regards,
> >> > > Edwin
> >> > >
> >> > > On 31 March 2015 at 09:42, Zheng Lin Edwin Yeo <
> edwinye...@gmail.com>
> >> > wrote:
> >> > >
> >> > >> Hi Erick,
> >> > >>
> >> > >> I've changed the uniqueKey from id to Item No.
> >> > >>
> >> > >> Item No
> >> > >>
> >> > >>
> >> > >> Below are my definitions for both the id and Item No.
> >> > >>
> >> > >>  >> > >> required="false" multiValued="false" />
> >> > >>  stored="true"/>
> >> > >>
> >> > >> Regards,
> >> > >> Edwin
> >> > >>
> >> > >>
> >> > >> On 30 March 2015 at 23:05, Erick Erickson  >
> >> > wrote:
> >> > >>
> >> > >>> Well, let's see the definition of your ID field, 'cause I'm
> puzzled.
> >> > >>>
> >> > >>> It's definitely A Bad Thing to have it be any kind of tokenized
> field
> >> > >>> though, but that's a shot in the dark.
> >> > >>>
> >> > >>> Best,
> >> > >>> Erick
> >> > >>>
> >> > >>> On Mon, Mar 30, 2015 at 2:17 AM, Zheng Lin Edwin Yeo
> >> > >>>  wrote:
> >> > >>> > Hi Mostafa,
> >> > >>> >
> >> > >>> > Yes, I've defined all the fields in schema.xml. It is able to
> work on
> >> > >>> the
> >> > >>> > version without SolrCloud, but it is not working for the one
> with
> >> > >>> SolrCloud.
> >> > >>> > Both of them are using the same schema.xml.
> >> > >>> >
> >> > >>> > Regards,
> >> > >>> > Edwin
> >> > >>> >
> >> > >>> >
> >> > >>> >
> >> > >>> > On 30 March 2015 at 14:34, Mostafa Gomaa <
> mostafa.goma...@gmail.com>
> >> > >>> wrote:
> >> > >>> >
> >> > >>> >> Hi Zheng,
> >> > >>> >>
> >> > >>> >> It's possible that there's a problem with your schema.xml. Are
> all
> >> > >>> fields
> >> > >>> >> defined and have appropriate options enabled?
> >> > >>> >>
> >> > >>> >> Regards,
> >> > >>> >>
> >> > >>> >> Mostafa.
> >> > >>> >>
> >> > >>> >> On Mon, Mar 30, 2015 at 7:49 AM, Zheng Lin Edwin Yeo <
> >> > >>> edwinye...@gmail.com
> >> > >>> >> >
> >> > >>> >> wrote:
> >> > >>> >>
> >> > >>> >> > Hi Erick,
> >> > >>> >> >
> >> > >>> >> > I've tried that, and removed the data directory from both the
> >> > >>> shards. But
> >> > >>> >> > the same problem still occurs, so we probably can rule out
> the
> >> > >>> "memory"
> >> > >>> >> > issue.
> >> > >>> >> >
> >> > >>> >> > Regards,
> >> > >>> >> > Edwin
> >> > >>> >> >
> >> > >>> >> > On 30 March 2015 at 12:39, Erick Erickson <
> >> > erickerick...@gmail.com>
> >> > >>> >> wrote:
> >> > >>> >> >
> >> > >>>

How to recover a Shard

2015-04-01 Thread Matt Kuiper
Hello,

I have a SolrCloud (4.10.1) where for one of the shards, both replicas are in a 
"Recovery Failed" state per the Solr Admin Cloud page.  The logs contains the 
following type of entries for the two Solr nodes involved, including statements 
that it will retry.

Is there a way to recover from this state?

Maybe bring down one replica, and then somehow declare that the remaining 
replica is to be the leader?  Understand this would not be ideal as the new 
leader may be missing documents that were sent its way to be indexed while it 
was down, but would be better than having to rebuild the whole cloud.

Any tips or suggestions would be appreciated.

Thanks,
Matt

Solr node .65
Error while trying to recover. 
core=kla_collection_shard6_replica5:org.apache.solr.common.SolrException: No 
registered leader was found after waiting for 4000ms , collection: 
kla_collection slice: shard6
 at 
org.apache.solr.common.cloud.ZkStateReader.getLeaderRetry(ZkStateReader.java:568)
 at 
org.apache.solr.common.cloud.ZkStateReader.getLeaderRetry(ZkStateReader.java:551)
 at 
org.apache.solr.cloud.RecoveryStrategy.doRecovery(RecoveryStrategy.java:332)
 at 
org.apache.solr.cloud.RecoveryStrategy.run(RecoveryStrategy.java:235)
Solr node .64

Error while trying to recover. 
core=kla_collection_shard6_replica2:org.apache.solr.common.SolrException: No 
registered leader was found after waiting for 4000ms , collection: 
kla_collection slice: shard6

 at 
org.apache.solr.common.cloud.ZkStateReader.getLeaderRetry(ZkStateReader.java:568)

 at 
org.apache.solr.common.cloud.ZkStateReader.getLeaderRetry(ZkStateReader.java:551)

 at 
org.apache.solr.cloud.RecoveryStrategy.doRecovery(RecoveryStrategy.java:332)

 at 
org.apache.solr.cloud.RecoveryStrategy.run(RecoveryStrategy.java:235)



RE: How to recover a Shard

2015-04-01 Thread Matt Kuiper
Maybe I have been working too many long hours as I missed the obvious solution 
of bringing down/up one of the Solr nodes backing one of the replicas, and then 
the same for the second node.  This did the trick.

Since I brought this topic up, I will narrow the question a bit:  Would there 
be a way to recover without restarting the Solr node?  Basically to delete one 
replica and then somehow declare the other replica the leader and break it out 
of its recovery process?

Thanks,
Matt


From: Matt Kuiper
Sent: Wednesday, April 01, 2015 8:43 PM
To: solr-user@lucene.apache.org
Subject: How to recover a Shard

Hello,

I have a SolrCloud (4.10.1) where for one of the shards, both replicas are in a 
"Recovery Failed" state per the Solr Admin Cloud page.  The logs contains the 
following type of entries for the two Solr nodes involved, including statements 
that it will retry.

Is there a way to recover from this state?

Maybe bring down one replica, and then somehow declare that the remaining 
replica is to be the leader?  Understand this would not be ideal as the new 
leader may be missing documents that were sent its way to be indexed while it 
was down, but would be better than having to rebuild the whole cloud.

Any tips or suggestions would be appreciated.

Thanks,
Matt

Solr node .65
Error while trying to recover. 
core=kla_collection_shard6_replica5:org.apache.solr.common.SolrException: No 
registered leader was found after waiting for 4000ms , collection: 
kla_collection slice: shard6
 at 
org.apache.solr.common.cloud.ZkStateReader.getLeaderRetry(ZkStateReader.java:568)
 at 
org.apache.solr.common.cloud.ZkStateReader.getLeaderRetry(ZkStateReader.java:551)
 at 
org.apache.solr.cloud.RecoveryStrategy.doRecovery(RecoveryStrategy.java:332)
 at 
org.apache.solr.cloud.RecoveryStrategy.run(RecoveryStrategy.java:235)
Solr node .64

Error while trying to recover. 
core=kla_collection_shard6_replica2:org.apache.solr.common.SolrException: No 
registered leader was found after waiting for 4000ms , collection: 
kla_collection slice: shard6

 at 
org.apache.solr.common.cloud.ZkStateReader.getLeaderRetry(ZkStateReader.java:568)

 at 
org.apache.solr.common.cloud.ZkStateReader.getLeaderRetry(ZkStateReader.java:551)

 at 
org.apache.solr.cloud.RecoveryStrategy.doRecovery(RecoveryStrategy.java:332)

 at 
org.apache.solr.cloud.RecoveryStrategy.run(RecoveryStrategy.java:235)



Re: Solr went on recovery multiple time.

2015-04-01 Thread William Bell
I would give it 32GB of RAM. And try to use SSD.

On Tue, Mar 31, 2015 at 12:50 AM, sthita  wrote:

> Hi Bill, My index size is around 48GB and contains around 8 million
> documents.
>
>
>
> --
> View this message in context:
> http://lucene.472066.n3.nabble.com/Solr-went-on-recovery-multiple-time-tp4196249p4196504.html
> Sent from the Solr - User mailing list archive at Nabble.com.
>



-- 
Bill Bell
billnb...@gmail.com
cell 720-256-8076


Re: Solr -indexing from csv file having 28 cols taking lot of time ..plz help i m new to solr

2015-04-01 Thread avinash09
Alex,
finally it worked for me found ctrl A separator ==( separator=%01&escape=\)

Thanks for your help



--
View this message in context: 
http://lucene.472066.n3.nabble.com/Solr-indexing-from-csv-file-having-28-cols-taking-lot-of-time-plz-help-i-m-new-to-solr-tp4196904p4197143.html
Sent from the Solr - User mailing list archive at Nabble.com.