Known memory leaks in 4.0?

2013-03-15 Thread Per Steffensen

Hi

We have a problem that seems to be due to memory leaks during search on 
Solr 4.0. Havnt dived into it yet, so I am certainly not sure, but just 
wanted to ask upfront, if 4.0 contains any known memory leaks? And if 
they have been fixed?


Regards, Per Steffensen


Re: Known memory leaks in 4.0?

2013-03-15 Thread Bernd Fehling
How do you know that it is Solr and nothing else?

Have you check with MemoryAnalyzer?
http://wiki.eclipse.org/index.php/MemoryAnalyzer

As we are always using the most recent released version we
have never seen any memory leaks with Solr so far.

Regards
Bernd

Am 15.03.2013 08:21, schrieb Per Steffensen:
> Hi
> 
> We have a problem that seems to be due to memory leaks during search on Solr 
> 4.0. Havnt dived into it yet, so I am certainly not sure, but just
> wanted to ask upfront, if 4.0 contains any known memory leaks? And if they 
> have been fixed?
> 
> Regards, Per Steffensen



Re: Out of Memory doing a query Solr 4.2

2013-03-15 Thread Bernd Fehling
We are currently using
Oracle Corporation Java HotSpot(TM) 64-Bit Server VM (1.7.0_07 23.3-b01)

Runs excellent and also no memory parameter tweaking neccessary.
Give enough physical and JVM memory, use "-XX:+UseG1GC" and thats it.

Also no "saw tooth" and GC timeouts from JVM as with earlier versions.

Regards
Bernd


Am 15.03.2013 09:09, schrieb raulgrande83:
> Why? Could this be the cause of the problem? This was working ok for Solr
> 3.5.
> 
> Could you recommend me one ?
> 
> Thanks.
> 
> 
> 
> 
> 
> 
> --
> View this message in context: 
> http://lucene.472066.n3.nabble.com/Out-of-Memory-doing-a-query-Solr-4-2-tp4047394p4047621.html
> Sent from the Solr - User mailing list archive at Nabble.com.
> 


Re: Advice: solrCloud + DIH

2013-03-15 Thread roySolr
Thans for the support so far,

I was running the dataimport on a replica! Now i start it on the leader and
it goes with 590 doc/s. I think all docs were going to another node and then
came back. 

Is there a way to get the leader? If there is, i can detect the leader with
a script and start the DIH every night on the right server. 

Roy





--
View this message in context: 
http://lucene.472066.n3.nabble.com/Advice-solrCloud-DIH-tp4047339p4047627.html
Sent from the Solr - User mailing list archive at Nabble.com.


Solr 3.6.1 and facet query regular expression

2013-03-15 Thread xpow
Hi, I have a question regarding how to facet query a particular field using a
regular expression ... I have a data like below:

{
  "responseHeader":{
"status":0,
"QTime":10,
"params":{
  "hl.fragsize":"500",
  "facet":"true",
  "sort":"score desc",
  "indent":"on",
  "facet.limit":"-1",
  "hl.fl":"exactText",
  "wt":"json",
  "hl":"true",
  "omitNorms":"true",
  "rows":"3",
  "fl":"title,id,mimetype",
  "start":"0",
  "hl.maxAlternateFieldLength":"500",
  "q":"exactText:(*:*)",
  "hl.alternateField":"exactText",
  "facet.field":["mimetype",
"date"]}},
  "response":{"numFound":3,"start":0,"docs":[
  {
"mimetype":"application/xhtml+xml",
"title":"title 1",
"id":"http://www.abbc.go.id/1"},
  {
"mimetype":"application/xhtml+xml",
"title":"title 2",
"id":"http://www.abbc.go.id/2"},
  {
"id":"http://www.defg.go.id/3";,
"title":"title 3",
"mimetype":"application/xhtml+xml"}]
  },
  "facet_counts":{
"facet_queries":{},
"facet_fields":{
  "mimetype":[
"application/pdf",5,
"application/xhtml+xml",48,
"image/jpeg",7,
"text/html",15],
  "date":[
"2011-07-13",1,
"2012-11-09",1,
"2012-11-12",1,
"2013-02-04",2,
"2013-03-04",1,
"2013-03-05",3,
"2013-03-11",3,
"2013-03-15",63]},
"facet_dates":{},
"facet_ranges":{}},
  "highlighting":{
"http://www.abbc.go.id/1":{
  "exactText":["content 1"]},
"http://www.abbc.go.id/2":{
  "exactText":["content 2"]},
"http://www.defg.go.id/3":{
  "exactText":["content 3"]}}}

My end goal would be to query based on content of exactText but within a
subset of a id field, for example:
- the keyword is "content"
- the regular expression that is applied to id is "http://www.abbc.go.id/*";
- there will be 2 results:
==> 1) http://www.abbc.go.id/1
==> 2) http://www.abbc.go.id/2
- the query would be q=exactText:(content)&fq=id:"http://www.abbc.go.id/*";

So my question: is it possible to query this using solr 3.6.1??

Thanks



--
View this message in context: 
http://lucene.472066.n3.nabble.com/Solr-3-6-1-and-facet-query-regular-expression-tp4047628.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: Blog Post: Integration Testing SOLR Index with Maven

2013-03-15 Thread Chantal Ackermann
Hi,

@Lance - thanks, it's a pleasure to give something back to the community. Even 
if it is comparatively small. :-)

@Paul - it's definitly not 15 min but rather 2 min. Actually, the testing part 
of this setup is very regular compared to other Maven projects. The copying of 
the WAR file and repackaging is not that time consuming. (This is still Maven - 
widely used and proven - it wouldn't be if it was not practical?)


Cheers,
Chantal

how to get term vector information of sepcific word/position in field

2013-03-15 Thread vrparekh
Hello,

currently when we set qt=tvrh&tv.all=true; it will return all the words
which are there in text of field.

is there any way, if i can get term vector information of specific word
only, like i can pass the word, and it will just return term position and
frequency for that word only?

and also if i can pass the position e.g. startPosition=5 and endPosition=10;
then it will return terms, positions and frequency of words which are there
occurred inbeween start and end postion.





--
View this message in context: 
http://lucene.472066.n3.nabble.com/how-to-get-term-vector-information-of-sepcific-word-position-in-field-tp4047637.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: SOLR - Dynamic Schema Design

2013-03-15 Thread Upayavira
The purpose of the schema is to associate a type with a field name.
That's it.

A dynamic field associates a type with a range of names.

An empty field in a Lucene index doesn't take any space, so having 450
fields doesn't in itself cause a problem. The point at which you may
have a problem is when you want to search across those dynamic fields,
as search across multiple fields in Solr/Lucene is not as efficient as
searching across a single field.

If in your application you will know the name of the field you want to
search against at query time, then your scenario seems quite reasonable.

Upayavira 

On Fri, Mar 15, 2013, at 09:51 AM, kobe.free.wo...@gmail.com wrote:
> Hello All,
> 
> Scenario:
> 
> We trying to define the schema structure for our application search
> feature,
> based on SOLR search server. In our scenario the total number of fields
> is
> 450 (quiet huge) and we will be using the features like faceting, sorting
> etc. This field set will be dynamic (not permanent) and will be modified
> (like removing/ adding some fields) on regular basis, based on the
> business
> needs. We are planning to use the concept of 'Dynamic Fields" for most of
> the fields from the original fields set of 450.
> 
> Queries:
> 
> 1. What are the pros/ cons of using dynamic fields?
> 2. What is the best way to achieve dynamic schema, is the above mentioned
> method proper?
> 3. What will be the best approach to update the schema and re-index the
> data?
> 4. Is there a major index size difference when using dynamic fields?
> 
> 
> 
> 
> 
> --
> View this message in context:
> http://lucene.472066.n3.nabble.com/SOLR-Dynamic-Schema-Design-tp4047638.html
> Sent from the Solr - User mailing list archive at Nabble.com.


Storage solutions recommendations

2013-03-15 Thread Christian von Wendt-Jensen
Hi,

I want to find what your experiences are with different storage setups.

We tried running a master/slave setup on the SAN but quickly realized that the 
master did not index fast enough. We didn't run with soft commit though – maybe 
that would change the conclusion?
The slaves seemed to run OK with data on the SAN, but as soon as replication 
was enabled, it died. Replication took hours and drained resources preventing 
good performance on the replicas. The cache warmup time took forever.

How does YOUR setup look like, and what storage solutions could YOU recommend? 
SAN? Local disc? Local SSD? Softcommit?





Med venlig hilsen / Best Regards

Christian von Wendt-Jensen
IT Team Lead, Customer Solutions

Infopaq International A/S
Kgs. Nytorv 22
DK-1050 København K

Phone +45 36 99 00 00
Mobile +45 31 17 10 07
Email  
christian.sonne.jen...@infopaq.com
Webwww.infopaq.com








DISCLAIMER:
This e-mail and accompanying documents contain privileged confidential 
information. The information is intended only for the recipient(s) named. Any 
unauthorised disclosure, copying, distribution, exploitation or the taking of 
any action in reliance of the content of this e-mail is strictly prohibited. If 
you have received this e-mail in error we would be obliged if you would delete 
the e-mail and attachments and notify the dispatcher by return e-mail or at +45 
36 99 00 00
P Please consider the environment before printing this mail note.


Re: SOLR Num Docs vs NumFound

2013-03-15 Thread Santoash Rajaram
I don't have an answer but I have seen this before too. I assumed this is an 
issue with the admin UI. In my case the number returned by the query looked 
closer to the truth than the one in the UI. I even tried an hard commit and 
optimize via admin UI. It didn't help. 

If you want to try hard commits, they can be done either via configuration in 
solrconfig.xml (specify a frequency) or solrj API or /update URL. All of these 
are explained here:

http://wiki.apache.org/solr/UpdateXmlMessages#A.22commit.22_and_.22optimize.22

-santoash

On Mar 14, 2013, at 9:48 PM, Nathan Findley  wrote:

> On my solr 4 setup a query returns a higher "NumFound" value during a *:* 
> query than the "Num Docs" value reported on the statistics page of 
> collection1. Why is that? My data is split across 3 data import handlers 
> where each handler has the same type of data but the ids are guaranteed to be 
> different.
> 
> Are some of my documents not hard commited? If so, how do I hard commit. 
> Otherwise, why are these numbers different?
> 
> -- 
> CTO
> Zenlok株式会社
> 


Query on Solr Data-db-config.xml

2013-03-15 Thread Ravi_Mandala
Hi,

My web service has the all DB related information(like
username,password,entity names,fields etc).I want to pass this data to Solr
dataimport handler to do 

the importing(fullimport or deltaimport).

Is it possible, passing the DB information and doing the data import from
the solr?.(i want db-data-config.xml data should be dynamic)


If yes, let me know the solution .

Thanks in Advance.
Ravi



--
View this message in context: 
http://lucene.472066.n3.nabble.com/Query-on-Solr-Data-db-config-xml-tp4047599.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: Solr 4.1 monitoring with /solr/replication?command=details - indexVersion?

2013-03-15 Thread Rafał Radecki
I use http and get /solr/replication?command=indexversion urls to get
index versions on master and slave. The replication works fine but
index versions from /solr/replication?command=indexversion differ.

Best regards,
Rafal.

2013/3/14 Mark Miller :
> What calls are you using to get the versions? Or is it the admin UI?
>
> Also can you add any details about your setup - if this is a problem, we need 
> to duplicate it in one of our unit tests.
>
> Also, is it affecting proper replication in any way that you can tell.
>
> - Mark
>
> On Mar 14, 2013, at 11:12 AM, richardg  wrote:
>
>> I believe this is the same issue as described, I'm running 4.2 and as you can
>> see my slave is a couple versions ahead of the master (all three slaves show
>> the same behavior).  This was never the case until I upgraded from 4.0 to
>> 4.2.
>>
>> Master:
>> 1363272681951
>> 93
>> 1,022.31 MB
>> Slave:
>> 1363273274085
>> 95
>> 1,022.31 MB
>>
>>
>>
>> --
>> View this message in context: 
>> http://lucene.472066.n3.nabble.com/Solr-4-1-monitoring-with-solr-replication-command-details-indexVersion-tp4047329p4047380.html
>> Sent from the Solr - User mailing list archive at Nabble.com.
>


Re: Out of Memory doing a query Solr 4.2

2013-03-15 Thread Robert Muir
On Fri, Mar 15, 2013 at 6:46 AM, raulgrande83  wrote:
> Thank you for your help. I'm afraid it won't be so easy to change de jvm
> version, because it is required at the moment.
>
> It seems that Solr 4.2 supports Java 1.6 at least. Is that correct?
>
> Could you find any clue of what is happening in the attached traces? It
> would be great to know why it is happening now, because it was working for
> Solr 3.5.

Its probably not an OOM at all. instead its more likely IBM JVM is
probably miscompiling our code and producing large integers, like it
does quite often. For example, we had to disable testing it completely
recently for this reason. If someone were to report a JIRA issue that
mentioned IBM, I'd make the same comment there but in general not take
it seriously at all due to the kind of bugs i've seen from that JVM.

The fact that IBM JVM didnt miscompile 3.5's code is irrelevant.


Re: Known memory leaks in 4.0?

2013-03-15 Thread Per Steffensen

On 3/15/13 9:13 AM, Bernd Fehling wrote:

How do you know that it is Solr and nothing else?
It is memory usage inside the Jetty/Solr JVM we monitor, so by 
definition it is Solr (or Jetty, but I couldnt imagine). The lower 
border (after full GC) of memory usage is increasing.


Have you check with MemoryAnalyzer?
http://wiki.eclipse.org/index.php/MemoryAnalyzer
Nope, not that particular tool, but other tools, but no in-depth 
analysis yet. We might use MemoryAnalyzer though. But we will manage 
diving into the problem ourselves, the question was more about whether 
or not there was any known issues already.


As we are always using the most recent released version we
have never seen any memory leaks with Solr so far.
We havnt seen any neither, but we are searching across more and more 
documents (currently about 2 billion) and it might be 
#docs/#shards/#replica related. We are not that concerned if Solr just 
needs or would like (if available) to use more and more memory the more 
docs it potentially have to visit to calculate a response for a request 
- such a property is kinda expected. Concern is more on actual leaks - 
e.g. that the lower border (after full GC) does not go down if you stop 
all searching.


We have just decided to dive into it for a few days in order to 
understand what actually happens.


Regards
Bernd

Regards, Per Steffensen


Re: Known memory leaks in 4.0?

2013-03-15 Thread Bernd Fehling


Am 15.03.2013 12:24, schrieb Per Steffensen:
> On 3/15/13 9:13 AM, Bernd Fehling wrote:
>> How do you know that it is Solr and nothing else?
> It is memory usage inside the Jetty/Solr JVM we monitor, so by definition it 
> is Solr (or Jetty, but I couldnt imagine). The lower border (after
> full GC) of memory usage is increasing.
>>
>> Have you check with MemoryAnalyzer?
>> http://wiki.eclipse.org/index.php/MemoryAnalyzer
> Nope, not that particular tool, but other tools, but no in-depth analysis 
> yet. We might use MemoryAnalyzer though. But we will manage diving
> into the problem ourselves, the question was more about whether or not there 
> was any known issues already.
>>
>> As we are always using the most recent released version we
>> have never seen any memory leaks with Solr so far.
> We havnt seen any neither, but we are searching across more and more 
> documents (currently about 2 billion) and it might be
> #docs/#shards/#replica related. We are not that concerned if Solr just needs 
> or would like (if available) to use more and more memory the more
> docs it potentially have to visit to calculate a response for a request - 
> such a property is kinda expected. Concern is more on actual leaks -
> e.g. that the lower border (after full GC) does not go down if you stop all 
> searching.

Then this will be a good starting point.
- stop all searching
- force a full GC
- make heap dump from JVM
- use MemoryAnalyzer to inspect the heap dump and see what is left over

Regards
Bernd


Re: Query on Solr Data-db-config.xml

2013-03-15 Thread Alexandre Rafalovitch
You can either have those values stored in variables (${varname}) and have
those configured somewhere else in Solr (there are several options).

Or, if you have a data source in your servlet contains, you can use
jndiName of that instead of configuring it in Solr itself.

Regards,
   Alex.

Personal blog: http://blog.outerthoughts.com/
LinkedIn: http://www.linkedin.com/in/alexandrerafalovitch
- Time is the quality of nature that keeps events from happening all at
once. Lately, it doesn't seem to be working.  (Anonymous  - via GTD book)


On Fri, Mar 15, 2013 at 1:56 AM, Ravi_Mandala  wrote:

> Hi,
>
> My web service has the all DB related information(like
> username,password,entity names,fields etc).I want to pass this data to Solr
> dataimport handler to do
>
> the importing(fullimport or deltaimport).
>
> Is it possible, passing the DB information and doing the data import from
> the solr?.(i want db-data-config.xml data should be dynamic)
>
>
> If yes, let me know the solution .
>
> Thanks in Advance.
> Ravi
>
>
>
> --
> View this message in context:
> http://lucene.472066.n3.nabble.com/Query-on-Solr-Data-db-config-xml-tp4047599.html
> Sent from the Solr - User mailing list archive at Nabble.com.
>


Re: SOLR - Dynamic Schema Design

2013-03-15 Thread Alexandre Rafalovitch
The main issue with dynamic fields is that because you have one definition,
you can also have only one treatment.

So, all of your fields (covered by one dynField definition) will have to be
of the same type. They will all have to be single- or multi- valued. They
will all have to be stored or not. And so on.

If that's not a problem, you should be ok.

Regards,
   Alex.

Personal blog: http://blog.outerthoughts.com/
LinkedIn: http://www.linkedin.com/in/alexandrerafalovitch
- Time is the quality of nature that keeps events from happening all at
once. Lately, it doesn't seem to be working.  (Anonymous  - via GTD book)


On Fri, Mar 15, 2013 at 6:14 AM, Upayavira  wrote:

> The purpose of the schema is to associate a type with a field name.
> That's it.
>
> A dynamic field associates a type with a range of names.
>
> An empty field in a Lucene index doesn't take any space, so having 450
> fields doesn't in itself cause a problem. The point at which you may
> have a problem is when you want to search across those dynamic fields,
> as search across multiple fields in Solr/Lucene is not as efficient as
> searching across a single field.
>
> If in your application you will know the name of the field you want to
> search against at query time, then your scenario seems quite reasonable.
>
> Upayavira
>
> On Fri, Mar 15, 2013, at 09:51 AM, kobe.free.wo...@gmail.com wrote:
> > Hello All,
> >
> > Scenario:
> >
> > We trying to define the schema structure for our application search
> > feature,
> > based on SOLR search server. In our scenario the total number of fields
> > is
> > 450 (quiet huge) and we will be using the features like faceting, sorting
> > etc. This field set will be dynamic (not permanent) and will be modified
> > (like removing/ adding some fields) on regular basis, based on the
> > business
> > needs. We are planning to use the concept of 'Dynamic Fields" for most of
> > the fields from the original fields set of 450.
> >
> > Queries:
> >
> > 1. What are the pros/ cons of using dynamic fields?
> > 2. What is the best way to achieve dynamic schema, is the above mentioned
> > method proper?
> > 3. What will be the best approach to update the schema and re-index the
> > data?
> > 4. Is there a major index size difference when using dynamic fields?
> >
> >
> >
> >
> >
> > --
> > View this message in context:
> >
> http://lucene.472066.n3.nabble.com/SOLR-Dynamic-Schema-Design-tp4047638.html
> > Sent from the Solr - User mailing list archive at Nabble.com.
>


Re: Advice: solrCloud + DIH

2013-03-15 Thread rulinma
Yes u can know that, u must understand shard partition.



--
View this message in context: 
http://lucene.472066.n3.nabble.com/Advice-solrCloud-DIH-tp4047339p4047673.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: Storage solutions recommendations

2013-03-15 Thread Otis Gospodnetic
Hi,

Most of our clients/customers use local storage. Some use SSDs and some
SANs, and those with extra cash use SANs with SSDs.

But what you wrote needs more detail because sources of poor performance
can come from many places and there are a lot or very different setups out
there that work in one situation but not in another.

Otis
Solr & ElasticSearch Support
http://sematext.com/
On Mar 15, 2013 6:37 AM, "Christian von Wendt-Jensen" <
christian.vonwendt-jen...@infopaq.com> wrote:

> Hi,
>
> I want to find what your experiences are with different storage setups.
>
> We tried running a master/slave setup on the SAN but quickly realized that
> the master did not index fast enough. We didn't run with soft commit though
> – maybe that would change the conclusion?
> The slaves seemed to run OK with data on the SAN, but as soon as
> replication was enabled, it died. Replication took hours and drained
> resources preventing good performance on the replicas. The cache warmup
> time took forever.
>
> How does YOUR setup look like, and what storage solutions could YOU
> recommend? SAN? Local disc? Local SSD? Softcommit?
>
>
>
>
>
> Med venlig hilsen / Best Regards
>
> Christian von Wendt-Jensen
> IT Team Lead, Customer Solutions
>
> Infopaq International A/S
> Kgs. Nytorv 22
> DK-1050 København K
>
> Phone +45 36 99 00 00
> Mobile +45 31 17 10 07
> Email  christian.sonne.jen...@infopaq.com christian.sonne.jen...@infopaq.com>
> Webwww.infopaq.com
>
>
>
>
>
>
>
>
> DISCLAIMER:
> This e-mail and accompanying documents contain privileged confidential
> information. The information is intended only for the recipient(s) named.
> Any unauthorised disclosure, copying, distribution, exploitation or the
> taking of any action in reliance of the content of this e-mail is strictly
> prohibited. If you have received this e-mail in error we would be obliged
> if you would delete the e-mail and attachments and notify the dispatcher by
> return e-mail or at +45 36 99 00 00
> P Please consider the environment before printing this mail note.
>


Re: How should I configure Solr to support multi-word synonyms?

2013-03-15 Thread rulinma
Mark




--
View this message in context: 
http://lucene.472066.n3.nabble.com/How-should-I-configure-Solr-to-support-multi-word-synonyms-tp4044578p4047678.html
Sent from the Solr - User mailing list archive at Nabble.com.


NRT - softCommit

2013-03-15 Thread Arkadi Colson
NRT seems not to work in my case when doing a softcommit every 2 
seconds. My conf looks like this:


1
false



2000


No result from Solr when searching for a word in the file. When doing a 
hard commit manually, it works.


Did I forgot to configure something?


Logging:

Mar 15, 2013 11:56:20 AM org.apache.solr.core.SolrCore execute
INFO: [intradesk] webapp=/solr path=/update 
params={distrib.from=http://solr03:8983/solr/intradesk/&update.distrib=FROMLEADER&wt=javabin&version=2} 
status=0 QTime=74

Mar 15, 2013 11:56:35 AM org.apache.solr.update.DirectUpdateHandler2 commit
INFO: start 
commit{flags=0,_version_=0,optimize=false,openSearcher=false,waitSearcher=true,expungeDeletes=false,softCommit=false}

Mar 15, 2013 11:56:36 AM org.apache.solr.core.SolrDeletionPolicy onCommit
INFO: SolrDeletionPolicy.onCommit: commits:num=2
commit{dir=NRTCachingDirectory(org.apache.lucene.store.MMapDirectory@/solr/intradesk/data/index 
lockFactory=org.apache.lucene.store.NativeFSLockFactory@6c0081fb; 
maxCacheMB=48.0 
maxMergeSizeMB=4.0),segFN=segments_7v,generation=283,filenames=[_e7.fnm, 
segments_7v, _ec_Lucene40_0.tim, _eg.fdt, _ee_Lucene40_0.frq, 
_ef_Lucene40_0.frq, _e9_Lucene40_0.frq, _eg.fnm, _eg.fdx, 
_ec_Lucene40_0.tip, _dz_Lucene40_0.frq, _e9.fnm, _e0_Lucene40_0.tip, 
_e0_Lucene40_0.tim, _e4.fdt, _e3.si, _ec.fnm, _eh_Lucene40_0.prx, 
_ef.si, _e8_Lucene40_0.frq, _e7_Lucene40_0.frq, _e4.fdx, 
_e8_Lucene40_0.prx, _dz.fdx, _ef_Lucene40_0.tip, _e2.fnm, 
_ef_Lucene40_0.tim, _ec_Lucene40_0.frq, _e8.si, _ef_Lucene40_0.prx, 
_eh.si, _e3_Lucene40_0.prx, _eh_Lucene40_0.tip, _e2_Lucene40_0.tim, 
_e2_Lucene40_0.tip, _ec.fdt, _ec.fdx, _ed_Lucene40_0.frq, 
_e6_Lucene40_0.frq, _e6_Lucene40_0.tim, _eh_Lucene40_0.tim, 
_e6_Lucene40_0.tip, _e9_Lucene40_0.prx, _ee.fnm, _e8_nrm.cfs, _e3.fdx, 
_ea_nrm.cfe, _eg_Lucene40_0.tip, _e3.fdt, _e9.fdt, _eg_Lucene40_0.tim, 
_e8_nrm.cfe, _e9.fdx, _e5_Lucene40_0.frq, _ea_Lucene40_0.tim, _dz.fnm, 
_e5_nrm.cfs, _eh.fnm, _ed.si, _e4.fnm, _e5_nrm.cfe, _e4_Lucene40_0.tip, 
_e2_Lucene40_0.frq, _e0.si, _ec.si, _e4_Lucene40_0.tim, _ee.fdt, 
_eg_Lucene40_0.frq, _ee.si, _dz_Lucene40_0.tim, _e7.fdt, _dz.fdt, 
_ea_Lucene40_0.prx, _e2.si, _dz_Lucene40_0.tip, _e7.fdx, 
_e5_Lucene40_0.prx, _e6.si, _ee.fdx, _eg_Lucene40_0.prx, _e5.si, _eg.si, 
_ea_Lucene40_0.tip, _e2.fdt, _eh.fdt, _dz_Lucene40_0.prx, _eb.fdx, 
_eh.fdx, _eb.fdt, _e7.si, _ea_nrm.cfs, _e2.fdx, _ed_Lucene40_0.prx, 
_ee_Lucene40_0.tim, _e5_Lucene40_0.tip, _dz_nrm.cfe, _e7_nrm.cfe, 
_ee_Lucene40_0.tip, _e5_Lucene40_0.tim, _e6_nrm.cfs, _e8_Lucene40_0.tip, 
_e7_Lucene40_0.prx, _e8_Lucene40_0.tim, _dz.si, _e6.fnm, 
_eh_Lucene40_0.frq, _ea.fnm, _e6_nrm.cfe, _e5.fnm, _e4_Lucene40_0.frq, 
_e7_nrm.cfs, _e7_Lucene40_0.tip, _e8.fdx, _eb.fnm, _e8.fdt, 
_e6_Lucene40_0.prx, _e7_Lucene40_0.tim, _ea.fdt, _dz_nrm.cfs, _ef.fdx, 
_ea.fdx, _ef.fdt, _e3_Lucene40_0.frq, _e0.fdt, _e4_Lucene40_0.prx, 
_ef.fnm, _ea_Lucene40_0.frq, _e0.fdx, _e9_Lucene40_0.tim, _ea.si, 
_e9_Lucene40_0.tip, _eb.si, _e2_Lucene40_0.prx, _e9.si, 
_eb_Lucene40_0.tim, _eb_Lucene40_0.tip, _e5.fdt, _ec_Lucene40_0.prx, 
_ed.fnm, _e3_Lucene40_0.tim, _e3_Lucene40_0.tip, _e5.fdx, _e8.fnm, 
_e0_Lucene40_0.prx, _ee_Lucene40_0.prx, _ed_Lucene40_0.tim, 
_e0_Lucene40_0.frq, _e6.fdt, _ed_Lucene40_0.tip, _e6.fdx, 
_eb_Lucene40_0.frq, _e3.fnm, _ed.fdt, _ed.fdx, _e4.si, 
_eb_Lucene40_0.prx, _e0.fnm]
commit{dir=NRTCachingDirectory(org.apache.lucene.store.MMapDirectory@/solr/intradesk/data/index 
lockFactory=org.apache.lucene.store.NativeFSLockFactory@6c0081fb; 
maxCacheMB=48.0 
maxMergeSizeMB=4.0),segFN=segments_7w,generation=284,filenames=[_e7.fnm, 
segments_7w, _ei_Lucene40_0.frq, _ec_Lucene40_0.tim, _eg.fdt, 
_ee_Lucene40_0.frq, _ef_Lucene40_0.frq, _e9_Lucene40_0.frq, _eg.fnm, 
_eg.fdx, _ec_Lucene40_0.tip, _dz_Lucene40_0.frq, _e9.fnm, 
_e0_Lucene40_0.tip, _e0_Lucene40_0.tim, _e4.fdt, _e3.si, _ec.fnm, 
_eh_Lucene40_0.prx, _ef.si, _ei.si, _e8_Lucene40_0.frq, 
_e7_Lucene40_0.frq, _e4.fdx, _e8_Lucene40_0.prx, _dz.fdx, 
_ef_Lucene40_0.tip, _e2.fnm, _ef_Lucene40_0.tim, _ec_Lucene40_0.frq, 
_e8.si, _ef_Lucene40_0.prx, _eh.si, _e3_Lucene40_0.prx, 
_eh_Lucene40_0.tip, _e2_Lucene40_0.tim, _e2_Lucene40_0.tip, _ec.fdt, 
_ec.fdx, _ed_Lucene40_0.frq, _e6_Lucene40_0.frq, _e6_Lucene40_0.tim, 
_eh_Lucene40_0.tim, _e6_Lucene40_0.tip, _e9_Lucene40_0.prx, _ee.fnm, 
_e8_nrm.cfs, _e3.fdx, _ea_nrm.cfe, _eg_Lucene40_0.tip, _e3.fdt, _e9.fdt, 
_eg_Lucene40_0.tim, _e8_nrm.cfe, _e9.fdx, _e5_Lucene40_0.frq, 
_ea_Lucene40_0.tim, _dz.fnm, _e5_nrm.cfs, _eh.fnm, _ed.si, _e4.fnm, 
_e5_nrm.cfe, _e4_Lucene40_0.tip, _ei.fdx, _e2_Lucene40_0.frq, _e0.si, 
_ec.si, _e4_Lucene40_0.tim, _ei.fdt, _ee.fdt, _eg_Lucene40_0.frq, 
_ee.si, _dz_Lucene40_0.tim, _e7.fdt, _dz.fdt, _ea_Lucene40_0.prx, 
_e2.si, _dz_Lucene40_0.tip, _e7.fdx, _e5_Lucene40_0.prx, _e6.si, 
_ee.fdx, _eg_Lucene40_0.prx, _e5.si, _eg.si, _ea_Lucene40_0.tip, 
_e2.fdt, _eh.fdt, _dz_Lucene40_0.prx, _eb.fdx, _eh.fdx, _eb.fdt, _e7.si, 
_ea_nrm.cfs, _e2.fdx, _ei.fnm, _ed_Lucene40_0.prx, _e

Re: How can I limit my Solr search to an arbitrary set of 100,000 documents?

2013-03-15 Thread Julián Arocena
Hi Andy,

may be you can look at Scotas products, www.scotas.com/products. They
combine the data synchronization in near real time between Oracle and Solr,
and also you can consumes data during SQLQ query time with new operators
and functions or direct to Solr.

Bye!

2013/3/12 Andy Lester 

>
> On Mar 12, 2013, at 1:21 PM, Chris Hostetter 
> wrote:
>
> > How are these sets of flrids created/defined?  (undertsanding the source
> > of the filter information may help inspire alternative suggestsions, ie:
> > XY Problem)
>
>
> It sounds like you're looking for patterns that could potentially
> providing groupings for these FLRIDs.  We've been down that road, too, but
> we don't see how there could be one.  The arbitrariness comes from the fact
> that the lists are maintained by users and can be changed at any time.
>
> Each book in the database has an FLRID.  Each user can create lists of
> books.  These lists can be modified at any time.
>
> That looks like this in Oracle:   USER   1->M   LIST   1->M   LISTDETAIL
>  M <- 1  TITLE
>
> The sizes we're talking about:  tens of thousands of users; hundreds of
> thousands of lists, with up to 100,000 items per list; tens of millions of
> listdetail.
>
> We have a feature that lets the user do a keyword search on books within
> his list.  We can't update the Solr record to keep track of which lists it
> appears on because there may be, say, 20 people every second updating the
> contents of their lists, and those 20 people expect that their next
> search-within-a-list will have those new results.
>
> Andy
>
> --
> Andy Lester => a...@petdance.com => www.petdance.com => AIM:petdance
>
>


Re: Solr indexing binary files

2013-03-15 Thread Luis
Hi Jack, thanks a lot for your reply.  I did that .  However, when I run Solr it gives me a
bunch of errors.  It actually displays the content of my files on my command
line and shows some logs like this:

org.apache.solr.common.SolrException: Document is missing mandatory
uniqueKey field: id
at
org.apache.solr.update.AddUpdateCommand.getIndexedId(AddUpdateCommand.java:88)
at
org.apache.solr.update.processor.DistributedUpdateProcessor.versionAdd(DistributedUpdateProcessor.java:468)
at
org.apache.solr.update.processor.DistributedUpdateProcessor.processAdd(DistributedUpdateProcessor.java:350)
at
org.apache.solr.update.processor.LogUpdateProcessor.processAdd(LogUpdateProcessorFactory.java:100)
at
org.apache.solr.handler.dataimport.SolrWriter.upload(SolrWriter.java:70)
at
org.apache.solr.handler.dataimport.DataImportHandler$1.upload(DataImportHandler.java:234)
at
org.apache.solr.handler.dataimport.DocBuilder.buildDocument(DocBuilder.java:500)
at
org.apache.solr.handler.dataimport.DocBuilder.buildDocument(DocBuilder.java:404)
at
org.apache.solr.handler.dataimport.DocBuilder.doFullDump(DocBuilder.java:319)
at
org.apache.solr.handler.dataimport.DocBuilder.execute(DocBuilder.java:227)
at
org.apache.solr.handler.dataimport.DataImporter.doFullImport(DataImporter.java:422)
at
org.apache.solr.handler.dataimport.DataImporter.runCmd(DataImporter.java:487)
at
org.apache.solr.handler.dataimport.DataImporter$1.run(DataImporter.java:468)
15-Mar-2013 9:56:29 AM org.apache.solr.handler.dataimport.DocBuilder execute

I do have an uniqueKey though.  Any ideas what the problem might be?





--
View this message in context: 
http://lucene.472066.n3.nabble.com/Solr-indexing-binary-files-tp4047470p4047690.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: Storage solutions recommendations

2013-03-15 Thread Julián Arocena
Hi,

one of our clients provides for an important Argentine telco, a complete
system to integrate and organize in a simple system large volumes of data
with information about customers,
transactions, security risk, potential frauds among other activities all in
real time. For text searching they use Scotas OLS (www.scotas.com/products),
that is a native integration of Solr into Oracle dmbs.

OLS replaces the Lucene inverted index storage, which by default is stored
on the OS file-system, by Oracle Secure File BLOBs, resulting in high
scalable, secure and transactional storage. A summarizes advantages of this
approach are:

   - Transactional storage, a parallel process can do insert or optimize
   operations and if they fail simply do a rollback and nothing happens to
   other concurrent sessions.
   - Compression and encryption using Secure File functionality, applicable
   to Lucene Inverted Index storage and Solr configuration files.
   - Shared storage for Lucene Inverted Index, on RAC installations several
   processes across nodes can use the storage transparently.

I hope this information can be useful for you.

Bye,

Julian

2013/3/15 Otis Gospodnetic 

> Hi,
>
> Most of our clients/customers use local storage. Some use SSDs and some
> SANs, and those with extra cash use SANs with SSDs.
>
> But what you wrote needs more detail because sources of poor performance
> can come from many places and there are a lot or very different setups out
> there that work in one situation but not in another.
>
> Otis
> Solr & ElasticSearch Support
> http://sematext.com/
> On Mar 15, 2013 6:37 AM, "Christian von Wendt-Jensen" <
> christian.vonwendt-jen...@infopaq.com> wrote:
>
> > Hi,
> >
> > I want to find what your experiences are with different storage setups.
> >
> > We tried running a master/slave setup on the SAN but quickly realized
> that
> > the master did not index fast enough. We didn't run with soft commit
> though
> > – maybe that would change the conclusion?
> > The slaves seemed to run OK with data on the SAN, but as soon as
> > replication was enabled, it died. Replication took hours and drained
> > resources preventing good performance on the replicas. The cache warmup
> > time took forever.
> >
> > How does YOUR setup look like, and what storage solutions could YOU
> > recommend? SAN? Local disc? Local SSD? Softcommit?
> >
> >
> >
> >
> >
> > Med venlig hilsen / Best Regards
> >
> > Christian von Wendt-Jensen
> > IT Team Lead, Customer Solutions
> >
> > Infopaq International A/S
> > Kgs. Nytorv 22
> > DK-1050 København K
> >
> > Phone +45 36 99 00 00
> > Mobile +45 31 17 10 07
> > Email  christian.sonne.jen...@infopaq.com > christian.sonne.jen...@infopaq.com>
> > Webwww.infopaq.com
> >
> >
> >
> >
> >
> >
> >
> >
> > DISCLAIMER:
> > This e-mail and accompanying documents contain privileged confidential
> > information. The information is intended only for the recipient(s) named.
> > Any unauthorised disclosure, copying, distribution, exploitation or the
> > taking of any action in reliance of the content of this e-mail is
> strictly
> > prohibited. If you have received this e-mail in error we would be obliged
> > if you would delete the e-mail and attachments and notify the dispatcher
> by
> > return e-mail or at +45 36 99 00 00
> > P Please consider the environment before printing this mail note.
> >
>


Re: Solr indexing binary files

2013-03-15 Thread Gora Mohanty
On 15 March 2013 19:28, Luis  wrote:
> Hi Jack, thanks a lot for your reply.  I did that  type="text" multiValued="true" />.  However, when I run Solr it gives me a
> bunch of errors.  It actually displays the content of my files on my command
> line and shows some logs like this:
>
> org.apache.solr.common.SolrException: Document is missing mandatory
> uniqueKey field: id
[...]
> I do have an uniqueKey though.  Any ideas what the problem might be?

Please share your schema.xml, and details on the exact
command used to index the PDFs. It is possible that you
are not supplying the the literal.id=XXX param that is
needed to provide a uniqueKey for the document. Please
see the "Getting Started with the Solr Example" section at
http://wiki.apache.org/solr/ExtractingRequestHandler

Regards,
Gora


Re: Solr indexing binary files

2013-03-15 Thread Luis
Hi Gora, thank you for your reply.  I am not using any commands, I just go on
the Solr dashboard, db > Dataimport and execute a full-import.

*My schema.xml looks like this:*

 
   
   
   
   
   
   
   
   
   
   
   


   
   
  
   
   
   
   
   
   
   


   
   
   
   
   
   
   
 
   
   
   
   
  
   



   
   
   
   
   
   
   

   


  * *

*My db-data-config.xml looks like this:*








  







   




*In my solrconfig.xml I have this:*



db-data-config.xml

  
  
   

  true
  metadata_
last_modified
text
size
initials
name
subject
company
title
comments
words
last_modified_by
 true

  

Thank you for your help!




--
View this message in context: 
http://lucene.472066.n3.nabble.com/Solr-indexing-binary-files-tp4047470p4047702.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: solr cell

2013-03-15 Thread Michael Della Bitta
Niklas,

In Linux, the API for watching for filesystem changes is called
inotify. You'd need to write something to listen to those events and
react accordingly.

Here's a brief discussion about it:
http://stackoverflow.com/questions/4062806/inotify-how-to-use-it-linux


Michael Della Bitta


Appinions
18 East 41st Street, 2nd Floor
New York, NY 10017-6271

www.appinions.com

Where Influence Isn’t a Game


On Fri, Mar 15, 2013 at 11:10 AM, Niklas Langvig
 wrote:
> We have all our documents (doc, docx, pdf) on a linux file server (~8 000 000 
> documents), is there a good way to update solr with documents that are added 
> to the file server and deleted from the file server?
> In windows you could have a wmi script that would get noticed when a document 
> has been removed or added and then do appropriate update in solr.
>
> Can this be solved somehow?
>
> Thanks
> Niklas


Re: discovery-based core enumeration with embedded solr

2013-03-15 Thread Michael Sokolov
Erick, before I do that - which I'll be happy to - I just want to make 
sure I'm testing the right thing. The wiki seems to indicate this is a 
4.2+ feature, but the ticket marks it as fixed in 4.3.  Maybe just a 
document bug?


-Mike

On 3/14/13 9:44 PM, Erick Erickson wrote:

H, could you raise a JIRA and assign it to me? Please be sure and
emphasize that it's embedded because I'm pretty sure this is fine for the
regular case.

But I have to admit that the embedded case completely slipped under the
radar.

Even better if you could make a test case, but that might not be
straightforward...

Thanks,
Erick


On Wed, Mar 13, 2013 at 5:28 PM, Michael Sokolov <
msoko...@safaribooksonline.com> wrote:


Has the new core enumeration strategy been implemented in the
CoreContainer.Initializer.**initialize() code path?  It doesn't seem like
it has.

I get this exception:

Caused by: org.apache.solr.common.**SolrException: Could not load config
for solrconfig.xml
 at org.apache.solr.core.**CoreContainer.createFromLocal(**
CoreContainer.java:991)
 at org.apache.solr.core.**CoreContainer.create(**
CoreContainer.java:1051)
 ... 10 more
Caused by: java.io.IOException: Can't find resource 'solrconfig.xml' in
classpath or 'solr-multi/collection1/conf/'**, cwd=/proj/lux
 at org.apache.solr.core.**SolrResourceLoader.**openResource(**
SolrResourceLoader.java:318)
 at org.apache.solr.core.**SolrResourceLoader.openConfig(**
SolrResourceLoader.java:283)
 at org.apache.solr.core.Config.<**init>(Config.java:103)
 at org.apache.solr.core.Config.<**init>(Config.java:73)
 at org.apache.solr.core.**SolrConfig.(SolrConfig.**java:117)
 at org.apache.solr.core.**CoreContainer.createFromLocal(**
CoreContainer.java:989)
 ... 11 more

even though I have a solr.properties file in solr-multi (which is my
solr.home), and core.properties in some subdirectories of that

--
Michael Sokolov
Senior Architect
Safari Books Online






Re: solr cell

2013-03-15 Thread Jack Krupansky
Take a look at ManifoldCF, whch has a file system crawler which can track 
changed files.


-- Jack Krupansky

-Original Message- 
From: Niklas Langvig

Sent: Friday, March 15, 2013 11:10 AM
To: solr-user@lucene.apache.org
Subject: solr cell

We have all our documents (doc, docx, pdf) on a linux file server (~8 000 
000 documents), is there a good way to update solr with documents that are 
added to the file server and deleted from the file server?
In windows you could have a wmi script that would get noticed when a 
document has been removed or added and then do appropriate update in solr.


Can this be solved somehow?

Thanks
Niklas 



Re: search request handler error

2013-03-15 Thread Jack Krupansky
"java.lang.Float cannot be cast to java.lang.String" typically means you 
wrote n for a parameter which is expected to be a 
string ("

-- Jack Krupansky

-Original Message- 
From: Rohan Thakur

Sent: Friday, March 15, 2013 10:01 AM
To: solr-user@lucene.apache.org
Subject: search request handler error

hi all

please someone help me with this I am getting search request handler when I
modified the /select request handler to add spell correction in it
I am using solr4.1

*error:*
INFO: Initializing spell checkers
Mar 15, 2013 7:21:19 PM org.apache.solr.core.QuerySenderListener newSearcher
INFO: QuerySenderListener sending requests to
Searcher@6ce7ce4cmain{StandardDirectoryReader(segments_e4:2003
_qp(4.1):C1063)}
Mar 15, 2013 7:21:19 PM org.apache.solr.common.SolrException log
SEVERE: java.lang.NullPointerException
   at
org.apache.solr.handler.component.SearchHandler.handleRequestBody(SearchHandler.java:181)
   at
org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:135)
   at org.apache.solr.core.SolrCore.execute(SolrCore.java:1816)
   at
org.apache.solr.core.QuerySenderListener.newSearcher(QuerySenderListener.java:64)
   at org.apache.solr.core.SolrCore$5.call(SolrCore.java:1594)
   at
java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:334)
   at java.util.concurrent.FutureTask.run(FutureTask.java:166)
   at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1110)
   at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:603)
   at java.lang.Thread.run(Thread.java:679)

Mar 15, 2013 7:21:19 PM org.apache.solr.core.SolrCore close
INFO: [collection1]  CLOSING SolrCore org.apache.solr.core.SolrCore@641cab18
Mar 15, 2013 7:21:19 PM org.apache.solr.update.DirectUpdateHandler2 close
INFO: closing DirectUpdateHandler2{commits=0,autocommit
maxTime=15000ms,autocommits=0,soft
autocommits=0,optimizes=0,rollbacks=0,expungeDeletes=0,docsPending=0,adds=0,deletesById=0,deletesByQuery=0,errors=0,cumulative_adds=0,cumulative_deletesById=0,cumulative_deletesByQuery=0,cumulative_errors=0}
Mar 15, 2013 7:21:19 PM org.apache.solr.core.SolrCore execute
INFO: [collection1] webapp=null path=null
params={event=firstSearcher&q=static+firstSearcher+warming+in+solrconfig.xml&distrib=false}
status=500 QTime=7
Mar 15, 2013 7:21:19 PM org.apache.solr.core.QuerySenderListener newSearcher
INFO: QuerySenderListener done.
Mar 15, 2013 7:21:19 PM org.apache.solr.core.SolrCore decrefSolrCoreState
INFO: Closing SolrCoreState
Mar 15, 2013 7:21:19 PM org.apache.solr.update.DefaultSolrCoreState
closeIndexWriter
INFO: SolrCoreState ref count has reached 0 - closing IndexWriter
Mar 15, 2013 7:21:19 PM org.apache.solr.update.DefaultSolrCoreState
closeIndexWriter
INFO: closing IndexWriter with IndexWriterCloser
Mar 15, 2013 7:21:19 PM org.apache.solr.core.SolrCore registerSearcher
INFO: [collection1] Registered new searcher
Searcher@6ce7ce4cmain{StandardDirectoryReader(segments_e4:2003
_qp(4.1):C1063)}
Mar 15, 2013 7:21:20 PM org.apache.solr.core.SolrCore closeSearcher
INFO: [collection1] Closing main searcher on request.
Mar 15, 2013 7:21:20 PM org.apache.solr.core.CachingDirectoryFactory close
INFO: Releasing
directory:/root/rohan/solr-4.1.0/example/solr/collection1/data/index
Mar 15, 2013 7:21:20 PM org.apache.solr.core.CachingDirectoryFactory close
INFO: Closing directory when closing
factory:/root/rohan/solr-4.1.0/example/solr/collection1/data
Mar 15, 2013 7:21:20 PM org.apache.solr.core.CachingDirectoryFactory
closeDirectory
INFO: Closing directory:/root/rohan/solr-4.1.0/example/solr/collection1/data
Mar 15, 2013 7:21:20 PM org.apache.solr.core.CachingDirectoryFactory close
INFO: Closing directory when closing
factory:/root/rohan/solr-4.1.0/example/solr/collection1/data/index
Mar 15, 2013 7:21:20 PM org.apache.solr.core.CachingDirectoryFactory
closeDirectory
INFO: Closing
directory:/root/rohan/solr-4.1.0/example/solr/collection1/data/index
Mar 15, 2013 7:21:20 PM org.apache.solr.core.CoreContainer recordAndThrow
SEVERE: Unable to create core: collection1
org.apache.solr.common.SolrException: java.lang.Float cannot be cast to
java.lang.String
   at org.apache.solr.core.SolrCore.(SolrCore.java:794)
   at org.apache.solr.core.SolrCore.(SolrCore.java:607)
   at
org.apache.solr.core.CoreContainer.createFromLocal(CoreContainer.java:1003)
   at
org.apache.solr.core.CoreContainer.create(CoreContainer.java:1033)
   at org.apache.solr.core.CoreContainer$3.call(CoreContainer.java:629)
   at org.apache.solr.core.CoreContainer$3.call(CoreContainer.java:624)
   at
java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:334)
   at java.util.concurrent.FutureTask.run(FutureTask.java:166)
   at
java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
   at
java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:334)
   at java.util.concurrent.FutureTask.run(Fu

Re: wanted to know can we some how know what are the documents that are exact match in solr

2013-03-15 Thread Jack Krupansky
The "explain" section that is returned if you specify the &debugQuery=true 
parameter will provides the details of what terms matched for each document.


-- Jack Krupansky

-Original Message- 
From: Rohan Thakur

Sent: Friday, March 15, 2013 9:19 AM
To: solr-user@lucene.apache.org
Subject: wanted to know can we some how know what are the documents that are 
exact match in solr


hi all

I need to pass some variable or some flag with the exact match document
than the others like say I have 3 terms in the search query so I need to
know the documents in which all three words are found from the other
documents in which only 1 or 2 out of three terms are matched.

any help would be great
thanks
regards
rohan 



faceting multivalued field

2013-03-15 Thread sathish_ix
Hi,I have a requirement below is my dataset ,Client_name rep_name   

acct_nameSUSAN CHILTONGERARD BUCHANAN  CHILTON SLARRY
CHILTON  GERARD BUCHANAN CHILTON LMy schema.xml SEARCHI 
need response
as,Search for CHILTON , group by Client name , rep_name,acct_nameClient name
(2)Rep name (0)acct_name
(2)http://localhost:8081/solr/select?q=CHILTON&facet=true&facet.field=REP_NM&facet.mincount=1&facet.field=CLIENT_NM&facet.field=ACCT_NAMEIm
getting below response,2Chilton name present in Client name and Acct
name field and not present in Rep nm,then why Rep_nm show 2.Why rep_nm is
listing values? can some explain.Thanks,



--
View this message in context: 
http://lucene.472066.n3.nabble.com/faceting-multivalued-field-tp4047686.html
Sent from the Solr - User mailing list archive at Nabble.com.

Re: Solr3.5 Vs Solr4.1 - Help please

2013-03-15 Thread Chris Hostetter

: Just from this observation, it seems like the code for SOLR 4.1 takes a
: wrong turn somewhere for large responses if it comes across the same query
: with a different fl list again.If the spinning query is pre-cached via

There definiately seems to be a problem with lazy field loading + 
variable fl lists + large multivalued fields in the 4.x releases...

https://issues.apache.org/jira/browse/SOLR-4589


-Hoss


Re: Solr indexing binary files

2013-03-15 Thread Gora Mohanty
On 15 March 2013 20:16, Luis  wrote:
>
> Hi Gora, thank you for your reply.  I am not using any commands, I just go
> on
> the Solr dashboard, db > Dataimport and execute a full-import.

In that case, you are not using the ExtractingRequestHandler, but
using the DataImportHandler, even though you have both handlers
defined.

>
> *My schema.xml looks like this:*
[...]

This cannot be the complete schema.xml, but in any case,
the issue probably does not lie there.

> *My db-data-config.xml looks like this:*
>
> 
>   url="jdbc:mysql://localhost:3306/opspedia"
>  user="username" batchSize="-1" name="mysql" />
> 
>
> 
>
>  rootEntity="true"
> dataSource="mysql" query="select ID, urlpath from myposts"
> deltaImportQuery="SELECT * FROM myposts WHERE id =
> '${dataimporter.delta.id}'"
>   deltaQuery="SELECT id FROM myposts WHERE last_modified >
> '${dataimporter.last_index_time}'">
>
>  processor="TikaEntityProcessor" fileName=".*"
> recursive="true" url="${fileSourcePaths.guid}" format="text"
> dataSource="bin" >

Your query on the root entity, fileSourcePaths, only selects ID
and urlpath, but the url attribute in the nested TikaEntityProcessor
refers to ${fileSourcePaths.guid} which has never been selected.

Regards,
Gora


structure of solr index

2013-03-15 Thread alxsss
Hi,

I wondered if solr searches on indexed fields only or on entire index? In more 
detail, let say I have fields id,  title and content, all  indexed, stored. 
Will a search send all these fields to memory or only indexed part of these 
fields? 

Thanks.
Alex.




Re: SolrCloud with Zookeeper ensemble in production environment: SEVERE problems.

2013-03-15 Thread Luis Cappa Banda
And up! :-)

I´ve been wondering if using CloudSolrServer has something to do here. Does
it have a bad performance when a CloudSolrServer singletong receives
multiple queries? Is it recommended to have a CloudSolrServer instances
list and select one of them with a Round Robin criteria?



2013/3/14 Luis Cappa Banda 

> Hello!
>
> Thanks a lot, Erick! I've attached some stack traces during a normal
> 'engine' running.
>
> Cheers,
>
> - Luis Cappa
>
>
> 2013/3/13 Erick Erickson 
>
>> Stack traces..
>>
>> First,
>> jps -l
>>
>> that will give you a the process IDs of your running Java processes. Then:
>>
>> jstack 
>>
>> Usually I pipe the output from jstack into a text file...
>>
>> Best
>> Erick
>>
>>
>> On Wed, Mar 13, 2013 at 1:48 PM, Luis Cappa Banda > >wrote:
>>
>> > Uhm, how can I do that... 'cleanly'? I know that with JConsole it´s
>> posible
>> > to output this traces, but with a .war application built on top of
>> Spring I
>> > don´t know how can I do that. In any case, here is my CloudSolrServer
>> > wrapper that is used by other classes. There is no sync method or piece
>> of
>> > code:
>> >
>> >  - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
>> - -
>> > - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
>> >
>> > *public class BinaryLBHttpSolrServer extends LBHttpSolrServer {*
>> >
>> > private static final long serialVersionUID = 3905956120804659445L;
>> > public BinaryLBHttpSolrServer(String[] endpoints) throws
>> > MalformedURLException {
>> > super(endpoints);
>> > }
>> >
>> > @Override
>> > protected HttpSolrServer makeServer(String server) throws
>> > MalformedURLException {
>> > HttpSolrServer solrServer = super.makeServer(server);
>> > solrServer.setRequestWriter(new BinaryRequestWriter());
>> > return solrServer;
>> > }
>> > }
>> >
>> >  - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
>> - -
>> > - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
>> >
>> > *public class CloudSolrHttpServerImpl implements CloudSolrHttpServer {*
>> >  private CloudSolrServer cloudSolrServer;
>> >
>> > private Logger log = Logger.getLogger(CloudSolrHttpServerImpl.class);
>> >
>> > public CloudSolrHttpServerImpl(String zookeeperEndpoints, String[]
>> > endpoints, int clientTimeout,
>> > int connectTimeout, String cloudCollection) {
>> >  try {
>> > BinaryLBHttpSolrServer lbSolrServer = new *BinaryLBHttpSolrServer*
>> > (endpoints);
>> > this.cloudSolrServer = new CloudSolrServer(zookeeperEndpoints,
>> > lbSolrServer);
>> > this.cloudSolrServer.setZkConnectTimeout(connectTimeout);
>> > this.cloudSolrServer.setZkClientTimeout(clientTimeout);
>> > this.cloudSolrServer.setDefaultCollection(cloudCollection);
>> >  } catch (MalformedURLException e) {
>> > log.error(e);
>> > }
>> > }
>> >
>> > @Override
>> > public QueryResponse *search*(SolrQuery query) throws
>> SolrServerException {
>> > return cloudSolrServer.query(query, METHOD.POST);
>> > }
>> >
>> > @Override
>> > public boolean *index*(DocumentBean user) {
>> > boolean indexed = false;
>> > int retries = 0;
>> >  do {
>> > indexed = addBean(user);
>> > retries++;
>> >  } while(!indexed && retries<4);
>> >  return indexed;
>> > }
>> >  @Override
>> > public boolean *update*(SolrInputDocument updateDoc) {
>> > boolean update = false;
>> > int retries = 0;
>> >
>> > do {
>> > update = addSolrInputDocument(updateDoc);
>> > retries++;
>> >  } while(!update && retries<4);
>> >  return update;
>> > }
>> >  @Override
>> > public void commit() {
>> > try {
>> > cloudSolrServer.commit();
>> > } catch (SolrServerException e) {
>> >  log.error(e);
>> > } catch (IOException e) {
>> >  log.error(e);
>> > }
>> > }
>> >
>> > @Override
>> > public boolean *delete*(String ... ids) {
>> > boolean deleted = false;
>> >  List idList = Arrays.asList(ids);
>> >  try {
>> > this.cloudSolrServer.deleteById(idList);
>> > this.cloudSolrServer.commit(true, true);
>> > deleted = true;
>> >
>> > } catch (SolrServerException e) {
>> > log.error(e);
>> >
>> > } catch (IOException e) {
>> > log.error(e);
>> >  }
>> >  return deleted;
>> > }
>> >
>> > @Override
>> > public void *optimize*() {
>> > try {
>> > this.cloudSolrServer.optimize();
>> >  } catch (SolrServerException e) {
>> > log.error(e);
>> >  } catch (IOException e) {
>> > log.error(e);
>> > }
>> > }
>> >  /*
>> >  * 
>> >  *  Getters & setters *
>> >  * 
>> >  * */
>> >  public CloudSolrServer getSolrServer() {
>> > return cloudSolrServer;
>> > }
>> >
>> > public void setSolrServer(CloudSolrServer solrServer) {
>> > this.cloudSolrServer = solrServer;
>> > }
>> >
>> > private boolean addBean(DocumentBean user) {
>> > boolean added = false;
>> >  try {
>> > this.cloudSolrServer.addBean(user, 100);
>> > this.commit();
>> >
>> > } catch (IOException e) {
>> > log.error(e);
>> >
>> > } catch (SolrServerException e) {
>> > log.error(e);
>> >  }catch(SolrExc

Re: Solr indexing binary files

2013-03-15 Thread Luis
Sorry, Gora.  It is ${fileSourcePaths.urlpath} actually.

*My complete schema.xml is this:*







  

  
  
   






































  

  




  








  
  








  





  








  




  








  


 
 

 


 
   


   
   
   
   
   
   
   
   

   
   
   
   
   


  
   
   
  
   
   
   
   
   
   
   


   
   
   
   
   
   
   
   
   

   
   
   
   
   
   
  
   
   
   

   
   
   

   

   
   
   
   
   
   
   
   
   
   
   
   
   
   

   
   

   
   
   

   
   

   
   
   

   
   
   
   
   
   
   
   
   

   


   
   
 

 
 id

 
 text

 
 

  
   

   
   
   
   
   
   
   

   

 
 






--
View this message in context: 
http://lucene.472066.n3.nabble.com/Solr-indexing-binary-files-tp4047470p4047778.html
Sent from the Solr - User mailing list archive at Nabble.com.


Analyzing Suggester and Fuzzy Suggester - configuration and comparison

2013-03-15 Thread Eoghan Ó Carragáin
Hi,
I'm interested in using the new Analyzing Suggester described by Mike
McCandless [1], but I'm not sure how it should be configured.

I've setup my SpellCheckComponent with
  org.apache.solr.spelling.suggest.Suggester
  org.apache.solr.spelling.suggest.fst.AnalyzingLookupFactory

I think I also need to set suggestAnalyzerFieldType
and queryAnalyzerFieldType? I presume these should have the names of field
types configured in schema.xml, but I'm not sure. I'd appreciate if someone
could point me to documentation on this (I didn't find anything on the
wiki), or post an example SpellCheckComponent configuration.

Also, what is the difference between the Fuzzy Suggester and the Analyzing
Suggester. When would you use one rather than the other?

Many thanks,
Eoghan

[1]
http://blog.mikemccandless.com/2012/09/lucenes-new-analyzing-suggester.html


Query.toString printing binary in the output...

2013-03-15 Thread Andrew Lundgren
We use the toString call on the query in our logs.  For some numeric types, the 
encoded form of the number is being printed instead of the readable form.

This makes tail and some other tools very unhappy...

Here is a partial example of a query.toString() that would have had binary in 
it.  As a short term work around I replaced all non-printable characters in the 
string with an '_'.

(collection_id:`__z_[^0.027 collection_id:`__nB+^0.026 
collection_id:`__Zl_^0.025 collection_id:`__i49^0.024 
collection_id:`__Pq%^0.023 collection_id:`__VCS^0.022 
collection_id:`__WbH^0.021 collection_id:`__Yu_^0.02 collection_id:`__UF&^0.019 
collection_id:`__I2g^0.018 collection_id:`__PP_^0.01699 
collection_id:`__Ysv^0.01599 collection_id:`__Oe_^0.01499 
collection_id:`__Ysw^0.01399 collection_id:`__Wi_^0.01298 
collection_id:`__fLi^0.01198 collection_id:`__XRk^0.01098 
collection_id:`__Uz[^0.00998 collection_id:`__SE_^0.00898 
collection_id:`__Ysx^0.00798 collection_id:`__Ysh^0.006974 
collection_id:`__fLh^0.005973 collection_id:`__f _^0.00497 
collection_id:`__`^C^0.00397 collection_id:`__fKM^0.00297 
collection_id:`__Szo^0.00197 collection_id:`__f ]^9.7E-4)

But, as you can see, that is less than useful...

I spent some time looking at the source and found that Term does not contain 
the type of the embedded data.  Any possible solutions to this short of walking 
the query and getting the type of each field from the schema and creating my 
own print function?

Thanks!

--
Andrew




 NOTICE: This email message is for the sole use of the intended recipient(s) 
and may contain confidential and privileged information. Any unauthorized 
review, use, disclosure or distribution is prohibited. If you are not the 
intended recipient, please contact the sender by reply email and destroy all 
copies of the original message.



Re: SolrCloud with Zookeeper ensemble in production environment: SEVERE problems.

2013-03-15 Thread Mark Miller
You def have to use multiple threads with it for it to be fast, but 3 or 4 docs 
a second still sounds absurdly slow.

- Mark

On Mar 15, 2013, at 2:58 PM, Luis Cappa Banda  wrote:

> And up! :-)
> 
> I´ve been wondering if using CloudSolrServer has something to do here. Does
> it have a bad performance when a CloudSolrServer singletong receives
> multiple queries? Is it recommended to have a CloudSolrServer instances
> list and select one of them with a Round Robin criteria?
> 
> 
> 
> 2013/3/14 Luis Cappa Banda 
> 
>> Hello!
>> 
>> Thanks a lot, Erick! I've attached some stack traces during a normal
>> 'engine' running.
>> 
>> Cheers,
>> 
>> - Luis Cappa
>> 
>> 
>> 2013/3/13 Erick Erickson 
>> 
>>> Stack traces..
>>> 
>>> First,
>>> jps -l
>>> 
>>> that will give you a the process IDs of your running Java processes. Then:
>>> 
>>> jstack 
>>> 
>>> Usually I pipe the output from jstack into a text file...
>>> 
>>> Best
>>> Erick
>>> 
>>> 
>>> On Wed, Mar 13, 2013 at 1:48 PM, Luis Cappa Banda >>> wrote:
>>> 
 Uhm, how can I do that... 'cleanly'? I know that with JConsole it´s
>>> posible
 to output this traces, but with a .war application built on top of
>>> Spring I
 don´t know how can I do that. In any case, here is my CloudSolrServer
 wrapper that is used by other classes. There is no sync method or piece
>>> of
 code:
 
 - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
>>> - -
 - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
 
 *public class BinaryLBHttpSolrServer extends LBHttpSolrServer {*
 
 private static final long serialVersionUID = 3905956120804659445L;
public BinaryLBHttpSolrServer(String[] endpoints) throws
 MalformedURLException {
super(endpoints);
}
 
@Override
protected HttpSolrServer makeServer(String server) throws
 MalformedURLException {
HttpSolrServer solrServer = super.makeServer(server);
solrServer.setRequestWriter(new BinaryRequestWriter());
return solrServer;
}
 }
 
 - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
>>> - -
 - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
 
 *public class CloudSolrHttpServerImpl implements CloudSolrHttpServer {*
 private CloudSolrServer cloudSolrServer;
 
 private Logger log = Logger.getLogger(CloudSolrHttpServerImpl.class);
 
 public CloudSolrHttpServerImpl(String zookeeperEndpoints, String[]
 endpoints, int clientTimeout,
 int connectTimeout, String cloudCollection) {
 try {
 BinaryLBHttpSolrServer lbSolrServer = new *BinaryLBHttpSolrServer*
 (endpoints);
 this.cloudSolrServer = new CloudSolrServer(zookeeperEndpoints,
 lbSolrServer);
 this.cloudSolrServer.setZkConnectTimeout(connectTimeout);
 this.cloudSolrServer.setZkClientTimeout(clientTimeout);
 this.cloudSolrServer.setDefaultCollection(cloudCollection);
 } catch (MalformedURLException e) {
 log.error(e);
 }
 }
 
 @Override
 public QueryResponse *search*(SolrQuery query) throws
>>> SolrServerException {
 return cloudSolrServer.query(query, METHOD.POST);
 }
 
 @Override
 public boolean *index*(DocumentBean user) {
 boolean indexed = false;
 int retries = 0;
 do {
 indexed = addBean(user);
 retries++;
 } while(!indexed && retries<4);
 return indexed;
 }
 @Override
 public boolean *update*(SolrInputDocument updateDoc) {
 boolean update = false;
 int retries = 0;
 
 do {
 update = addSolrInputDocument(updateDoc);
 retries++;
 } while(!update && retries<4);
 return update;
 }
 @Override
 public void commit() {
 try {
 cloudSolrServer.commit();
 } catch (SolrServerException e) {
 log.error(e);
 } catch (IOException e) {
 log.error(e);
 }
 }
 
 @Override
 public boolean *delete*(String ... ids) {
 boolean deleted = false;
 List idList = Arrays.asList(ids);
 try {
 this.cloudSolrServer.deleteById(idList);
 this.cloudSolrServer.commit(true, true);
 deleted = true;
 
 } catch (SolrServerException e) {
 log.error(e);
 
 } catch (IOException e) {
 log.error(e);
 }
 return deleted;
 }
 
 @Override
 public void *optimize*() {
 try {
 this.cloudSolrServer.optimize();
 } catch (SolrServerException e) {
 log.error(e);
 } catch (IOException e) {
 log.error(e);
 }
 }
 /*
 * 
 *  Getters & setters *
 * 
 * */
 public CloudSolrServer getSolrServer() {
 return cloudSolrServer;
 }
 
 public void setSolrServer(CloudSolrServer solrServer) {
 this.cloudSolrServer = solrServer;
 }
 
 private boolean addBean(DocumentBean user) {
 boo

Re: SolrCloud with Zookeeper ensemble in production environment: SEVERE problems.

2013-03-15 Thread Jack Park
Is there a document that tells how to create multiple threads? Search
returns many hits which orbit this idea, but I haven't spotted one
which tells how.

Thanks
Jack

On Fri, Mar 15, 2013 at 1:01 PM, Mark Miller  wrote:
> You def have to use multiple threads with it for it to be fast, but 3 or 4 
> docs a second still sounds absurdly slow.
>
> - Mark
>
> On Mar 15, 2013, at 2:58 PM, Luis Cappa Banda  wrote:
>
>> And up! :-)
>>
>> I´ve been wondering if using CloudSolrServer has something to do here. Does
>> it have a bad performance when a CloudSolrServer singletong receives
>> multiple queries? Is it recommended to have a CloudSolrServer instances
>> list and select one of them with a Round Robin criteria?
>>
>>
>>
>> 2013/3/14 Luis Cappa Banda 
>>
>>> Hello!
>>>
>>> Thanks a lot, Erick! I've attached some stack traces during a normal
>>> 'engine' running.
>>>
>>> Cheers,
>>>
>>> - Luis Cappa
>>>
>>>
>>> 2013/3/13 Erick Erickson 
>>>
 Stack traces..

 First,
 jps -l

 that will give you a the process IDs of your running Java processes. Then:

 jstack 

 Usually I pipe the output from jstack into a text file...

 Best
 Erick


 On Wed, Mar 13, 2013 at 1:48 PM, Luis Cappa Banda  wrote:

> Uhm, how can I do that... 'cleanly'? I know that with JConsole it´s
 posible
> to output this traces, but with a .war application built on top of
 Spring I
> don´t know how can I do that. In any case, here is my CloudSolrServer
> wrapper that is used by other classes. There is no sync method or piece
 of
> code:
>
> - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
 - -
> - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
>
> *public class BinaryLBHttpSolrServer extends LBHttpSolrServer {*
>
> private static final long serialVersionUID = 3905956120804659445L;
>public BinaryLBHttpSolrServer(String[] endpoints) throws
> MalformedURLException {
>super(endpoints);
>}
>
>@Override
>protected HttpSolrServer makeServer(String server) throws
> MalformedURLException {
>HttpSolrServer solrServer = super.makeServer(server);
>solrServer.setRequestWriter(new BinaryRequestWriter());
>return solrServer;
>}
> }
>
> - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
 - -
> - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
>
> *public class CloudSolrHttpServerImpl implements CloudSolrHttpServer {*
> private CloudSolrServer cloudSolrServer;
>
> private Logger log = Logger.getLogger(CloudSolrHttpServerImpl.class);
>
> public CloudSolrHttpServerImpl(String zookeeperEndpoints, String[]
> endpoints, int clientTimeout,
> int connectTimeout, String cloudCollection) {
> try {
> BinaryLBHttpSolrServer lbSolrServer = new *BinaryLBHttpSolrServer*
> (endpoints);
> this.cloudSolrServer = new CloudSolrServer(zookeeperEndpoints,
> lbSolrServer);
> this.cloudSolrServer.setZkConnectTimeout(connectTimeout);
> this.cloudSolrServer.setZkClientTimeout(clientTimeout);
> this.cloudSolrServer.setDefaultCollection(cloudCollection);
> } catch (MalformedURLException e) {
> log.error(e);
> }
> }
>
> @Override
> public QueryResponse *search*(SolrQuery query) throws
 SolrServerException {
> return cloudSolrServer.query(query, METHOD.POST);
> }
>
> @Override
> public boolean *index*(DocumentBean user) {
> boolean indexed = false;
> int retries = 0;
> do {
> indexed = addBean(user);
> retries++;
> } while(!indexed && retries<4);
> return indexed;
> }
> @Override
> public boolean *update*(SolrInputDocument updateDoc) {
> boolean update = false;
> int retries = 0;
>
> do {
> update = addSolrInputDocument(updateDoc);
> retries++;
> } while(!update && retries<4);
> return update;
> }
> @Override
> public void commit() {
> try {
> cloudSolrServer.commit();
> } catch (SolrServerException e) {
> log.error(e);
> } catch (IOException e) {
> log.error(e);
> }
> }
>
> @Override
> public boolean *delete*(String ... ids) {
> boolean deleted = false;
> List idList = Arrays.asList(ids);
> try {
> this.cloudSolrServer.deleteById(idList);
> this.cloudSolrServer.commit(true, true);
> deleted = true;
>
> } catch (SolrServerException e) {
> log.error(e);
>
> } catch (IOException e) {
> log.error(e);
> }
> return deleted;
> }
>
> @Override
> public void *optimize*() {
> try {
> this.cloudSolrServer.optimize();
> } catch (SolrServerException e) {
> log.error(e);
> } catch (IOException e) {
> log.error(e);
> }
> }
>>

Re: Analyzing Suggester and Fuzzy Suggester - configuration and comparison

2013-03-15 Thread Robert Muir
On Fri, Mar 15, 2013 at 3:04 PM, Eoghan Ó Carragáin
 wrote:
> Hi,
> I'm interested in using the new Analyzing Suggester described by Mike
> McCandless [1], but I'm not sure how it should be configured.
>
> I've setup my SpellCheckComponent with
>   org.apache.solr.spelling.suggest.Suggester
>name="lookupImpl">org.apache.solr.spelling.suggest.fst.AnalyzingLookupFactory
>
> I think I also need to set suggestAnalyzerFieldType
> and queryAnalyzerFieldType? I presume these should have the names of field
> types configured in schema.xml, but I'm not sure. I'd appreciate if someone
> could point me to documentation on this (I didn't find anything on the
> wiki), or post an example SpellCheckComponent configuration.
>
> Also, what is the difference between the Fuzzy Suggester and the Analyzing
> Suggester. When would you use one rather than the other?
>

The same example config for the wiki also has entries for
analyzing/fuzzy suggesters

there is an example schema here:
http://svn.apache.org/repos/asf/lucene/dev/trunk/solr/core/src/test-files/solr/collection1/conf/schema-phrasesuggest.xml
and with solrconfig here:
http://svn.apache.org/repos/asf/lucene/dev/trunk/solr/core/src/test-files/solr/collection1/conf/solrconfig-phrasesuggest.xml

the "two" analyzers are confusing. the queryAnalyzerFieldType is just
a solr thing for all suggester/spellcheckers and unrelated to
analyzing suggester. so e.g. you'd make this type do minimal stuff if
anything (set to keywordtokenizer, or something like that
phrase_suggest that will try to parse out the field names and
operators too but otherwise not do any tokenization or anything).

the suggestAnalyzerFieldType is the one being passed to
analyzing/fuzzy suggester as the analyzer..

fuzzy suggester is just like analyzing suggester, except it also
corrects typos while autosuggesting.


Re: SolrCloud with Zookeeper ensemble in production environment: SEVERE problems.

2013-03-15 Thread Luis Cappa Banda
Me neither. Please, Mark, can you tell us how?

2013/3/15 Jack Park 

> Is there a document that tells how to create multiple threads? Search
> returns many hits which orbit this idea, but I haven't spotted one
> which tells how.
>
> Thanks
> Jack
>
> On Fri, Mar 15, 2013 at 1:01 PM, Mark Miller 
> wrote:
> > You def have to use multiple threads with it for it to be fast, but 3 or
> 4 docs a second still sounds absurdly slow.
> >
> > - Mark
> >
> > On Mar 15, 2013, at 2:58 PM, Luis Cappa Banda 
> wrote:
> >
> >> And up! :-)
> >>
> >> I´ve been wondering if using CloudSolrServer has something to do here.
> Does
> >> it have a bad performance when a CloudSolrServer singletong receives
> >> multiple queries? Is it recommended to have a CloudSolrServer instances
> >> list and select one of them with a Round Robin criteria?
> >>
> >>
> >>
> >> 2013/3/14 Luis Cappa Banda 
> >>
> >>> Hello!
> >>>
> >>> Thanks a lot, Erick! I've attached some stack traces during a normal
> >>> 'engine' running.
> >>>
> >>> Cheers,
> >>>
> >>> - Luis Cappa
> >>>
> >>>
> >>> 2013/3/13 Erick Erickson 
> >>>
>  Stack traces..
> 
>  First,
>  jps -l
> 
>  that will give you a the process IDs of your running Java processes.
> Then:
> 
>  jstack 
> 
>  Usually I pipe the output from jstack into a text file...
> 
>  Best
>  Erick
> 
> 
>  On Wed, Mar 13, 2013 at 1:48 PM, Luis Cappa Banda <
> luisca...@gmail.com
> > wrote:
> 
> > Uhm, how can I do that... 'cleanly'? I know that with JConsole it´s
>  posible
> > to output this traces, but with a .war application built on top of
>  Spring I
> > don´t know how can I do that. In any case, here is my CloudSolrServer
> > wrapper that is used by other classes. There is no sync method or
> piece
>  of
> > code:
> >
> > - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
>  - -
> > - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
> >
> > *public class BinaryLBHttpSolrServer extends LBHttpSolrServer {*
> >
> > private static final long serialVersionUID = 3905956120804659445L;
> >public BinaryLBHttpSolrServer(String[] endpoints) throws
> > MalformedURLException {
> >super(endpoints);
> >}
> >
> >@Override
> >protected HttpSolrServer makeServer(String server) throws
> > MalformedURLException {
> >HttpSolrServer solrServer = super.makeServer(server);
> >solrServer.setRequestWriter(new BinaryRequestWriter());
> >return solrServer;
> >}
> > }
> >
> > - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
>  - -
> > - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
> >
> > *public class CloudSolrHttpServerImpl implements CloudSolrHttpServer
> {*
> > private CloudSolrServer cloudSolrServer;
> >
> > private Logger log = Logger.getLogger(CloudSolrHttpServerImpl.class);
> >
> > public CloudSolrHttpServerImpl(String zookeeperEndpoints, String[]
> > endpoints, int clientTimeout,
> > int connectTimeout, String cloudCollection) {
> > try {
> > BinaryLBHttpSolrServer lbSolrServer = new *BinaryLBHttpSolrServer*
> > (endpoints);
> > this.cloudSolrServer = new CloudSolrServer(zookeeperEndpoints,
> > lbSolrServer);
> > this.cloudSolrServer.setZkConnectTimeout(connectTimeout);
> > this.cloudSolrServer.setZkClientTimeout(clientTimeout);
> > this.cloudSolrServer.setDefaultCollection(cloudCollection);
> > } catch (MalformedURLException e) {
> > log.error(e);
> > }
> > }
> >
> > @Override
> > public QueryResponse *search*(SolrQuery query) throws
>  SolrServerException {
> > return cloudSolrServer.query(query, METHOD.POST);
> > }
> >
> > @Override
> > public boolean *index*(DocumentBean user) {
> > boolean indexed = false;
> > int retries = 0;
> > do {
> > indexed = addBean(user);
> > retries++;
> > } while(!indexed && retries<4);
> > return indexed;
> > }
> > @Override
> > public boolean *update*(SolrInputDocument updateDoc) {
> > boolean update = false;
> > int retries = 0;
> >
> > do {
> > update = addSolrInputDocument(updateDoc);
> > retries++;
> > } while(!update && retries<4);
> > return update;
> > }
> > @Override
> > public void commit() {
> > try {
> > cloudSolrServer.commit();
> > } catch (SolrServerException e) {
> > log.error(e);
> > } catch (IOException e) {
> > log.error(e);
> > }
> > }
> >
> > @Override
> > public boolean *delete*(String ... ids) {
> > boolean deleted = false;
> > List idList = Arrays.asList(ids);
> > try {
> > this.cloudSolrServer.deleteById(idList);
> > this.cloudSolrServer.commit(true, true);
> 

Re: Analyzing Suggester and Fuzzy Suggester - configuration and comparison

2013-03-15 Thread Eoghan Ó Carragáin
Thanks, Robert.

I'm I correct in thinking that queryAnalyzerFieldType isn't needed at all
if I'm using spellcheck.q rather than just q?

Eoghan

On 15 March 2013 20:07, Robert Muir  wrote:

> On Fri, Mar 15, 2013 at 3:04 PM, Eoghan Ó Carragáin
>  wrote:
> > Hi,
> > I'm interested in using the new Analyzing Suggester described by Mike
> > McCandless [1], but I'm not sure how it should be configured.
> >
> > I've setup my SpellCheckComponent with
> >name="classname">org.apache.solr.spelling.suggest.Suggester
> >>
> name="lookupImpl">org.apache.solr.spelling.suggest.fst.AnalyzingLookupFactory
> >
> > I think I also need to set suggestAnalyzerFieldType
> > and queryAnalyzerFieldType? I presume these should have the names of
> field
> > types configured in schema.xml, but I'm not sure. I'd appreciate if
> someone
> > could point me to documentation on this (I didn't find anything on the
> > wiki), or post an example SpellCheckComponent configuration.
> >
> > Also, what is the difference between the Fuzzy Suggester and the
> Analyzing
> > Suggester. When would you use one rather than the other?
> >
>
> The same example config for the wiki also has entries for
> analyzing/fuzzy suggesters
>
> there is an example schema here:
>
> http://svn.apache.org/repos/asf/lucene/dev/trunk/solr/core/src/test-files/solr/collection1/conf/schema-phrasesuggest.xml
> and with solrconfig here:
>
> http://svn.apache.org/repos/asf/lucene/dev/trunk/solr/core/src/test-files/solr/collection1/conf/solrconfig-phrasesuggest.xml
>
> the "two" analyzers are confusing. the queryAnalyzerFieldType is just
> a solr thing for all suggester/spellcheckers and unrelated to
> analyzing suggester. so e.g. you'd make this type do minimal stuff if
> anything (set to keywordtokenizer, or something like that
> phrase_suggest that will try to parse out the field names and
> operators too but otherwise not do any tokenization or anything).
>
> the suggestAnalyzerFieldType is the one being passed to
> analyzing/fuzzy suggester as the analyzer..
>
> fuzzy suggester is just like analyzing suggester, except it also
> corrects typos while autosuggesting.
>


Re: NRT - softCommit

2013-03-15 Thread Prakhar Birla
You have to open searchers for the new data to show up. Try this:


1
false



2000
true


Make sure that you have low autowarm counts otherwise you need to reduce
the freq. of commits.

On 15 March 2013 18:59, Arkadi Colson  wrote:

> NRT seems not to work in my case when doing a softcommit every 2 seconds.
> My conf looks like this:
> 
> 1
> false
> 
>
> 
> 2000
> 
>
> No result from Solr when searching for a word in the file. When doing a
> hard commit manually, it works.
>
> Did I forgot to configure something?
>
>
> Logging:
>
> Mar 15, 2013 11:56:20 AM org.apache.solr.core.SolrCore execute
> INFO: [intradesk] webapp=/solr path=/update params={distrib.from=http://**
> solr03:8983/solr/intradesk/&**update.distrib=FROMLEADER&wt=**
> javabin&version=2}
> status=0 QTime=74
> Mar 15, 2013 11:56:35 AM org.apache.solr.update.**DirectUpdateHandler2
> commit
> INFO: start commit{flags=0,_version_=0,**optimize=false,openSearcher=**
> false,waitSearcher=true,**expungeDeletes=false,**softCommit=false}
> Mar 15, 2013 11:56:36 AM org.apache.solr.core.**SolrDeletionPolicy
> onCommit
> INFO: SolrDeletionPolicy.onCommit: commits:num=2
> commit{dir=**NRTCachingDirectory(org.**apache.lucene.store.**
> MMapDirectory@/solr/intradesk/**data/index lockFactory=org.apache.lucene.*
> *store.NativeFSLockFactory@**6c0081fb; maxCacheMB=48.0
> maxMergeSizeMB=4.0),segFN=**segments_7v,generation=283,**filenames=[_e7.fnm,
> segments_7v, _ec_Lucene40_0.tim, _eg.fdt, _ee_Lucene40_0.frq,
> _ef_Lucene40_0.frq, _e9_Lucene40_0.frq, _eg.fnm, _eg.fdx,
> _ec_Lucene40_0.tip, _dz_Lucene40_0.frq, _e9.fnm, _e0_Lucene40_0.tip,
> _e0_Lucene40_0.tim, _e4.fdt, _e3.si, _ec.fnm, _eh_Lucene40_0.prx, _ef.si,
> _e8_Lucene40_0.frq, _e7_Lucene40_0.frq, _e4.fdx, _e8_Lucene40_0.prx,
> _dz.fdx, _ef_Lucene40_0.tip, _e2.fnm, _ef_Lucene40_0.tim,
> _ec_Lucene40_0.frq, _e8.si, _ef_Lucene40_0.prx, _eh.si,
> _e3_Lucene40_0.prx, _eh_Lucene40_0.tip, _e2_Lucene40_0.tim,
> _e2_Lucene40_0.tip, _ec.fdt, _ec.fdx, _ed_Lucene40_0.frq,
> _e6_Lucene40_0.frq, _e6_Lucene40_0.tim, _eh_Lucene40_0.tim,
> _e6_Lucene40_0.tip, _e9_Lucene40_0.prx, _ee.fnm, _e8_nrm.cfs, _e3.fdx,
> _ea_nrm.cfe, _eg_Lucene40_0.tip, _e3.fdt, _e9.fdt, _eg_Lucene40_0.tim,
> _e8_nrm.cfe, _e9.fdx, _e5_Lucene40_0.frq, _ea_Lucene40_0.tim, _dz.fnm,
> _e5_nrm.cfs, _eh.fnm, _ed.si, _e4.fnm, _e5_nrm.cfe, _e4_Lucene40_0.tip,
> _e2_Lucene40_0.frq, _e0.si, _ec.si, _e4_Lucene40_0.tim, _ee.fdt,
> _eg_Lucene40_0.frq, _ee.si, _dz_Lucene40_0.tim, _e7.fdt, _dz.fdt,
> _ea_Lucene40_0.prx, _e2.si, _dz_Lucene40_0.tip, _e7.fdx,
> _e5_Lucene40_0.prx, _e6.si, _ee.fdx, _eg_Lucene40_0.prx, _e5.si, _eg.si,
> _ea_Lucene40_0.tip, _e2.fdt, _eh.fdt, _dz_Lucene40_0.prx, _eb.fdx, _eh.fdx,
> _eb.fdt, _e7.si, _ea_nrm.cfs, _e2.fdx, _ed_Lucene40_0.prx,
> _ee_Lucene40_0.tim, _e5_Lucene40_0.tip, _dz_nrm.cfe, _e7_nrm.cfe,
> _ee_Lucene40_0.tip, _e5_Lucene40_0.tim, _e6_nrm.cfs, _e8_Lucene40_0.tip,
> _e7_Lucene40_0.prx, _e8_Lucene40_0.tim, _dz.si, _e6.fnm,
> _eh_Lucene40_0.frq, _ea.fnm, _e6_nrm.cfe, _e5.fnm, _e4_Lucene40_0.frq,
> _e7_nrm.cfs, _e7_Lucene40_0.tip, _e8.fdx, _eb.fnm, _e8.fdt,
> _e6_Lucene40_0.prx, _e7_Lucene40_0.tim, _ea.fdt, _dz_nrm.cfs, _ef.fdx,
> _ea.fdx, _ef.fdt, _e3_Lucene40_0.frq, _e0.fdt, _e4_Lucene40_0.prx, _ef.fnm,
> _ea_Lucene40_0.frq, _e0.fdx, _e9_Lucene40_0.tim, _ea.si,
> _e9_Lucene40_0.tip, _eb.si, _e2_Lucene40_0.prx, _e9.si,
> _eb_Lucene40_0.tim, _eb_Lucene40_0.tip, _e5.fdt, _ec_Lucene40_0.prx,
> _ed.fnm, _e3_Lucene40_0.tim, _e3_Lucene40_0.tip, _e5.fdx, _e8.fnm,
> _e0_Lucene40_0.prx, _ee_Lucene40_0.prx, _ed_Lucene40_0.tim,
> _e0_Lucene40_0.frq, _e6.fdt, _ed_Lucene40_0.tip, _e6.fdx,
> _eb_Lucene40_0.frq, _e3.fnm, _ed.fdt, _ed.fdx, _e4.si,
> _eb_Lucene40_0.prx, _e0.fnm]
> commit{dir=**NRTCachingDirectory(org.**apache.lucene.store.**
> MMapDirectory@/solr/intradesk/**data/index lockFactory=org.apache.lucene.*
> *store.NativeFSLockFactory@**6c0081fb; maxCacheMB=48.0
> maxMergeSizeMB=4.0),segFN=**segments_7w,generation=284,**filenames=[_e7.fnm,
> segments_7w, _ei_Lucene40_0.frq, _ec_Lucene40_0.tim, _eg.fdt,
> _ee_Lucene40_0.frq, _ef_Lucene40_0.frq, _e9_Lucene40_0.frq, _eg.fnm,
> _eg.fdx, _ec_Lucene40_0.tip, _dz_Lucene40_0.frq, _e9.fnm,
> _e0_Lucene40_0.tip, _e0_Lucene40_0.tim, _e4.fdt, _e3.si, _ec.fnm,
> _eh_Lucene40_0.prx, _ef.si, _ei.si, _e8_Lucene40_0.frq,
> _e7_Lucene40_0.frq, _e4.fdx, _e8_Lucene40_0.prx, _dz.fdx,
> _ef_Lucene40_0.tip, _e2.fnm, _ef_Lucene40_0.tim, _ec_Lucene40_0.frq, _
> e8.si, _ef_Lucene40_0.prx, _eh.si, _e3_Lucene40_0.prx,
> _eh_Lucene40_0.tip, _e2_Lucene40_0.tim, _e2_Lucene40_0.tip, _ec.fdt,
> _ec.fdx, _ed_Lucene40_0.frq, _e6_Lucene40_0.frq, _e6_Lucene40_0.tim,
> _eh_Lucene40_0.tim, _e6_Lucene40_0.tip, _e9_Lucene40_0.prx, _ee.fnm,
> _e8_nrm.cfs, _e3.fdx, _ea_nrm.cfe, _eg_Lucene40_0.tip, _e3.fdt, _e9.fdt,
> _eg_Lucene40_0.tim, _e8_nrm.cfe, _e9.fdx, _e5_Lucene40_0.frq,
> _ea_Lucene40_0.tim, _dz.fnm, _e5_nrm.cfs, _eh.fn

status 400 on posting json

2013-03-15 Thread Patrice Seyed
Hi all,

Running the solr server:

~/solr-4.1.0/example$ java -jar start.jar

For updating solr with json, I  followed the convention at:
example/examplesdocs/books.json

which has:
[
  {
"id" : "978-0641723445",^M
"cat" : ["book","hardcover"],^M
"name" : "The Lightning Thief",^M
"author" : "Rick Riordan",^M
"series_t" : "Percy Jackson and the Olympians",^M
"sequence_i" : 1,^M
"genre_s" : "fantasy",^M
"inStock" : true,^M
"price" : 12.50,^M
"pages_i" : 384^M
  },

...

]

My json file is structured as follows:
[
{"id":"doi:10.6085\/AA\/CBLX00_XXXITBDXLSR01_20040221.50.4","datasource":"urn:node:PISCO","abstract":"This
metadata record describes a mix of intertidal seawater and air
temperature data collected at Cape Blanco, Oregon, USA, by PISCO.
Measurements were collected using StowAway TidbiT Temperature Loggers
(Onset Computer Corp. TBI32-05+37) beginning 2004-02-21. Site
temperature loggers are bolted down in a wire cage at three locations
within each site near MLLW.  Temperature is recorded at 1 hour
intervals.","title":"PISCO: Intertidal: site temperature data: Cape
Blanco, Oregon, USA (CBLX00)","project":"Partnership for
Interdisciplinary Studies of Coastal Oceans (PISCO)","author":"Bruce
Menge","contactOrganization":["Partnership for Interdisciplinary
Studies of Coastal Oceans (PISCO)","PISCO"],"keywords":["EARTH SCIENCE
: Oceans : Ocean Temperature : Water
Temperature","Temperature","Integrated Ocean Observing
System","IOOS","Oceanographic Sensor Data","Intertidal Temperature
Data","continental shelf","seawater","temperature","Oregon","United
States of America","PISCO"]},

...

]

Per the documentation at:
http://wiki.apache.org/solr/UpdateJSON

attempted post of json with:
curl 'http://localhost:8983/solr/update/json?commit=true'
--data-binary @datafile.json -H 'Content-type:application/json'

received the following status 400:

{"responseHeader":{"status":400,"QTime":1},"error":{"msg":"ERROR:
[doc=doi:10.6073/AA/knb-lter-bes.14.45] unknown field
'datasource'","code":400}}
/Applications/MAMP/htdocs$

Is there some place I should indicate what parameters are including in
the json objects send? I was able to test books.json without the
error.

Thanks in advance,
Patrice


Re: status 400 on posting json

2013-03-15 Thread Erik Hatcher
> Is there some place I should indicate what parameters are including in
> the json objects send? I was able to test books.json without the
> error.

Yes, in Solr's schema.xml (under the conf/ directory).  See 
 for more details.

Erik


On Mar 15, 2013, at 17:58 , Patrice Seyed wrote:

> Hi all,
> 
> Running the solr server:
> 
> ~/solr-4.1.0/example$ java -jar start.jar
> 
> For updating solr with json, I  followed the convention at:
> example/examplesdocs/books.json
> 
> which has:
> [
>  {
>"id" : "978-0641723445",^M
>"cat" : ["book","hardcover"],^M
>"name" : "The Lightning Thief",^M
>"author" : "Rick Riordan",^M
>"series_t" : "Percy Jackson and the Olympians",^M
>"sequence_i" : 1,^M
>"genre_s" : "fantasy",^M
>"inStock" : true,^M
>"price" : 12.50,^M
>"pages_i" : 384^M
>  },
> 
> ...
> 
> ]
> 
> My json file is structured as follows:
> [
> {"id":"doi:10.6085\/AA\/CBLX00_XXXITBDXLSR01_20040221.50.4","datasource":"urn:node:PISCO","abstract":"This
> metadata record describes a mix of intertidal seawater and air
> temperature data collected at Cape Blanco, Oregon, USA, by PISCO.
> Measurements were collected using StowAway TidbiT Temperature Loggers
> (Onset Computer Corp. TBI32-05+37) beginning 2004-02-21. Site
> temperature loggers are bolted down in a wire cage at three locations
> within each site near MLLW.  Temperature is recorded at 1 hour
> intervals.","title":"PISCO: Intertidal: site temperature data: Cape
> Blanco, Oregon, USA (CBLX00)","project":"Partnership for
> Interdisciplinary Studies of Coastal Oceans (PISCO)","author":"Bruce
> Menge","contactOrganization":["Partnership for Interdisciplinary
> Studies of Coastal Oceans (PISCO)","PISCO"],"keywords":["EARTH SCIENCE
> : Oceans : Ocean Temperature : Water
> Temperature","Temperature","Integrated Ocean Observing
> System","IOOS","Oceanographic Sensor Data","Intertidal Temperature
> Data","continental shelf","seawater","temperature","Oregon","United
> States of America","PISCO"]},
> 
> ...
> 
> ]
> 
> Per the documentation at:
> http://wiki.apache.org/solr/UpdateJSON
> 
> attempted post of json with:
> curl 'http://localhost:8983/solr/update/json?commit=true'
> --data-binary @datafile.json -H 'Content-type:application/json'
> 
> received the following status 400:
> 
> {"responseHeader":{"status":400,"QTime":1},"error":{"msg":"ERROR:
> [doc=doi:10.6073/AA/knb-lter-bes.14.45] unknown field
> 'datasource'","code":400}}
> /Applications/MAMP/htdocs$
> 
> Is there some place I should indicate what parameters are including in
> the json objects send? I was able to test books.json without the
> error.
> 
> Thanks in advance,
> Patrice



Re: status 400 on posting json

2013-03-15 Thread Jack Krupansky
I tried it and I get the same error response! Which is because... I don't 
have a field named "datasource".


You need to check the Solr schema.xml for the available fields and then add 
any fields that your JSON uses that are not already there. Be sure to 
shutdown and restart Solr after editing the schema.


I did notice that there is a "keywords" field, but it is not multivalued, 
while you keywords are multivalued.


Or, you can us dynamic fields, such as "datasource_s" and "keywords_ss ("s" 
for string and a second "s" for multivalued), etc. for your other fields.


-- Jack Krupansky

-Original Message- 
From: Patrice Seyed

Sent: Friday, March 15, 2013 5:58 PM
To: solr-user@lucene.apache.org
Subject: status 400 on posting json

Hi all,

Running the solr server:

~/solr-4.1.0/example$ java -jar start.jar

For updating solr with json, I  followed the convention at:
example/examplesdocs/books.json

which has:
[
 {
   "id" : "978-0641723445",^M
   "cat" : ["book","hardcover"],^M
   "name" : "The Lightning Thief",^M
   "author" : "Rick Riordan",^M
   "series_t" : "Percy Jackson and the Olympians",^M
   "sequence_i" : 1,^M
   "genre_s" : "fantasy",^M
   "inStock" : true,^M
   "price" : 12.50,^M
   "pages_i" : 384^M
 },

...

]

My json file is structured as follows:
[
{"id":"doi:10.6085\/AA\/CBLX00_XXXITBDXLSR01_20040221.50.4","datasource":"urn:node:PISCO","abstract":"This
metadata record describes a mix of intertidal seawater and air
temperature data collected at Cape Blanco, Oregon, USA, by PISCO.
Measurements were collected using StowAway TidbiT Temperature Loggers
(Onset Computer Corp. TBI32-05+37) beginning 2004-02-21. Site
temperature loggers are bolted down in a wire cage at three locations
within each site near MLLW.  Temperature is recorded at 1 hour
intervals.","title":"PISCO: Intertidal: site temperature data: Cape
Blanco, Oregon, USA (CBLX00)","project":"Partnership for
Interdisciplinary Studies of Coastal Oceans (PISCO)","author":"Bruce
Menge","contactOrganization":["Partnership for Interdisciplinary
Studies of Coastal Oceans (PISCO)","PISCO"],"keywords":["EARTH SCIENCE
: Oceans : Ocean Temperature : Water
Temperature","Temperature","Integrated Ocean Observing
System","IOOS","Oceanographic Sensor Data","Intertidal Temperature
Data","continental shelf","seawater","temperature","Oregon","United
States of America","PISCO"]},

...

]

Per the documentation at:
http://wiki.apache.org/solr/UpdateJSON

attempted post of json with:
curl 'http://localhost:8983/solr/update/json?commit=true'
--data-binary @datafile.json -H 'Content-type:application/json'

received the following status 400:

{"responseHeader":{"status":400,"QTime":1},"error":{"msg":"ERROR:
[doc=doi:10.6073/AA/knb-lter-bes.14.45] unknown field
'datasource'","code":400}}
/Applications/MAMP/htdocs$

Is there some place I should indicate what parameters are including in
the json objects send? I was able to test books.json without the
error.

Thanks in advance,
Patrice 



Re: overseer queue clogged

2013-03-15 Thread Mark Miller
What Solr version? 4.0, 4.1 4.2?

- Mark

On Mar 15, 2013, at 7:19 PM, Gary Yngve  wrote:

> my solr cloud has been running fine for weeks, but about a week ago, it
> stopped dequeueing from the overseer queue, and now there are thousands of
> tasks on the queue, most which look like
> 
> {
>  "operation":"state",
>  "numShards":null,
>  "shard":"shard3",
>  "roles":null,
>  "state":"recovering",
>  "core":"production_things_shard3_2",
>  "collection":"production_things",
>  "node_name":"10.31.41.59:8883_solr",
>  "base_url":"http://10.31.41.59:8883/solr"}
> 
> i'm trying to create a new collection through collection API, and
> obviously, nothing is happening...
> 
> any suggestion on how to fix this?  drop the queue in zk?
> 
> how could did it have gotten in this state in the first place?
> 
> thanks,
> gary



Re: solr cell

2013-03-15 Thread Arcadius Ahouansou
Another options similar to this would be the new file system
WatchService available in java 7:
http://docs.oracle.com/javase/tutorial/essential/io/notification.html


Arcadius.

On 15 March 2013 15:22, Michael Della Bitta
 wrote:
> Niklas,
>
> In Linux, the API for watching for filesystem changes is called
> inotify. You'd need to write something to listen to those events and
> react accordingly.
>
> Here's a brief discussion about it:
> http://stackoverflow.com/questions/4062806/inotify-how-to-use-it-linux
>
>
> Michael Della Bitta
>
> 
> Appinions
> 18 East 41st Street, 2nd Floor
> New York, NY 10017-6271
>
> www.appinions.com
>
> Where Influence Isn’t a Game
>
>
> On Fri, Mar 15, 2013 at 11:10 AM, Niklas Langvig
>  wrote:
>> We have all our documents (doc, docx, pdf) on a linux file server (~8 000 
>> 000 documents), is there a good way to update solr with documents that are 
>> added to the file server and deleted from the file server?
>> In windows you could have a wmi script that would get noticed when a 
>> document has been removed or added and then do appropriate update in solr.
>>
>> Can this be solved somehow?
>>
>> Thanks
>> Niklas


Re: overseer queue clogged

2013-03-15 Thread Gary Yngve
Sorry, should have specified.  4.1




On Fri, Mar 15, 2013 at 4:33 PM, Mark Miller  wrote:

> What Solr version? 4.0, 4.1 4.2?
>
> - Mark
>
> On Mar 15, 2013, at 7:19 PM, Gary Yngve  wrote:
>
> > my solr cloud has been running fine for weeks, but about a week ago, it
> > stopped dequeueing from the overseer queue, and now there are thousands
> of
> > tasks on the queue, most which look like
> >
> > {
> >  "operation":"state",
> >  "numShards":null,
> >  "shard":"shard3",
> >  "roles":null,
> >  "state":"recovering",
> >  "core":"production_things_shard3_2",
> >  "collection":"production_things",
> >  "node_name":"10.31.41.59:8883_solr",
> >  "base_url":"http://10.31.41.59:8883/solr"}
> >
> > i'm trying to create a new collection through collection API, and
> > obviously, nothing is happening...
> >
> > any suggestion on how to fix this?  drop the queue in zk?
> >
> > how could did it have gotten in this state in the first place?
> >
> > thanks,
> > gary
>
>


Re: overseer queue clogged

2013-03-15 Thread Gary Yngve
Also, looking at overseer_elect, everything looks fine.  node is valid and
live.


On Fri, Mar 15, 2013 at 4:47 PM, Gary Yngve  wrote:

> Sorry, should have specified.  4.1
>
>
>
>
> On Fri, Mar 15, 2013 at 4:33 PM, Mark Miller wrote:
>
>> What Solr version? 4.0, 4.1 4.2?
>>
>> - Mark
>>
>> On Mar 15, 2013, at 7:19 PM, Gary Yngve  wrote:
>>
>> > my solr cloud has been running fine for weeks, but about a week ago, it
>> > stopped dequeueing from the overseer queue, and now there are thousands
>> of
>> > tasks on the queue, most which look like
>> >
>> > {
>> >  "operation":"state",
>> >  "numShards":null,
>> >  "shard":"shard3",
>> >  "roles":null,
>> >  "state":"recovering",
>> >  "core":"production_things_shard3_2",
>> >  "collection":"production_things",
>> >  "node_name":"10.31.41.59:8883_solr",
>> >  "base_url":"http://10.31.41.59:8883/solr"}
>> >
>> > i'm trying to create a new collection through collection API, and
>> > obviously, nothing is happening...
>> >
>> > any suggestion on how to fix this?  drop the queue in zk?
>> >
>> > how could did it have gotten in this state in the first place?
>> >
>> > thanks,
>> > gary
>>
>>
>


Re: overseer queue clogged

2013-03-15 Thread Mark Miller
Strange - we hardened that loop in 4.1 - so I'm not sure what happened here.

Can you do a stack dump on the overseer and see if you see an Overseer thread 
running perhaps? Or just post the results?

To recover, you should be able to just restart the Overseer node and have 
someone else take over - they should pick up processing the queue.

Any logs you might be able to share could be useful too.

- Mark

On Mar 15, 2013, at 7:51 PM, Gary Yngve  wrote:

> Also, looking at overseer_elect, everything looks fine.  node is valid and
> live.
> 
> 
> On Fri, Mar 15, 2013 at 4:47 PM, Gary Yngve  wrote:
> 
>> Sorry, should have specified.  4.1
>> 
>> 
>> 
>> 
>> On Fri, Mar 15, 2013 at 4:33 PM, Mark Miller wrote:
>> 
>>> What Solr version? 4.0, 4.1 4.2?
>>> 
>>> - Mark
>>> 
>>> On Mar 15, 2013, at 7:19 PM, Gary Yngve  wrote:
>>> 
 my solr cloud has been running fine for weeks, but about a week ago, it
 stopped dequeueing from the overseer queue, and now there are thousands
>>> of
 tasks on the queue, most which look like
 
 {
 "operation":"state",
 "numShards":null,
 "shard":"shard3",
 "roles":null,
 "state":"recovering",
 "core":"production_things_shard3_2",
 "collection":"production_things",
 "node_name":"10.31.41.59:8883_solr",
 "base_url":"http://10.31.41.59:8883/solr"}
 
 i'm trying to create a new collection through collection API, and
 obviously, nothing is happening...
 
 any suggestion on how to fix this?  drop the queue in zk?
 
 how could did it have gotten in this state in the first place?
 
 thanks,
 gary
>>> 
>>> 
>> 



Re: overseer queue clogged

2013-03-15 Thread Gary Yngve
I restarted the overseer node and another took over, queues are empty now.

the server with core production_things_shard1_2
is having these errors:

shard update error RetryNode:
http://10.104.59.189:8883/solr/production_things_shard11_replica1/:org.apache.solr.client.solrj.SolrServerException:
Server refused connection at:
http://10.104.59.189:8883/solr/production_things_shard11_replica1

  for shard11!!!

I also got some strange errors on the restarted node.  Makes me wonder if
there is a string-matching bug for shard1 vs shard11?

SEVERE: :org.apache.solr.common.SolrException: Error getting leader from zk
  at org.apache.solr.cloud.ZkController.getLeader(ZkController.java:771)
  at org.apache.solr.cloud.ZkController.register(ZkController.java:683)
  at org.apache.solr.cloud.ZkController.register(ZkController.java:634)
  at org.apache.solr.core.CoreContainer.registerInZk(CoreContainer.java:890)
  at org.apache.solr.core.CoreContainer.registerCore(CoreContainer.java:874)
  at org.apache.solr.core.CoreContainer.register(CoreContainer.java:823)
  at org.apache.solr.core.CoreContainer$3.call(CoreContainer.java:633)
  at org.apache.solr.core.CoreContainer$3.call(CoreContainer.java:624)
  at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:334)
  at java.util.concurrent.FutureTask.run(FutureTask.java:166)
  at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
  at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:334)
  at java.util.concurrent.FutureTask.run(FutureTask.java:166)
  at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
  at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
  at java.lang.Thread.run(Thread.java:722)
Caused by: org.apache.solr.common.SolrException: There is conflicting
information about the leader
of shard: shard1 our state
says:http://10.104.59.189:8883/solr/collection1/but zookeeper
says:http
://10.217.55.151:8883/solr/collection1/
  at org.apache.solr.cloud.ZkController.getLeader(ZkController.java:756)

INFO: Releasing
directory:/vol/ubuntu/talemetry_match_solr/solr_server/solr/production_things_shar
d11_replica1/data/index
Mar 15, 2013 5:52:34 PM org.apache.solr.common.SolrException log
SEVERE: org.apache.solr.common.SolrException: Error opening new searcher
  at org.apache.solr.core.SolrCore.openNewSearcher(SolrCore.java:1423)
  at org.apache.solr.core.SolrCore.getSearcher(SolrCore.java:1535)

SEVERE: org.apache.solr.common.SolrException: I was asked to wait on state
recovering for 10.76.31.
67:8883_solr but I still do not see the requested state. I see state:
active live:true
  at
org.apache.solr.handler.admin.CoreAdminHandler.handleWaitForStateAction(CoreAdminHandler
.java:948)




On Fri, Mar 15, 2013 at 5:05 PM, Mark Miller  wrote:

> Strange - we hardened that loop in 4.1 - so I'm not sure what happened
> here.
>
> Can you do a stack dump on the overseer and see if you see an Overseer
> thread running perhaps? Or just post the results?
>
> To recover, you should be able to just restart the Overseer node and have
> someone else take over - they should pick up processing the queue.
>
> Any logs you might be able to share could be useful too.
>
> - Mark
>
> On Mar 15, 2013, at 7:51 PM, Gary Yngve  wrote:
>
> > Also, looking at overseer_elect, everything looks fine.  node is valid
> and
> > live.
> >
> >
> > On Fri, Mar 15, 2013 at 4:47 PM, Gary Yngve 
> wrote:
> >
> >> Sorry, should have specified.  4.1
> >>
> >>
> >>
> >>
> >> On Fri, Mar 15, 2013 at 4:33 PM, Mark Miller  >wrote:
> >>
> >>> What Solr version? 4.0, 4.1 4.2?
> >>>
> >>> - Mark
> >>>
> >>> On Mar 15, 2013, at 7:19 PM, Gary Yngve  wrote:
> >>>
>  my solr cloud has been running fine for weeks, but about a week ago,
> it
>  stopped dequeueing from the overseer queue, and now there are
> thousands
> >>> of
>  tasks on the queue, most which look like
> 
>  {
>  "operation":"state",
>  "numShards":null,
>  "shard":"shard3",
>  "roles":null,
>  "state":"recovering",
>  "core":"production_things_shard3_2",
>  "collection":"production_things",
>  "node_name":"10.31.41.59:8883_solr",
>  "base_url":"http://10.31.41.59:8883/solr"}
> 
>  i'm trying to create a new collection through collection API, and
>  obviously, nothing is happening...
> 
>  any suggestion on how to fix this?  drop the queue in zk?
> 
>  how could did it have gotten in this state in the first place?
> 
>  thanks,
>  gary
> >>>
> >>>
> >>
>
>


Re: overseer queue clogged

2013-03-15 Thread Gary Yngve
it doesn't appear to be a shard1 vs shard11 issue... 60% of my followers
are red now in the solr cloud graph.. trying to figure out what that
means...


On Fri, Mar 15, 2013 at 6:48 PM, Gary Yngve  wrote:

> I restarted the overseer node and another took over, queues are empty now.
>
> the server with core production_things_shard1_2
> is having these errors:
>
> shard update error RetryNode:
> http://10.104.59.189:8883/solr/production_things_shard11_replica1/:org.apache.solr.client.solrj.SolrServerException:
> Server refused connection at:
> http://10.104.59.189:8883/solr/production_things_shard11_replica1
>
>   for shard11!!!
>
> I also got some strange errors on the restarted node.  Makes me wonder if
> there is a string-matching bug for shard1 vs shard11?
>
> SEVERE: :org.apache.solr.common.SolrException: Error getting leader from zk
>   at org.apache.solr.cloud.ZkController.getLeader(ZkController.java:771)
>   at org.apache.solr.cloud.ZkController.register(ZkController.java:683)
>   at org.apache.solr.cloud.ZkController.register(ZkController.java:634)
>   at
> org.apache.solr.core.CoreContainer.registerInZk(CoreContainer.java:890)
>   at
> org.apache.solr.core.CoreContainer.registerCore(CoreContainer.java:874)
>   at org.apache.solr.core.CoreContainer.register(CoreContainer.java:823)
>   at org.apache.solr.core.CoreContainer$3.call(CoreContainer.java:633)
>   at org.apache.solr.core.CoreContainer$3.call(CoreContainer.java:624)
>   at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:334)
>   at java.util.concurrent.FutureTask.run(FutureTask.java:166)
>   at
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
>   at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:334)
>   at java.util.concurrent.FutureTask.run(FutureTask.java:166)
>   at
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
>   at
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
>   at java.lang.Thread.run(Thread.java:722)
> Caused by: org.apache.solr.common.SolrException: There is conflicting
> information about the leader
> of shard: shard1 our state says:
> http://10.104.59.189:8883/solr/collection1/ but zookeeper says:http
> ://10.217.55.151:8883/solr/collection1/
>   at org.apache.solr.cloud.ZkController.getLeader(ZkController.java:756)
>
> INFO: Releasing
> directory:/vol/ubuntu/talemetry_match_solr/solr_server/solr/production_things_shar
> d11_replica1/data/index
> Mar 15, 2013 5:52:34 PM org.apache.solr.common.SolrException log
> SEVERE: org.apache.solr.common.SolrException: Error opening new searcher
>   at org.apache.solr.core.SolrCore.openNewSearcher(SolrCore.java:1423)
>   at org.apache.solr.core.SolrCore.getSearcher(SolrCore.java:1535)
>
> SEVERE: org.apache.solr.common.SolrException: I was asked to wait on state
> recovering for 10.76.31.
> 67:8883_solr but I still do not see the requested state. I see state:
> active live:true
>   at
> org.apache.solr.handler.admin.CoreAdminHandler.handleWaitForStateAction(CoreAdminHandler
> .java:948)
>
>
>
>
> On Fri, Mar 15, 2013 at 5:05 PM, Mark Miller wrote:
>
>> Strange - we hardened that loop in 4.1 - so I'm not sure what happened
>> here.
>>
>> Can you do a stack dump on the overseer and see if you see an Overseer
>> thread running perhaps? Or just post the results?
>>
>> To recover, you should be able to just restart the Overseer node and have
>> someone else take over - they should pick up processing the queue.
>>
>> Any logs you might be able to share could be useful too.
>>
>> - Mark
>>
>> On Mar 15, 2013, at 7:51 PM, Gary Yngve  wrote:
>>
>> > Also, looking at overseer_elect, everything looks fine.  node is valid
>> and
>> > live.
>> >
>> >
>> > On Fri, Mar 15, 2013 at 4:47 PM, Gary Yngve 
>> wrote:
>> >
>> >> Sorry, should have specified.  4.1
>> >>
>> >>
>> >>
>> >>
>> >> On Fri, Mar 15, 2013 at 4:33 PM, Mark Miller > >wrote:
>> >>
>> >>> What Solr version? 4.0, 4.1 4.2?
>> >>>
>> >>> - Mark
>> >>>
>> >>> On Mar 15, 2013, at 7:19 PM, Gary Yngve  wrote:
>> >>>
>>  my solr cloud has been running fine for weeks, but about a week ago,
>> it
>>  stopped dequeueing from the overseer queue, and now there are
>> thousands
>> >>> of
>>  tasks on the queue, most which look like
>> 
>>  {
>>  "operation":"state",
>>  "numShards":null,
>>  "shard":"shard3",
>>  "roles":null,
>>  "state":"recovering",
>>  "core":"production_things_shard3_2",
>>  "collection":"production_things",
>>  "node_name":"10.31.41.59:8883_solr",
>>  "base_url":"http://10.31.41.59:8883/solr"}
>> 
>>  i'm trying to create a new collection through collection API, and
>>  obviously, nothing is happening...
>> 
>>  any suggestion on how to fix this?  drop the queue in zk?
>> 
>>  how could did it have gotten in this state in the first place?
>> 
>>  thanks,
>>  gary
>> >>>
>> >>>
>> >>
>>
>>
>


Re: overseer queue clogged

2013-03-15 Thread Gary Yngve
i think those followers are red from trying to forward requests to the
overseer while it was being restarted.  i guess i'll see if they become
green over time.  or i guess i can restart them one at a time..


On Fri, Mar 15, 2013 at 6:53 PM, Gary Yngve  wrote:

> it doesn't appear to be a shard1 vs shard11 issue... 60% of my followers
> are red now in the solr cloud graph.. trying to figure out what that
> means...
>
>
> On Fri, Mar 15, 2013 at 6:48 PM, Gary Yngve  wrote:
>
>> I restarted the overseer node and another took over, queues are empty now.
>>
>> the server with core production_things_shard1_2
>> is having these errors:
>>
>> shard update error RetryNode:
>> http://10.104.59.189:8883/solr/production_things_shard11_replica1/:org.apache.solr.client.solrj.SolrServerException:
>> Server refused connection at:
>> http://10.104.59.189:8883/solr/production_things_shard11_replica1
>>
>>   for shard11!!!
>>
>> I also got some strange errors on the restarted node.  Makes me wonder if
>> there is a string-matching bug for shard1 vs shard11?
>>
>> SEVERE: :org.apache.solr.common.SolrException: Error getting leader from
>> zk
>>   at org.apache.solr.cloud.ZkController.getLeader(ZkController.java:771)
>>   at org.apache.solr.cloud.ZkController.register(ZkController.java:683)
>>   at org.apache.solr.cloud.ZkController.register(ZkController.java:634)
>>   at
>> org.apache.solr.core.CoreContainer.registerInZk(CoreContainer.java:890)
>>   at
>> org.apache.solr.core.CoreContainer.registerCore(CoreContainer.java:874)
>>   at org.apache.solr.core.CoreContainer.register(CoreContainer.java:823)
>>   at org.apache.solr.core.CoreContainer$3.call(CoreContainer.java:633)
>>   at org.apache.solr.core.CoreContainer$3.call(CoreContainer.java:624)
>>   at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:334)
>>   at java.util.concurrent.FutureTask.run(FutureTask.java:166)
>>   at
>> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
>>   at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:334)
>>   at java.util.concurrent.FutureTask.run(FutureTask.java:166)
>>   at
>> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
>>   at
>> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
>>   at java.lang.Thread.run(Thread.java:722)
>> Caused by: org.apache.solr.common.SolrException: There is conflicting
>> information about the leader
>> of shard: shard1 our state says:
>> http://10.104.59.189:8883/solr/collection1/ but zookeeper says:http
>> ://10.217.55.151:8883/solr/collection1/
>>   at org.apache.solr.cloud.ZkController.getLeader(ZkController.java:756)
>>
>> INFO: Releasing
>> directory:/vol/ubuntu/talemetry_match_solr/solr_server/solr/production_things_shar
>> d11_replica1/data/index
>> Mar 15, 2013 5:52:34 PM org.apache.solr.common.SolrException log
>> SEVERE: org.apache.solr.common.SolrException: Error opening new searcher
>>   at org.apache.solr.core.SolrCore.openNewSearcher(SolrCore.java:1423)
>>   at org.apache.solr.core.SolrCore.getSearcher(SolrCore.java:1535)
>>
>> SEVERE: org.apache.solr.common.SolrException: I was asked to wait on
>> state recovering for 10.76.31.
>> 67:8883_solr but I still do not see the requested state. I see state:
>> active live:true
>>   at
>> org.apache.solr.handler.admin.CoreAdminHandler.handleWaitForStateAction(CoreAdminHandler
>> .java:948)
>>
>>
>>
>>
>> On Fri, Mar 15, 2013 at 5:05 PM, Mark Miller wrote:
>>
>>> Strange - we hardened that loop in 4.1 - so I'm not sure what happened
>>> here.
>>>
>>> Can you do a stack dump on the overseer and see if you see an Overseer
>>> thread running perhaps? Or just post the results?
>>>
>>> To recover, you should be able to just restart the Overseer node and
>>> have someone else take over - they should pick up processing the queue.
>>>
>>> Any logs you might be able to share could be useful too.
>>>
>>> - Mark
>>>
>>> On Mar 15, 2013, at 7:51 PM, Gary Yngve  wrote:
>>>
>>> > Also, looking at overseer_elect, everything looks fine.  node is valid
>>> and
>>> > live.
>>> >
>>> >
>>> > On Fri, Mar 15, 2013 at 4:47 PM, Gary Yngve 
>>> wrote:
>>> >
>>> >> Sorry, should have specified.  4.1
>>> >>
>>> >>
>>> >>
>>> >>
>>> >> On Fri, Mar 15, 2013 at 4:33 PM, Mark Miller >> >wrote:
>>> >>
>>> >>> What Solr version? 4.0, 4.1 4.2?
>>> >>>
>>> >>> - Mark
>>> >>>
>>> >>> On Mar 15, 2013, at 7:19 PM, Gary Yngve 
>>> wrote:
>>> >>>
>>>  my solr cloud has been running fine for weeks, but about a week
>>> ago, it
>>>  stopped dequeueing from the overseer queue, and now there are
>>> thousands
>>> >>> of
>>>  tasks on the queue, most which look like
>>> 
>>>  {
>>>  "operation":"state",
>>>  "numShards":null,
>>>  "shard":"shard3",
>>>  "roles":null,
>>>  "state":"recovering",
>>>  "core":"production_things_shard3_2",
>>>  "collection":"production_things",
>>>  "node_name":"10.31.41.59:8883_solr",
>>>  "base_url"

Re: overseer queue clogged

2013-03-15 Thread Mark Miller
It looks like they are not picking up the new leader state for some reason…

Thats where where it say the local state doesn't match the zookeeper state. If 
the local state doesn't match the zookeeper state in a short amount of time 
when a new leader comes, everything will bail because it assumes something is 
wrong.

There are a fair number of SolrCloud bug fixes in 4.2 by the way. We didn't do 
a 4.1.1, but I would recommend you update. I don't know that it solves this 
particular issue. I'm going to continue investigating.

- Mark

On Mar 15, 2013, at 9:53 PM, Gary Yngve  wrote:

> it doesn't appear to be a shard1 vs shard11 issue... 60% of my followers
> are red now in the solr cloud graph.. trying to figure out what that
> means...
> 
> 
> On Fri, Mar 15, 2013 at 6:48 PM, Gary Yngve  wrote:
> 
>> I restarted the overseer node and another took over, queues are empty now.
>> 
>> the server with core production_things_shard1_2
>> is having these errors:
>> 
>> shard update error RetryNode:
>> http://10.104.59.189:8883/solr/production_things_shard11_replica1/:org.apache.solr.client.solrj.SolrServerException:
>> Server refused connection at:
>> http://10.104.59.189:8883/solr/production_things_shard11_replica1
>> 
>>  for shard11!!!
>> 
>> I also got some strange errors on the restarted node.  Makes me wonder if
>> there is a string-matching bug for shard1 vs shard11?
>> 
>> SEVERE: :org.apache.solr.common.SolrException: Error getting leader from zk
>>  at org.apache.solr.cloud.ZkController.getLeader(ZkController.java:771)
>>  at org.apache.solr.cloud.ZkController.register(ZkController.java:683)
>>  at org.apache.solr.cloud.ZkController.register(ZkController.java:634)
>>  at
>> org.apache.solr.core.CoreContainer.registerInZk(CoreContainer.java:890)
>>  at
>> org.apache.solr.core.CoreContainer.registerCore(CoreContainer.java:874)
>>  at org.apache.solr.core.CoreContainer.register(CoreContainer.java:823)
>>  at org.apache.solr.core.CoreContainer$3.call(CoreContainer.java:633)
>>  at org.apache.solr.core.CoreContainer$3.call(CoreContainer.java:624)
>>  at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:334)
>>  at java.util.concurrent.FutureTask.run(FutureTask.java:166)
>>  at
>> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
>>  at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:334)
>>  at java.util.concurrent.FutureTask.run(FutureTask.java:166)
>>  at
>> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
>>  at
>> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
>>  at java.lang.Thread.run(Thread.java:722)
>> Caused by: org.apache.solr.common.SolrException: There is conflicting
>> information about the leader
>> of shard: shard1 our state says:
>> http://10.104.59.189:8883/solr/collection1/ but zookeeper says:http
>> ://10.217.55.151:8883/solr/collection1/
>>  at org.apache.solr.cloud.ZkController.getLeader(ZkController.java:756)
>> 
>> INFO: Releasing
>> directory:/vol/ubuntu/talemetry_match_solr/solr_server/solr/production_things_shar
>> d11_replica1/data/index
>> Mar 15, 2013 5:52:34 PM org.apache.solr.common.SolrException log
>> SEVERE: org.apache.solr.common.SolrException: Error opening new searcher
>>  at org.apache.solr.core.SolrCore.openNewSearcher(SolrCore.java:1423)
>>  at org.apache.solr.core.SolrCore.getSearcher(SolrCore.java:1535)
>> 
>> SEVERE: org.apache.solr.common.SolrException: I was asked to wait on state
>> recovering for 10.76.31.
>> 67:8883_solr but I still do not see the requested state. I see state:
>> active live:true
>>  at
>> org.apache.solr.handler.admin.CoreAdminHandler.handleWaitForStateAction(CoreAdminHandler
>> .java:948)
>> 
>> 
>> 
>> 
>> On Fri, Mar 15, 2013 at 5:05 PM, Mark Miller wrote:
>> 
>>> Strange - we hardened that loop in 4.1 - so I'm not sure what happened
>>> here.
>>> 
>>> Can you do a stack dump on the overseer and see if you see an Overseer
>>> thread running perhaps? Or just post the results?
>>> 
>>> To recover, you should be able to just restart the Overseer node and have
>>> someone else take over - they should pick up processing the queue.
>>> 
>>> Any logs you might be able to share could be useful too.
>>> 
>>> - Mark
>>> 
>>> On Mar 15, 2013, at 7:51 PM, Gary Yngve  wrote:
>>> 
 Also, looking at overseer_elect, everything looks fine.  node is valid
>>> and
 live.
 
 
 On Fri, Mar 15, 2013 at 4:47 PM, Gary Yngve 
>>> wrote:
 
> Sorry, should have specified.  4.1
> 
> 
> 
> 
> On Fri, Mar 15, 2013 at 4:33 PM, Mark Miller >>> wrote:
> 
>> What Solr version? 4.0, 4.1 4.2?
>> 
>> - Mark
>> 
>> On Mar 15, 2013, at 7:19 PM, Gary Yngve  wrote:
>> 
>>> my solr cloud has been running fine for weeks, but about a week ago,
>>> it
>>> stopped dequeueing from the overseer queue, and now there are
>>> thousands
>> of
>>> tasks on the queue, most which look like
>>

Re: overseer queue clogged

2013-03-15 Thread Mark Miller

On Mar 15, 2013, at 10:04 PM, Gary Yngve  wrote:

> i think those followers are red from trying to forward requests to the
> overseer while it was being restarted.  i guess i'll see if they become
> green over time.  or i guess i can restart them one at a time..

Restarting the cluster clear things up. It shouldn't take too long for those 
nodes to recover though - they should have been up to date before. The couple 
exceptions you posted def indicate something is out of whack. It's something 
I'd like to get to the bottom of.

- Mark

> 
> 
> On Fri, Mar 15, 2013 at 6:53 PM, Gary Yngve  wrote:
> 
>> it doesn't appear to be a shard1 vs shard11 issue... 60% of my followers
>> are red now in the solr cloud graph.. trying to figure out what that
>> means...
>> 
>> 
>> On Fri, Mar 15, 2013 at 6:48 PM, Gary Yngve  wrote:
>> 
>>> I restarted the overseer node and another took over, queues are empty now.
>>> 
>>> the server with core production_things_shard1_2
>>> is having these errors:
>>> 
>>> shard update error RetryNode:
>>> http://10.104.59.189:8883/solr/production_things_shard11_replica1/:org.apache.solr.client.solrj.SolrServerException:
>>> Server refused connection at:
>>> http://10.104.59.189:8883/solr/production_things_shard11_replica1
>>> 
>>>  for shard11!!!
>>> 
>>> I also got some strange errors on the restarted node.  Makes me wonder if
>>> there is a string-matching bug for shard1 vs shard11?
>>> 
>>> SEVERE: :org.apache.solr.common.SolrException: Error getting leader from
>>> zk
>>>  at org.apache.solr.cloud.ZkController.getLeader(ZkController.java:771)
>>>  at org.apache.solr.cloud.ZkController.register(ZkController.java:683)
>>>  at org.apache.solr.cloud.ZkController.register(ZkController.java:634)
>>>  at
>>> org.apache.solr.core.CoreContainer.registerInZk(CoreContainer.java:890)
>>>  at
>>> org.apache.solr.core.CoreContainer.registerCore(CoreContainer.java:874)
>>>  at org.apache.solr.core.CoreContainer.register(CoreContainer.java:823)
>>>  at org.apache.solr.core.CoreContainer$3.call(CoreContainer.java:633)
>>>  at org.apache.solr.core.CoreContainer$3.call(CoreContainer.java:624)
>>>  at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:334)
>>>  at java.util.concurrent.FutureTask.run(FutureTask.java:166)
>>>  at
>>> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
>>>  at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:334)
>>>  at java.util.concurrent.FutureTask.run(FutureTask.java:166)
>>>  at
>>> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
>>>  at
>>> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
>>>  at java.lang.Thread.run(Thread.java:722)
>>> Caused by: org.apache.solr.common.SolrException: There is conflicting
>>> information about the leader
>>> of shard: shard1 our state says:
>>> http://10.104.59.189:8883/solr/collection1/ but zookeeper says:http
>>> ://10.217.55.151:8883/solr/collection1/
>>>  at org.apache.solr.cloud.ZkController.getLeader(ZkController.java:756)
>>> 
>>> INFO: Releasing
>>> directory:/vol/ubuntu/talemetry_match_solr/solr_server/solr/production_things_shar
>>> d11_replica1/data/index
>>> Mar 15, 2013 5:52:34 PM org.apache.solr.common.SolrException log
>>> SEVERE: org.apache.solr.common.SolrException: Error opening new searcher
>>>  at org.apache.solr.core.SolrCore.openNewSearcher(SolrCore.java:1423)
>>>  at org.apache.solr.core.SolrCore.getSearcher(SolrCore.java:1535)
>>> 
>>> SEVERE: org.apache.solr.common.SolrException: I was asked to wait on
>>> state recovering for 10.76.31.
>>> 67:8883_solr but I still do not see the requested state. I see state:
>>> active live:true
>>>  at
>>> org.apache.solr.handler.admin.CoreAdminHandler.handleWaitForStateAction(CoreAdminHandler
>>> .java:948)
>>> 
>>> 
>>> 
>>> 
>>> On Fri, Mar 15, 2013 at 5:05 PM, Mark Miller wrote:
>>> 
 Strange - we hardened that loop in 4.1 - so I'm not sure what happened
 here.
 
 Can you do a stack dump on the overseer and see if you see an Overseer
 thread running perhaps? Or just post the results?
 
 To recover, you should be able to just restart the Overseer node and
 have someone else take over - they should pick up processing the queue.
 
 Any logs you might be able to share could be useful too.
 
 - Mark
 
 On Mar 15, 2013, at 7:51 PM, Gary Yngve  wrote:
 
> Also, looking at overseer_elect, everything looks fine.  node is valid
 and
> live.
> 
> 
> On Fri, Mar 15, 2013 at 4:47 PM, Gary Yngve 
 wrote:
> 
>> Sorry, should have specified.  4.1
>> 
>> 
>> 
>> 
>> On Fri, Mar 15, 2013 at 4:33 PM, Mark Miller  wrote:
>> 
>>> What Solr version? 4.0, 4.1 4.2?
>>> 
>>> - Mark
>>> 
>>> On Mar 15, 2013, at 7:19 PM, Gary Yngve 
 wrote:
>>> 
 my solr cloud has been running fine for weeks, but about a week
 ago, it
 s

Re: structure of solr index

2013-03-15 Thread Otis Gospodnetic
Hi,

I think you are asking if the original/raw content of those fields will be
read.  No, it won't, not for the search itself.  If you want to
retrieve/return those fields then, of course, they will be read for the
documents being returned.

Otis
--
Solr & ElasticSearch Support
http://sematext.com/





On Fri, Mar 15, 2013 at 2:41 PM,  wrote:

> Hi,
>
> I wondered if solr searches on indexed fields only or on entire index? In
> more detail, let say I have fields id,  title and content, all  indexed,
> stored. Will a search send all these fields to memory or only indexed part
> of these fields?
>
> Thanks.
> Alex.
>
>
>


Re: Solr indexing binary files

2013-03-15 Thread Gora Mohanty
On 16 March 2013 00:30, Luis  wrote:
> Sorry, Gora.  It is ${fileSourcePaths.urlpath} actually.

Most likely, there is some issue with the selected urlpath
not pointing to a proper http or file source. E.g., urlpath
could be something like http://example.com/myfile.pdf .
Please check that ${fileSourcePaths.urlpath} points to a
proper resource.

> *My complete schema.xml is this:*
[...]

This looks fine.

Regards,
Gora


Re: How should I configure Solr to support multi-word synonyms?

2013-03-15 Thread Felipe Lahti
Hi,

I also have been using that plugin (https://github.com/healthonnet/hon-
lucene-synonyms) in a project and it's been working pretty well. But I
think Solr should handle multi-word synonyms natively (BTW, there is a
story in jira for that https://issues.apache.org/jira/browse/SOLR-4381).
One downside is that project doesn't have any unit tests or component tests
ensuring his functionality so you will need to be more careful and cover it
with tests by yourself.

Best,


On Mon, Mar 4, 2013 at 7:32 PM, Jan Høydahl  wrote:

> Hi,
>
> I have been using this plugin with success:
> https://github.com/healthonnet/hon-lucene-synonyms
> While it gives you multi-word synonyms, you lose the ability to have
> different synonym dictionaries per field.
>
> --
> Jan Høydahl, search solution architect
> Cominvent AS - www.cominvent.com
> Solr Training - www.solrtraining.com
>
> 4. mars 2013 kl. 19:40 skrev David Sharpe <
> david.sha...@seekersolutions.com>:
>
> > Hello Solr mailing list,
> >
> > I have read many posts and run many tests, but still I cannot get
> > multi-word synonyms behaving the way I think they should. I would
> > appreciate your advice.
> >
> > Here is an example of the behaviour I am trying to achieve:
> >
> > *# Given synonyms.txt
> > wordOne, phrase one
> > *
> >
> >
> >   1. At index time, a document containing "wordOne" should expand to
> >   "wordOne | phrase one". A query for "wordOne" or "phrase one" should
> find
> >   the document, but a query for just "phrase" or "one" should not find
> the
> >   document.
> >
> >   2. Conversely, a document containing "phrase one" should expand to
> >   "phrase one | wordOne". A query for "wordOne" or "phrase one" should
> find
> >   the document. (Depending on field tokenization, I would also expect
> >   "phrase" and "one" to find the document.)
> >
> > To attempt to achieve this behaviour, I have downloaded Solr 4.1.0 and
> made
> > the following changes to
> > "solr-4.1.0\example\solr\collection1\conf\schema.xml":
> >
> > https://gist.github.com/sharpedavid/5072150
> >
> >
> > (Note that I set SynonymFilterFactor
> > tokenizerFactory="solr.KeywordTokenizerFactory". This is to prevent
> > "wordOne" from being expanded to "wordOne | phrase | one".)
> >
> > Achieving the first behaviour (i.e. number one in the above list) seems
> > difficult. A query for "wordOne" returns the document, but a query for
> > "phrase one" returns nothing. I realized that the query tokenizer
> tokenized
> > my query for "phrase one", so I changed the query tokenizer to
> > KeywordTokenizer, which achieves the desired behaviour, but now queries
> are
> > not tokenized at all, which breaks other desirable behaviour.
> >
> > The second behaviour (i.e. number two in the above list) has similar
> > problems, but no solution that I can see. If the index tokenizer is
> > StandardTokenizer, "phrase one" is tokenized to "phrase | one", so the
> > equivalent synonym is not matched. If I change the index tokenizer to
> > KeywordTokenizer, it does match; however, KeywordTokenizer will treat the
> > entire field as a a single token, so a document containing "something
> > phrase one something" will not match the equivalent synonym, and also a
> > query for "phrase" or "one" will not find the document.
> >
> > Thank you for your time.
> >
> > Sincerely,
> > David Sharpe
>
>


-- 
Felipe Lahti
Consultant Developer - ThoughtWorks Porto Alegre


Re: overseer queue clogged

2013-03-15 Thread Gary Yngve
I will upgrade to 4.2 this weekend and see what happens.  We are on ec2 and
have had a few issues with hostnames with both zk and solr. (but in this
case i haven't rebooted any instances either)

it's relatively a pain to do the upgrade because we have a query/scorer
fork of lucene along with supplemental jars, and zk cannot distribute
binary jars via the config.

we are also multi-collection per zk... i wish it didn't require a core
always defined up front for the core admin?  i would love to have an
instance have no cores and then just create the core i need..

-g



On Fri, Mar 15, 2013 at 7:14 PM, Mark Miller  wrote:

>
> On Mar 15, 2013, at 10:04 PM, Gary Yngve  wrote:
>
> > i think those followers are red from trying to forward requests to the
> > overseer while it was being restarted.  i guess i'll see if they become
> > green over time.  or i guess i can restart them one at a time..
>
> Restarting the cluster clear things up. It shouldn't take too long for
> those nodes to recover though - they should have been up to date before.
> The couple exceptions you posted def indicate something is out of whack.
> It's something I'd like to get to the bottom of.
>
> - Mark
>
> >
> >
> > On Fri, Mar 15, 2013 at 6:53 PM, Gary Yngve 
> wrote:
> >
> >> it doesn't appear to be a shard1 vs shard11 issue... 60% of my followers
> >> are red now in the solr cloud graph.. trying to figure out what that
> >> means...
> >>
> >>
> >> On Fri, Mar 15, 2013 at 6:48 PM, Gary Yngve 
> wrote:
> >>
> >>> I restarted the overseer node and another took over, queues are empty
> now.
> >>>
> >>> the server with core production_things_shard1_2
> >>> is having these errors:
> >>>
> >>> shard update error RetryNode:
> >>>
> http://10.104.59.189:8883/solr/production_things_shard11_replica1/:org.apache.solr.client.solrj.SolrServerException
> :
> >>> Server refused connection at:
> >>> http://10.104.59.189:8883/solr/production_things_shard11_replica1
> >>>
> >>>  for shard11!!!
> >>>
> >>> I also got some strange errors on the restarted node.  Makes me wonder
> if
> >>> there is a string-matching bug for shard1 vs shard11?
> >>>
> >>> SEVERE: :org.apache.solr.common.SolrException: Error getting leader
> from
> >>> zk
> >>>  at org.apache.solr.cloud.ZkController.getLeader(ZkController.java:771)
> >>>  at org.apache.solr.cloud.ZkController.register(ZkController.java:683)
> >>>  at org.apache.solr.cloud.ZkController.register(ZkController.java:634)
> >>>  at
> >>> org.apache.solr.core.CoreContainer.registerInZk(CoreContainer.java:890)
> >>>  at
> >>> org.apache.solr.core.CoreContainer.registerCore(CoreContainer.java:874)
> >>>  at org.apache.solr.core.CoreContainer.register(CoreContainer.java:823)
> >>>  at org.apache.solr.core.CoreContainer$3.call(CoreContainer.java:633)
> >>>  at org.apache.solr.core.CoreContainer$3.call(CoreContainer.java:624)
> >>>  at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:334)
> >>>  at java.util.concurrent.FutureTask.run(FutureTask.java:166)
> >>>  at
> >>> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
> >>>  at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:334)
> >>>  at java.util.concurrent.FutureTask.run(FutureTask.java:166)
> >>>  at
> >>>
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
> >>>  at
> >>>
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
> >>>  at java.lang.Thread.run(Thread.java:722)
> >>> Caused by: org.apache.solr.common.SolrException: There is conflicting
> >>> information about the leader
> >>> of shard: shard1 our state says:
> >>> http://10.104.59.189:8883/solr/collection1/ but zookeeper says:http
> >>> ://10.217.55.151:8883/solr/collection1/
> >>>  at org.apache.solr.cloud.ZkController.getLeader(ZkController.java:756)
> >>>
> >>> INFO: Releasing
> >>>
> directory:/vol/ubuntu/talemetry_match_solr/solr_server/solr/production_things_shar
> >>> d11_replica1/data/index
> >>> Mar 15, 2013 5:52:34 PM org.apache.solr.common.SolrException log
> >>> SEVERE: org.apache.solr.common.SolrException: Error opening new
> searcher
> >>>  at org.apache.solr.core.SolrCore.openNewSearcher(SolrCore.java:1423)
> >>>  at org.apache.solr.core.SolrCore.getSearcher(SolrCore.java:1535)
> >>>
> >>> SEVERE: org.apache.solr.common.SolrException: I was asked to wait on
> >>> state recovering for 10.76.31.
> >>> 67:8883_solr but I still do not see the requested state. I see state:
> >>> active live:true
> >>>  at
> >>>
> org.apache.solr.handler.admin.CoreAdminHandler.handleWaitForStateAction(CoreAdminHandler
> >>> .java:948)
> >>>
> >>>
> >>>
> >>>
> >>> On Fri, Mar 15, 2013 at 5:05 PM, Mark Miller  >wrote:
> >>>
>  Strange - we hardened that loop in 4.1 - so I'm not sure what happened
>  here.
> 
>  Can you do a stack dump on the overseer and see if you see an Overseer
>  thread running perhaps? Or just post the results?
> 
>  To recover, you should be able to just r

Re: overseer queue clogged

2013-03-15 Thread Mark Miller

On Mar 16, 2013, at 12:30 AM, Gary Yngve  wrote:

> I will upgrade to 4.2 this weekend and see what happens.  We are on ec2 and
> have had a few issues with hostnames with both zk and solr. (but in this
> case i haven't rebooted any instances either)

There is actually a new feature in 4.2 that lets you specify arbitrary node 
names so that new ips can take over for old nodes. You just have to do this up 
front...

> 
> it's relatively a pain to do the upgrade because we have a query/scorer
> fork of lucene along with supplemental jars, and zk cannot distribute
> binary jars via the config.

There is a JIRA issue for this and it's on my list if no one gets it in before 
me.

> 
> we are also multi-collection per zk... i wish it didn't require a core
> always defined up front for the core admin?  i would love to have an
> instance have no cores and then just create the core i need..

You can do this - just modify your starting Solr example to have no cores in 
solr.xml. You won't be able to make use of the admin UI until you create at 
least one core, but the core and collection apis will both work fine.

- Mark

> 
> -g
> 
> 
> 
> On Fri, Mar 15, 2013 at 7:14 PM, Mark Miller  wrote:
> 
>> 
>> On Mar 15, 2013, at 10:04 PM, Gary Yngve  wrote:
>> 
>>> i think those followers are red from trying to forward requests to the
>>> overseer while it was being restarted.  i guess i'll see if they become
>>> green over time.  or i guess i can restart them one at a time..
>> 
>> Restarting the cluster clear things up. It shouldn't take too long for
>> those nodes to recover though - they should have been up to date before.
>> The couple exceptions you posted def indicate something is out of whack.
>> It's something I'd like to get to the bottom of.
>> 
>> - Mark
>> 
>>> 
>>> 
>>> On Fri, Mar 15, 2013 at 6:53 PM, Gary Yngve 
>> wrote:
>>> 
 it doesn't appear to be a shard1 vs shard11 issue... 60% of my followers
 are red now in the solr cloud graph.. trying to figure out what that
 means...
 
 
 On Fri, Mar 15, 2013 at 6:48 PM, Gary Yngve 
>> wrote:
 
> I restarted the overseer node and another took over, queues are empty
>> now.
> 
> the server with core production_things_shard1_2
> is having these errors:
> 
> shard update error RetryNode:
> 
>> http://10.104.59.189:8883/solr/production_things_shard11_replica1/:org.apache.solr.client.solrj.SolrServerException
>> :
> Server refused connection at:
> http://10.104.59.189:8883/solr/production_things_shard11_replica1
> 
> for shard11!!!
> 
> I also got some strange errors on the restarted node.  Makes me wonder
>> if
> there is a string-matching bug for shard1 vs shard11?
> 
> SEVERE: :org.apache.solr.common.SolrException: Error getting leader
>> from
> zk
> at org.apache.solr.cloud.ZkController.getLeader(ZkController.java:771)
> at org.apache.solr.cloud.ZkController.register(ZkController.java:683)
> at org.apache.solr.cloud.ZkController.register(ZkController.java:634)
> at
> org.apache.solr.core.CoreContainer.registerInZk(CoreContainer.java:890)
> at
> org.apache.solr.core.CoreContainer.registerCore(CoreContainer.java:874)
> at org.apache.solr.core.CoreContainer.register(CoreContainer.java:823)
> at org.apache.solr.core.CoreContainer$3.call(CoreContainer.java:633)
> at org.apache.solr.core.CoreContainer$3.call(CoreContainer.java:624)
> at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:334)
> at java.util.concurrent.FutureTask.run(FutureTask.java:166)
> at
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
> at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:334)
> at java.util.concurrent.FutureTask.run(FutureTask.java:166)
> at
> 
>> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
> at
> 
>> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
> at java.lang.Thread.run(Thread.java:722)
> Caused by: org.apache.solr.common.SolrException: There is conflicting
> information about the leader
> of shard: shard1 our state says:
> http://10.104.59.189:8883/solr/collection1/ but zookeeper says:http
> ://10.217.55.151:8883/solr/collection1/
> at org.apache.solr.cloud.ZkController.getLeader(ZkController.java:756)
> 
> INFO: Releasing
> 
>> directory:/vol/ubuntu/talemetry_match_solr/solr_server/solr/production_things_shar
> d11_replica1/data/index
> Mar 15, 2013 5:52:34 PM org.apache.solr.common.SolrException log
> SEVERE: org.apache.solr.common.SolrException: Error opening new
>> searcher
> at org.apache.solr.core.SolrCore.openNewSearcher(SolrCore.java:1423)
> at org.apache.solr.core.SolrCore.getSearcher(SolrCore.java:1535)
> 
> SEVERE: org.apache.solr.common.SolrException: I was asked to wait on
> state recovering for 10.76.31.

Re: status 400 on posting json

2013-03-15 Thread Patrice Seyed
Hi,

Re:

-
> Is there some place I should indicate what parameters are including in
> the json objects send? I was able to test books.json without the
> error.

"Yes, in Solr's schema.xml (under the conf/ directory).  See
 for more details.

Erik Hatcher"

and:

-

"I tried it and I get the same error response! Which is because... I
don't have a field named "datasource".

You need to check the Solr schema.xml for the available fields and
then add any fields that your JSON uses that are not already there. Be
sure to shutdown and restart Solr after editing the schema.

I did notice that there is a "keywords" field, but it is not
multivalued, while you keywords are multivalued.

Or, you can us dynamic fields, such as "datasource_s" and "keywords_ss
("s" for string and a second "s" for multivalued), etc. for your other
fields.

-- Jack Krupansky"

-

Thanks very much for these responses.  I'm still confused by the fact
that there is no schema.xml for the books.json fields though. Was this
compiled into the start.jar? Also, where do I place my crafted
schema.xml? I tried in ~/solr-4.1.0/example/conf where the start.jar
is in/run from ~/solr-4.1.0/example but so far is still complaining
about not finding the 'datasource' field. (also tried placing it in
~/solr-4.1.0/example/solr/conf) (my current schema.xml is further
below.)

Also Jack, would it be better to parse this keywords field value
differently for it to be appropriately multivalued?:

"keywords":["EARTH SCIENCE
: Oceans : Ocean Temperature : Water
Temperature","Temperature","Integrated Ocean Observing
System","IOOS","Oceanographic Sensor Data","Intertidal Temperature
Data","continental shelf","seawater","temperature","Oregon","United
States of America","PISCO"]

Thanks,
Patrice