Re: Multi-Select Faceting with delimited field values

2012-10-04 Thread Mikhail Khludnev
The only way to do that is split your attributes, which are concatenations
of attr and val. you should have color attr with vals red, green, blue;
hdmi: yes/no; speaker: yes/no.
04.10.2012 5:19 пользователь "Aaron Bains"  написал:

> I am trying to set up my query for multi-select faceting, here is my
> attempt at the query:
>
>
> q=category:monitors&fq=attribute:(color-black)&facet.field=attribute&facet=true
>
> The undesired response from the query:
>
> 
> 
> 
> 1019141675
> 
> 
> 
> 
> 
> 
>  1
>  1
>  1
>  1
> 
> 
> 
> 
> 
> 
>
>
>
>
> The desired response:
>
> 
> 
> 
> 1019141675
> 
> 
> 
> 
> 
> 
>  120
>  58
>  13
>  1
>  1
>  1
> 
> 
> 
> 
> 
> 
>
>
> The way I have the attribute and value delimited by a dash has me stumped
> on how to perform the tagging and excluding. If we exclude the entire
> attribute field with facet.field={!ex=dt}attribute it brings an undesired
> result. What I need to do is exclude (attribute:color)
>
> Thanks for the help!!
>


Re: Solr/Lucene courses and training

2012-10-04 Thread Jan Høydahl
Please have a look at the "Support" wiki page:

  http://wiki.apache.org/solr/Support

Search for "Training" and you'll find a number of companies providing 
Solr/Lucene training in Europe.

--
Jan Høydahl, search solution architect
Cominvent AS - www.cominvent.com
Solr Training - www.solrtraining.com

3. okt. 2012 kl. 18:17 skrev Oleg Ruchovets :

> Hi ,
>   I am looking for a courses / trainings of Solr and Lucene in Europe (
> for english speakers).
> 
> Can you please send me  links or organizations name.
> 
> Thanks in advance
> Oleg.



Re: Get report of keywords searched.

2012-10-04 Thread Davide Lorenzo Marino
If you need to analyze the search queries is not very difficult, just
create a search plugin and put them in a db.
If you need to search the single keywords it is more difficult and you need
before starting to take some decision. In particular take the following
queries and try to answer how you would like to treat them for the keywards:

1) apple OR orange
2) apple AND orange
3) title:apple AND subject:orange
4) apple -orange
5) apple OR (orange AND banana)
6) title:apple OR subject:orange

Ciao

Davide Marino








2012/10/3 Rajani Maski 

> Hi All,
>
>I am using solrJ. When there is search query hit, I am logging the url
> in a location and also it is getting logged into tomcat catalina logs.
>  Now I wanted to implement a functionality of periodically(per week)
> analyzing search logs of solr and find out the keywords searched. Is there
> a way to do it using any of the existing functionality of solr? If not,
> Anybody has tried this implementation with any open source tools?
> Suggestions welcome. . Awaiting reply
>
>
> Thank you.
>


Re: Hierarchical Data

2012-10-04 Thread Maurizio Cucchiara
Thank you Erick for your answer. I read your post and I found it very
interesting.
Unfortunately it is not suitable for my use case:
* security is not an issue, since the dbs will be fully replicated in
the same infrastructure.
* there are no bazillion of data (something like 300K html documents)
* if I choose client side approach, I'd have to write twice (Solr
index is a merge of 2 dbs).
* I'd like to pull data from Solr unless it is absolutely impossible
(that was the reason I chose Solr over Lucene).
* least but not last, ATM my real issue is to found a reusable
solution to index hierarchical data (unless it already exists).


Twitter :http://www.twitter.com/m_cucchiara
G+  :https://plus.google.com/107903711540963855921
Linkedin:http://www.linkedin.com/in/mauriziocucchiara
VisualizeMe: http://vizualize.me/maurizio.cucchiara?r=maurizio.cucchiara

Maurizio Cucchiara


On 3 October 2012 14:06, Erick Erickson  wrote:
> Maurizio:
>
> DIH is great for its intended purpose, but when things get complex I generally
> prefer writing something in SolrJ, it gives much finer-grained control
> over "special circumstances". Plus, you can see everything that
> happens. Here's a blog with a skeletal SolrJ program, you can just
> pull out all the local-tika stuff.
>
> http://searchhub.org/dev/2012/02/14/indexing-with-solrj/
>
> The take-away IMO is that once you've spent some time working with
> DIH without getting what you need, something like using an independent
> client (SolrJ in this example) is worth considering..
>
> Best
> Erick
>
> On Tue, Oct 2, 2012 at 12:59 PM, Maurizio Cucchiara
>  wrote:
>> Hi all,
>> I'm trying to import some hierarchical data (stored in MySQL) on Solr,
>> using DataImportHandler.
>> Unfortunately, as most of you already knows, MySQL has no support for
>> recursive queries, so there is no way to get hierarchical data stored
>> as an adjacency list.
>> So I considered writing a DIH custom transformers which given a
>> specified sql (like select * from categories) and a value (f.e.
>> category_id):
>> * fetches all data
>> * builds an hierarchical representation of the fetched data
>> * optionally caches the hierarchical data structure
>> * then returns 2 multi-valued lists which contain the 2 full paths (as
>> String and as Number)
>>
>> Is there something out of the box?
>> Alternatively, does the above approach sound good?
>>
>> TIA
>>
>>
>> Twitter :http://www.twitter.com/m_cucchiara
>> G+  :https://plus.google.com/107903711540963855921
>> Linkedin:http://www.linkedin.com/in/mauriziocucchiara
>> VisualizeMe: http://vizualize.me/maurizio.cucchiara?r=maurizio.cucchiara
>>
>> Maurizio Cucchiara


Re: How to make SOLR manipulate the results?

2012-10-04 Thread Jan Høydahl
Hi,

I would simply split the query into two individual ones. One for the ordinary 
products, where you request 8 rows, and one for the sponsored ones where you 
request 2 rows and sort by a "random" field. "&sort=random_XXX desc"  where XXX 
is your random seed.

However, instead of using Elevate component, you can create a "keywords" field 
in your schema which enforces exact matching only, so that all "S" documents 
having a purchased keyword matching exactly (such as iPhone) will be selected 
for random sorting. See https://github.com/cominvent/exactmatch for an example 
of exactmatch field type. To use it for your "sponsored" query, be sure to 
quote the user query so you don't get any false matches: &q=keywords:"user 
exact query" (instead of &q.df=keywords&q=user exact query).

You could also look into Grouping http://wiki.apache.org/solr/FieldCollapsing 
to try to get this as one single query, perhaps through two distict 
&group.query requests...

--
Jan Høydahl, search solution architect
Cominvent AS - www.cominvent.com
Solr Training - www.solrtraining.com

4. okt. 2012 kl. 08:14 skrev Yash Ganthe :

> Hi,
> 
> For an E-commerce website, we have stored the products as SOLR
> documents with the following fields and weights:
> Title:5
> Description:4
> 
> For some products, we need to ensure that they appear in the top ten
> results even if their relevance in the above two fields does not
> qualify them for being in top 10. For example:
> P1, P2,  P10 are the legitimate products for a given search
> keyword "iPhone". I have S1 ... S100 as sponsored products that want
> to appear in the top 10. My policy is that only 2 of these 100
> sponsored products will be randomly chosen and shown in the top 10 so
> that the results will be: S5, S31, P1, P2, ... P8. In the next
> request, the sponsored products that gets slipped in may be S4, S99.
> 
> The QueryElevationComponent lets us specify the docIDs for keywords
> but does not let us randomize the results such that only 2 of the
> complete set of sponsored docIDs is sent in the results.
> 
> Any suggestions for implementing this would be appreciated.
> 
> Thanks,
> Yash



Solr search

2012-10-04 Thread Tolga

Hi,

I installed Solr and Nutch on a server, crawled with Nutch, and searched 
at http://localhost:8983/solr/, to no avail. I mean it turns up no 
results. What to do?


Regards,


How to make SOLR manipulate the results?

2012-10-04 Thread srilatha
For an E-commerce website, we have stored the products as SOLR documents with
the following fields and weights:
Title:5
Description:4

For some products, we need to ensure that they appear in the top ten results
even if their relevance in the above two fields does not qualify them for
being in top 10. For example:
P1, P2,  P10 are the legitimate products for a given search keyword
"iPhone". I have S1 ... S100 as sponsored products that want to appear in
the top 10. My policy is that only 2 of these 100 sponsored products will be
randomly chosen and shown in the top 10 so that the results will be: S5,
S31, P1, P2, ... P8. In the next request, the sponsored products that gets
slipped in may be S4, S99.

The QueryElevationComponent lets us specify the docIDs for keywords but does
not let us randomize the results such that only 2 of the complete set of
sponsored docIDs is sent in the results.

Any suggestions for implementing this would be appreciated.




--
View this message in context: 
http://lucene.472066.n3.nabble.com/How-to-make-SOLR-manipulate-the-results-tp4011739.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: solr facet !tag on multiple columns

2012-10-04 Thread Erick Erickson
_how_ is it not working? You might review:
http://wiki.apache.org/solr/UsingMailingLists

Best
Erick

On Thu, Oct 4, 2012 at 4:37 AM, lavesh  wrote:
> select?q=*:*&facet=true&facet.zeros=false&fq=column1:(16 31)&&fq=COLUMN2:(6
> 208)&fq={!tag=COLUMN2,column1}COLUMN2:(6)&facet.field={!ex=COLUMN2,column1}COLUMN2&start=0&rows=0
>
> this is not working
>
>
>
> --
> View this message in context: 
> http://lucene.472066.n3.nabble.com/solr-facet-tag-on-multiple-columns-tp4011618p4011747.html
> Sent from the Solr - User mailing list archive at Nabble.com.


Re: Solr search

2012-10-04 Thread Erick Erickson
I'm at a complete loss here, you've provided no
information at all to help diagnose your issues. Please
review:
http://wiki.apache.org/solr/UsingMailingLists

Best
Erick

On Thu, Oct 4, 2012 at 5:49 AM, Tolga  wrote:
> Hi,
>
> I installed Solr and Nutch on a server, crawled with Nutch, and searched at
> http://localhost:8983/solr/, to no avail. I mean it turns up no results.
> What to do?
>
> Regards,


Re: multivalued filed question (FieldCache error)

2012-10-04 Thread giovanni.bricc...@banzai.it

Thank you for the support!

Unfortunately my configuration is very large, but I was able to 
reproduce the error in a new test collection (I have a multicore setup).

So extracting the attachment you will be able to track down what appens

this is the query that shows the error, and below you can see the latest 
stack trace and the qt definition


i'm using solr version "4.0.0-BETA 1370099 - rmuir - 2012-08-06 22:50:47"

http://src-eprice-dev:8080/solr/test/select?q=ciao&wt=xml&qt=eprice

SEVERE: org.apache.solr.common.SolrException: can not use FieldCache on 
multivalued field: store_slug
at 
org.apache.solr.schema.SchemaField.checkFieldCacheSource(SchemaField.java:174)

at org.apache.solr.schema.StrField.getValueSource(StrField.java:44)
at 
org.apache.solr.search.FunctionQParser.parseValueSource(FunctionQParser.java:376)
at 
org.apache.solr.search.FunctionQParser.parse(FunctionQParser.java:70)

at org.apache.solr.search.QParser.getQuery(QParser.java:145)
at org.apache.solr.search.ReturnFields.add(ReturnFields.java:289)
at 
org.apache.solr.search.ReturnFields.parseFieldList(ReturnFields.java:115)

at org.apache.solr.search.ReturnFields.(ReturnFields.java:101)
at org.apache.solr.search.ReturnFields.(ReturnFields.java:77)
at 
org.apache.solr.handler.component.QueryComponent.prepare(QueryComponent.java:97)
at 
org.apache.solr.handler.component.SearchHandler.handleRequestBody(SearchHandler.java:185)
at 
org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:129)

at org.apache.solr.core.SolrCore.execute(SolrCore.java:1656)
at 
org.apache.solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter.java:454)
at 
org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:275)
at 
org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:235)
at 
org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:206)
at 
org.apache.catalina.core.StandardWrapperValve.invoke(StandardWrapperValve.java:233)
at 
org.apache.catalina.core.StandardContextValve.invoke(StandardContextValve.java:191)
at 
org.apache.catalina.core.StandardHostValve.invoke(StandardHostValve.java:127)
at 
org.apache.catalina.valves.ErrorReportValve.invoke(ErrorReportValve.java:102)
at 
org.apache.catalina.core.StandardEngineValve.invoke(StandardEngineValve.java:109)
at 
org.apache.catalina.connector.CoyoteAdapter.service(CoyoteAdapter.java:298)
at 
org.apache.coyote.http11.Http11Processor.process(Http11Processor.java:859)
at 
org.apache.coyote.http11.Http11Protocol$Http11ConnectionHandler.process(Http11Protocol.java:588)
at 
org.apache.tomcat.util.net.JIoEndpoint$Worker.run(JIoEndpoint.java:489)

at java.lang.Thread.run(Thread.java:662)


  

 dismax
 explicit
 1
 
sku^1
 
 
sku^1
 
 
sku,store_slug
 

 
 
 2
 2
 *:*

true
1
0
count
store_slug
false


  

Il 03/10/12 19:51, Chris Hostetter ha scritto:

: Here is the stack trace

what exactly does your fl param look like when you get this error?  and
what exactly are the field/fieldType declarations for each of the fields
in your fl?

Because if i'm reading this correctly, Solr thinks you are trying to
include in the response the results of a function on your store_slug
field, ie...

   fl=foo, bar, baz, somefunction(store_slug)

...it's possible there is a bug in the parsing code -- it includes some
huersitics to deal with the posibility of atypical field names that might
look like function names, but it shouldn't get confused by a field name as
simple sa "store_slug" which leads me to believe something earlier in the
fl list is confusing it.

(Details really matter.  When you only give us part of the information
-- ie: "..." in your solrconfig, a one line error message instead of hte
full stack trace -- and we have to ask lots of follow up questions to get
the basic info about what/how you got an error, it really makes it hard to
help diagnose problems)


: Oct 3, 2012 3:07:38 PM org.apache.solr.common.SolrException log
: SEVERE: org.apache.solr.common.SolrException: can not use FieldCache on
: multivalued field: store_slug
: at
: org.apache.solr.schema.SchemaField.checkFieldCacheSource(SchemaField.java:174)
: at org.apache.solr.schema.StrField.getValueSource(StrField.java:44)
: at
: 
org.apache.solr.search.FunctionQParser.parseValueSource(FunctionQParser.java:376)
: at org.apache.solr.search.FunctionQParser.parse(FunctionQParser.java:70)
: at org.apache.solr.search.QParser.getQuery(QParser.java:145)
: at org.apache.solr.search.ReturnFields.add(ReturnFields.java:289)
: at
: org.apache.solr.search.ReturnFields.parseFieldList(ReturnFields.java:115)
: at org.apache.solr.search.ReturnFields.(ReturnFields.java:101)
: at org.apache.solr.s

Re: Solr search

2012-10-04 Thread Otis Gospodnetic
Hi

Search for *:* to retrieve all docs. Got anything?

Otis
--
Performance Monitoring - http://sematext.com/spm
On Oct 4, 2012 5:50 AM, "Tolga"  wrote:

> Hi,
>
> I installed Solr and Nutch on a server, crawled with Nutch, and searched
> at http://localhost:8983/solr/, to no avail. I mean it turns up no
> results. What to do?
>
> Regards,
>


Re: solr meger policy

2012-10-04 Thread Otis Gospodnetic
Hi,

Look for the word merge in solrconfig.xml :)

Otis
--
Performance Monitoring - http://sematext.com/spm
On Oct 4, 2012 7:50 AM, "jame vaalet"  wrote:

> Hi,
> I would like to know the different merge policies lucene uses in different
> versions of SOLR. I have got 3.4 and 3.6 versions of solr running but how
> do i point them to use different merge policies?
> Thanks in advance !
>
> --
>
> -JAME
>


Re: solr meger policy

2012-10-04 Thread jame vaalet
Thats the first thing i tried, but it had only merge factor and
maxmergedocs in it. We have different merge policies like
LogMergePolicy
, 
NoMergePolicy
, 
TieredMergePolicy
, 
UpgradeIndexMergePolicy.
finally i found what is the default policy and values in 3.4 lucene:

   - default policy is
TieredMergePolicy
(
   
http://lucene.apache.org/core/old_versioned_docs/versions/3_4_0/api/core/org/apache/lucene/index/MergePolicy.html
   )
   - default constants unless specified are
   
http://lucene.apache.org/core/old_versioned_docs/versions/3_4_0/api/core/constant-values.html


On 4 October 2012 18:29, Otis Gospodnetic wrote:

> Hi,
>
> Look for the word merge in solrconfig.xml :)
>
> Otis
> --
> Performance Monitoring - http://sematext.com/spm
> On Oct 4, 2012 7:50 AM, "jame vaalet"  wrote:
>
> > Hi,
> > I would like to know the different merge policies lucene uses in
> different
> > versions of SOLR. I have got 3.4 and 3.6 versions of solr running but how
> > do i point them to use different merge policies?
> > Thanks in advance !
> >
> > --
> >
> > -JAME
> >
>



-- 

-JAME


Re: Can I rely on correct handling of interrupted status of threads?

2012-10-04 Thread Robert Krüger
On Tue, Oct 2, 2012 at 8:50 PM, Mikhail Khludnev
 wrote:
> I remember a bug in EmbeddedSolrServer at 1.4.1 when exception bypasses
> request closing that lead to searcher leak and OOM. It was fixed about two
> years ago.
>
You mean InterruptedException?


QueryElevationComponent not working in Distributed Search

2012-10-04 Thread vasokan
Hi,

I am using the following version of Solr.

Solr Specification Version: 3.6.0.2012.04.06.11.34.07
Solr Implementation Version: 3.6.0 1310449 - rmuir - 2012-04-06 11:34:07
Lucene Specification Version: 3.6.0
Lucene Implementation Version: 3.6.0 1310449 - rmuir - 2012-04-06 
11:31:16

Elevation of documents is working fine when tested for single system.  But
is not working for distributed systems.  I found a relevant issue as in the
link https://issues.apache.org/jira/browse/SOLR-2949 and it is currently
resolved.

I am in need to understand
1.  Whether the fix for the issue I have mentioned above is present in my
version.
2.  Is the problem of elevating in distributed search still exists.

It will be of great help if anyone can share me your ideas with me.

Thank you,
Vinoth Asokan



--
View this message in context: 
http://lucene.472066.n3.nabble.com/QueryElevationComponent-not-working-in-Distributed-Search-tp4011780.html
Sent from the Solr - User mailing list archive at Nabble.com.


QueryElevationComponent not working in Distributed Search

2012-10-04 Thread vasokan
Hi,

I am using the following version of Solr.

Solr Specification Version: 3.6.0.2012.04.06.11.34.07
Solr Implementation Version: 3.6.0 1310449 - rmuir - 2012-04-06
11:34:07
Lucene Specification Version: 3.6.0
Lucene Implementation Version: 3.6.0 1310449 - rmuir - 2012-04-06
11:31:16

Elevation of documents is working fine when tested for single system.  But
is not working for distributed systems.  I found a relevant issue as in the
link https://issues.apache.org/jira/browse/SOLR-2949 and it is currently
resolved.

I am in need to understand
1.  Whether the fix for the issue I have mentioned above is present in my
version.
2.  Is the problem of elevating in distributed search still exists.

It will be of great help if anyone can share me your ideas with me.

Thank you,
Vinoth Asokan 



--
View this message in context: 
http://lucene.472066.n3.nabble.com/QueryElevationComponent-not-working-in-Distributed-Search-tp4011785.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: solr meger policy

2012-10-04 Thread Tomás Fernández Löbbe
TieredMergePolicy is the default in Solr since 3.3. See
https://issues.apache.org/jira/browse/SOLR-2567 It is still the default for
4.0, so you should have the same MergePolicy in 3.4 and 3.6.



On Thu, Oct 4, 2012 at 9:14 AM, jame vaalet  wrote:

> Thats the first thing i tried, but it had only merge factor and
> maxmergedocs in it. We have different merge policies like
> LogMergePolicy<
> http://lucene.apache.org/core/old_versioned_docs/versions/3_4_0/api/core/org/apache/lucene/index/LogMergePolicy.html
> >
> , NoMergePolicy<
> http://lucene.apache.org/core/old_versioned_docs/versions/3_4_0/api/core/org/apache/lucene/index/NoMergePolicy.html
> >
> , TieredMergePolicy<
> http://lucene.apache.org/core/old_versioned_docs/versions/3_4_0/api/core/org/apache/lucene/index/TieredMergePolicy.html
> >
> , UpgradeIndexMergePolicy<
> http://lucene.apache.org/core/old_versioned_docs/versions/3_4_0/api/core/org/apache/lucene/index/UpgradeIndexMergePolicy.html
> >.
> finally i found what is the default policy and values in 3.4 lucene:
>
>- default policy is
> TieredMergePolicy<
> http://lucene.apache.org/core/old_versioned_docs/versions/3_4_0/api/core/org/apache/lucene/index/TieredMergePolicy.html
> >
> (
>
> http://lucene.apache.org/core/old_versioned_docs/versions/3_4_0/api/core/org/apache/lucene/index/MergePolicy.html
>)
>- default constants unless specified are
>
> http://lucene.apache.org/core/old_versioned_docs/versions/3_4_0/api/core/constant-values.html
>
>
> On 4 October 2012 18:29, Otis Gospodnetic  >wrote:
>
> > Hi,
> >
> > Look for the word merge in solrconfig.xml :)
> >
> > Otis
> > --
> > Performance Monitoring - http://sematext.com/spm
> > On Oct 4, 2012 7:50 AM, "jame vaalet"  wrote:
> >
> > > Hi,
> > > I would like to know the different merge policies lucene uses in
> > different
> > > versions of SOLR. I have got 3.4 and 3.6 versions of solr running but
> how
> > > do i point them to use different merge policies?
> > > Thanks in advance !
> > >
> > > --
> > >
> > > -JAME
> > >
> >
>
>
>
> --
>
> -JAME
>


problem with hl.mergeContinuous

2012-10-04 Thread Yoni Amir
I am using a configuration roughly as follows (with solr 4 beta):

   true
   true
   4
   true

The fragment/snippet size is 100 by default. I found a strange case as follows:

The word that I search for appears in a field somewhere between the 300th and 
400th characters. Solr, instead of returning a snippet of 100 characters, 
returns 400 characters, from the beginning of the text and up to the word that 
is highlighted and a bit further on the text. This happens even though in the 
first 300 characters there is no hit.

I found out that the length of the snippet (400) is proportional to the number 
of snippets (in this case, 100 times 4).
This is a problem because I want to show the user only around 250~ characters.

Is it a bug? Is it configurable?

Thanks,
Yoni


Re: Unknown format version: -11

2012-10-04 Thread Otis Gospodnetic
Hi,

I'd have to check the src to see what exactly -11 signifies but
why not paste the Solr version you see in Solr Admin, plus ls -l your
lib directory(-ies).

Also, who indexed docs to those cores?  That same Solr?  Can you
remove the core and reindex?  If so, do you still get -11?

Otis
--
Search Analytics - http://sematext.com/search-analytics/index.html
Performance Monitoring - http://sematext.com/spm/index.html


On Thu, Oct 4, 2012 at 9:09 AM, Sushil jain  wrote:
> Greetings,
>
> I have two cores(core0 & core1) in my solr instance and I am creating
> indexes using Solr.
> I am trying to access(read) same indexes using *EmbeddedSolrServer. *
> I get instance of core1 without any error but in case of core0  I am
> getting the error :
>
> Exception in thread "main" java.lang.RuntimeException:
> org.apache.lucene.index.CorruptIndexException: Unknown format version: -11
> at
> org.apache.solr.spelling.AbstractLuceneSpellChecker.init(AbstractLuceneSpellChecker.java:104)
> at
> org.apache.solr.spelling.IndexBasedSpellChecker.init(IndexBasedSpellChecker.java:56)
> at
> org.apache.solr.handler.component.SpellCheckComponent.inform(SpellCheckComponent.java:274)
> at
> org.apache.solr.core.SolrResourceLoader.inform(SolrResourceLoader.java:508)
> at org.apache.solr.core.SolrCore.(SolrCore.java:588)
> at TestCaseSolrJ.createSolrCore(TestCaseSolrJ.java:1166)
> at TestCaseSolrJ.main(TestCaseSolrJ.java:234)
> Caused by: org.apache.lucene.index.CorruptIndexException: Unknown format
> version: -11
> at org.apache.lucene.index.SegmentInfos.read(SegmentInfos.java:247)
> at org.apache.lucene.index.DirectoryReader$1.doBody(DirectoryReader.java:72)
> at
> org.apache.lucene.index.SegmentInfos$FindSegmentsFile.run(SegmentInfos.java:683)
> at org.apache.lucene.index.DirectoryReader.open(DirectoryReader.java:69)
> at org.apache.lucene.index.IndexReader.open(IndexReader.java:476)
> at org.apache.lucene.index.IndexReader.open(IndexReader.java:314)
> at org.apache.lucene.search.IndexSearcher.(IndexSearcher.java:102)
> at
> org.apache.lucene.search.spell.SpellChecker.createSearcher(SpellChecker.java:542)
> at
> org.apache.lucene.search.spell.SpellChecker.swapSearcher(SpellChecker.java:519)
> at
> org.apache.lucene.search.spell.SpellChecker.setSpellIndex(SpellChecker.java:146)
> at org.apache.lucene.search.spell.SpellChecker.(SpellChecker.java:110)
> at
> org.apache.solr.spelling.AbstractLuceneSpellChecker.init(AbstractLuceneSpellChecker.java:102)
> ... 6 more
>
>
> I tried lot but couldn't figure out what's wrong it. I am using compatible
> versions of lucene and solr jar files.
> Please let me know the possible solution for same.
>
> Thanks & Regards,
> Sushil Jain


Problem with relating values in two multi value fields

2012-10-04 Thread Torben Honigbaum
Hello,

I've a problem with relating values in two multi value fields. My documents 
look like this: 


  3
  
A
B
C
D
  
  
200
400
240
310
  


My problem is that I've to search for a set of documents and display only the 
value for option A, for example, and use the value field as facet field. I need 
a result like this:


  3
  A
  200

 …

I think that this is a use case which isn't possible, right? So can someone 
show me an alternative way to solve this problem? The documents each have 500 
options with 500 related values.

Thank you
Torben



Question about OR operator

2012-10-04 Thread Jorge Luis Betancourt Gonzalez
Hi:

I'm having an issue with solr 3.6.1 and I'm sensing that is a lack  of 
understanding. I'm building a search engine, using of course solr to store the 
inverted index, so far so good. When I search for a term, let's say "java" I 
get 761 results, then querying the index with a "php" term give me 3194 results 
found. So if a do a query for java php (without any quotas) I suppose that solr 
will interpret this as an OR between the two terms, correct? so the results 
should be the JOIN between the two subsets of results? so can anyone  explain 
why I get less results searching for the last query? java php without any 
quotes??

Thanks in advance!!
10mo. ANIVERSARIO DE LA CREACION DE LA UNIVERSIDAD DE LAS CIENCIAS 
INFORMATICAS...
CONECTADOS AL FUTURO, CONECTADOS A LA REVOLUCION

http://www.uci.cu
http://www.facebook.com/universidad.uci
http://www.flickr.com/photos/universidad_uci


Re: Problem with relating values in two multi value fields

2012-10-04 Thread Jack Krupansky
Use a field called "option_value_pairs" with values like "A 200" and then 
query with a quoted phrase "A 200".


You could use a special character like equal sign instead of space: "A=200" 
and then you don't have to quote it in the query.


-- Jack Krupansky

-Original Message- 
From: Torben Honigbaum

Sent: Thursday, October 04, 2012 11:03 AM
To: solr-user@lucene.apache.org
Subject: Problem with relating values in two multi value fields

Hello,

I've a problem with relating values in two multi value fields. My documents 
look like this:



 3
 
   A
   B
   C
   D
 
 
   200
   400
   240
   310
 


My problem is that I've to search for a set of documents and display only 
the value for option A, for example, and use the value field as facet field. 
I need a result like this:



 3
 A
 200

 …

I think that this is a use case which isn't possible, right? So can someone 
show me an alternative way to solve this problem? The documents each have 
500 options with 500 related values.


Thank you
Torben



Re: Question about OR operator

2012-10-04 Thread Torben Honigbaum
Hi Jorge,

maybe I've defined 



in your schema.xml

Torben

Am 04.10.2012 um 17:06 schrieb Jorge Luis Betancourt Gonzalez:

> Hi:
> 
> I'm having an issue with solr 3.6.1 and I'm sensing that is a lack  of 
> understanding. I'm building a search engine, using of course solr to store 
> the inverted index, so far so good. When I search for a term, let's say 
> "java" I get 761 results, then querying the index with a "php" term give me 
> 3194 results found. So if a do a query for java php (without any quotas) I 
> suppose that solr will interpret this as an OR between the two terms, 
> correct? so the results should be the JOIN between the two subsets of 
> results? so can anyone  explain why I get less results searching for the last 
> query? java php without any quotes??
> 
> Thanks in advance!!
> 10mo. ANIVERSARIO DE LA CREACION DE LA UNIVERSIDAD DE LAS CIENCIAS 
> INFORMATICAS...
> CONECTADOS AL FUTURO, CONECTADOS A LA REVOLUCION
> 
> http://www.uci.cu
> http://www.facebook.com/universidad.uci
> http://www.flickr.com/photos/universidad_uci



Re: Question about OR operator

2012-10-04 Thread Jack Krupansky
Maybe the "mm" (minimum match) parameter is set to "100%" which is the same 
as using the "AND" operator. Set "mm" to "1" or "50%" or "25%" or whatever 
makes sense for your app.


-- Jack Krupansky

-Original Message- 
From: Jorge Luis Betancourt Gonzalez

Sent: Thursday, October 04, 2012 11:06 AM
To: solr-user@lucene.apache.org
Subject: Question about OR operator

Hi:

I'm having an issue with solr 3.6.1 and I'm sensing that is a lack  of 
understanding. I'm building a search engine, using of course solr to store 
the inverted index, so far so good. When I search for a term, let's say 
"java" I get 761 results, then querying the index with a "php" term give me 
3194 results found. So if a do a query for java php (without any quotas) I 
suppose that solr will interpret this as an OR between the two terms, 
correct? so the results should be the JOIN between the two subsets of 
results? so can anyone  explain why I get less results searching for the 
last query? java php without any quotes??


Thanks in advance!!
10mo. ANIVERSARIO DE LA CREACION DE LA UNIVERSIDAD DE LAS CIENCIAS 
INFORMATICAS...

CONECTADOS AL FUTURO, CONECTADOS A LA REVOLUCION

http://www.uci.cu
http://www.facebook.com/universidad.uci
http://www.flickr.com/photos/universidad_uci 



Re: Question about OR operator

2012-10-04 Thread Otis Gospodnetic
Hi Jorge,

Have a look at 
http://wiki.apache.org/solr/ExtendedDisMax#mm_.28Minimum_.27Should.27_Match.29

Have a look at http://wiki.apache.org/solr/CommonQueryParameters#Debugging
for info about &debugQuery=true and friends.

Otis
--
Search Analytics - http://sematext.com/search-analytics/index.html
Performance Monitoring - http://sematext.com/spm/index.html


On Thu, Oct 4, 2012 at 11:06 AM, Jorge Luis Betancourt Gonzalez
 wrote:
> Hi:
>
> I'm having an issue with solr 3.6.1 and I'm sensing that is a lack  of 
> understanding. I'm building a search engine, using of course solr to store 
> the inverted index, so far so good. When I search for a term, let's say 
> "java" I get 761 results, then querying the index with a "php" term give me 
> 3194 results found. So if a do a query for java php (without any quotas) I 
> suppose that solr will interpret this as an OR between the two terms, 
> correct? so the results should be the JOIN between the two subsets of 
> results? so can anyone  explain why I get less results searching for the last 
> query? java php without any quotes??
>
> Thanks in advance!!
> 10mo. ANIVERSARIO DE LA CREACION DE LA UNIVERSIDAD DE LAS CIENCIAS 
> INFORMATICAS...
> CONECTADOS AL FUTURO, CONECTADOS A LA REVOLUCION
>
> http://www.uci.cu
> http://www.facebook.com/universidad.uci
> http://www.flickr.com/photos/universidad_uci


Re: Problem with relating values in two multi value fields

2012-10-04 Thread Torben Honigbaum
Hi Jack, 

thank you for your answer. The problem is, that I don't know the value for 
option A and that the values are numbers and I've to use the values as facet. 
So I need something like this:

Docs: 


  3
  
A
B
...
  
  
200
400 
...
  


  4
  
A
E
...
  
  
300
400 
...
  


  6
  
A
C
...
  
  
200
400 
...
  


Query: …?q=options:A

Facet: 200 (2), 300 (1)

Thank you
Torben

Am 04.10.2012 um 17:10 schrieb Jack Krupansky:

> Use a field called "option_value_pairs" with values like "A 200" and then 
> query with a quoted phrase "A 200".
> 
> You could use a special character like equal sign instead of space: "A=200" 
> and then you don't have to quote it in the query.
> 
> -- Jack Krupansky
> 
> -Original Message- From: Torben Honigbaum
> Sent: Thursday, October 04, 2012 11:03 AM
> To: solr-user@lucene.apache.org
> Subject: Problem with relating values in two multi value fields
> 
> Hello,
> 
> I've a problem with relating values in two multi value fields. My documents 
> look like this:
> 
> 
> 3
> 
>   A
>   B
>   C
>   D
> 
> 
>   200
>   400
>   240
>   310
> 
> 
> 
> My problem is that I've to search for a set of documents and display only the 
> value for option A, for example, and use the value field as facet field. I 
> need a result like this:
> 
> 
> 3
> A
> 200
> 
>  …
> 
> I think that this is a use case which isn't possible, right? So can someone 
> show me an alternative way to solve this problem? The documents each have 500 
> options with 500 related values.
> 
> Thank you
> Torben
> 



Re: Can I rely on correct handling of interrupted status of threads?

2012-10-04 Thread Mikhail Khludnev
it was another exception class.

On Thu, Oct 4, 2012 at 5:19 PM, Robert Krüger  wrote:

> On Tue, Oct 2, 2012 at 8:50 PM, Mikhail Khludnev
>  wrote:
> > I remember a bug in EmbeddedSolrServer at 1.4.1 when exception bypasses
> > request closing that lead to searcher leak and OOM. It was fixed about
> two
> > years ago.
> >
> You mean InterruptedException?
>



-- 
Sincerely yours
Mikhail Khludnev
Tech Lead
Grid Dynamics


 


Re: Getting list of operators and terms for a query

2012-10-04 Thread Mikhail Khludnev
you've got ResponseBuilder as process() or prepare() argument, check
"query" field, but your component should be registered after QueryComponent
in your requestHandler config.

On Thu, Oct 4, 2012 at 6:03 PM, Davide Lorenzo Marino <
davide.mar...@gmail.com> wrote:

> Hi All,
> i'm working in a new searchComponent that analyze the search queries.
> I need to know if given a query string is possible to get the list of
> operators and terms (better in polish notation)?
> I mean if the default field is "country" and the query is the String
>
> "england OR (name:paul AND city:rome)"
>
> to get the List
>
> [ Operator OR, Term country:england, OPERATOR AND, Term name:paul, Term
> city:rome ]
>
> Thanks in advance
>
> Davide Marino
>



-- 
Sincerely yours
Mikhail Khludnev
Tech Lead
Grid Dynamics


 


Re: Problem with relating values in two multi value fields

2012-10-04 Thread Mikhail Khludnev
it's a typical nested document problem. there are several approaches. Out
of the box solution as far you need facets is
http://wiki.apache.org/solr/FieldCollapsing .

On Thu, Oct 4, 2012 at 7:19 PM, Torben Honigbaum <
torben.honigb...@neuland-bfi.de> wrote:

> Hi Jack,
>
> thank you for your answer. The problem is, that I don't know the value for
> option A and that the values are numbers and I've to use the values as
> facet. So I need something like this:
>
> Docs:
>
> 
>   3
>   
> A
> B
> ...
>   
>   
> 200
> 400
> ...
>   
> 
> 
>   4
>   
> A
> E
> ...
>   
>   
> 300
> 400
> ...
>   
> 
> 
>   6
>   
> A
> C
> ...
>   
>   
> 200
> 400
> ...
>   
> 
>
> Query: …?q=options:A
>
> Facet: 200 (2), 300 (1)
>
> Thank you
> Torben
>
> Am 04.10.2012 um 17:10 schrieb Jack Krupansky:
>
> > Use a field called "option_value_pairs" with values like "A 200" and
> then query with a quoted phrase "A 200".
> >
> > You could use a special character like equal sign instead of space:
> "A=200" and then you don't have to quote it in the query.
> >
> > -- Jack Krupansky
> >
> > -Original Message- From: Torben Honigbaum
> > Sent: Thursday, October 04, 2012 11:03 AM
> > To: solr-user@lucene.apache.org
> > Subject: Problem with relating values in two multi value fields
> >
> > Hello,
> >
> > I've a problem with relating values in two multi value fields. My
> documents look like this:
> >
> > 
> > 3
> > 
> >   A
> >   B
> >   C
> >   D
> > 
> > 
> >   200
> >   400
> >   240
> >   310
> > 
> > 
> >
> > My problem is that I've to search for a set of documents and display
> only the value for option A, for example, and use the value field as facet
> field. I need a result like this:
> >
> > 
> > 3
> > A
> > 200
> > 
> >  …
> >
> > I think that this is a use case which isn't possible, right? So can
> someone show me an alternative way to solve this problem? The documents
> each have 500 options with 500 related values.
> >
> > Thank you
> > Torben
> >
>
>


-- 
Sincerely yours
Mikhail Khludnev
Tech Lead
Grid Dynamics


 


Re: Getting list of operators and terms for a query

2012-10-04 Thread Davide Lorenzo Marino
It's ok.. I did it and I took the query string.
The problem is convert the java.lang.string (query) in a list of term and
operators and doing it using the same parser used by Solr to execute the
queries.

2012/10/4 Mikhail Khludnev 

> you've got ResponseBuilder as process() or prepare() argument, check
> "query" field, but your component should be registered after QueryComponent
> in your requestHandler config.
>
> On Thu, Oct 4, 2012 at 6:03 PM, Davide Lorenzo Marino <
> davide.mar...@gmail.com> wrote:
>
> > Hi All,
> > i'm working in a new searchComponent that analyze the search queries.
> > I need to know if given a query string is possible to get the list of
> > operators and terms (better in polish notation)?
> > I mean if the default field is "country" and the query is the String
> >
> > "england OR (name:paul AND city:rome)"
> >
> > to get the List
> >
> > [ Operator OR, Term country:england, OPERATOR AND, Term name:paul, Term
> > city:rome ]
> >
> > Thanks in advance
> >
> > Davide Marino
> >
>
>
>
> --
> Sincerely yours
> Mikhail Khludnev
> Tech Lead
> Grid Dynamics
>
> 
>  
>


Re: Getting list of operators and terms for a query

2012-10-04 Thread Jack Krupansky
I'm not quite following what the issue is here. I mean, the Solr 
QueryComponent generates a Lucene Query structure and you need to write code 
to recursively traverse that Lucene Query structure and generate your 
preferred form of output. There would be no need to look at the original 
query string. So, what exactly are you asking?


Maybe you simply need to read up on Lucene Query and its subclasses to 
understand what that structure looks like.


-- Jack Krupansky

-Original Message- 
From: Davide Lorenzo Marino

Sent: Thursday, October 04, 2012 11:36 AM
To: solr-user@lucene.apache.org
Subject: Re: Getting list of operators and terms for a query

It's ok.. I did it and I took the query string.
The problem is convert the java.lang.string (query) in a list of term and
operators and doing it using the same parser used by Solr to execute the
queries.

2012/10/4 Mikhail Khludnev 


you've got ResponseBuilder as process() or prepare() argument, check
"query" field, but your component should be registered after 
QueryComponent

in your requestHandler config.

On Thu, Oct 4, 2012 at 6:03 PM, Davide Lorenzo Marino <
davide.mar...@gmail.com> wrote:

> Hi All,
> i'm working in a new searchComponent that analyze the search queries.
> I need to know if given a query string is possible to get the list of
> operators and terms (better in polish notation)?
> I mean if the default field is "country" and the query is the String
>
> "england OR (name:paul AND city:rome)"
>
> to get the List
>
> [ Operator OR, Term country:england, OPERATOR AND, Term name:paul, Term
> city:rome ]
>
> Thanks in advance
>
> Davide Marino
>



--
Sincerely yours
Mikhail Khludnev
Tech Lead
Grid Dynamics


 





SolrCloud AutoSharding?

2012-10-04 Thread Jason Huang
Hello,

I am exploring SolrCloud and have a few questions about SolrCloud's
auto-sharding functionality. I couldn't find any good answer from my
online search - if anyone knows the answer to these questions or can
point me to the right document, that would be great!

(1) Does SolrCloud offer auto-sharding functionality? If we
continuously feed documents to a single index, eventually the shard
will grow to a huge size and the query will be slow. How does
SolrCloud handle this situation?

(2) If SolrCloud auto-splits a big shard to two small shards, then
shard 1 will have part of the index and shard 2 will have some other
part of index. Is this correct? If so, when we perform a query, do we
need to go through both shards in order to get a good response? Will
this be slow (because we need to go through two shards, or more shards
later if we need to split the shards again when the size is too big)?

thanks!

Jason


Solr replication hangs on multiple slave nodes

2012-10-04 Thread Justin Babuscio
After a large index rebuild (16-masters with ~15GB each), some slaves fail
to completely replicate.

We are running Solr v3.5 with 16 masters and 2 slaves each for a total of
48 servers.

4 of the 32 slaves sit in a stalled replication state with similar messages:

Files Downloaded:  254/260
Downloaded: 12.09 GB / 12.09 GB [ 100% ]
Downloading File: _t6.fdt, Downloaded: 3.1 MB / 3.1 MB [ 100 % ]
Time Elapsed: 3215s, EStimated Time REmaining: 0s, Speed: 24.5 MB/s


As you'll notice, all download sizes appear to be complete but the files
downloaded are not.  This also prevents the servers from polling for a new
update from the masters.  When searching, we are occasionally seeing 500
responses from the slaves that fail to replicate.  The errors are

ArrayIndexOutOfBounds - this occurs when writing the HTTP Response (our
container is WebSphere)
NullPointerExceptions - org.apache.lucnee.queryParser.QueryParser.parse
(QueryParser.java:203 )

We have tried to stop the slave, delete the /data directory, and restart.
 This started downloading the index but stalled as expected.

Thanks,
Justin


SOLR 4 BETA facet.pivot and cloud

2012-10-04 Thread Nick Cotton
Please pardon this if is and FAQ, but after searching the archives I
cannot get a clear answer.

Does the new facet.pivot work with SOLRCloud?  When I run SOLR 4 BETA
with zookeeper, even if I specify shards=1, pivoting does not seem to
work.  The quickest way to demo this is with the velocity browse page
on the example data.  The pivot facet for "cat,inStock" only appears
if I run without zookeeper.

If this is known, can you please let me know if this is a defect in
beta that is expected to be working in GA or whether it will remain a
limition for some time.

regards,

Nick Koton


Re: Getting list of operators and terms for a query

2012-10-04 Thread Davide Lorenzo Marino
I don't need really start from the query String.
What I need is obtain a list of terms and operators.
So the real problem is:

How can I access the Lucene Query structure to traverse it?

Davide Marino


2012/10/4 Jack Krupansky 

> I'm not quite following what the issue is here. I mean, the Solr
> QueryComponent generates a Lucene Query structure and you need to write
> code to recursively traverse that Lucene Query structure and generate your
> preferred form of output. There would be no need to look at the original
> query string. So, what exactly are you asking?
>
> Maybe you simply need to read up on Lucene Query and its subclasses to
> understand what that structure looks like.
>
> -- Jack Krupansky
>
> -Original Message- From: Davide Lorenzo Marino
> Sent: Thursday, October 04, 2012 11:36 AM
> To: solr-user@lucene.apache.org
> Subject: Re: Getting list of operators and terms for a query
>
>
> It's ok.. I did it and I took the query string.
> The problem is convert the java.lang.string (query) in a list of term and
> operators and doing it using the same parser used by Solr to execute the
> queries.
>
> 2012/10/4 Mikhail Khludnev 
>
>  you've got ResponseBuilder as process() or prepare() argument, check
>> "query" field, but your component should be registered after
>> QueryComponent
>> in your requestHandler config.
>>
>> On Thu, Oct 4, 2012 at 6:03 PM, Davide Lorenzo Marino <
>> davide.mar...@gmail.com> wrote:
>>
>> > Hi All,
>> > i'm working in a new searchComponent that analyze the search queries.
>> > I need to know if given a query string is possible to get the list of
>> > operators and terms (better in polish notation)?
>> > I mean if the default field is "country" and the query is the String
>> >
>> > "england OR (name:paul AND city:rome)"
>> >
>> > to get the List
>> >
>> > [ Operator OR, Term country:england, OPERATOR AND, Term name:paul, Term
>> > city:rome ]
>> >
>> > Thanks in advance
>> >
>> > Davide Marino
>> >
>>
>>
>>
>> --
>> Sincerely yours
>> Mikhail Khludnev
>> Tech Lead
>> Grid Dynamics
>>
>> 
>>  
>>
>>
>


Solr 4.0 and Maven SNAPSHOT artifacts

2012-10-04 Thread Amit Nithian
Is there a maven repository location that contains the nightly build
Maven artifacts of Solr? Are SNAPSHOT releases being generated by
Jenkins or anything so that when I re-resolve the dependencies I'd get
the latest snapshot jars?

Thanks
Amit


Re: Getting list of operators and terms for a query

2012-10-04 Thread Amit Nithian
I think you'd want to start by looking at the rb.getQuery() in the
prepare (or process if you are trying to do post-results analysis).
This returns a Query object that would contain everything in that and
I'd then look at the Javadoc to see how to traverse it. I'm sure some
runtime type-casting may be necessary to get at the sub-structures

On Thu, Oct 4, 2012 at 9:23 AM, Davide Lorenzo Marino
 wrote:
> I don't need really start from the query String.
> What I need is obtain a list of terms and operators.
> So the real problem is:
>
> How can I access the Lucene Query structure to traverse it?
>
> Davide Marino
>
>
> 2012/10/4 Jack Krupansky 
>
>> I'm not quite following what the issue is here. I mean, the Solr
>> QueryComponent generates a Lucene Query structure and you need to write
>> code to recursively traverse that Lucene Query structure and generate your
>> preferred form of output. There would be no need to look at the original
>> query string. So, what exactly are you asking?
>>
>> Maybe you simply need to read up on Lucene Query and its subclasses to
>> understand what that structure looks like.
>>
>> -- Jack Krupansky
>>
>> -Original Message- From: Davide Lorenzo Marino
>> Sent: Thursday, October 04, 2012 11:36 AM
>> To: solr-user@lucene.apache.org
>> Subject: Re: Getting list of operators and terms for a query
>>
>>
>> It's ok.. I did it and I took the query string.
>> The problem is convert the java.lang.string (query) in a list of term and
>> operators and doing it using the same parser used by Solr to execute the
>> queries.
>>
>> 2012/10/4 Mikhail Khludnev 
>>
>>  you've got ResponseBuilder as process() or prepare() argument, check
>>> "query" field, but your component should be registered after
>>> QueryComponent
>>> in your requestHandler config.
>>>
>>> On Thu, Oct 4, 2012 at 6:03 PM, Davide Lorenzo Marino <
>>> davide.mar...@gmail.com> wrote:
>>>
>>> > Hi All,
>>> > i'm working in a new searchComponent that analyze the search queries.
>>> > I need to know if given a query string is possible to get the list of
>>> > operators and terms (better in polish notation)?
>>> > I mean if the default field is "country" and the query is the String
>>> >
>>> > "england OR (name:paul AND city:rome)"
>>> >
>>> > to get the List
>>> >
>>> > [ Operator OR, Term country:england, OPERATOR AND, Term name:paul, Term
>>> > city:rome ]
>>> >
>>> > Thanks in advance
>>> >
>>> > Davide Marino
>>> >
>>>
>>>
>>>
>>> --
>>> Sincerely yours
>>> Mikhail Khludnev
>>> Tech Lead
>>> Grid Dynamics
>>>
>>> 
>>>  
>>>
>>>
>>


RE: Solr 4.0 and Maven SNAPSHOT artifacts

2012-10-04 Thread Steven A Rowe
http://wiki.apache.org/solr/NightlyBuilds

-Original Message-
From: Amit Nithian [mailto:anith...@gmail.com] 
Sent: Thursday, October 04, 2012 1:22 PM
To: solr-user@lucene.apache.org
Subject: Solr 4.0 and Maven SNAPSHOT artifacts

Is there a maven repository location that contains the nightly build
Maven artifacts of Solr? Are SNAPSHOT releases being generated by
Jenkins or anything so that when I re-resolve the dependencies I'd get
the latest snapshot jars?

Thanks
Amit


Re: Solr 4.0 and Maven SNAPSHOT artifacts

2012-10-04 Thread Maurizio Cucchiara
I could be wrong in case of solr, but usually snapshots live in
repository.a.o (and here [1] AFAICU you will find what you're looking
for).

In order to use you should add the following repository


  
apache.snapshots
ASF Maven 2 Snapshot
https://repository.apache.org/content/groups/snapshots-group/
  


I'm relatively new to solr, so I'd suggest to wait a confirmation from
someone else.

[1] 
https://repository.apache.org/content/groups/snapshots-group/org/apache/solr/


Twitter :http://www.twitter.com/m_cucchiara
G+  :https://plus.google.com/107903711540963855921
Linkedin:http://www.linkedin.com/in/mauriziocucchiara
VisualizeMe: http://vizualize.me/maurizio.cucchiara?r=maurizio.cucchiara

Maurizio Cucchiara


On 4 October 2012 19:21, Amit Nithian  wrote:
> Solr


Re: Question about OR operator

2012-10-04 Thread Jorge Luis Betancourt Gonzalez
Hi:

Thanks for all the replies, right now I have this in my mm parameter:


2<-1 5<-2 6<90%


I'm trying to get an straight OR between all the terms in my query, should I 
set the mm parameter to 1? because this gave an error.

Greetings!

On Oct 4, 2012, at 11:06 AM, Jorge Luis Betancourt Gonzalez wrote:

> Hi:
> 
> I'm having an issue with solr 3.6.1 and I'm sensing that is a lack  of 
> understanding. I'm building a search engine, using of course solr to store 
> the inverted index, so far so good. When I search for a term, let's say 
> "java" I get 761 results, then querying the index with a "php" term give me 
> 3194 results found. So if a do a query for java php (without any quotas) I 
> suppose that solr will interpret this as an OR between the two terms, 
> correct? so the results should be the JOIN between the two subsets of 
> results? so can anyone  explain why I get less results searching for the last 
> query? java php without any quotes??
> 
> Thanks in advance!!
> 10mo. ANIVERSARIO DE LA CREACION DE LA UNIVERSIDAD DE LAS CIENCIAS 
> INFORMATICAS...
> CONECTADOS AL FUTURO, CONECTADOS A LA REVOLUCION
> 
> http://www.uci.cu
> http://www.facebook.com/universidad.uci
> http://www.flickr.com/photos/universidad_uci
> 
> 10mo. ANIVERSARIO DE LA CREACION DE LA UNIVERSIDAD DE LAS CIENCIAS 
> INFORMATICAS...
> CONECTADOS AL FUTURO, CONECTADOS A LA REVOLUCION
> 
> http://www.uci.cu
> http://www.facebook.com/universidad.uci
> http://www.flickr.com/photos/universidad_uci


10mo. ANIVERSARIO DE LA CREACION DE LA UNIVERSIDAD DE LAS CIENCIAS 
INFORMATICAS...
CONECTADOS AL FUTURO, CONECTADOS A LA REVOLUCION

http://www.uci.cu
http://www.facebook.com/universidad.uci
http://www.flickr.com/photos/universidad_uci


Re: SolrCloud AutoSharding?

2012-10-04 Thread Tomás Fernández Löbbe
SolrCloud doesn't auto-shard at this point. It doesn't split indexes either
(there is an open issue for this:
https://issues.apache.org/jira/browse/SOLR-3755 )

At this point you need to specify the number of shards for a collection in
advance, with the numShards parameter. When you have more than one shard
for a collection, SolrCloud automatically distributes the query to one
replica of each shard and join the results for you.

Most reliable documentation about SolrCloud can be found here:
http://wiki.apache.org/solr/SolrCloud

Tomás

On Thu, Oct 4, 2012 at 12:02 PM, Jason Huang  wrote:

> Hello,
>
> I am exploring SolrCloud and have a few questions about SolrCloud's
> auto-sharding functionality. I couldn't find any good answer from my
> online search - if anyone knows the answer to these questions or can
> point me to the right document, that would be great!
>
> (1) Does SolrCloud offer auto-sharding functionality? If we
> continuously feed documents to a single index, eventually the shard
> will grow to a huge size and the query will be slow. How does
> SolrCloud handle this situation?
>
> (2) If SolrCloud auto-splits a big shard to two small shards, then
> shard 1 will have part of the index and shard 2 will have some other
> part of index. Is this correct? If so, when we perform a query, do we
> need to go through both shards in order to get a good response? Will
> this be slow (because we need to go through two shards, or more shards
> later if we need to split the shards again when the size is too big)?
>
> thanks!
>
> Jason
>


Re: Getting list of operators and terms for a query

2012-10-04 Thread Davide Lorenzo Marino
For what I saw in the documentation from the class
org.apache.lucene.search.Query
I can just iterate over the terms using the method extractTerms. How can I
extract the operators?

2012/10/4 Amit Nithian 

> I think you'd want to start by looking at the rb.getQuery() in the
> prepare (or process if you are trying to do post-results analysis).
> This returns a Query object that would contain everything in that and
> I'd then look at the Javadoc to see how to traverse it. I'm sure some
> runtime type-casting may be necessary to get at the sub-structures
>
> On Thu, Oct 4, 2012 at 9:23 AM, Davide Lorenzo Marino
>  wrote:
> > I don't need really start from the query String.
> > What I need is obtain a list of terms and operators.
> > So the real problem is:
> >
> > How can I access the Lucene Query structure to traverse it?
> >
> > Davide Marino
> >
> >
> > 2012/10/4 Jack Krupansky 
> >
> >> I'm not quite following what the issue is here. I mean, the Solr
> >> QueryComponent generates a Lucene Query structure and you need to write
> >> code to recursively traverse that Lucene Query structure and generate
> your
> >> preferred form of output. There would be no need to look at the original
> >> query string. So, what exactly are you asking?
> >>
> >> Maybe you simply need to read up on Lucene Query and its subclasses to
> >> understand what that structure looks like.
> >>
> >> -- Jack Krupansky
> >>
> >> -Original Message- From: Davide Lorenzo Marino
> >> Sent: Thursday, October 04, 2012 11:36 AM
> >> To: solr-user@lucene.apache.org
> >> Subject: Re: Getting list of operators and terms for a query
> >>
> >>
> >> It's ok.. I did it and I took the query string.
> >> The problem is convert the java.lang.string (query) in a list of term
> and
> >> operators and doing it using the same parser used by Solr to execute the
> >> queries.
> >>
> >> 2012/10/4 Mikhail Khludnev 
> >>
> >>  you've got ResponseBuilder as process() or prepare() argument, check
> >>> "query" field, but your component should be registered after
> >>> QueryComponent
> >>> in your requestHandler config.
> >>>
> >>> On Thu, Oct 4, 2012 at 6:03 PM, Davide Lorenzo Marino <
> >>> davide.mar...@gmail.com> wrote:
> >>>
> >>> > Hi All,
> >>> > i'm working in a new searchComponent that analyze the search queries.
> >>> > I need to know if given a query string is possible to get the list of
> >>> > operators and terms (better in polish notation)?
> >>> > I mean if the default field is "country" and the query is the String
> >>> >
> >>> > "england OR (name:paul AND city:rome)"
> >>> >
> >>> > to get the List
> >>> >
> >>> > [ Operator OR, Term country:england, OPERATOR AND, Term name:paul,
> Term
> >>> > city:rome ]
> >>> >
> >>> > Thanks in advance
> >>> >
> >>> > Davide Marino
> >>> >
> >>>
> >>>
> >>>
> >>> --
> >>> Sincerely yours
> >>> Mikhail Khludnev
> >>> Tech Lead
> >>> Grid Dynamics
> >>>
> >>> 
> >>>  
> >>>
> >>>
> >>
>


Re: Question about OR operator

2012-10-04 Thread Otis Gospodnetic
What's the error Jorge?

Otis
--
Search Analytics - http://sematext.com/search-analytics/index.html
Performance Monitoring - http://sematext.com/spm/index.html


On Thu, Oct 4, 2012 at 1:36 PM, Jorge Luis Betancourt Gonzalez
 wrote:
> Hi:
>
> Thanks for all the replies, right now I have this in my mm parameter:
>
> 
> 2<-1 5<-2 6<90%
> 
>
> I'm trying to get an straight OR between all the terms in my query, should I 
> set the mm parameter to 1? because this gave an error.
>
> Greetings!
>
> On Oct 4, 2012, at 11:06 AM, Jorge Luis Betancourt Gonzalez wrote:
>
>> Hi:
>>
>> I'm having an issue with solr 3.6.1 and I'm sensing that is a lack  of 
>> understanding. I'm building a search engine, using of course solr to store 
>> the inverted index, so far so good. When I search for a term, let's say 
>> "java" I get 761 results, then querying the index with a "php" term give me 
>> 3194 results found. So if a do a query for java php (without any quotas) I 
>> suppose that solr will interpret this as an OR between the two terms, 
>> correct? so the results should be the JOIN between the two subsets of 
>> results? so can anyone  explain why I get less results searching for the 
>> last query? java php without any quotes??
>>
>> Thanks in advance!!
>> 10mo. ANIVERSARIO DE LA CREACION DE LA UNIVERSIDAD DE LAS CIENCIAS 
>> INFORMATICAS...
>> CONECTADOS AL FUTURO, CONECTADOS A LA REVOLUCION
>>
>> http://www.uci.cu
>> http://www.facebook.com/universidad.uci
>> http://www.flickr.com/photos/universidad_uci
>>
>> 10mo. ANIVERSARIO DE LA CREACION DE LA UNIVERSIDAD DE LAS CIENCIAS 
>> INFORMATICAS...
>> CONECTADOS AL FUTURO, CONECTADOS A LA REVOLUCION
>>
>> http://www.uci.cu
>> http://www.facebook.com/universidad.uci
>> http://www.flickr.com/photos/universidad_uci
>
>
> 10mo. ANIVERSARIO DE LA CREACION DE LA UNIVERSIDAD DE LAS CIENCIAS 
> INFORMATICAS...
> CONECTADOS AL FUTURO, CONECTADOS A LA REVOLUCION
>
> http://www.uci.cu
> http://www.facebook.com/universidad.uci
> http://www.flickr.com/photos/universidad_uci


Re: Question about OR operator

2012-10-04 Thread Jorge Luis Betancourt Gonzalez
This is the error:

GRAVE: java.lang.NumberFormatException: For input string: "
100
"
at 
java.lang.NumberFormatException.forInputString(NumberFormatException.java:65)
at java.lang.Integer.parseInt(Integer.java:470)
at java.lang.Integer.(Integer.java:636)
at 
org.apache.solr.util.SolrPluginUtils.calculateMinShouldMatch(SolrPluginUtils.java:691)
at 
org.apache.solr.util.SolrPluginUtils.setMinShouldMatch(SolrPluginUtils.java:656)
at 
org.apache.solr.search.DisMaxQParser.getUserQuery(DisMaxQParser.java:210)
at 
org.apache.solr.search.DisMaxQParser.addMainQuery(DisMaxQParser.java:166)
at org.apache.solr.search.DisMaxQParser.parse(DisMaxQParser.java:77)
at org.apache.solr.search.QParser.getQuery(QParser.java:143)
at 
org.apache.solr.handler.component.QueryComponent.prepare(QueryComponent.java:105)
at 
org.apache.solr.handler.component.SearchHandler.handleRequestBody(SearchHandler.java:165)
at 
org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:129)
at org.apache.solr.core.SolrCore.execute(SolrCore.java:1376)
at 
org.apache.solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter.java:365)
at 
org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:260)
at 
org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:243)
at 
org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:210)
at 
org.apache.catalina.core.StandardWrapperValve.invoke(StandardWrapperValve.java:224)
at 
org.apache.catalina.core.StandardContextValve.invoke(StandardContextValve.java:169)
at 
org.apache.catalina.authenticator.AuthenticatorBase.invoke(AuthenticatorBase.java:472)
at 
org.apache.catalina.core.StandardHostValve.invoke(StandardHostValve.java:168)
at 
org.apache.catalina.valves.ErrorReportValve.invoke(ErrorReportValve.java:98)
at 
org.apache.catalina.valves.AccessLogValve.invoke(AccessLogValve.java:927)
at 
org.apache.catalina.core.StandardEngineValve.invoke(StandardEngineValve.java:118)
at 
org.apache.catalina.connector.CoyoteAdapter.service(CoyoteAdapter.java:407)
at 
org.apache.coyote.http11.AbstractHttp11Processor.process(AbstractHttp11Processor.java:987)
at 
org.apache.coyote.AbstractProtocol$AbstractConnectionHandler.process(AbstractProtocol.java:579)
at 
org.apache.tomcat.util.net.JIoEndpoint$SocketProcessor.run(JIoEndpoint.java:307)
at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1110)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:603)
at java.lang.Thread.run(Thread.java:679)

This is the parameter in my solrconfig.xml


0


On Oct 4, 2012, at 1:46 PM, Otis Gospodnetic wrote:

> What's the error Jorge?
> 
> Otis
> --
> Search Analytics - http://sematext.com/search-analytics/index.html
> Performance Monitoring - http://sematext.com/spm/index.html
> 
> 
> On Thu, Oct 4, 2012 at 1:36 PM, Jorge Luis Betancourt Gonzalez
>  wrote:
>> Hi:
>> 
>> Thanks for all the replies, right now I have this in my mm parameter:
>> 
>>
>>2<-1 5<-2 6<90%
>>
>> 
>> I'm trying to get an straight OR between all the terms in my query, should I 
>> set the mm parameter to 1? because this gave an error.
>> 
>> Greetings!
>> 
>> On Oct 4, 2012, at 11:06 AM, Jorge Luis Betancourt Gonzalez wrote:
>> 
>>> Hi:
>>> 
>>> I'm having an issue with solr 3.6.1 and I'm sensing that is a lack  of 
>>> understanding. I'm building a search engine, using of course solr to store 
>>> the inverted index, so far so good. When I search for a term, let's say 
>>> "java" I get 761 results, then querying the index with a "php" term give me 
>>> 3194 results found. So if a do a query for java php (without any quotas) I 
>>> suppose that solr will interpret this as an OR between the two terms, 
>>> correct? so the results should be the JOIN between the two subsets of 
>>> results? so can anyone  explain why I get less results searching for the 
>>> last query? java php without any quotes??
>>> 
>>> Thanks in advance!!
>>> 10mo. ANIVERSARIO DE LA CREACION DE LA UNIVERSIDAD DE LAS CIENCIAS 
>>> INFORMATICAS...
>>> CONECTADOS AL FUTURO, CONECTADOS A LA REVOLUCION
>>> 
>>> http://www.uci.cu
>>> http://www.facebook.com/universidad.uci
>>> http://www.flickr.com/photos/universidad_uci
>>> 
>>> 10mo. ANIVERSARIO DE LA CREACION DE LA UNIVERSIDAD DE LAS CIENCIAS 
>>> INFORMATICAS...
>>> CONECTADOS AL FUTURO, CONECTADOS A LA REVOLUCION
>>> 
>>> http://www.uci.cu
>>> http://www.facebook.com/universidad.uci
>>> http://www.flickr.com/photos/universidad_uci
>> 
>> 
>> 10mo. ANIVERSARIO DE LA CREACION DE LA UNIVERSIDAD DE LAS CIENCIAS 
>> INFORMATICAS...
>> CONECTADOS AL FUTURO, CONECTADOS A LA REVOLUCION
>> 
>> h

Re: Solr replication hangs on multiple slave nodes

2012-10-04 Thread Otis Gospodnetic
Hi,

I haven't seen this error before.

Some questions/suggestions...
Have you tried with 3.6.1?
Is the disk full?
Have you tried watching the network with
http://code.google.com/p/tcpmon/ or tcpdump?

Otis
--
Search Analytics - http://sematext.com/search-analytics/index.html
Performance Monitoring - http://sematext.com/spm/index.html


On Thu, Oct 4, 2012 at 12:06 PM, Justin Babuscio
 wrote:
> After a large index rebuild (16-masters with ~15GB each), some slaves fail
> to completely replicate.
>
> We are running Solr v3.5 with 16 masters and 2 slaves each for a total of
> 48 servers.
>
> 4 of the 32 slaves sit in a stalled replication state with similar messages:
>
> Files Downloaded:  254/260
> Downloaded: 12.09 GB / 12.09 GB [ 100% ]
> Downloading File: _t6.fdt, Downloaded: 3.1 MB / 3.1 MB [ 100 % ]
> Time Elapsed: 3215s, EStimated Time REmaining: 0s, Speed: 24.5 MB/s
>
>
> As you'll notice, all download sizes appear to be complete but the files
> downloaded are not.  This also prevents the servers from polling for a new
> update from the masters.  When searching, we are occasionally seeing 500
> responses from the slaves that fail to replicate.  The errors are
>
> ArrayIndexOutOfBounds - this occurs when writing the HTTP Response (our
> container is WebSphere)
> NullPointerExceptions - org.apache.lucnee.queryParser.QueryParser.parse
> (QueryParser.java:203 )
>
> We have tried to stop the slave, delete the /data directory, and restart.
>  This started downloading the index but stalled as expected.
>
> Thanks,
> Justin


Re: Question about OR operator

2012-10-04 Thread Otis Gospodnetic
Try 0% instead of just 0.

Otis
--
Search Analytics - http://sematext.com/search-analytics/index.html
Performance Monitoring - http://sematext.com/spm/index.html


On Thu, Oct 4, 2012 at 1:50 PM, Jorge Luis Betancourt Gonzalez
 wrote:
> This is the error:
>
> GRAVE: java.lang.NumberFormatException: For input string: "
> 100
> "
> at 
> java.lang.NumberFormatException.forInputString(NumberFormatException.java:65)
> at java.lang.Integer.parseInt(Integer.java:470)
> at java.lang.Integer.(Integer.java:636)
> at 
> org.apache.solr.util.SolrPluginUtils.calculateMinShouldMatch(SolrPluginUtils.java:691)
> at 
> org.apache.solr.util.SolrPluginUtils.setMinShouldMatch(SolrPluginUtils.java:656)
> at 
> org.apache.solr.search.DisMaxQParser.getUserQuery(DisMaxQParser.java:210)
> at 
> org.apache.solr.search.DisMaxQParser.addMainQuery(DisMaxQParser.java:166)
> at org.apache.solr.search.DisMaxQParser.parse(DisMaxQParser.java:77)
> at org.apache.solr.search.QParser.getQuery(QParser.java:143)
> at 
> org.apache.solr.handler.component.QueryComponent.prepare(QueryComponent.java:105)
> at 
> org.apache.solr.handler.component.SearchHandler.handleRequestBody(SearchHandler.java:165)
> at 
> org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:129)
> at org.apache.solr.core.SolrCore.execute(SolrCore.java:1376)
> at 
> org.apache.solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter.java:365)
> at 
> org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:260)
> at 
> org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:243)
> at 
> org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:210)
> at 
> org.apache.catalina.core.StandardWrapperValve.invoke(StandardWrapperValve.java:224)
> at 
> org.apache.catalina.core.StandardContextValve.invoke(StandardContextValve.java:169)
> at 
> org.apache.catalina.authenticator.AuthenticatorBase.invoke(AuthenticatorBase.java:472)
> at 
> org.apache.catalina.core.StandardHostValve.invoke(StandardHostValve.java:168)
> at 
> org.apache.catalina.valves.ErrorReportValve.invoke(ErrorReportValve.java:98)
> at 
> org.apache.catalina.valves.AccessLogValve.invoke(AccessLogValve.java:927)
> at 
> org.apache.catalina.core.StandardEngineValve.invoke(StandardEngineValve.java:118)
> at 
> org.apache.catalina.connector.CoyoteAdapter.service(CoyoteAdapter.java:407)
> at 
> org.apache.coyote.http11.AbstractHttp11Processor.process(AbstractHttp11Processor.java:987)
> at 
> org.apache.coyote.AbstractProtocol$AbstractConnectionHandler.process(AbstractProtocol.java:579)
> at 
> org.apache.tomcat.util.net.JIoEndpoint$SocketProcessor.run(JIoEndpoint.java:307)
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1110)
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:603)
> at java.lang.Thread.run(Thread.java:679)
>
> This is the parameter in my solrconfig.xml
>
> 
> 0
> 
>
> On Oct 4, 2012, at 1:46 PM, Otis Gospodnetic wrote:
>
>> What's the error Jorge?
>>
>> Otis
>> --
>> Search Analytics - http://sematext.com/search-analytics/index.html
>> Performance Monitoring - http://sematext.com/spm/index.html
>>
>>
>> On Thu, Oct 4, 2012 at 1:36 PM, Jorge Luis Betancourt Gonzalez
>>  wrote:
>>> Hi:
>>>
>>> Thanks for all the replies, right now I have this in my mm parameter:
>>>
>>>
>>>2<-1 5<-2 6<90%
>>>
>>>
>>> I'm trying to get an straight OR between all the terms in my query, should 
>>> I set the mm parameter to 1? because this gave an error.
>>>
>>> Greetings!
>>>
>>> On Oct 4, 2012, at 11:06 AM, Jorge Luis Betancourt Gonzalez wrote:
>>>
 Hi:

 I'm having an issue with solr 3.6.1 and I'm sensing that is a lack  of 
 understanding. I'm building a search engine, using of course solr to store 
 the inverted index, so far so good. When I search for a term, let's say 
 "java" I get 761 results, then querying the index with a "php" term give 
 me 3194 results found. So if a do a query for java php (without any 
 quotas) I suppose that solr will interpret this as an OR between the two 
 terms, correct? so the results should be the JOIN between the two subsets 
 of results? so can anyone  explain why I get less results searching for 
 the last query? java php without any quotes??

 Thanks in advance!!
 10mo. ANIVERSARIO DE LA CREACION DE LA UNIVERSIDAD DE LAS CIENCIAS 
 INFORMATICAS...
 CONECTADOS AL FUTURO, CONECTADOS A LA REVOLUCION

 http://www.uci.cu
 http://www.facebook.com/universidad.uci
 http://www.flickr.com/photos/universidad_uci

 10mo. ANIVERSAR

Identify exact search in edismax

2012-10-04 Thread rhl4tr
I am using edismax for guessing category from user query.

If user says "I want to buy BMW and Audi car". This query will be fed to
edismax which will give me results based on phrase match.

Field contains following values
-BMW => Cars category
-Audi => Cars 
-2 BHK => Real Estate
-need job => jobs category
-Buy 1Bhk - Apartments

I get results with phrase matches on top.

Generally top result will be a phrase match (if there are any). How can I
know that field's all terms have matched to user query.

e.g.
mm => percentage of user query terms should match with field terms

I want opposite => percentage of field values should match with user query.
which is in my case 100% => phrase match

 



--
View this message in context: 
http://lucene.472066.n3.nabble.com/Identify-exact-search-in-edismax-tp4011859.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: Question about OR operator

2012-10-04 Thread Jorge Luis Betancourt Gonzalez
Thanks for the quick response, I got the same response, what I'm trying to 
accomplish is to get straight OR between all the clauses or terms in my query, 
the value I should use is 0 right?


10mo. ANIVERSARIO DE LA CREACION DE LA UNIVERSIDAD DE LAS CIENCIAS 
INFORMATICAS...
CONECTADOS AL FUTURO, CONECTADOS A LA REVOLUCION

http://www.uci.cu
http://www.facebook.com/universidad.uci
http://www.flickr.com/photos/universidad_uci


Re: QueryElevationComponent not working in Distributed Search

2012-10-04 Thread Erick Erickson
no, it's not. The fix version is 4.0-ALPHA

So you can test this pretty easily for yourself by getting 4.0
alpha/beta or RC1 and giving it a whirl...


Best
Erick

On Thu, Oct 4, 2012 at 10:10 AM, vasokan  wrote:
> Hi,
>
> I am using the following version of Solr.
>
> Solr Specification Version: 3.6.0.2012.04.06.11.34.07
> Solr Implementation Version: 3.6.0 1310449 - rmuir - 2012-04-06
> 11:34:07
> Lucene Specification Version: 3.6.0
> Lucene Implementation Version: 3.6.0 1310449 - rmuir - 2012-04-06
> 11:31:16
>
> Elevation of documents is working fine when tested for single system.  But
> is not working for distributed systems.  I found a relevant issue as in the
> link https://issues.apache.org/jira/browse/SOLR-2949 and it is currently
> resolved.
>
> I am in need to understand
> 1.  Whether the fix for the issue I have mentioned above is present in my
> version.
> 2.  Is the problem of elevating in distributed search still exists.
>
> It will be of great help if anyone can share me your ideas with me.
>
> Thank you,
> Vinoth Asokan
>
>
>
> --
> View this message in context: 
> http://lucene.472066.n3.nabble.com/QueryElevationComponent-not-working-in-Distributed-Search-tp4011785.html
> Sent from the Solr - User mailing list archive at Nabble.com.


Re: Identify exact search in edismax

2012-10-04 Thread Mikhail Khludnev
overall task is not clear to me, but if you want to "field's all terms have
matched to user query" i'd suggest to introduce your own Similarity:
 - write number of terms as a norm value (which is by default a byte per
doc per field), then
 - you'll be able to retrieve this number during search time and use for
evaluating your own "mm - criteria".
WDYT?

On Thu, Oct 4, 2012 at 9:28 PM, rhl4tr  wrote:

> I am using edismax for guessing category from user query.
>
> If user says "I want to buy BMW and Audi car". This query will be fed to
> edismax which will give me results based on phrase match.
>
> Field contains following values
> -BMW => Cars category
> -Audi => Cars
> -2 BHK => Real Estate
> -need job => jobs category
> -Buy 1Bhk - Apartments
>
> I get results with phrase matches on top.
>
> Generally top result will be a phrase match (if there are any). How can I
> know that field's all terms have matched to user query.
>
> e.g.
> mm => percentage of user query terms should match with field terms
>
> I want opposite => percentage of field values should match with user query.
> which is in my case 100% => phrase match
>
>
>
>
>
> --
> View this message in context:
> http://lucene.472066.n3.nabble.com/Identify-exact-search-in-edismax-tp4011859.html
> Sent from the Solr - User mailing list archive at Nabble.com.
>



-- 
Sincerely yours
Mikhail Khludnev
Tech Lead
Grid Dynamics


 


Extract multiple streams into the same document

2012-10-04 Thread Yury Kats
I'm sending streams of data to Solr, using ExtractingRequestHandler to be 
parsed/extracted by Tika and then indexed.

While multiple streams can be passed with a single request to Solr, each stream 
ends up being indexed into a separate document.
Or, if I pass the unique id parameter with the request (as "literal.id" 
parameter), the very last stream ends up overwriting all
other streams withing the same request, since each one is being indexed into a 
new document with the same id.

I'm looking for a way to have multiple streams indexed into the same document. 
I have a content field defined for extraction
(using fmap.content parameter) and the field is defined as multiValued in the 
schema. I would like all streams from the request to be
indexed as different values of that multiValued content field in the same 
document.

Any hints or ideas are appreciated.

Thanks,
Yury


Re: SolrCloud AutoSharding?

2012-10-04 Thread Jason Huang
Tomás,

Thanks for the response.

So basically at this point what I could do is to make a "best guess"
of my estimated index size and specify a few shards to start with. I
am guessing if I assigned too many shards, then the "join" between
different shards may be the bottleneck? On the other side, if I assign
only one or two shards, then each shard may become too big and the I/O
within each shard will be the bottleneck?

Then after a while of deployment, if we find out where the bottleneck
is, do we have a way to adjust the number of shards without breaking
the indexing and without require any downtime in production system?
Say I have 4 shards and each of them is 100GB. I found that the I/O is
the bottleneck and I want to use 8 shards instead - is there a good
way to redistribute the whole index from 4 existing shards to 8 shards
without breaking anything (and without a downtime)?

thanks!

Jason



On Thu, Oct 4, 2012 at 1:36 PM, Tomás Fernández Löbbe
 wrote:
> SolrCloud doesn't auto-shard at this point. It doesn't split indexes either
> (there is an open issue for this:
> https://issues.apache.org/jira/browse/SOLR-3755 )
>
> At this point you need to specify the number of shards for a collection in
> advance, with the numShards parameter. When you have more than one shard
> for a collection, SolrCloud automatically distributes the query to one
> replica of each shard and join the results for you.
>
> Most reliable documentation about SolrCloud can be found here:
> http://wiki.apache.org/solr/SolrCloud
>
> Tomás
>
> On Thu, Oct 4, 2012 at 12:02 PM, Jason Huang  wrote:
>
>> Hello,
>>
>> I am exploring SolrCloud and have a few questions about SolrCloud's
>> auto-sharding functionality. I couldn't find any good answer from my
>> online search - if anyone knows the answer to these questions or can
>> point me to the right document, that would be great!
>>
>> (1) Does SolrCloud offer auto-sharding functionality? If we
>> continuously feed documents to a single index, eventually the shard
>> will grow to a huge size and the query will be slow. How does
>> SolrCloud handle this situation?
>>
>> (2) If SolrCloud auto-splits a big shard to two small shards, then
>> shard 1 will have part of the index and shard 2 will have some other
>> part of index. Is this correct? If so, when we perform a query, do we
>> need to go through both shards in order to get a good response? Will
>> this be slow (because we need to go through two shards, or more shards
>> later if we need to split the shards again when the size is too big)?
>>
>> thanks!
>>
>> Jason
>>


Re: SolrCloud AutoSharding?

2012-10-04 Thread Otis Gospodnetic
Hi,

You could start with one node on which you could start with # shards
== # CPU cores.
Then, all while running a stress/performance test, observe the latency
and other metrics you care about.
Keep increasing the number of shards and keep observing.

SPM for Solr (see signature) will help with the observing part.
JMeter or SolrMeter (hi Tomás ;)) will help with stress testing part.

You cannot change the number of shards on the fly, reindexing is needed.
The above also doesn't take into account index/shard size, but that is
dimension to experiment with, too.

Otis
--
Search Analytics - http://sematext.com/search-analytics/index.html
Performance Monitoring - http://sematext.com/spm/index.html


On Thu, Oct 4, 2012 at 2:43 PM, Jason Huang  wrote:
> Tomás,
>
> Thanks for the response.
>
> So basically at this point what I could do is to make a "best guess"
> of my estimated index size and specify a few shards to start with. I
> am guessing if I assigned too many shards, then the "join" between
> different shards may be the bottleneck? On the other side, if I assign
> only one or two shards, then each shard may become too big and the I/O
> within each shard will be the bottleneck?
>
> Then after a while of deployment, if we find out where the bottleneck
> is, do we have a way to adjust the number of shards without breaking
> the indexing and without require any downtime in production system?
> Say I have 4 shards and each of them is 100GB. I found that the I/O is
> the bottleneck and I want to use 8 shards instead - is there a good
> way to redistribute the whole index from 4 existing shards to 8 shards
> without breaking anything (and without a downtime)?
>
> thanks!
>
> Jason
>
>
>
> On Thu, Oct 4, 2012 at 1:36 PM, Tomás Fernández Löbbe
>  wrote:
>> SolrCloud doesn't auto-shard at this point. It doesn't split indexes either
>> (there is an open issue for this:
>> https://issues.apache.org/jira/browse/SOLR-3755 )
>>
>> At this point you need to specify the number of shards for a collection in
>> advance, with the numShards parameter. When you have more than one shard
>> for a collection, SolrCloud automatically distributes the query to one
>> replica of each shard and join the results for you.
>>
>> Most reliable documentation about SolrCloud can be found here:
>> http://wiki.apache.org/solr/SolrCloud
>>
>> Tomás
>>
>> On Thu, Oct 4, 2012 at 12:02 PM, Jason Huang  wrote:
>>
>>> Hello,
>>>
>>> I am exploring SolrCloud and have a few questions about SolrCloud's
>>> auto-sharding functionality. I couldn't find any good answer from my
>>> online search - if anyone knows the answer to these questions or can
>>> point me to the right document, that would be great!
>>>
>>> (1) Does SolrCloud offer auto-sharding functionality? If we
>>> continuously feed documents to a single index, eventually the shard
>>> will grow to a huge size and the query will be slow. How does
>>> SolrCloud handle this situation?
>>>
>>> (2) If SolrCloud auto-splits a big shard to two small shards, then
>>> shard 1 will have part of the index and shard 2 will have some other
>>> part of index. Is this correct? If so, when we perform a query, do we
>>> need to go through both shards in order to get a good response? Will
>>> this be slow (because we need to go through two shards, or more shards
>>> later if we need to split the shards again when the size is too big)?
>>>
>>> thanks!
>>>
>>> Jason
>>>


Re: SolrCloud AutoSharding?

2012-10-04 Thread Jason Huang
Thanks Otis.

This starts to make more sense to me. I will go through the links in
your signature and dig into it.

Still learning but this is a good direction.

thanks!

Jason

On Thu, Oct 4, 2012 at 2:55 PM, Otis Gospodnetic
 wrote:
> Hi,
>
> You could start with one node on which you could start with # shards
> == # CPU cores.
> Then, all while running a stress/performance test, observe the latency
> and other metrics you care about.
> Keep increasing the number of shards and keep observing.
>
> SPM for Solr (see signature) will help with the observing part.
> JMeter or SolrMeter (hi Tomás ;)) will help with stress testing part.
>
> You cannot change the number of shards on the fly, reindexing is needed.
> The above also doesn't take into account index/shard size, but that is
> dimension to experiment with, too.
>
> Otis
> --
> Search Analytics - http://sematext.com/search-analytics/index.html
> Performance Monitoring - http://sematext.com/spm/index.html
>
>
> On Thu, Oct 4, 2012 at 2:43 PM, Jason Huang  wrote:
>> Tomás,
>>
>> Thanks for the response.
>>
>> So basically at this point what I could do is to make a "best guess"
>> of my estimated index size and specify a few shards to start with. I
>> am guessing if I assigned too many shards, then the "join" between
>> different shards may be the bottleneck? On the other side, if I assign
>> only one or two shards, then each shard may become too big and the I/O
>> within each shard will be the bottleneck?
>>
>> Then after a while of deployment, if we find out where the bottleneck
>> is, do we have a way to adjust the number of shards without breaking
>> the indexing and without require any downtime in production system?
>> Say I have 4 shards and each of them is 100GB. I found that the I/O is
>> the bottleneck and I want to use 8 shards instead - is there a good
>> way to redistribute the whole index from 4 existing shards to 8 shards
>> without breaking anything (and without a downtime)?
>>
>> thanks!
>>
>> Jason
>>
>>
>>
>> On Thu, Oct 4, 2012 at 1:36 PM, Tomás Fernández Löbbe
>>  wrote:
>>> SolrCloud doesn't auto-shard at this point. It doesn't split indexes either
>>> (there is an open issue for this:
>>> https://issues.apache.org/jira/browse/SOLR-3755 )
>>>
>>> At this point you need to specify the number of shards for a collection in
>>> advance, with the numShards parameter. When you have more than one shard
>>> for a collection, SolrCloud automatically distributes the query to one
>>> replica of each shard and join the results for you.
>>>
>>> Most reliable documentation about SolrCloud can be found here:
>>> http://wiki.apache.org/solr/SolrCloud
>>>
>>> Tomás
>>>
>>> On Thu, Oct 4, 2012 at 12:02 PM, Jason Huang  wrote:
>>>
 Hello,

 I am exploring SolrCloud and have a few questions about SolrCloud's
 auto-sharding functionality. I couldn't find any good answer from my
 online search - if anyone knows the answer to these questions or can
 point me to the right document, that would be great!

 (1) Does SolrCloud offer auto-sharding functionality? If we
 continuously feed documents to a single index, eventually the shard
 will grow to a huge size and the query will be slow. How does
 SolrCloud handle this situation?

 (2) If SolrCloud auto-splits a big shard to two small shards, then
 shard 1 will have part of the index and shard 2 will have some other
 part of index. Is this correct? If so, when we perform a query, do we
 need to go through both shards in order to get a good response? Will
 this be slow (because we need to go through two shards, or more shards
 later if we need to split the shards again when the size is too big)?

 thanks!

 Jason



Re: solr meger policy

2012-10-04 Thread jame vaalet
thanks Tomaz.

On 4 October 2012 19:56, Tomás Fernández Löbbe wrote:

> TieredMergePolicy is the default in Solr since 3.3. See
> https://issues.apache.org/jira/browse/SOLR-2567 It is still the default
> for
> 4.0, so you should have the same MergePolicy in 3.4 and 3.6.
>
>
>
> On Thu, Oct 4, 2012 at 9:14 AM, jame vaalet  wrote:
>
> > Thats the first thing i tried, but it had only merge factor and
> > maxmergedocs in it. We have different merge policies like
> > LogMergePolicy<
> >
> http://lucene.apache.org/core/old_versioned_docs/versions/3_4_0/api/core/org/apache/lucene/index/LogMergePolicy.html
> > >
> > , NoMergePolicy<
> >
> http://lucene.apache.org/core/old_versioned_docs/versions/3_4_0/api/core/org/apache/lucene/index/NoMergePolicy.html
> > >
> > , TieredMergePolicy<
> >
> http://lucene.apache.org/core/old_versioned_docs/versions/3_4_0/api/core/org/apache/lucene/index/TieredMergePolicy.html
> > >
> > , UpgradeIndexMergePolicy<
> >
> http://lucene.apache.org/core/old_versioned_docs/versions/3_4_0/api/core/org/apache/lucene/index/UpgradeIndexMergePolicy.html
> > >.
> > finally i found what is the default policy and values in 3.4 lucene:
> >
> >- default policy is
> > TieredMergePolicy<
> >
> http://lucene.apache.org/core/old_versioned_docs/versions/3_4_0/api/core/org/apache/lucene/index/TieredMergePolicy.html
> > >
> > (
> >
> >
> http://lucene.apache.org/core/old_versioned_docs/versions/3_4_0/api/core/org/apache/lucene/index/MergePolicy.html
> >)
> >- default constants unless specified are
> >
> >
> http://lucene.apache.org/core/old_versioned_docs/versions/3_4_0/api/core/constant-values.html
> >
> >
> > On 4 October 2012 18:29, Otis Gospodnetic  > >wrote:
> >
> > > Hi,
> > >
> > > Look for the word merge in solrconfig.xml :)
> > >
> > > Otis
> > > --
> > > Performance Monitoring - http://sematext.com/spm
> > > On Oct 4, 2012 7:50 AM, "jame vaalet"  wrote:
> > >
> > > > Hi,
> > > > I would like to know the different merge policies lucene uses in
> > > different
> > > > versions of SOLR. I have got 3.4 and 3.6 versions of solr running but
> > how
> > > > do i point them to use different merge policies?
> > > > Thanks in advance !
> > > >
> > > > --
> > > >
> > > > -JAME
> > > >
> > >
> >
> >
> >
> > --
> >
> > -JAME
> >
>



-- 

-JAME


Re: Getting list of operators and terms for a query

2012-10-04 Thread Amit Nithian
I'm not 100% sure but my guess is that you can get the list of boolean
clauses and their "occur" (must, should, must not) and that would be
your and, or, not equivalents.



On Thu, Oct 4, 2012 at 10:39 AM, Davide Lorenzo Marino
 wrote:
> For what I saw in the documentation from the class
> org.apache.lucene.search.Query
> I can just iterate over the terms using the method extractTerms. How can I
> extract the operators?
>
> 2012/10/4 Amit Nithian 
>
>> I think you'd want to start by looking at the rb.getQuery() in the
>> prepare (or process if you are trying to do post-results analysis).
>> This returns a Query object that would contain everything in that and
>> I'd then look at the Javadoc to see how to traverse it. I'm sure some
>> runtime type-casting may be necessary to get at the sub-structures
>>
>> On Thu, Oct 4, 2012 at 9:23 AM, Davide Lorenzo Marino
>>  wrote:
>> > I don't need really start from the query String.
>> > What I need is obtain a list of terms and operators.
>> > So the real problem is:
>> >
>> > How can I access the Lucene Query structure to traverse it?
>> >
>> > Davide Marino
>> >
>> >
>> > 2012/10/4 Jack Krupansky 
>> >
>> >> I'm not quite following what the issue is here. I mean, the Solr
>> >> QueryComponent generates a Lucene Query structure and you need to write
>> >> code to recursively traverse that Lucene Query structure and generate
>> your
>> >> preferred form of output. There would be no need to look at the original
>> >> query string. So, what exactly are you asking?
>> >>
>> >> Maybe you simply need to read up on Lucene Query and its subclasses to
>> >> understand what that structure looks like.
>> >>
>> >> -- Jack Krupansky
>> >>
>> >> -Original Message- From: Davide Lorenzo Marino
>> >> Sent: Thursday, October 04, 2012 11:36 AM
>> >> To: solr-user@lucene.apache.org
>> >> Subject: Re: Getting list of operators and terms for a query
>> >>
>> >>
>> >> It's ok.. I did it and I took the query string.
>> >> The problem is convert the java.lang.string (query) in a list of term
>> and
>> >> operators and doing it using the same parser used by Solr to execute the
>> >> queries.
>> >>
>> >> 2012/10/4 Mikhail Khludnev 
>> >>
>> >>  you've got ResponseBuilder as process() or prepare() argument, check
>> >>> "query" field, but your component should be registered after
>> >>> QueryComponent
>> >>> in your requestHandler config.
>> >>>
>> >>> On Thu, Oct 4, 2012 at 6:03 PM, Davide Lorenzo Marino <
>> >>> davide.mar...@gmail.com> wrote:
>> >>>
>> >>> > Hi All,
>> >>> > i'm working in a new searchComponent that analyze the search queries.
>> >>> > I need to know if given a query string is possible to get the list of
>> >>> > operators and terms (better in polish notation)?
>> >>> > I mean if the default field is "country" and the query is the String
>> >>> >
>> >>> > "england OR (name:paul AND city:rome)"
>> >>> >
>> >>> > to get the List
>> >>> >
>> >>> > [ Operator OR, Term country:england, OPERATOR AND, Term name:paul,
>> Term
>> >>> > city:rome ]
>> >>> >
>> >>> > Thanks in advance
>> >>> >
>> >>> > Davide Marino
>> >>> >
>> >>>
>> >>>
>> >>>
>> >>> --
>> >>> Sincerely yours
>> >>> Mikhail Khludnev
>> >>> Tech Lead
>> >>> Grid Dynamics
>> >>>
>> >>> 
>> >>>  
>> >>>
>> >>>
>> >>
>>


Re: How to make SOLR manipulate the results?

2012-10-04 Thread SUJIT PAL
Hi Srilatha,

One way to do this would be by making two calls, one to your sponsored list 
where you pick two at random and a solr call where you pick all the search 
results and then stick them together in your client.

Sujit

On Oct 4, 2012, at 12:39 AM, srilatha wrote:

> For an E-commerce website, we have stored the products as SOLR documents with
> the following fields and weights:
> Title:5
> Description:4
> 
> For some products, we need to ensure that they appear in the top ten results
> even if their relevance in the above two fields does not qualify them for
> being in top 10. For example:
> P1, P2,  P10 are the legitimate products for a given search keyword
> "iPhone". I have S1 ... S100 as sponsored products that want to appear in
> the top 10. My policy is that only 2 of these 100 sponsored products will be
> randomly chosen and shown in the top 10 so that the results will be: S5,
> S31, P1, P2, ... P8. In the next request, the sponsored products that gets
> slipped in may be S4, S99.
> 
> The QueryElevationComponent lets us specify the docIDs for keywords but does
> not let us randomize the results such that only 2 of the complete set of
> sponsored docIDs is sent in the results.
> 
> Any suggestions for implementing this would be appreciated.
> 
> 
> 
> 
> --
> View this message in context: 
> http://lucene.472066.n3.nabble.com/How-to-make-SOLR-manipulate-the-results-tp4011739.html
> Sent from the Solr - User mailing list archive at Nabble.com.



Re: segment number during optimize of index

2012-10-04 Thread Otis Gospodnetic
You can certainly optimize down to just 1 segment.

Note that this is the most expensive option and that when you do that
you may actually hurt performance for a bit because Solr/Lucene may
need to re-read a bunch of data from the index for sorting and
faceting purposes.  You will also invalidate the previously cached
index data in the OS cache.

Finally, if this index is being modified, it will be de-optimized
again.  Note that Lucene periodically merges segments under the hood
as documents are added to the index anyway.

Otis
--
Search Analytics - http://sematext.com/search-analytics/index.html
Performance Monitoring - http://sematext.com/spm/index.html


On Thu, Oct 4, 2012 at 4:20 PM, jame vaalet  wrote:
> Hi,
> I was about to do optimize on my index which has got around 100 segments
> right now, but am confused about the segment size that has to be chosen.
> would it have any trouble merging all the index into one single segment ?
> thanks in advance.
>
> --
>
> -JAME


Re: segment number during optimize of index

2012-10-04 Thread jame vaalet
so imagine i have merged the 150 Gb index into single segment, this would
make a single segment of 150 GB in memory. When new docs are indexed it
wouldn't alter this 150 Gb index unless i update or delete the older docs,
right? will 150 Gb single segment have problem with memory swapping at OS
level?

On 5 October 2012 02:28, Otis Gospodnetic wrote:

> You can certainly optimize down to just 1 segment.
>
> Note that this is the most expensive option and that when you do that
> you may actually hurt performance for a bit because Solr/Lucene may
> need to re-read a bunch of data from the index for sorting and
> faceting purposes.  You will also invalidate the previously cached
> index data in the OS cache.
>
> Finally, if this index is being modified, it will be de-optimized
> again.  Note that Lucene periodically merges segments under the hood
> as documents are added to the index anyway.
>
> Otis
> --
> Search Analytics - http://sematext.com/search-analytics/index.html
> Performance Monitoring - http://sematext.com/spm/index.html
>
>
> On Thu, Oct 4, 2012 at 4:20 PM, jame vaalet  wrote:
> > Hi,
> > I was about to do optimize on my index which has got around 100 segments
> > right now, but am confused about the segment size that has to be chosen.
> > would it have any trouble merging all the index into one single segment ?
> > thanks in advance.
> >
> > --
> >
> > -JAME
>



-- 

-JAME


Re: SOLR 4 BETA facet.pivot and cloud

2012-10-04 Thread Chris Hostetter

At the moment, it's easy to know what features will be in 4.0 final by 
trying out one of hte release candidates (look for the VOTE 
threads on the dev@lucene list)

As far as distributed pivot faceting support: this has not been committed 
-- there is an open jira with a patch, however skimming the recent 
ocmments there seems to be some question as to wether it's out of date 
with the current code.

The best way to ensure this makes it into 4.1 would be to try out hte 
patch, and report back in Jira your success/failures applying/using it.

https://issues.apache.org/jira/browse/SOLR-2894

: Please pardon this if is and FAQ, but after searching the archives I
: cannot get a clear answer.
: 
: Does the new facet.pivot work with SOLRCloud?  When I run SOLR 4 BETA
: with zookeeper, even if I specify shards=1, pivoting does not seem to
: work.  The quickest way to demo this is with the velocity browse page
: on the example data.  The pivot facet for "cat,inStock" only appears
: if I run without zookeeper.
: 
: If this is known, can you please let me know if this is a defect in
: beta that is expected to be working in GA or whether it will remain a
: limition for some time.
: 
: regards,
: 
: Nick Koton
: 

-Hoss


Re: Question about OR operator

2012-10-04 Thread Chris Hostetter

: GRAVE: java.lang.NumberFormatException: For input string: "
:   100
:   "
:   at 
java.lang.NumberFormatException.forInputString(NumberFormatException.java:65)
:   at java.lang.Integer.parseInt(Integer.java:470)
:   at java.lang.Integer.(Integer.java:636)
:   at 
org.apache.solr.util.SolrPluginUtils.calculateMinShouldMatch(SolrPluginUtils.java:691)

What version of Solr are you using?

That looks like a simple parsing bug that seems to have been fixed a while 
back (it's definitely not in the 4.0 branch)

can you try eliminating hte whitespace from your XML configured value...

 100

...that should work arround the problem.


-Hoss


StandardTokenizer generation from JFlex grammar

2012-10-04 Thread vempap
Hello,

  I'm trying to generate the standard tokenizer again using the jflex
specification (StandardTokenizerImpl.jflex) but I'm not able to do so due to
some errors (I would like to create my own jflex file using the standard
tokenizer which is why I'm trying to first generate using that to get a hang
of things).

I'm using jflex 1.4.3 and I ran into the following error:

Error in file "" (line 64): 
Syntax error.
HangulEx   = (!(!\p{Script:Hangul}|!\p{WB:ALetter})) ({Format} |
{Extend})*


Also, I tried installing an eclipse plugin from
http://cup-lex-eclipse.sourceforge.net/ which I thought would provide
options similar to JavaCC (http://eclipse-javacc.sourceforge.net/) through
which we can generate classes within eclipse - but had a hard luck.

Any help would be very helpful.

Regards,
Phani.



--
View this message in context: 
http://lucene.472066.n3.nabble.com/StandardTokenizer-generation-from-JFlex-grammar-tp4011941.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: segment number during optimize of index

2012-10-04 Thread Shawn Heisey

On 10/4/2012 3:22 PM, jame vaalet wrote:

so imagine i have merged the 150 Gb index into single segment, this would
make a single segment of 150 GB in memory. When new docs are indexed it
wouldn't alter this 150 Gb index unless i update or delete the older docs,
right? will 150 Gb single segment have problem with memory swapping at OS
level?


The number of segments involved don't matter much at all. The way Solr 
will use memory is the same either way.  With only one segment, the size 
of the index on disk (and the amount of memory required to track those 
segments) will be slightly less with one segment than with many 
segments, and searches will be slightly faster with one segment once 
everything is warmed up.  Of course, it takes a lot of I/O and CPU 
cycles to get the index optimized, which can have a strong negative 
effect on searching.


Deleting or indexing docs will never alter the existing segments on your 
disk.  Once a segment is finalized, it never gets changed.  Deleted 
documents are just marked deleted by a separate file and still exist in 
the index, and new documents end up in new segments, until a merge or an 
optimize happens on those segments.


Your index is never actually loaded into application RAM. Later versions 
of Solr (default starting in 3.1 on Windows and 3.3 on Linux), use an OS 
feature called memory mapping (MMapDirectory) which efficiently turns 
the data on disk into a large section of virtual memory.  The 
application makes requests to this memory section and the OS 
automatically turns it into a disk read.  This is not real memory, and 
it's not swap.  Real memory is used by the operating system (not Solr) 
to *cache* this memory map, speeding up access.  If you have more than 
150GB of memory, your entire index can fit into the disk cache, 
otherwise it will determine which parts of the index get used the most 
and try to cache those.  It is a good idea to have a large amount of 
memory that is not allocated to applications.


Your OS will only begin swapping if the actual amount of *real* memory 
used by your applications (including Solr, by virtue of the -Xmx 
parameter to Java) begins to exceed the amount of physical memory available.


Thanks,
Shawn



Re: segment number during optimize of index

2012-10-04 Thread Shawn Heisey

On 10/4/2012 3:22 PM, jame vaalet wrote:

so imagine i have merged the 150 Gb index into single segment, this would
make a single segment of 150 GB in memory. When new docs are indexed it
wouldn't alter this 150 Gb index unless i update or delete the older docs,
right? will 150 Gb single segment have problem with memory swapping at OS
level?


Supplement to my previous reply:  the real memory mentioned in the last 
paragraph does not include the memory that the OS uses to cache disk 
access.  If more memory is needed and all the free memory is being used 
by the disk cache, the OS will throw away part of the disk cache (a 
near-instantaneous operation that should never involve disk I/O) and 
give that memory to the application that requests it.


Here's a very good breakdown of how memory gets used with MMapDirectory 
in Solr.  It's applicable to any program that uses memory mapping, not 
just Solr:


http://java.dzone.com/articles/use-lucene%E2%80%99s-mmapdirectory

Thanks,
Shawn



Re: Question about OR operator

2012-10-04 Thread Jorge Luis Betancourt Gonzalez
Hi Chris:

I'm using solr 3.6.1, is the bug present in this version?

Greetings!

On Oct 4, 2012, at 6:11 PM, Chris Hostetter wrote:

> 
> : GRAVE: java.lang.NumberFormatException: For input string: "
> : 100
> : "
> : at 
> java.lang.NumberFormatException.forInputString(NumberFormatException.java:65)
> : at java.lang.Integer.parseInt(Integer.java:470)
> : at java.lang.Integer.(Integer.java:636)
> : at 
> org.apache.solr.util.SolrPluginUtils.calculateMinShouldMatch(SolrPluginUtils.java:691)
> 
> What version of Solr are you using?
> 
> That looks like a simple parsing bug that seems to have been fixed a while 
> back (it's definitely not in the 4.0 branch)
> 
> can you try eliminating hte whitespace from your XML configured value...
> 
> 100
> 
> ...that should work arround the problem.
> 
> 
> -Hoss
> 
> 10mo. ANIVERSARIO DE LA CREACION DE LA UNIVERSIDAD DE LAS CIENCIAS 
> INFORMATICAS...
> CONECTADOS AL FUTURO, CONECTADOS A LA REVOLUCION
> 
> http://www.uci.cu
> http://www.facebook.com/universidad.uci
> http://www.flickr.com/photos/universidad_uci


10mo. ANIVERSARIO DE LA CREACION DE LA UNIVERSIDAD DE LAS CIENCIAS 
INFORMATICAS...
CONECTADOS AL FUTURO, CONECTADOS A LA REVOLUCION

http://www.uci.cu
http://www.facebook.com/universidad.uci
http://www.flickr.com/photos/universidad_uci


Re: StandardTokenizer generation from JFlex grammar

2012-10-04 Thread Ahmet Arslan
>   I'm trying to generate the standard tokenizer again
> using the jflex
> specification (StandardTokenizerImpl.jflex) but I'm not able
> to do so due to
> some errors (I would like to create my own jflex file using
> the standard
> tokenizer which is why I'm trying to first generate using
> that to get a hang
> of things).
> 
> I'm using jflex 1.4.3 and I ran into the following error:

You need to use trunk. There is info about this in *.flex file. 
jflex-1.5.0-SNAPSHOT.jar


Re: StandardTokenizer generation from JFlex grammar

2012-10-04 Thread Ahmet Arslan

> > I'm using jflex 1.4.3 and I ran into the following
> 
> You need to use trunk. There is info about this in *.flex
> file. jflex-1.5.0-SNAPSHOT.jar
> 

Taken from ClassicTokenizerImpl.jflex :

"WARNING: if you change ClassicTokenizerImpl.jflex and need to regenerate the 
tokenizer, only use the trunk version of JFlex 1.5 at the moment!"


SolrCloud - replication factor

2012-10-04 Thread Sudhakar Maddineni
Hi,

 Appreciate if someone could provide some pointers/docx to find info about
replication factor.



I see that the replication factor was mentioned in the wiki doc:
http://wiki.apache.org/solr/SolrCloud - Managing collections via the
Collections API -
http://localhost:8983/solr/admin/collections?action=CREATE&name=mycollection&numShards=3&replicationFactor=4
.

But, couldn't find much documentation on how it is actually going to work
in a sharded cluster setup.



I have a cluster with 3 solr nodes and 2 shards [numShards=2] with the
following setup and I didn't specify any replicationFactor during the setup.



shard1 <--> solr node1, *node3*

shard2 <--> solr node2



So, when I added "*node3*" to the existing cluster, it was auto-assigned to
"shard1".



Does that mean "*node3*" acting as a replica of "node1"? And, "node2"
didn't have any replica yet?



If yes,what is the replication factor that i should provide in order to get
the documents in node2 replicated to other nodes?


What is the default replication factor if i don't specify any?



Thanks, Sudhakar.


Re: solr facet !tag on multiple columns

2012-10-04 Thread Chris Hostetter

: select?q=*:*&facet=true&facet.zeros=false&fq=column1:(16 31)&&fq=COLUMN2:(6
: 
208)&fq={!tag=COLUMN2,column1}COLUMN2:(6)&facet.field={!ex=COLUMN2,column1}COLUMN2&start=0&rows=0

As erick said, you need to elaborate more on what you expect, what you 
get, and how they are differnet.

taking a wild shot in the dark, the problems may be...

1) you have an fq using the field "COLUMN2" that isn't taged, so it won't 
be excluded
2) you have two fqs that filter results by "COLUMN2" where one is a super 
set of hte other, making the first one useless
2) you have an fq using a tag name with a comma in it that looks like a 
it could be a cut/paste error from your exclusion list.

maybe something like this is what you really want?...

...&fq={!tag=COL1}column1:(16 31)&&fq={!tag=COL2}COLUMN2:(6 
208)&facet.field={!ex=COL1,COL2}COLUMN2

?

-Hoss


PriorityQueue:initialize consistently showing up as hot spot while profiling

2012-10-04 Thread Aaron Daubman
Greetings,

I've been seeing this call chain come up fairly frequently when
debugging longer-QTime queries under Solr 3.6.1 but have not been able
to understand from the code what is really going on - the call graph
and code follow below.

Would somebody please explain to me:
1) Why this would show up frequently as a hotspot
2) If it is expected to do so
3) If there is anything I should look in to that may help performance
where this frequently shows up as the long pole in the QTime tent
4) What the code is doing and why heap is being allocated as an
apparently giant object (which also is apparently not unheard of due
to MAX_VALUE wrapping check)

---call-graph---
Filter - SolrDispatchFilter:doFilter (method time = 12 ms, total time = 487 ms)
 Filter - SolrDispatchFilter:execute:365 (method time = 0 ms, total
time = 109 ms)
  org.apache.solr.core.SolrCore:execute:1376 (method time = 0 ms,
total time = 109 ms)
   org.apache.solr.handler.RequestHandlerBase:handleRequest:129
(method time = 0 ms, total time = 109 ms)
org.apache.solr.handler.component.SearchHandler:handleRequestBody:186
(method time = 0 ms, total time = 109 ms)
 com.echonest.solr.component.EchoArtistGroupingComponent:process:188
(method time = 0 ms, total time = 109 ms)
  org.apache.solr.search.SolrIndexSearcher:search:375 (method time
= 0 ms, total time = 96 ms)
   org.apache.solr.search.SolrIndexSearcher:getDocListC:1176
(method time = 0 ms, total time = 96 ms)
org.apache.solr.search.SolrIndexSearcher:getDocListNC:1209
(method time = 0 ms, total time = 96 ms)
 org.apache.solr.search.SolrIndexSearcher:getProcessedFilter:796
(method time = 0 ms, total time = 26 ms)
  org.apache.solr.search.BitDocSet:andNot:185 (method time = 0
ms, total time = 13 ms)
   org.apache.lucene.util.OpenBitSet:clone:732 (method time =
13 ms, total time = 13 ms)
  org.apache.solr.search.BitDocSet:intersection:31 (method
time = 0 ms, total time = 13 ms)
   org.apache.solr.search.DocSetBase:intersection:90 (method
time = 0 ms, total time = 13 ms)
org.apache.lucene.util.OpenBitSet:and:808 (method time =
13 ms, total time = 13 ms)
 org.apache.lucene.search.TopFieldCollector:create:916 (method
time = 0 ms, total time = 46 ms)
  org.apache.lucene.search.FieldValueHitQueue:create:175
(method time = 0 ms, total time = 46 ms)
   
org.apache.lucene.search.FieldValueHitQueue$MultiComparatorsFieldValueHitQueue::111
(method time = 0 ms, total time = 46 ms)
org.apache.lucene.search.SortField:getComparator:409
(method time = 0 ms, total time = 13 ms)
 org.apache.lucene.search.FieldComparator$FloatComparator::400
(method time = 13 ms, total time = 13 ms)
org.apache.lucene.util.PriorityQueue:initialize:108
(method time = 33 ms, total time = 33 ms)
---snip---


org.apache.lucene.util.PriorityQueue:initialize - hotspot is line 108:
heap = (T[]) new Object[heapSize]; // T is unbounded type, so this
unchecked cast works always

---PriorityQueue.java---
  /** Subclass constructors must call this. */
  @SuppressWarnings("unchecked")
  protected final void initialize(int maxSize) {
size = 0;
int heapSize;
if (0 == maxSize)
  // We allocate 1 extra to avoid if statement in top()
  heapSize = 2;
else {
  if (maxSize == Integer.MAX_VALUE) {
// Don't wrap heapSize to -1, in this case, which
// causes a confusing NegativeArraySizeException.
// Note that very likely this will simply then hit
// an OOME, but at least that's more indicative to
// caller that this values is too big.  We don't +1
// in this case, but it's very unlikely in practice
// one will actually insert this many objects into
// the PQ:
heapSize = Integer.MAX_VALUE;
  } else {
// NOTE: we add +1 because all access to heap is
// 1-based not 0-based.  heap[0] is unused.
heapSize = maxSize + 1;
  }
}
heap = (T[]) new Object[heapSize]; // T is unbounded type, so this
unchecked cast works always
this.maxSize = maxSize;

// If sentinel objects are supported, populate the queue with them
T sentinel = getSentinelObject();
if (sentinel != null) {
  heap[1] = sentinel;
  for (int i = 2; i < heap.length; i++) {
heap[i] = getSentinelObject();
  }
  size = maxSize;
}
  }
---snip---


Thanks, as always!
 Aaron


Re: Solr search

2012-10-04 Thread Tolga
Nope. Nutch says "Adding x documents" then "Error adding title 'Sabancı 
University'".


On 10/04/2012 03:59 PM, Otis Gospodnetic wrote:

Hi

Search for *:* to retrieve all docs. Got anything?

Otis
--
Performance Monitoring - http://sematext.com/spm
On Oct 4, 2012 5:50 AM, "Tolga"  wrote:


Hi,

I installed Solr and Nutch on a server, crawled with Nutch, and searched
at http://localhost:8983/solr/, to no avail. I mean it turns up no
results. What to do?

Regards,



Re: Solr search

2012-10-04 Thread Jack Krupansky
I wonder if nutch added documents but failed before it sent a commit to 
Solr. Do you see the commit in the Solr log file? If Solr is still running, 
you could manually send a commit yourself.


-- Jack Krupansky

-Original Message- 
From: Tolga

Sent: Friday, October 05, 2012 12:14 AM
To: solr-user@lucene.apache.org
Subject: Re: Solr search

Nope. Nutch says "Adding x documents" then "Error adding title 'Sabancı
University'".

On 10/04/2012 03:59 PM, Otis Gospodnetic wrote:

Hi

Search for *:* to retrieve all docs. Got anything?

Otis
--
Performance Monitoring - http://sematext.com/spm
On Oct 4, 2012 5:50 AM, "Tolga"  wrote:


Hi,

I installed Solr and Nutch on a server, crawled with Nutch, and searched
at http://localhost:8983/solr/, to no avail. I mean it turns up no
results. What to do?

Regards,





Re: Solr search

2012-10-04 Thread Tolga

Here's the last 100 lines of my log:

2012-10-03 12:52:45,761 INFO indexer.IndexingFilters - Adding 
org.apache.nutch.indexer.basic.BasicIndexingFilter
2012-10-03 12:52:45,761 INFO anchor.AnchorIndexingFilter - Anchor 
deduplication is: off
2012-10-03 12:52:45,761 INFO indexer.IndexingFilters - Adding 
org.apache.nutch.indexer.anchor.AnchorIndexingFilter
2012-10-03 12:52:48,807 INFO indexer.IndexingFilters - Adding 
org.apache.nutch.indexer.basic.BasicIndexingFilter
2012-10-03 12:52:48,807 INFO anchor.AnchorIndexingFilter - Anchor 
deduplication is: off
2012-10-03 12:52:48,807 INFO indexer.IndexingFilters - Adding 
org.apache.nutch.indexer.anchor.AnchorIndexingFilter
2012-10-03 12:52:51,822 INFO indexer.IndexingFilters - Adding 
org.apache.nutch.indexer.basic.BasicIndexingFilter
2012-10-03 12:52:51,822 INFO anchor.AnchorIndexingFilter - Anchor 
deduplication is: off
2012-10-03 12:52:51,822 INFO indexer.IndexingFilters - Adding 
org.apache.nutch.indexer.anchor.AnchorIndexingFilter
2012-10-03 12:52:54,827 INFO indexer.IndexingFilters - Adding 
org.apache.nutch.indexer.basic.BasicIndexingFilter
2012-10-03 12:52:54,828 INFO anchor.AnchorIndexingFilter - Anchor 
deduplication is: off
2012-10-03 12:52:54,828 INFO indexer.IndexingFilters - Adding 
org.apache.nutch.indexer.anchor.AnchorIndexingFilter
2012-10-03 12:52:57,834 INFO indexer.IndexingFilters - Adding 
org.apache.nutch.indexer.basic.BasicIndexingFilter
2012-10-03 12:52:57,834 INFO anchor.AnchorIndexingFilter - Anchor 
deduplication is: off
2012-10-03 12:52:57,834 INFO indexer.IndexingFilters - Adding 
org.apache.nutch.indexer.anchor.AnchorIndexingFilter
2012-10-03 12:53:00,842 INFO indexer.IndexingFilters - Adding 
org.apache.nutch.indexer.basic.BasicIndexingFilter
2012-10-03 12:53:00,842 INFO anchor.AnchorIndexingFilter - Anchor 
deduplication is: off
2012-10-03 12:53:00,842 INFO indexer.IndexingFilters - Adding 
org.apache.nutch.indexer.anchor.AnchorIndexingFilter
2012-10-03 12:53:03,958 INFO indexer.IndexingFilters - Adding 
org.apache.nutch.indexer.basic.BasicIndexingFilter
2012-10-03 12:53:03,958 INFO anchor.AnchorIndexingFilter - Anchor 
deduplication is: off
2012-10-03 12:53:03,958 INFO indexer.IndexingFilters - Adding 
org.apache.nutch.indexer.anchor.AnchorIndexingFilter
2012-10-03 12:53:06,809 INFO indexer.IndexingFilters - Adding 
org.apache.nutch.indexer.basic.BasicIndexingFilter
2012-10-03 12:53:06,810 INFO anchor.AnchorIndexingFilter - Anchor 
deduplication is: off
2012-10-03 12:53:06,810 INFO indexer.IndexingFilters - Adding 
org.apache.nutch.indexer.anchor.AnchorIndexingFilter
2012-10-03 12:53:09,855 INFO indexer.IndexingFilters - Adding 
org.apache.nutch.indexer.basic.BasicIndexingFilter
2012-10-03 12:53:09,856 INFO anchor.AnchorIndexingFilter - Anchor 
deduplication is: off
2012-10-03 12:53:09,856 INFO indexer.IndexingFilters - Adding 
org.apache.nutch.indexer.anchor.AnchorIndexingFilter
2012-10-03 12:53:12,870 INFO indexer.IndexingFilters - Adding 
org.apache.nutch.indexer.basic.BasicIndexingFilter
2012-10-03 12:53:12,870 INFO anchor.AnchorIndexingFilter - Anchor 
deduplication is: off
2012-10-03 12:53:12,870 INFO indexer.IndexingFilters - Adding 
org.apache.nutch.indexer.anchor.AnchorIndexingFilter
2012-10-03 12:53:15,877 INFO indexer.IndexingFilters - Adding 
org.apache.nutch.indexer.basic.BasicIndexingFilter
2012-10-03 12:53:15,878 INFO anchor.AnchorIndexingFilter - Anchor 
deduplication is: off
2012-10-03 12:53:15,878 INFO indexer.IndexingFilters - Adding 
org.apache.nutch.indexer.anchor.AnchorIndexingFilter
2012-10-03 12:53:18,882 INFO indexer.IndexingFilters - Adding 
org.apache.nutch.indexer.basic.BasicIndexingFilter
2012-10-03 12:53:18,882 INFO anchor.AnchorIndexingFilter - Anchor 
deduplication is: off
2012-10-03 12:53:18,882 INFO indexer.IndexingFilters - Adding 
org.apache.nutch.indexer.anchor.AnchorIndexingFilter
2012-10-03 12:53:21,889 INFO indexer.IndexingFilters - Adding 
org.apache.nutch.indexer.basic.BasicIndexingFilter
2012-10-03 12:53:21,889 INFO anchor.AnchorIndexingFilter - Anchor 
deduplication is: off
2012-10-03 12:53:21,889 INFO indexer.IndexingFilters - Adding 
org.apache.nutch.indexer.anchor.AnchorIndexingFilter
2012-10-03 12:53:25,005 INFO indexer.IndexingFilters - Adding 
org.apache.nutch.indexer.basic.BasicIndexingFilter
2012-10-03 12:53:25,006 INFO anchor.AnchorIndexingFilter - Anchor 
deduplication is: off
2012-10-03 12:53:25,006 INFO indexer.IndexingFilters - Adding 
org.apache.nutch.indexer.anchor.AnchorIndexingFilter
2012-10-03 12:53:27,858 INFO indexer.IndexingFilters - Adding 
org.apache.nutch.indexer.basic.BasicIndexingFilter
2012-10-03 12:53:27,858 INFO anchor.AnchorIndexingFilter - Anchor 
deduplication is: off
2012-10-03 12:53:27,858 INFO indexer.IndexingFilters - Adding 
org.apache.nutch.indexer.anchor.AnchorIndexingFilter
2012-10-03 12:53:30,902 INFO indexer.IndexingFilters - Adding 
org.apache.nutch.indexer.basic.BasicIndexingFilter
2012-10-03 12:53:30,903 INFO anchor.AnchorIndexingFilter - An

Re: Solr search

2012-10-04 Thread Tolga
The word 'commit' exists both in logs of failed attempt and succeeded 
attempt on another server with another URL.


On 10/05/2012 07:18 AM, Jack Krupansky wrote:
I wonder if nutch added documents but failed before it sent a commit 
to Solr. Do you see the commit in the Solr log file? If Solr is still 
running, you could manually send a commit yourself.


-- Jack Krupansky

-Original Message- From: Tolga
Sent: Friday, October 05, 2012 12:14 AM
To: solr-user@lucene.apache.org
Subject: Re: Solr search

Nope. Nutch says "Adding x documents" then "Error adding title 'Sabancı
University'".

On 10/04/2012 03:59 PM, Otis Gospodnetic wrote:

Hi

Search for *:* to retrieve all docs. Got anything?

Otis
--
Performance Monitoring - http://sematext.com/spm
On Oct 4, 2012 5:50 AM, "Tolga"  wrote:


Hi,

I installed Solr and Nutch on a server, crawled with Nutch, and 
searched

at http://localhost:8983/solr/, to no avail. I mean it turns up no
results. What to do?

Regards,





Re: Identify exact search in edismax

2012-10-04 Thread rhl4tr
But user query can contain any number of terms. I can not know how many
fields term it has to match.

{
  "responseHeader":{
"status":0,
"QTime":1,
"params":{
  "mm":"0",
  "sort":"score desc",
  "indent":"true",
  "qf":"exact_keywords",
  "wt":"json",
  "rows":"1",
  "defType":"dismax",
  "pf":"exact_keywords",
  "debugQuery":"false",
  "fl":"data_id,data_name,exact_keywords",
  "start":"0",
  "q":"i want to by honda suzuki",
  "fq":"+data_type:pwords"}},
  "response":{"numFound":2,"start":0,"docs":[
  {
"data_name":"Cars ",
"data_id":"71",
"exact_keywords":"honda suzuki",
"term_mm":"100%"},
  {
"data_name":"bikes ",
"data_id":"72",
"exact_keywords":"suzuki",
"term_mm":"50%"}
]
  }}

An hypothetical solution would look like above json response.
user_mm parameter will tell what percentage of terms has matched to user
query. 




--
View this message in context: 
http://lucene.472066.n3.nabble.com/Identify-exact-search-in-edismax-tp4011859p4011976.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: Unknown format version: -11

2012-10-04 Thread Sushil jain
I am using Solr 1.4.1 and same solr is indexing the documents, I have tried
re-indexing but nothing good happened.

On Thu, Oct 4, 2012 at 8:26 PM, Otis Gospodnetic  wrote:

> Hi,
>
> I'd have to check the src to see what exactly -11 signifies but
> why not paste the Solr version you see in Solr Admin, plus ls -l your
> lib directory(-ies).
>
> Also, who indexed docs to those cores?  That same Solr?  Can you
> remove the core and reindex?  If so, do you still get -11?
>
> Otis
> --
> Search Analytics - http://sematext.com/search-analytics/index.html
> Performance Monitoring - http://sematext.com/spm/index.html
>
>
> On Thu, Oct 4, 2012 at 9:09 AM, Sushil jain 
> wrote:
> > Greetings,
> >
> > I have two cores(core0 & core1) in my solr instance and I am creating
> > indexes using Solr.
> > I am trying to access(read) same indexes using *EmbeddedSolrServer. *
> > I get instance of core1 without any error but in case of core0  I am
> > getting the error :
> >
> > Exception in thread "main" java.lang.RuntimeException:
> > org.apache.lucene.index.CorruptIndexException: Unknown format version:
> -11
> > at
> >
> org.apache.solr.spelling.AbstractLuceneSpellChecker.init(AbstractLuceneSpellChecker.java:104)
> > at
> >
> org.apache.solr.spelling.IndexBasedSpellChecker.init(IndexBasedSpellChecker.java:56)
> > at
> >
> org.apache.solr.handler.component.SpellCheckComponent.inform(SpellCheckComponent.java:274)
> > at
> >
> org.apache.solr.core.SolrResourceLoader.inform(SolrResourceLoader.java:508)
> > at org.apache.solr.core.SolrCore.(SolrCore.java:588)
> > at TestCaseSolrJ.createSolrCore(TestCaseSolrJ.java:1166)
> > at TestCaseSolrJ.main(TestCaseSolrJ.java:234)
> > Caused by: org.apache.lucene.index.CorruptIndexException: Unknown format
> > version: -11
> > at org.apache.lucene.index.SegmentInfos.read(SegmentInfos.java:247)
> > at
> org.apache.lucene.index.DirectoryReader$1.doBody(DirectoryReader.java:72)
> > at
> >
> org.apache.lucene.index.SegmentInfos$FindSegmentsFile.run(SegmentInfos.java:683)
> > at org.apache.lucene.index.DirectoryReader.open(DirectoryReader.java:69)
> > at org.apache.lucene.index.IndexReader.open(IndexReader.java:476)
> > at org.apache.lucene.index.IndexReader.open(IndexReader.java:314)
> > at org.apache.lucene.search.IndexSearcher.(IndexSearcher.java:102)
> > at
> >
> org.apache.lucene.search.spell.SpellChecker.createSearcher(SpellChecker.java:542)
> > at
> >
> org.apache.lucene.search.spell.SpellChecker.swapSearcher(SpellChecker.java:519)
> > at
> >
> org.apache.lucene.search.spell.SpellChecker.setSpellIndex(SpellChecker.java:146)
> > at
> org.apache.lucene.search.spell.SpellChecker.(SpellChecker.java:110)
> > at
> >
> org.apache.solr.spelling.AbstractLuceneSpellChecker.init(AbstractLuceneSpellChecker.java:102)
> > ... 6 more
> >
> >
> > I tried lot but couldn't figure out what's wrong it. I am using
> compatible
> > versions of lucene and solr jar files.
> > Please let me know the possible solution for same.
> >
> > Thanks & Regards,
> > Sushil Jain
>