DIH solr cloud

2015-07-29 Thread Midas A
Hi,

I have  to create DIH with solr cloud shared with multi node architecture
for solr 5.2.1.

Please advise where should i create this.

~M


Re: Use faceted search to drill down in hierarchical structure and omit node data outside current selection

2015-07-29 Thread Charlie Hull
We've been doing some work on heirarchical faceting as part of the 
BioSolr project:

https://github.com/flaxsearch/BioSolr/tree/master/spot/solr-hierarchical-facet
This is mainly to support indexing of biological ontologies/taxonomies.

Cheers

Charlie

On 28/07/2015 17:19, Alessandro Benedetti wrote:

Hi Peter,
yeah, i briefly read it, it seems quite similar !
There is no problem I can see yet with Multi values.
The token produced will be properly managed.

Cheers

2015-07-28 17:06 GMT+01:00 PeterKerk :


Oh and one more thing, I was Googling on this and found
http://www.springyweb.com/2012/01/hierarchical-faceting-with-elastic.html,
so apparently your solution is similar to this: hierarchical Faceting With
Elastic Search?
So does your solution facilitate for items to be in multiple categories?
e.g. a product may be in:

Man
Man > top
Man > top > shirt
Man > top > shirt> sleeveless shirt

AND also fall under:

Clothing
Clothing > shirt
Clothing > shirt> sleeveless shirt

Thanks again!

From: Alessandro Benedetti [via Lucene]
Sent: Tuesday, July 28, 2015 10:26
To: PeterKerk
Subject: Re: Use faceted search to drill down in hierarchical structure
and omit node data outside current selection

The fact is that you are trying to model a hierarchical facet on documents
that actually index the  content as a simple field.

What I would suggest for example is to use a PathhierarcyTokenizer for your
field with a proper separator.
This will produce these tokens in the index :

input : Man > top > shirt > sleeveless shirt
Tokenized :

Man
Man > top
Man > top > shirt
Man > top > shirt> sleeveless shirt

At this point your counting will be exactly what you would like, you need
only to parse it Search API side and model the hierarchical facets in
nested elements.

Cheers



2015-07-28 2:02 GMT+01:00 PeterKerk <[hidden email]>:



I have the following structure for my products, where a product may fall
into
multiple categories. In my case, a "caketopper", which would be under
"cake/caketoppers" as well as "caketoppers" (don't focus on the logic
behind
the category structure in this example).

Category structure:

 cake
 caketoppers
 funny

 caketoppers
 funny

What I want is that when the user has chosen a category on level 0 (the
main
category selection), in this case 'caketoppers', I don't want to return

the

attributes/values that same product has because it's also in a different
category.
I tried the following queries, but it keeps returning all data:


&f.slug_nl_0.facet.pre‌​fix=(caketoppers)&fq=slug_nl_0:"(caketoppers)"

&f.slug_nl_0.facet.pre‌​fix="caketoppers"&fq=slug_nl_0:"(caketoppers)"

I keep getting this result (cleaned for better readability):

 
 
 
 caketoppers
 cake
 
 
 
 
 
 
 6
 6
 
 
 

But my desired result would be:

 
 
 
 caketoppers
 
 
 
 
 
 
 6
 
 
 



field definition of 'slug_nl_0' in schema.xml:
 


I also tried with a more simple query but I'm getting the exact same
results:

 &facet.pre‌​fix=caketoppers&fq=slug_nl_0:caketoppers

I then was reading into grouping:
http://wiki.apache.org/solr/FieldCollapsing

So I tried adding that in my queries, but I get errors:




`&fq=slug_nl_0:taarttoppers&group=true&group.facet=true&group.field=slug_nl_0`

error: can not use FieldCache on multivalued field: slug_nl_0

`&fq=slug_nl_0:taarttoppers&group=true&group.field=slug_nl_0`

error: can not use FieldCache on multivalued field: slug_nl_0

`&fq=slug_nl_0:taarttoppers&group.facet=true&group.field=slug_nl_0`

error: Specify the group.field as parameter or local parameter

And then I noticed this at the bottom of the page:


Known Limitations Support for grouping on a multi-valued field has not
yet been implemented.

On that same Solr FieldCollapsing example page they refer to Best Buy as

an

example. Now I wonder how that was implemented without support for
multivalued fields.

What can I do?




--
View this message in context:


http://lucene.472066.n3.nabble.com/Use-faceted-search-to-drill-down-in-hierarchical-structure-and-omit-node-data-outside-current-selectn-tp4219384.html

Sent from the Solr - User mailing list archive at Nabble.com.




--
--

Benedetti Alessandro
Visiting card - http://about.me/alessandro_benedetti
Blog - http://alexbenedetti.blogspot.co.uk

"Tyger, tyger burning bright
In the forests of the night,

Problem with Highlighting results

2015-07-29 Thread Zheng Lin Edwin Yeo
Hi,

I'm using Solr 5.2.1, and sometimes, the highlighting return with results,
but there is no correct match in all the fields that are listed in hl.fl,
and there is also no  tag on the results at all.

What could be the reason that this is happening?

I've include my highlighting request handler here.

  

explicit
10
json
true
text
id, title, content_type, last_modified, url, score

on
id, title, content, author, tag
true
true
html
200
0.6
 
  

Regards,
Edwin


Question about Stemmer

2015-07-29 Thread Ashish Mukherjee
Hello,

I am using Stemmer on a Ngram field. I am getting better results with
Stemmer factory after Ngram, but I was wondering what is the recommended
practice when using Stemmer on Ngram field?

Regards,
Ashish


How can I use solr for hbase

2015-07-29 Thread weibaohui

Hi Everyone,

Recently, I want to use solr for the query of hbase ,but I cann't find a 
effective way. 
So,how can I use solr with hbase,is solr supported hbase?

Hope for any answer,

Thank you!




weibaohui


Re: Question about Stemmer

2015-07-29 Thread Erick Erickson
I really can't imagine ngrams followed by a stemmer really
being that useful, but I've been wrong once or twice before.
Well, a lot more than once or twice. But this pair isn't
something I've ever really seen before.

I'd make use of the admin/analysis page for your field to see
why it appears to work better, I suspect that you're seeing
more matches per query than otherwise, but I also suspect
that the matches aren't very good.

FWIW,
Erick


On Wed, Jul 29, 2015 at 5:49 AM, Ashish Mukherjee
 wrote:
> Hello,
>
> I am using Stemmer on a Ngram field. I am getting better results with
> Stemmer factory after Ngram, but I was wondering what is the recommended
> practice when using Stemmer on Ngram field?
>
> Regards,
> Ashish


Re: DIH solr cloud

2015-07-29 Thread Erick Erickson
Just pick a node to run it on. I vastly prefer, though,
using a SolrJ client, here's a sample:

https://lucidworks.com/blog/indexing-with-solrj/

Best,
Erick

On Wed, Jul 29, 2015 at 4:37 AM, Midas A  wrote:
> Hi,
>
> I have  to create DIH with solr cloud shared with multi node architecture
> for solr 5.2.1.
>
> Please advise where should i create this.
>
> ~M


Query token access in solr function queries

2015-07-29 Thread Nutch Solr User
How can i access each query token seperately in function query . I want to
pass each token to ttf function to get total term frequency for that token.
Currently I have access to main query using $q parameter. 

Do I have to write some code to tokenize original query and add tokens as
additional parameters to main query say t1,t2,t3 like this before
sending query to Solr. 

Is there any other way to do this using existing solr functions ?

one more questions is If I have to write my own function for this how should
I return these tokens?



-
Nutch Solr User

"The ultimate search engine would basically understand everything in the world, 
and it would always give you the right thing."
--
View this message in context: 
http://lucene.472066.n3.nabble.com/Query-token-access-in-solr-function-queries-tp4219695.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: SOLR Exception with SOLR Cloud 5.1 setup on Linux

2015-07-29 Thread Shawn Heisey
On 7/28/2015 5:10 PM, Yonik Seeley wrote:
> On Tue, Jul 28, 2015 at 6:54 PM, Shawn Heisey  wrote:
>> To get out of the hole you're in now, either build a new collection with
>> the actual shard count that you want so it's correctly set up, or edit
>> the clusterstate in zookeeper to change the hash range (change 8000
>> to )
> 
> Actually, if you want a range that covers the entire 32 bit hash
> space, it would be
> 8000-7fff  (hex representations of signed integers).

Good to know.  Thanks.  I was somewhat confused by something I saw in my
own clusterstate on a three-shard collection where the start value
appeared to be larger than the end value in one of the shards, this note
makes that understandable.  I find it irritating and confusing, but now
it makes sense.

Shawn



RE: [ANN] New Features For Splainer

2015-07-29 Thread Davis, Daniel (NIH/NLM) [C]
I usually protect https://whatever.nlm.nih.gov/solr deeply, requiring CAS 
authentication against NIH Login, but I also make sure handleSelect=false, and 
reverse proxy https://whatever.nlm.nih.gov/search/core-name to /solr/select.

I'm surprised and gratified that http://splainer.io/ works in my environment. 

-Original Message-
From: Doug Turnbull [mailto:dturnb...@opensourceconnections.com] 
Sent: Friday, July 24, 2015 3:47 PM
To: solr-user@lucene.apache.org
Subject: [ANN] New Features For Splainer

First, I wanted to humbly thank the Solr community for their contributions and 
feedback for our open source Solr sandbox, Splainer (http://splainer.io and 
http://github.com/o19s/splainer). The reception and comments have been 
generally positive and helpful, and I very much appreciate being part of such a 
great open source community that wants to support each other.

What is Splainer exactly? Why should you care? Nobody likes working with Solr 
in the browser's URL bar.  Splainer let's you paste in your Solr URL and get an 
instant, easy to understand breakdown of why some documents are ranked higher 
than others. It then gives you a friendly interface to tweak Solr params and 
experiment with different ideas with a friendlier UI than trying to parse 
through XML and JSON. You needn't worry about security rules so that some 
splainer backend needing to talk to your Solr. The interaction with Solr is 
100% through your browser. If your PC can see Solr, then so can Splainer 
running in your browser. If you leave work or turn off the VPN, then Splainer 
can't see your Solr. It's all running locally on your machine through the 
browser!

I wanted to share that we've been slowly adding features to Splainer. The two I 
wanted to highlight, are captured in this blog article ( 
http://opensourceconnections.com/blog/2015/07/24/splainer-a-solr-developers-best-friend/
)

To summarize, they include

- Explain Other
You often wonder why obviously relevant search results don't come back.
Splainer now gives you the ability to compare any document to secondary 
document to see what factors caused one document to rank higher than another

- Share Splainerized Solr Results
Once you paste a Solr URL into Splainer, you can then copy the splainer.io URL 
to share what you're seeing with a colleague. For example, here's some 
information about Virginia state laws about hunting deer from a boat:

http://splainer.io/#?solr=http:%2F%2Fsolr.quepid.com%2Fsolr%2Fstatedecoded%2Fselect%3Fq%3Ddeer%20hunt%20from%20watercraft%0A%26defType%3Dedismax%0A%26qf%3Dcatch_line%20text%0A%26bq%3Dtitle:deer

There's many more smaller features and tweaks, but I wanted to let you know 
this was out there. I hope you find Splainer useful. I'm very happy to field 
pull requests, ideas, suggestions, or try to figure out why Splainer isn't 
working for you!

Cheers!
--
*Doug Turnbull **| *Search Relevance Consultant | OpenSource Connections 
, LLC | 240.476.9983
Author: Relevant Search  This e-mail and all 
contents, including attachments, is considered to be Company Confidential 
unless explicitly stated otherwise, regardless of whether attachments are 
marked as such.


normalize accent solr

2015-07-29 Thread ojalà
Hi! I'm using solr 3.4 and I need to normalize for example é to e
if I search joelè solr must give me result for joelè and for joele
if I search joele solr must give me result for joelè and for joele

In my server.xml I put:


and I try to use solr.MappingCharFilterFactory in this way:










  

Where is the problem?Thanks



--
View this message in context: 
http://lucene.472066.n3.nabble.com/normalize-accent-solr-tp4219721.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: normalize accent solr

2015-07-29 Thread Ahmet Arslan
Hi Ojala,

Do you have that charFilter configured in index analyser too?

Ahmet

On Wednesday, July 29, 2015 6:00 PM, ojalà  wrote:
Hi! I'm using solr 3.4 and I need to normalize for example é to e
if I search joelè solr must give me result for joelè and for joele
if I search joele solr must give me result for joelè and for joele

In my server.xml I put:


and I try to use solr.MappingCharFilterFactory in this way:










  

Where is the problem?Thanks



--
View this message in context: 
http://lucene.472066.n3.nabble.com/normalize-accent-solr-tp4219721.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: normalize accent solr

2015-07-29 Thread ojalà
yes i put in index and query



--
View this message in context: 
http://lucene.472066.n3.nabble.com/normalize-accent-solr-tp4219721p4219723.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: normalize accent solr

2015-07-29 Thread Ahmet Arslan
Hi,

What happens when you test some sample text on admin/analysis page?
You should see that accents are removed in the first analysis step.



On Wednesday, July 29, 2015 6:06 PM, ojalà  wrote:
yes i put in index and query



--
View this message in context: 
http://lucene.472066.n3.nabble.com/normalize-accent-solr-tp4219721p4219723.html

Sent from the Solr - User mailing list archive at Nabble.com.


Re: normalize accent solr

2015-07-29 Thread ojalà
I have different result number for joelè and joele



--
View this message in context: 
http://lucene.472066.n3.nabble.com/normalize-accent-solr-tp4219721p4219726.html
Sent from the Solr - User mailing list archive at Nabble.com.


How Index Xml file using solrJ or DIH or post command

2015-07-29 Thread Mugeesh Husain
I have more than 30 millions of xml files which is store in a filesystems,
Please suggest me in which method i have to follows
1.) Should i have to use Solrj
1.) Should i have to use DIH
1.) Should i have to use post method(in terminal)

Basically i have java and lucene developer new in solr.

How many shard should i have to used.


Please advice me and give appropriate suggestion for this.

Thanks in advance




--
View this message in context: 
http://lucene.472066.n3.nabble.com/How-Index-Xml-file-using-solrJ-or-DIH-or-post-command-tp4219756.html
Sent from the Solr - User mailing list archive at Nabble.com.


Using Update Request Handlers with Solr

2015-07-29 Thread Paden
Hello all,

I've been trying to integrate NER into my solr search so I can get some
really good facets out of it. I've already managed to plug in a search
handler with code from searchbox.com to get a feel for how it works. And now
I'm trying to plug in an update request processor so I can pull facets out
of it. But I've gotten kind of stuck on implementing it. I've bookmarked the
specific problem area with bolded messages below


   
 
   content
 
   
   
   
 
 
Here we see that we’re using the field content to determine the language,
though we could specify as many fields as we wished. Next we just need to
add the chain the update request handler like so:

*RIGHT HERE I've used processor chains before (to trim whitespace and
remove blank fields) but I'm not quite sure what they are doing here. They
are using a totally different request handler. But go down further to the
other bolded part *


   
 mychain
   
  
 

And we’re good to go! After indexing some documents (via curl,
dataimporthandlers, etc), we can do a query and see if we have any results:

*They say AFTER the indexing has happened you use a query and get results.
Which he does. I guess my question is. Where is the /update handler being
used. It's not being used to index, is it? It's not being used to search,
because down below they used the /select search handler. Where the heck is
the /update processor being implement? This is probably a more generic
question about update handlers *

 
*Query they use to get results.* 

http://192.168.56.101:8983/solr/ner/select?q=*%3A*&fl=ORGANIZATION%2CPERSON&wt=xml&indent=true&facet=true&facet.field=ORGANIZATION

 



0
1

true
ORGANIZATION,PERSON
true
*:*
ORGANIZATION
xml





Sauyet
Dave
Scott
Fuller





BCCI


Gregg
Jaeger
Jon
Livesey




Russell
Hemingway
Gregg
James
Jim
Allah
Hoban
Hogan




State
Iowa
University


Warren
Bruce
Cobb
Kurt
Salem
Mike





David
Einstien
McAloon
Einstein




Bill




Bill
Hausmann
Maddi




Mozumder
Bill
Bobby
Conner




 

There it is! Our documents now have a PERSON and ORGANIZATION field, which
are correctly populated from the index data. Now the question is, can we use
this information for better/easier information finding for our end users,
and the answer is of course a resounding yes. By faceting on this field:



--
View this message in context: 
http://lucene.472066.n3.nabble.com/Using-Update-Request-Handlers-with-Solr-tp4219770.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: Parameterized values

2015-07-29 Thread Mikhail Khludnev
hm. Did you try
PS127
hosp_quality_spec_boost:${pspec:${pspec}}
?


On Tue, Jul 28, 2015 at 8:16 PM, William Bell  wrote:

> http://yonik.com/solr-query-parameter-substitution/
>
> This is not working as part of QTs.
>
> Cannot load the core, since ${value} is being used for XML parameters for
> system property substitution.
>
> https://wiki.apache.org/solr/SolrConfigXml#System_property_substitution
>
> Can we support both?
>
> PS127
> hosp_quality_spec_boost:${pspec}
>
>
> This does not work.
>
>
> --
> Bill Bell
> billnb...@gmail.com
> cell 720-256-8076
>



-- 
Sincerely yours
Mikhail Khludnev
Principal Engineer,
Grid Dynamics





Re: Use faceted search to drill down in hierarchical structure and omit node data outside current selection

2015-07-29 Thread PeterKerk
Ok, I managed to get this as output via SQL for a single product:

ProductId  categorystring
2481445 cake > caketoppers > funny
2481445 caketoppers > funny

Before I start diving into the tokenization in Solr, this is what you meant
as the correct input of the data right? I should be able to support drilling
down in categories using your suggested solution?

Just want to make sure I'm on the right track here :)

Thanks!



--
View this message in context: 
http://lucene.472066.n3.nabble.com/Use-faceted-search-to-drill-down-in-hierarchical-structure-and-omit-node-data-outside-current-selectn-tp4219384p4219773.html
Sent from the Solr - User mailing list archive at Nabble.com.


Specifying multiple query parsers

2015-07-29 Thread Jamie Johnson
I have a use case where I want to use the block join query parser for the
top level query and for the nested portion a custom query parser.  I was
originally doing this, which worked

{!parent which='type:parent'}_query_:{!myqp df='child_pay' v='"value foo"'}

but switched to this which also worked

{!parent which='type:parent'}{!myqp}child_pay:"value foo"

I have never seen this type of syntax where you can specify multiple query
parsers inline, is this supposed to work or am I taking advantage of some
oversight in the local params implementation?


Re: Specifying multiple query parsers

2015-07-29 Thread Jamie Johnson
Sorry answered my own question.  For those that are interested this is
related to how BlockJoinParentQParser handles sub queries and looks like
it's working as it should.

On Wed, Jul 29, 2015 at 3:31 PM, Jamie Johnson  wrote:

> I have a use case where I want to use the block join query parser for the
> top level query and for the nested portion a custom query parser.  I was
> originally doing this, which worked
>
> {!parent which='type:parent'}_query_:{!myqp df='child_pay' v='"value foo"'}
>
> but switched to this which also worked
>
> {!parent which='type:parent'}{!myqp}child_pay:"value foo"
>
> I have never seen this type of syntax where you can specify multiple query
> parsers inline, is this supposed to work or am I taking advantage of some
> oversight in the local params implementation?
>


Re: Specifying multiple query parsers

2015-07-29 Thread Mikhail Khludnev
BlockJoinParentQParser does nothing specific just calls subroutine. The
question about redundant is _query_ is worthy per se, and here is the
answer: https://issues.apache.org/jira/browse/SOLR-4093
Note: spaces in subquery may lead to mistakes I'd rather recommend to use
{!.. v=$childq}&childq=... hack

On Wed, Jul 29, 2015 at 10:41 PM, Jamie Johnson  wrote:

> Sorry answered my own question.  For those that are interested this is
> related to how BlockJoinParentQParser handles sub queries and looks like
> it's working as it should.
>
> On Wed, Jul 29, 2015 at 3:31 PM, Jamie Johnson  wrote:
>
> > I have a use case where I want to use the block join query parser for the
> > top level query and for the nested portion a custom query parser.  I was
> > originally doing this, which worked
> >
> > {!parent which='type:parent'}_query_:{!myqp df='child_pay' v='"value
> foo"'}
> >
> > but switched to this which also worked
> >
> > {!parent which='type:parent'}{!myqp}child_pay:"value foo"
> >
> > I have never seen this type of syntax where you can specify multiple
> query
> > parsers inline, is this supposed to work or am I taking advantage of some
> > oversight in the local params implementation?
> >
>



-- 
Sincerely yours
Mikhail Khludnev
Principal Engineer,
Grid Dynamics





Leader election

2015-07-29 Thread Olivier Damiot
Hello everybody,

I use solr 5.2.1 and am having a big problem.
I have about 1200 collections, 3 shards, replicationfactor = 3,
MaxShardPerNode=3.
I have 3 boxes of 64G (32 JVM).
I have no problems with the creation of collection or indexing, but when I
lose a node (VMY full or kill) and I restart, all my collections are down.
I look in the logs I can see problems of leader election, eg:
  - Checking if I (core = test339_shard1_replica1, coreNodeName =
core_node5) shoulds try and be the leader.
- Cloud says we are still state leader.

I feel that all server pass the buck!

I do not understand this error especially as if I read the mailing list I
have the impression that this bug is solved long ago.

what should I do to start my collections properly?

Is someone could help me ?

thank you a lot

Olivier


Re: Parameterized values

2015-07-29 Thread William Bell
That would be pretty bizarre.

I'll try it.

On Wed, Jul 29, 2015 at 1:00 PM, Mikhail Khludnev <
mkhlud...@griddynamics.com> wrote:

> hm. Did you try
> PS127
> hosp_quality_spec_boost:${pspec:${pspec}}
> ?
>
>
> On Tue, Jul 28, 2015 at 8:16 PM, William Bell  wrote:
>
> > http://yonik.com/solr-query-parameter-substitution/
> >
> > This is not working as part of QTs.
> >
> > Cannot load the core, since ${value} is being used for XML parameters for
> > system property substitution.
> >
> > https://wiki.apache.org/solr/SolrConfigXml#System_property_substitution
> >
> > Can we support both?
> >
> > PS127
> > hosp_quality_spec_boost:${pspec}
> >
> >
> > This does not work.
> >
> >
> > --
> > Bill Bell
> > billnb...@gmail.com
> > cell 720-256-8076
> >
>
>
>
> --
> Sincerely yours
> Mikhail Khludnev
> Principal Engineer,
> Grid Dynamics
>
> 
> 
>



-- 
Bill Bell
billnb...@gmail.com
cell 720-256-8076


custom function for multivalued fields

2015-07-29 Thread Gopal Jee
I have a requirement where i want to maintain a multivalued field. However,
at query time, i want to query on only one value we store in  multivalued
field. That one value should be output of a custom function which should
execute on all values of multivalued field at query time.
Can we write such function and plug into solr.

Gopal


PayloadSpanOrQuery

2015-07-29 Thread Jamie Johnson
I have a need for doing using payloads in a SpanOrQuery to influence the
score.  I noticed that there is no PayloadSpanOrQuery so I'd like to
implement one.  I couldn't find a ticket in JIRA for this so I created
https://issues.apache.org/jira/browse/LUCENE-6706, if this feature exists I
will gladly close the JIRA if someone could point me in the right direction.


Re: Leader election

2015-07-29 Thread Timothy Potter
Hi Olivier,

Can you look at the collections to see if there are leader initiated
recovery nodes in the ZooKeeper tree? Go into the Solr Admin UI ->
Cloud panel -> Tree view and drill into one of the collections that's
not recovering /collections//leader_initiated_recovery/

You could try deleting the znodes for one of the shards under that and
see if that shard recovers.

Let me know what shakes out as there still may be a bug in this area
of the recovery logic.


On Wed, Jul 29, 2015 at 1:49 PM, Olivier Damiot
 wrote:
> Hello everybody,
>
> I use solr 5.2.1 and am having a big problem.
> I have about 1200 collections, 3 shards, replicationfactor = 3,
> MaxShardPerNode=3.
> I have 3 boxes of 64G (32 JVM).
> I have no problems with the creation of collection or indexing, but when I
> lose a node (VMY full or kill) and I restart, all my collections are down.
> I look in the logs I can see problems of leader election, eg:
>   - Checking if I (core = test339_shard1_replica1, coreNodeName =
> core_node5) shoulds try and be the leader.
> - Cloud says we are still state leader.
>
> I feel that all server pass the buck!
>
> I do not understand this error especially as if I read the mailing list I
> have the impression that this bug is solved long ago.
>
> what should I do to start my collections properly?
>
> Is someone could help me ?
>
> thank you a lot
>
> Olivier


Re: WordDelimiterFilter Leading & Trailing Special Character

2015-07-29 Thread Sathiya N Sundararajan
thanks for the suggestion Jack. We are already using @ and # as ,
will see if it makes sense to go that route.

On Tue, Jul 21, 2015 at 4:52 PM, Jack Krupansky 
wrote:

> You can also use the types attribute to change the type of specific
> characters, such as to treat the "!" or "&" as an .
>
> -- Jack Krupansky
>
> On Tue, Jul 21, 2015 at 7:43 PM, Sathiya N Sundararajan <
> ausat...@gmail.com>
> wrote:
>
> > Upayavira,
> >
> > thanks for the helpful suggestion, that works. I was looking for an
> option
> > to turn off/circumvent that particular WordDelimiterFilter's behavior
> > completely. Since our indexes are hundred's of Terabytes, every time we
> > find a term that needs to be added, it will be a cumbersome process to
> > reload all the cores.
> >
> >
> > thanks
> >
> > On Tue, Jul 21, 2015 at 12:57 AM, Upayavira  wrote:
> >
> > > Looking at the javadoc for the WordDelimiterFilterFactory, it suggests
> > > this config:
> > >
> > >   > >  positionIncrementGap="100">
> > >
> > >  
> > >   > >  protected="protectedword.txt"
> > >  preserveOriginal="0" splitOnNumerics="1"
> > >  splitOnCaseChange="1"
> > >  catenateWords="0" catenateNumbers="0" catenateAll="0"
> > >  generateWordParts="1" generateNumberParts="1"
> > >  stemEnglishPossessive="1"
> > >  types="wdfftypes.txt" />
> > >
> > >  
> > >
> > > Note the protected="x" attribute. I suspect if you put Yahoo! into
> a
> > > file referenced by that attribute, it may survive analysis. I'd be
> > > curious to hear whether it works.
> > >
> > > Upayavira
> > >
> > > On Tue, Jul 21, 2015, at 12:51 AM, Sathiya N Sundararajan wrote:
> > > > Question about WordDelimiterFilter. The search behavior that we
> > > > experience
> > > > with WordDelimiterFilter satisfies well, except for the case where
> > there
> > > > is
> > > > a special character either at the leading or trailing end of the
> term.
> > > >
> > > > For instance:
> > > >
> > > > *‘d&b’ *  —>  Works as expected. Finds all docs with ‘d&b’.
> > > > *‘p!nk’*  —>  Works fine as above.
> > > >
> > > > But on cases when, there is a special character towards the trailing
> > end
> > > > of
> > > > the term, like ‘Yahoo!’
> > > >
> > > > *‘yahoo!’* —> Turns out to be a search for just *‘yahoo’* with the
> > > > special
> > > > character *‘!’* stripped out.  This WordDelimiterFilter behavior is
> > > > documented
> > > >
> > >
> >
> http://lucene.apache.org/core/4_6_0/analyzers-common/index.html?org/apache/lucene/analysis/miscellaneous/WordDelimiterFilter.html
> > > >
> > > > What I would like to have is, the search performed without stripping
> > out
> > > > the leading & trailing special character. Is there a way to achieve
> > this
> > > > behavior with WordDelimiterFilter.
> > > >
> > > > This is current config that we have for the field:
> > > >
> > > >  > > > positionIncrementGap="100">
> > > > 
> > > > 
> > > >  > > > splitOnCaseChange="0" generateWordParts="0" generateNumberParts="0"
> > > > catenateWords="0" catenateNumbers="0" catenateAll="0"
> > > > preserveOriginal="1"
> > > > types="specialchartypes.txt"/>
> > > > 
> > > > 
> > > > 
> > > > 
> > > >  > > > splitOnCaseChange="0" generateWordParts="0" generateNumberParts="0"
> > > > catenateWords="0" catenateNumbers="0" catenateAll="0"
> > > > preserveOriginal="1"
> > > > types="specialchartypes.txt"/>
> > > > 
> > > > 
> > > > 
> > > >
> > > >
> > > > thanks
> > >
> >
>


Re: custom function for multivalued fields

2015-07-29 Thread Chris Hostetter

Thanks to the SortedSetDocValues this is in fact possible -- in fact i 
just uploaded a patch for SOLR-2522 that you can take a look at to get an 
idea of how to make it work (the main class you're probably going 
to want to look at is SortedSetSelector: you're going to want a similar 
"SortedDocValues proxy" class on top of SortedSetDocValues -- but instead 
of picking a single value, you want to pick your new synthetic value based 
on your custom function logic.

https://issues.apache.org/jira/browse/SOLR-2522

: I have a requirement where i want to maintain a multivalued field. However,
: at query time, i want to query on only one value we store in  multivalued
: field. That one value should be output of a custom function which should
: execute on all values of multivalued field at query time.
: Can we write such function and plug into solr.


-Hoss
http://www.lucidworks.com/


Re: Use faceted search to drill down in hierarchical structure and omit node data outside current selection

2015-07-29 Thread PeterKerk
Hi Alessandro!

I'm having a hard time on how to use the PathHierarchyTokenizerFactory. I
was reading here:
https://lucene.apache.org/core/4_4_0/analyzers-common/org/apache/lucene/analysis/path/PathHierarchyTokenizerFactory.html

And ended up with this:



   
 
   
   
 
   
 

I tried with these field definitions:
 



And these querystring parameters in the request:

1.  
&facet.field=categorystring_nl --> this returns a facet with count based on
full categorystring, e.g. "bruidstaart>taarttoppers>grappig", so I can't use
that for the count on the highest category level (in this case
"bruidstaart"):


15
6
6
3




2. 
&facet.field=categorystring_tokenized, this now returns:


15
15
6
6
6
6
6
3
3



I'm now wondering, is this the data you expected me to end up with? Right
now I still don't see how I can easily extract the hierarchy from this data,
except by looping through the facets and count the number of ">" occurrences
in the "name" attribute to determine the actual level in the hierarchy.

Can you advice? 

Thanks again!








--
View this message in context: 
http://lucene.472066.n3.nabble.com/Use-faceted-search-to-drill-down-in-hierarchical-structure-and-omit-node-data-outside-current-selectn-tp4219384p4219832.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: Use faceted search to drill down in hierarchical structure and omit node data outside current selection

2015-07-29 Thread PeterKerk
Hi Charlie,

Your solution seems to remove faceting capabilities...so that's not what I'm
looking for :) Thanks though!



--
View this message in context: 
http://lucene.472066.n3.nabble.com/Use-faceted-search-to-drill-down-in-hierarchical-structure-and-omit-node-data-outside-current-selectn-tp4219384p4219833.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: custom function for multivalued fields

2015-07-29 Thread Gopal Jee
Thanks Chris.

On Thu, Jul 30, 2015 at 5:36 AM, Chris Hostetter 
wrote:

>
> Thanks to the SortedSetDocValues this is in fact possible -- in fact i
> just uploaded a patch for SOLR-2522 that you can take a look at to get an
> idea of how to make it work (the main class you're probably going
> to want to look at is SortedSetSelector: you're going to want a similar
> "SortedDocValues proxy" class on top of SortedSetDocValues -- but instead
> of picking a single value, you want to pick your new synthetic value based
> on your custom function logic.
>
> https://issues.apache.org/jira/browse/SOLR-2522
>
> : I have a requirement where i want to maintain a multivalued field.
> However,
> : at query time, i want to query on only one value we store in  multivalued
> : field. That one value should be output of a custom function which should
> : execute on all values of multivalued field at query time.
> : Can we write such function and plug into solr.
>
>
> -Hoss
> http://www.lucidworks.com/
>



--