from:"sandeep"

Re: Indexing CSV data in Multicore setup

2011-07-02 Thread sandeep

> post.jar is used to post xml files. You can use curl to feed csv. 
> http://wiki.apache.org/solr/UpdateCSV


I tried using curl as well to post the CSV data using following command.

curl http://localhost:8983/solr/core0/update/csv --data-binary @books.csv -H
'Content-type:text/plain;charset=utf-8'

It errors out saying problem accessing "/solr/core0/update/csv".

"
HTTP ERROR 404

Problem accessing /solr/core0/update/csv. Reason:
NOT_FOUND/Powered by Jetty:///"

--
View this message in context: 
http://lucene.472066.n3.nabble.com/Indexing-CSV-data-in-Multicore-setup-tp3131252p3132350.html
Sent from the Solr - User mailing list archive at Nabble.com.

Many to Many Mapping with Solr

2016-04-29 Thread Sandeep Mestry

Hi All,

Hope the day is going on well for you.

This question has been asked before, but I couldn't find answer to my
specific request. I have many to many relationship and the mapping table
has additional columns. Whats the best way I can model this into solr
entity?

For example: a user has many recordings and a recording belongs to many
users. But each user-recording has additional feature like type, number etc.
I'd like to fetch recordings for the user. If the user adds/ updates/
deletes a recording then that should be reflected in the search.

I have 2 options:
1) to create user entity, recording entity and user_recording entity
- this is good but it's like treating solr like rdbms which i mostly avoid..

2) user entity containing all the recordings information and each recording
containing user information
- this has impact on index size but the fetch and manipulation will be
faster.

Any guidance will be good..

Thanks,
Sandeep

Re: Many to Many Mapping with Solr

2016-05-01 Thread Sandeep Mestry

Thanks Alexandre, even I am of the opinion not to use solr rdbms way but i
am concerned about the updates to the indexes. We're expecting around 500
writes per second to the database which will generate in >500 updates to
the index per second. If the entities are denormalised this will have an
impact on performance hence I was inclined to design it like db.

Joel,
I will explain it in a bit more detail what my use cases are, all of these
should be driven by search engine:

1) user logs in and the system should display all recordings for that user
2) user adds a recording, the system is updated with the additional
recording
3) user removes a recording, the system is updated with the recording
removed.
4) when the user searches for a recording, the system should only display
matches in his recordings. Every user-recording mapping has additional
properties which are also searchable attributes.

here, we are talking about 2M users and 500M recordings and this is
currently driven by database of size ~60-80GB.

I am going to do a small poc for these use cases and I will go with
denormalised entities with search requirements as my main focus. However,
if you have anything more to add, do let me know. I will be grateful.

Many Thanks,
Sandeep


On 29 April 2016 at 14:54, Joel Bernstein  wrote:

> We really still need to know more about your use case. In particular what
> types of questions will you be asking of the data? It's useful to do this
> in plain english without mapping to any specific implementation.
>
>
> Joel Bernstein
> http://joelsolr.blogspot.com/
>
> On Fri, Apr 29, 2016 at 9:43 AM, Alexandre Rafalovitch  >
> wrote:
>
> > You do not structure Solr to represent your database. You structure it
> > to represent what you will search.
> >
> > In your case, it sounds like you want to return 'user-records', in
> > which case you will index the related information all together. Yes,
> > you will possibly need to recreate the multiple documents when you
> > update one record (or one user). And yes, you will have the same
> > information multiple times. But you can used index-only values or
> > docvalues to reduce storage and duplication.
> >
> > You may also want to have Solr return only the relevant IDs from the
> > search and you recreate the m-to-m object structure from the database.
> > Then, you don't need to store much at all, just index.
> >
> > Basically, don't think about your database as much when deciding Solr
> > structure. It does not map one-to-one.
> >
> > Regards,
> >Alex.
> > 
> > Newsletter and resources for Solr beginners and intermediates:
> > http://www.solr-start.com/
> >
> >
> > On 29 April 2016 at 20:48, Sandeep Mestry  wrote:
> > > Hi All,
> > >
> > > Hope the day is going on well for you.
> > >
> > > This question has been asked before, but I couldn't find answer to my
> > > specific request. I have many to many relationship and the mapping
> table
> > > has additional columns. Whats the best way I can model this into solr
> > > entity?
> > >
> > > For example: a user has many recordings and a recording belongs to many
> > > users. But each user-recording has additional feature like type, number
> > etc.
> > > I'd like to fetch recordings for the user. If the user adds/ updates/
> > > deletes a recording then that should be reflected in the search.
> > >
> > > I have 2 options:
> > > 1) to create user entity, recording entity and user_recording entity
> > > - this is good but it's like treating solr like rdbms which i mostly
> > avoid..
> > >
> > > 2) user entity containing all the recordings information and each
> > recording
> > > containing user information
> > > - this has impact on index size but the fetch and manipulation will be
> > > faster.
> > >
> > > Any guidance will be good..
> > >
> > > Thanks,
> > > Sandeep
> >
>

issue with query boost using qf and edismax

2015-07-21 Thread sandeep bonkra

 (MATCH)
ConstantScore(person_full_name:*louis*), product of:\n  1.0 =
boost\n  0.09805807 = queryNorm\n  0.125 = coord(1/8)\n",
  "11470328": "\n0.012257258 = (MATCH) product of:\n  0.09805807 =
(MATCH) sum of:\n0.09805807 = (MATCH)
ConstantScore(person_full_name:*louis*), product of:\n  1.0 =
boost\n  0.09805807 = queryNorm\n  0.125 = coord(1/8)\n",
  "11470331": "\n0.012257258 = (MATCH) product of:\n  0.09805807 =
(MATCH) sum of:\n0.09805807 = (MATCH)
ConstantScore(person_full_name:*louis*), product of:\n  1.0 =
boost\n  0.09805807 = queryNorm\n  0.125 = coord(1/8)\n"
},


SO THE ISSUE IS THAT SOMETIMES IT IS TAKING FILTER BUT NOT ALWAYS. What
could be the reason.

Here is tokenizer and fileter I am using:


  
   
  

   
   


Any help on this would be appriciable.

Thanks,
Sandeep

How to split using multiple parameters

2015-06-12 Thread Sandeep Mellacheruvu

Hi,

I have a json document which has multiple json arrays and inner json
objects. From the documentation it seems like there is only one split
parameter. Following is the sample JSON that I have.

Can anyone help me in splitting this json ? Also I do not need some of the
fields like websites, so can I also ignore such fields altogether ?

{
"groups": [
{
"name": "Airlines"
},
{
"name": "Industrial Design"
},
{},
{}
],
"family_name": "Volante",
"locality": "Chile",
"industry": "Civil Engineering",
"num_connections": "500+",
"websites": [
{
"description": "Personal Website"
}
],
"summary": "Ingeniero Civil Industrial UC y Magister en Innovación UAI.
Mis áreas de interés son la Innovación Empresarial, las Tecnologías de
Información, la Gestión de Operaciones y Logística",
"headline": "Value Senior Advisor en SAP",
"given_name": "Martin",
"full_name": "Martin Volante",
"skills": [
"Business Intelligence",
"Team Leadership",
"Business Strategy",
"Project Management",
"Management",
"Business Process",
"Business Analysis",
"Change Management",
"SOA",
"Strategic Planning",
"Software Project...",
"Oracle",
"Pre-sales",
"Solution Architecture",
"Management Consulting",
"Project Planning",
"IT Strategy"
],
"experience": [
{
"end": "Present",
"title": "Industry Value Engineering",
"start": "September 2014",
"location": "Santiago, Chile",
"duration": "6 months",
"organization": [
{
"name": "SAP",
"profile_url": "http://www.linkedin.com/company/1115";
}
]
}
],
"education": [
{
"start": "2010",
"end": "2011",
"name": "Universidad Adolfo Ibáñez",
"degrees": [
"Master en Innovación"
]
}
]
}

Thanks,
Sandeep

Re: Sorting in solr

2016-07-11 Thread Sandeep Mestry

Hi Naveen,

I am not too sure what you're after but the sorting mechanism is applied
after search results are fetched.

>From Solr Ref Guide:
https://cwiki.apache.org/confluence/display/solr/Common+Query+Parameters#CommonQueryParameters-ThesortParameter

The sort parameter *arranges search results* in either ascending (asc) or
descending (desc) order.

Thanks,
Sandeep

On 11 July 2016 at 11:13, Naveen Pajjuri  wrote:

> Hi,
> If i apply some sorting order on solr. when are the Documents sorted.
>
>1. are documents sorted after fetching the results  ?
>2. or we get sorted documents ?
>
> Regards,
> Naveen
>

StrField with Wildcard Search

2016-09-08 Thread Sandeep Khanzode

Hello,
There are quite a few links that detail the difference between StrField and 
TextField. Also links that explain that, even though the field is indexed, it 
is not tokenized and stored as a single keyword, as can be verified by the 
debug analysis on Solr admin and CURL debugQuery options.
What I am unable to understand is how a wildcard works on StrFields? For 
example, if the name is "John Doe" and I search for "John*", I get that match. 
Which means, that somewhere deep within, maybe a Trie or Dictionary 
representation exists that allows this search with a partial string.
I would have assumed that wildcard would match on TextFields which allow 
(Edge)NGramFilters, etc.  -- SRK

Re: StrField with Wildcard Search

2016-09-08 Thread Sandeep Khanzode

Hi,
Okay.
So it seems that the wildcard searches will perform a (sort-of) dictionary 
search where they will inspect every (full keyword) token at search time, and 
do a match instead of a match on pre-created index-time tokens with TextField. 
However, the wildcard/fuzzy functionality will still be provided no matter the 
approach... SRK 

On Thursday, September 8, 2016 5:05 PM, Ahmet Arslan 
 wrote:
 

 Hi,

EdgeNGram and Wildcard may be used to achieve the same goal: prefix search or 
starts with search.

Lets say, wildcard enumerates the whole inverted index, thus it may get slower 
for very large databases.
With this one no index time manipulation is required.

EdgeNGram does its magic at index time, indexes a lot of tokens, all possible 
prefixes.
Index size gets bigger, query time no wildcard operator required in this one.

Ahmet



On Thursday, September 8, 2016 12:35 PM, Sandeep Khanzode 
 wrote:
Hello,
There are quite a few links that detail the difference between StrField and 
TextField. Also links that explain that, even though the field is indexed, it 
is not tokenized and stored as a single keyword, as can be verified by the 
debug analysis on Solr admin and CURL debugQuery options.
What I am unable to understand is how a wildcard works on StrFields? For 
example, if the name is "John Doe" and I search for "John*", I get that match. 
Which means, that somewhere deep within, maybe a Trie or Dictionary 
representation exists that allows this search with a partial string.
I would have assumed that wildcard would match on TextFields which allow 
(Edge)NGramFilters, etc.  -- SRK

Custom Function-based Fields

2016-09-08 Thread Sandeep Khanzode

Hi,
Can someone please direct me to some documentation that shows how to do this 
... ?
I need to write a non-trivial function that will return a new custom (not in 
schema) field but which is more complicated than a simple sum/avg/etc. 
I want to create a function that looks at a few dateranges in the current 
records and return possible an enum or an integer ... 
Maybe something similar could also be helpful ...
Thanks. SRK

Solr DateRange Query with AND and different op types

2016-09-19 Thread Sandeep Khanzode

Hi,
Can I not query like this?
{!field f=schedule1 op=Contains}[1988-10-22T18:30:00Z TO *] AND 
-schedule3:[1988-10-22T18:30:00Z TO *] AND -schedule2:[1988-10-22T18:30:00Z TO 
*] I keep getting parsing and date math related errors.
If I change it to ...schedule1:[1988-10-22T18:30:00Z TO *] AND 
-schedule3:[1988-10-22T18:30:00Z TO *] AND -schedule2:[1988-10-22T18:30:00Z TO 
*]
... this works. But then I obviously have the functionality wrong (intersects 
is the default).
Can I not mix and match multiple op types (like contains, within, intersects) 
in a AND/OR joined query?
SRK

Re: Solr DateRange Query with AND and different op types

2016-09-19 Thread Sandeep Khanzode

Hi, Can someone please reply to my query? Let me know if it is not 
understandable. Thanks.
SRK 

On Monday, September 19, 2016 6:00 PM, Sandeep Khanzode 
 wrote:
 

 Hi,
Can I not query like this?
{!field f=schedule1 op=Contains}[1988-10-22T18:30:00Z TO *] AND 
-schedule3:[1988-10-22T18:30:00Z TO *] AND -schedule2:[1988-10-22T18:30:00Z TO 
*] I keep getting parsing and date math related errors.
If I change it to ...schedule1:[1988-10-22T18:30:00Z TO *] AND 
-schedule3:[1988-10-22T18:30:00Z TO *] AND -schedule2:[1988-10-22T18:30:00Z TO 
*]
... this works. But then I obviously have the functionality wrong (intersects 
is the default).
Can I not mix and match multiple op types (like contains, within, intersects) 
in a AND/OR joined query?
SRK

How to set NOT clause on Date range query in Solr

2016-09-20 Thread Sandeep Khanzode

Have been trying to understand this for a while ...How can I specify NOT clause 
in the following query?{!field f=schedule op=Intersects}[2016-08-26T12:30:00Z 
TO 2016-08-26T18:30:00Z]{!field f=schedule op=Contains}[2016-08-26T12:30:00Z TO 
2016-08-26T18:30:00Z]Like, without LocalParams, we can specify 
-DateField:[2016-08-26T12:30:00Z TO 2016-08-26T18:30:00Z] to get an equivalent 
NOT clause. But, I need a NOT Contains Date Range query.I have tried a few 
options but I end up getting parsing errors. Surely there must be some obvious 
way I am missing. SRK

Negative Date Query for Local Params in Solr

2016-09-20 Thread Sandeep Khanzode

For Solr 6.1.0
This works .. -{!field f=schedule op=Intersects}2016-08-26T12:00:56Z

This works .. {!field f=schedule op=Contains}[2016-08-26T12:00:12Z TO 
2016-08-26T15:00:12Z]


Why does this not work?-{!field f=schedule op=Contains}[2016-08-26T12:00:12Z TO 
2016-08-26T15:00:12Z]
 SRK

Re: Negative Date Query for Local Params in Solr

2016-09-20 Thread Sandeep Khanzode

This is what I get ... 
{ "responseHeader": { "status": 400, "QTime": 1, "params": { "q": "-{!field 
f=schedule op=Contains}[2016-08-26T12:00:12Z TO 2016-08-26T15:00:12Z]", 
"indent": "true", "wt": "json", "_": "1474373612202" } }, "error": { "msg": 
"Invalid Date in Date Math String:'[2016-08-26T12:00:12Z'", "code": 400 }}
 SRK 

On Tuesday, September 20, 2016 5:34 PM, David Smiley 
 wrote:
 

 It should, I think... what happens? Can you ascertain the nature of the
results?
~ David

On Tue, Sep 20, 2016 at 5:35 AM Sandeep Khanzode
 wrote:

> For Solr 6.1.0
> This works .. -{!field f=schedule op=Intersects}2016-08-26T12:00:56Z
>
> This works .. {!field f=schedule op=Contains}[2016-08-26T12:00:12Z TO
> 2016-08-26T15:00:12Z]
>
>
> Why does this not work?-{!field f=schedule
> op=Contains}[2016-08-26T12:00:12Z TO 2016-08-26T15:00:12Z]
>  SRK

-- 
Lucene/Solr Search Committer, Consultant, Developer, Author, Speaker
LinkedIn: http://linkedin.com/in/davidwsmiley | Book:
http://www.solrenterprisesearchserver.com

Re: Negative Date Query for Local Params in Solr

2016-09-20 Thread Sandeep Khanzode

Wow. Simply awesome!
Where can I read more about this? I am not sure whether I understand what is 
going on behind the scenes ... like which parser is invoked for !field, how can 
we know which all special local params exist, whether we should prefer edismax 
over others, when is the LuceneQParser invoked in other conditions, etc? Would 
appreciate if you could indicate some references to catch up. 
Thanks a lot ...  SRK 

  Show original message On Tuesday, September 20, 2016 5:54 PM, David 
Smiley  wrote:
 

 OH!  Ok the moment the query no longer starts with "{!", the query is
parsed by defType (for 'q') and will default to lucene QParser.  So then it
appears we have a clause with a NOT operator.  In this parsing mode,
embedded "{!" terminates at the "}".  This means you can't put the
sub-query text after the "}", you instead need to put it in the special "v"
local-param.  e.g.:
-{!field f=schedule op=Contains v='[2016-08-26T12:00:12Z TO
2016-08-26T15:00:12Z]'}

On Tue, Sep 20, 2016 at 8:15 AM Sandeep Khanzode
 wrote:

> This is what I get ...
> { "responseHeader": { "status": 400, "QTime": 1, "params": { "q":
> "-{!field f=schedule op=Contains}[2016-08-26T12:00:12Z TO
> 2016-08-26T15:00:12Z]", "indent": "true", "wt": "json", "_":
> "1474373612202" } }, "error": { "msg": "Invalid Date in Date Math
> String:'[2016-08-26T12:00:12Z'", "code": 400 }}
>  SRK
>
>    On Tuesday, September 20, 2016 5:34 PM, David Smiley <
> david.w.smi...@gmail.com> wrote:
>
>
>  It should, I think... what happens? Can you ascertain the nature of the
> results?
> ~ David
>
> On Tue, Sep 20, 2016 at 5:35 AM Sandeep Khanzode
>  wrote:
>
> > For Solr 6.1.0
> > This works .. -{!field f=schedule op=Intersects}2016-08-26T12:00:56Z
> >
> > This works .. {!field f=schedule op=Contains}[2016-08-26T12:00:12Z TO
> > 2016-08-26T15:00:12Z]
> >
> >
> > Why does this not work?-{!field f=schedule
> > op=Contains}[2016-08-26T12:00:12Z TO 2016-08-26T15:00:12Z]
> >  SRK
>
> --
> Lucene/Solr Search Committer, Consultant, Developer, Author, Speaker
> LinkedIn: http://linkedin.com/in/davidwsmiley | Book:
> http://www.solrenterprisesearchserver.com
>
>
>

-- 
Lucene/Solr Search Committer, Consultant, Developer, Author, Speaker
LinkedIn: http://linkedin.com/in/davidwsmiley | Book:
http://www.solrenterprisesearchserver.com

JSON Facet API

2016-09-20 Thread Sandeep Khanzode

Hello,
How can I specify JSON Facets in SolrJ? The below facet query for example ... 
&json.facet={ 
 facet1: { 
 type: query, 
 q: "field1:value1 AND field2:value2", 
 facet: 
 { 
 facet1sub1: {
 type: query, 
 q: "{!field f=mydate op=Intersects}2016-09-08T08:00:00", 
 facet: 
 { 
 id: 
 { 
 type: terms, 
 field: id 
 } 
 } 
 }, 
 facet1sub2: { 
 type: query,
 q: "-{!field f=myseconddate op=Intersects}2016-09-08T08:00:00 AND -{!field 
f=mydateop=Intersects}2016-05-08T08:00:00", 
 facet: 
 { 
 id: 
 { 
 type: terms, 
 field: id 
 } 
 } 
 }
 } 
    } 
},

 SRK

Re: Negative Date Query for Local Params in Solr

2016-09-20 Thread Sandeep Khanzode

Thanks, David! Perhaps browsing the Solr sources may be a necessity at some 
point in time. :) SRK 

On Wednesday, September 21, 2016 9:08 AM, David Smiley 
 wrote:
 

 So that page referenced describes local-params, and describes the special
"v" local-param.  But first, see a list of all query parsers (which lists
"field"): https://cwiki.apache.org/confluence/display/solr/Other+Parsers
and
https://cwiki.apache.org/confluence/display/solr/The+Standard+Query+Parser for
the "lucene" one.

The "op" param is rather unique... it's not defined by any query parser.  A
trick is done in which a custom field type (DateRangeField in this case) is
able to inspect the local-params, and thus define and use params it needs.
https://cwiki.apache.org/confluence/display/solr/Working+with+Dates "More
DateRangeField Details" mentions "op".  {!lucene df=dateRange
op=Contains}... would also work.  I don't know of any other local-param
used in this way.

On Tue, Sep 20, 2016 at 11:21 PM David Smiley 
wrote:

> Personally I learned this by pouring over Solr's source code some time
> ago.  I suppose the only official reference to this stuff is:
>
> https://cwiki.apache.org/confluence/display/solr/Local+Parameters+in+Queries
> But that page doesn't address the implications for when the syntax is a
> clause of a larger query instead of being the whole query (i.e. has "{!"...
> but but not at the first char).
>
> On Tue, Sep 20, 2016 at 2:06 PM Sandeep Khanzode
>  wrote:
>
>> Wow. Simply awesome!
>> Where can I read more about this? I am not sure whether I understand what
>> is going on behind the scenes ... like which parser is invoked for !field,
>> how can we know which all special local params exist, whether we should
>> prefer edismax over others, when is the LuceneQParser invoked in other
>> conditions, etc? Would appreciate if you could indicate some references to
>> catch up.
>> Thanks a lot ...  SRK
>>
>>  Show original message    On Tuesday, September 20, 2016 5:54 PM, David
>> Smiley  wrote:
>>
>>
>>  OH!  Ok the moment the query no longer starts with "{!", the query is
>> parsed by defType (for 'q') and will default to lucene QParser.  So then
>> it
>> appears we have a clause with a NOT operator.  In this parsing mode,
>> embedded "{!" terminates at the "}".  This means you can't put the
>> sub-query text after the "}", you instead need to put it in the special
>> "v"
>> local-param.  e.g.:
>> -{!field f=schedule op=Contains v='[2016-08-26T12:00:12Z TO
>> 2016-08-26T15:00:12Z]'}
>>
>> On Tue, Sep 20, 2016 at 8:15 AM Sandeep Khanzode
>>  wrote:
>>
>> > This is what I get ...
>> > { "responseHeader": { "status": 400, "QTime": 1, "params": { "q":
>> > "-{!field f=schedule op=Contains}[2016-08-26T12:00:12Z TO
>> > 2016-08-26T15:00:12Z]", "indent": "true", "wt": "json", "_":
>> > "1474373612202" } }, "error": { "msg": "Invalid Date in Date Math
>> > String:'[2016-08-26T12:00:12Z'", "code": 400 }}
>> >  SRK
>> >
>> >    On Tuesday, September 20, 2016 5:34 PM, David Smiley <
>> > david.w.smi...@gmail.com> wrote:
>> >
>> >
>> >  It should, I think... what happens? Can you ascertain the nature of the
>> > results?
>> > ~ David
>> >
>> > On Tue, Sep 20, 2016 at 5:35 AM Sandeep Khanzode
>> >  wrote:
>> >
>> > > For Solr 6.1.0
>> > > This works .. -{!field f=schedule op=Intersects}2016-08-26T12:00:56Z
>> > >
>> > > This works .. {!field f=schedule op=Contains}[2016-08-26T12:00:12Z TO
>> > > 2016-08-26T15:00:12Z]
>> > >
>> > >
>> > > Why does this not work?-{!field f=schedule
>> > > op=Contains}[2016-08-26T12:00:12Z TO 2016-08-26T15:00:12Z]
>> > >  SRK
>> >
>> > --
>> > Lucene/Solr Search Committer, Consultant, Developer, Author, Speaker
>> > LinkedIn: http://linkedin.com/in/davidwsmiley | Book:
>> > http://www.solrenterprisesearchserver.com
>> >
>> >
>> >
>>
>> --
>> Lucene/Solr Search Committer, Consultant, Developer, Author, Speaker
>> LinkedIn: http://linkedin.com/in/davidwsmiley | Book:
>> http://www.solrenterprisesearchserver.com
>>
>>
>>
>
> --
> Lucene/Solr Search Committer, Consultant, Developer, Author, Speaker
> LinkedIn: http://linkedin.com/in/davidwsmiley | Book:
> http://www.solrenterprisesearchserver.com
>
-- 
Lucene/Solr Search Committer, Consultant, Developer, Author, Speaker
LinkedIn: http://linkedin.com/in/davidwsmiley | Book:
http://www.solrenterprisesearchserver.com

Re: JSON Facet API

2016-09-21 Thread Sandeep Khanzode

Thanks a lot, Bram. I will try that ...  SRK 

On Wednesday, September 21, 2016 11:57 AM, Bram Van Dam 
 wrote:
 

 On 21/09/16 05:40, Sandeep Khanzode wrote:
> How can I specify JSON Facets in SolrJ? The below facet query for example ... 

SolrQuery query = new SolrQuery();
query.add("json.facet", jsonStringGoesHere);

 - Bram

-field1:value1 OR field2:value2

2016-09-26 Thread Sandeep Khanzode

Hi,
If I query for 
-field1=value1 ... I get, say, 100 records
and if I query for 
field2:value2 ... I may get 200 records

I would assume that if I query for 
-field1:value1 OR field2:value2

... I should get atleast 100 records (assuming they overlap, if not, upto 300 
records). I am assuming that the default joining is OR.
 But I do not ... 
The result is that I get less than 100. If I didn't know better, I would have 
said that an AND is being done.

I am expecting records that EITHER do NOT contain field1:value1 OR which 
contain field2:value2.

Please let me know what I am missing. Thanks.

SRK

Re: -field1:value1 OR field2:value2

2016-09-26 Thread Sandeep Khanzode

Yup. That works. So does (*:* NOT ...)
Thanks, Alex.  SRK 

On Monday, September 26, 2016 3:03 PM, Alexandre Rafalovitch 
 wrote:
 

 Try field2:value2 OR (*:* -field1=value1)

There is a magic in negative query syntax that breaks down when it
gets more complex. It's been discussed on the mailing list a bunch of
times, though the discussions are hard to find by title.

Regards,
    Alex.

Newsletter and resources for Solr beginners and intermediates:
http://www.solr-start.com/


On 26 September 2016 at 16:06, Sandeep Khanzode
 wrote:
> Hi,
> If I query for
> -field1=value1 ... I get, say, 100 records
> and if I query for
> field2:value2 ... I may get 200 records
>
> I would assume that if I query for
> -field1:value1 OR field2:value2
>
> ... I should get atleast 100 records (assuming they overlap, if not, upto 300 
> records). I am assuming that the default joining is OR.
>  But I do not ...
> The result is that I get less than 100. If I didn't know better, I would have 
> said that an AND is being done.
>
> I am expecting records that EITHER do NOT contain field1:value1 OR which 
> contain field2:value2.
>
> Please let me know what I am missing. Thanks.
>
> SRK

Re: -field1:value1 OR field2:value2

2016-09-26 Thread Sandeep Khanzode

Hi Alex,
It seems that this is not an issue with AND clause. For example, if I do ...
field1:value1 AND -field2:value2 
... the results seem to be an intersection of both.
Is this an issue with OR? Which is which we replace it with an implicit (*:* 
NOT)? SRK 

On Monday, September 26, 2016 3:09 PM, Sandeep Khanzode 
 wrote:

 Yup. That works. So does (*:* NOT ...)
Thanks, Alex.  SRK 

    On Monday, September 26, 2016 3:03 PM, Alexandre Rafalovitch 
 wrote:

 Try field2:value2 OR (*:* -field1=value1)

There is a magic in negative query syntax that breaks down when it
gets more complex. It's been discussed on the mailing list a bunch of
times, though the discussions are hard to find by title.

Regards,
    Alex.

Newsletter and resources for Solr beginners and intermediates:
http://www.solr-start.com/

On 26 September 2016 at 16:06, Sandeep Khanzode
 wrote:
> Hi,
> If I query for
> -field1=value1 ... I get, say, 100 records
> and if I query for
> field2:value2 ... I may get 200 records
>
> I would assume that if I query for
> -field1:value1 OR field2:value2
>
> ... I should get atleast 100 records (assuming they overlap, if not, upto 300 
> records). I am assuming that the default joining is OR.
>  But I do not ...
> The result is that I get less than 100. If I didn't know better, I would have 
> said that an AND is being done.
>
> I am expecting records that EITHER do NOT contain field1:value1 OR which 
> contain field2:value2.
>
> Please let me know what I am missing. Thanks.
>
> SRK

Re: -field1:value1 OR field2:value2

2016-09-26 Thread Sandeep Khanzode

Sure. Noted. 
Thanks for the link ...  SRK 

On Monday, September 26, 2016 8:29 PM, Erick Erickson 
 wrote:
 

 Please do not cross post to multiple lists, it's considered bad
etiquette.

Solr does not implement strict boolean logic, please read:

https://lucidworks.com/blog/2011/12/28/why-not-and-or-and-not/

Best,
Erick

On Mon, Sep 26, 2016 at 2:58 AM, Alexandre Rafalovitch
 wrote:
> I don't remember specifically :-(. Search the archives
> http://search-lucene.com/ or follow-up on Solr Users list. Remember to
> mention the version of Solr, as there were some bugs/features/fixes
> with OR, I think.
>
> Regards,
>  Alex.
> 
> Newsletter and resources for Solr beginners and intermediates:
> http://www.solr-start.com/
>
>
> On 26 September 2016 at 16:56, Sandeep Khanzode
>  wrote:
>> Hi Alex,
>> It seems that this is not an issue with AND clause. For example, if I do ...
>> field1:value1 AND -field2:value2
>> ... the results seem to be an intersection of both.
>> Is this an issue with OR? Which is which we replace it with an implicit (*:* 
>> NOT)? SRK
>>
>>    On Monday, September 26, 2016 3:09 PM, Sandeep Khanzode 
>> wrote:
>>
>>
>>  Yup. That works. So does (*:* NOT ...)
>> Thanks, Alex.  SRK
>>
>>    On Monday, September 26, 2016 3:03 PM, Alexandre Rafalovitch 
>> wrote:
>>
>>
>>  Try field2:value2 OR (*:* -field1=value1)
>>
>> There is a magic in negative query syntax that breaks down when it
>> gets more complex. It's been discussed on the mailing list a bunch of
>> times, though the discussions are hard to find by title.
>>
>> Regards,
>>    Alex.
>> 
>> Newsletter and resources for Solr beginners and intermediates:
>> http://www.solr-start.com/
>>
>>
>> On 26 September 2016 at 16:06, Sandeep Khanzode
>>  wrote:
>>> Hi,
>>> If I query for
>>> -field1=value1 ... I get, say, 100 records
>>> and if I query for
>>> field2:value2 ... I may get 200 records
>>>
>>> I would assume that if I query for
>>> -field1:value1 OR field2:value2
>>>
>>> ... I should get atleast 100 records (assuming they overlap, if not, upto 
>>> 300 records). I am assuming that the default joining is OR.
>>>  But I do not ...
>>> The result is that I get less than 100. If I didn't know better, I would 
>>> have said that an AND is being done.
>>>
>>> I am expecting records that EITHER do NOT contain field1:value1 OR which 
>>> contain field2:value2.
>>>
>>> Please let me know what I am missing. Thanks.
>>>
>>> SRK
>>
>>
>>
>>
>>

solrj Https problem

2016-10-31 Thread sandeep mukherjee

I followed the steps to make the solr SSL enabled. I'm able to hit solr at: 
https://localhost:8985/solr/problem/select?indent=on&q=*:*&wt=json And for 
accessing it through Solr Client I created it as 
follows:System.setProperty("javax.net.ssl.keyStore", 
"/path/to/solr/server/etc/solr-ssl.keystore.jks");
System.setProperty("javax.net.ssl.keyStorePassword", "secret");
System.setProperty("javax.net.ssl.trustStore", 
"/path/to/solr/server/etc/solr-ssl.keystore.jks");
System.setProperty("javax.net.ssl.trustStorePassword", "secret");
return new CloudSolrClient.Builder()
.withZkHost(solrConfig.getConnectString()).build(); The path to the 
keystore and truststore is correct.  However I still get the following 
error:Caused by: javax.net.ssl.SSLHandshakeException: 
sun.security.validator.ValidatorException: PKIX path building failed: 
sun.security.provider.certpath.SunCertPathBuilderException: unable to find 
valid certification path to requested target
at sun.security.ssl.Alerts.getSSLException(Alerts.java:192) ~[na:1.8.0_45]
at sun.security.ssl.SSLSocketImpl.fatal(SSLSocketImpl.java:1937) ~[na:1.8.0_45]
at sun.security.ssl.Handshaker.fatalSE(Handshaker.java:302) ~[na:1.8.0_45]
at sun.security.ssl.Handshaker.fatalSE(Handshaker.java:296) ~[na:1.8.0_45]
at 
sun.security.ssl.ClientHandshaker.serverCertificate(ClientHandshaker.java:1478) 
~[na:1.8.0_45]
at sun.security.ssl.ClientHandshaker.processMessage(ClientHandshaker.java:212) 
~[na:1.8.0_45]
at sun.security.ssl.Handshaker.processLoop(Handshaker.java:979) ~[na:1.8.0_45]
at sun.security.ssl.Handshaker.process_record(Handshaker.java:914) 
~[na:1.8.0_45]
at sun.security.ssl.SSLSocketImpl.readRecord(SSLSocketImpl.java:1050) 
~[na:1.8.0_45]
at 
sun.security.ssl.SSLSocketImpl.performInitialHandshake(SSLSocketImpl.java:1363) 
~[na:1.8.0_45]
at sun.security.ssl.SSLSocketImpl.startHandshake(SSLSocketImpl.java:1391) 
~[na:1.8.0_45]
at sun.security.ssl.SSLSocketImpl.startHandshake(SSLSocketImpl.java:1375) 
~[na:1.8.0_45]
at 
org.apache.http.conn.ssl.SSLSocketFactory.connectSocket(SSLSocketFactory.java:543)
 ~[httpclient-4.5.1.jar:4.5.1]
at 
org.apache.http.conn.ssl.SSLSocketFactory.connectSocket(SSLSocketFactory.java:409)
 ~[httpclient-4.5.1.jar:4.5.1]
at 
org.apache.http.impl.conn.DefaultClientConnectionOperator.openConnection(DefaultClientConnectionOperator.java:177)
 ~[httpclient-4.5.1.jar:4.5.1]
at 
org.apache.http.impl.conn.ManagedClientConnectionImpl.open(ManagedClientConnectionImpl.java:304)
 ~[httpclient-4.5.1.jar:4.5.1]
at 
org.apache.http.impl.client.DefaultRequestDirector.tryConnect(DefaultRequestDirector.java:611)
 ~[httpclient-4.5.1.jar:4.5.1]
at 
org.apache.http.impl.client.DefaultRequestDirector.execute(DefaultRequestDirector.java:446)
 ~[httpclient-4.5.1.jar:4.5.1]
at 
org.apache.http.impl.client.AbstractHttpClient.doExecute(AbstractHttpClient.java:882)
 ~[httpclient-4.5.1.jar:4.5.1]
at 
org.apache.http.impl.client.CloseableHttpClient.execute(CloseableHttpClient.java:82)
 ~[httpclient-4.5.1.jar:4.5.1]
at 
org.apache.http.impl.client.CloseableHttpClient.execute(CloseableHttpClient.java:107)
 ~[httpclient-4.5.1.jar:4.5.1]
at 
org.apache.http.impl.client.CloseableHttpClient.execute(CloseableHttpClient.java:55)
 ~[httpclient-4.5.1.jar:4.5.1]
at 
org.apache.solr.client.solrj.impl.HttpSolrClient.executeMethod(HttpSolrClient.java:495)
 ~[solr-solrj-6.1.0.jar:6.1.0 4726c5b2d2efa9ba160b608d46a977d0a6b83f94 - 
jpountz - 2016-06-13 09:46:59]
... 26 common frames omitted
Caused by: sun.security.validator.ValidatorException: PKIX path building 
failed: sun.security.provider.certpath.SunCertPathBuilderException: unable to 
find valid certification path to requested target
at sun.security.validator.PKIXValidator.doBuild(PKIXValidator.java:387) 
~[na:1.8.0_45]
at sun.security.validator.PKIXValidator.engineValidate(PKIXValidator.java:292) 
~[na:1.8.0_45]
at sun.security.validator.Validator.validate(Validator.java:260) ~[na:1.8.0_45]
at 
sun.security.ssl.X509TrustManagerImpl.validate(X509TrustManagerImpl.java:324) 
~[na:1.8.0_45]
at 
sun.security.ssl.X509TrustManagerImpl.checkTrusted(X509TrustManagerImpl.java:229)
 ~[na:1.8.0_45]
at 
sun.security.ssl.X509TrustManagerImpl.checkServerTrusted(X509TrustManagerImpl.java:124)
 ~[na:1.8.0_45]
at 
sun.security.ssl.ClientHandshaker.serverCertificate(ClientHandshaker.java:1460) 
~[na:1.8.0_45]
... 44 common frames omitted
Caused by: sun.security.provider.certpath.SunCertPathBuilderException: unable 
to find valid certification path to requested target
at 
sun.security.provider.certpath.SunCertPathBuilder.build(SunCertPathBuilder.java:145)
 ~[na:1.8.0_45]
at 
sun.security.provider.certpath.SunCertPathBuilder.engineBuild(SunCertPathBuilder.java:131)
 ~[na:1.8.0_45]
at java.security.cert.CertPathBuilder.build(CertPathBuilder.java:280) 
~[na:1.8.0_45]
at sun.security.validator.PKIXValidator.doBuild(PKIXValidator.java:382) 
~[na:1.8.0_45]
... 50 common frames omitted  What am I missing?
ThanksSandeep

Basic Auth for Solr Streaming Expressions

2016-11-09 Thread sandeep mukherjee

Hello everyone,
I trying to find the documentation for Basic Auth plugin for Solr Streaming 
expressions. But I'm not able to find it in the documentation anywhere. Could 
you please point me in right direction of how to enable Basic auth for Solr 
Streams?
I'm creating StreamFactory as follows: I wonder how and where can I specify 
Basic Auth username and password
@Bean
public StreamFactory streamFactory() {
SolrConfig solrConfig = ConfigManager.getNamedConfig("solr", 
SolrConfig.class);

return new StreamFactory().withDefaultZkHost(solrConfig.getConnectString())
.withFunctionName("gatherNodes", GatherNodesStream.class);
}

Re: solrj Https problem

2016-11-09 Thread sandeep mukherjee

Thanks Bryan for the response. That seem to have solved it. 

On Monday, October 31, 2016 6:58 PM, Bryan Bende  wrote:
 

 A possible problem might be that your certificate was generated for
"localhost" which is why it works when you go to https://localhost:8985/solr
in your browser, but when SolrJ gets the cluster information from ZooKeeper
the hostnames of the Solr nodes might be using an IP address which won't
work when the SSL/TLS negotiation happens.

If this is the problem you will want to specify the hostname for Solr to
use when starting each node by passing "-h localhost".

-Bryan

On Mon, Oct 31, 2016 at 1:05 PM, sandeep mukherjee <
wiredcit...@yahoo.com.invalid> wrote:

> I followed the steps to make the solr SSL enabled. I'm able to hit solr
> at: https://localhost:8985/solr/problem/select?indent=on&q=*:*&wt=json And
> for accessing it through Solr Client I created it as
> follows:System.setProperty("javax.net.ssl.keyStore",
> "/path/to/solr/server/etc/solr-ssl.keystore.jks");
> System.setProperty("javax.net.ssl.keyStorePassword", "secret");
> System.setProperty("javax.net.ssl.trustStore", "/path/to/solr/server/etc/
> solr-ssl.keystore.jks");
> System.setProperty("javax.net.ssl.trustStorePassword", "secret");
> return new CloudSolrClient.Builder()
>        .withZkHost(solrConfig.getConnectString()).build(); The path to
> the keystore and truststore is correct.  However I still get the following
> error:Caused by: javax.net.ssl.SSLHandshakeException:
> sun.security.validator.ValidatorException: PKIX path building failed:
> sun.security.provider.certpath.SunCertPathBuilderException: unable to
> find valid certification path to requested target
> at sun.security.ssl.Alerts.getSSLException(Alerts.java:192) ~[na:1.8.0_45]
> at sun.security.ssl.SSLSocketImpl.fatal(SSLSocketImpl.java:1937)
> ~[na:1.8.0_45]
> at sun.security.ssl.Handshaker.fatalSE(Handshaker.java:302) ~[na:1.8.0_45]
> at sun.security.ssl.Handshaker.fatalSE(Handshaker.java:296) ~[na:1.8.0_45]
> at 
> sun.security.ssl.ClientHandshaker.serverCertificate(ClientHandshaker.java:1478)
> ~[na:1.8.0_45]
> at sun.security.ssl.ClientHandshaker.processMessage(ClientHandshaker.java:212)
> ~[na:1.8.0_45]
> at sun.security.ssl.Handshaker.processLoop(Handshaker.java:979)
> ~[na:1.8.0_45]
> at sun.security.ssl.Handshaker.process_record(Handshaker.java:914)
> ~[na:1.8.0_45]
> at sun.security.ssl.SSLSocketImpl.readRecord(SSLSocketImpl.java:1050)
> ~[na:1.8.0_45]
> at 
> sun.security.ssl.SSLSocketImpl.performInitialHandshake(SSLSocketImpl.java:1363)
> ~[na:1.8.0_45]
> at sun.security.ssl.SSLSocketImpl.startHandshake(SSLSocketImpl.java:1391)
> ~[na:1.8.0_45]
> at sun.security.ssl.SSLSocketImpl.startHandshake(SSLSocketImpl.java:1375)
> ~[na:1.8.0_45]
> at 
> org.apache.http.conn.ssl.SSLSocketFactory.connectSocket(SSLSocketFactory.java:543)
> ~[httpclient-4.5.1.jar:4.5.1]
> at 
> org.apache.http.conn.ssl.SSLSocketFactory.connectSocket(SSLSocketFactory.java:409)
> ~[httpclient-4.5.1.jar:4.5.1]
> at org.apache.http.impl.conn.DefaultClientConnectionOperato
> r.openConnection(DefaultClientConnectionOperator.java:177)
> ~[httpclient-4.5.1.jar:4.5.1]
> at org.apache.http.impl.conn.ManagedClientConnectionImpl.open(
> ManagedClientConnectionImpl.java:304) ~[httpclient-4.5.1.jar:4.5.1]
> at org.apache.http.impl.client.DefaultRequestDirector.tryConnect(
> DefaultRequestDirector.java:611) ~[httpclient-4.5.1.jar:4.5.1]
> at org.apache.http.impl.client.DefaultRequestDirector.execute(
> DefaultRequestDirector.java:446) ~[httpclient-4.5.1.jar:4.5.1]
> at 
> org.apache.http.impl.client.AbstractHttpClient.doExecute(AbstractHttpClient.java:882)
> ~[httpclient-4.5.1.jar:4.5.1]
> at 
> org.apache.http.impl.client.CloseableHttpClient.execute(CloseableHttpClient.java:82)
> ~[httpclient-4.5.1.jar:4.5.1]
> at 
> org.apache.http.impl.client.CloseableHttpClient.execute(CloseableHttpClient.java:107)
> ~[httpclient-4.5.1.jar:4.5.1]
> at 
> org.apache.http.impl.client.CloseableHttpClient.execute(CloseableHttpClient.java:55)
> ~[httpclient-4.5.1.jar:4.5.1]
> at org.apache.solr.client.solrj.impl.HttpSolrClient.
> executeMethod(HttpSolrClient.java:495) ~[solr-solrj-6.1.0.jar:6.1.0
> 4726c5b2d2efa9ba160b608d46a977d0a6b83f94 - jpountz - 2016-06-13 09:46:59]
> ... 26 common frames omitted
> Caused by: sun.security.validator.ValidatorException: PKIX path building
> failed: sun.security.provider.certpath.SunCertPathBuilderException:
> unable to find valid certification path to requested target
> at sun.security.validator.PKIXValidator.doBuild(PKIXValidator.java:387)
> ~[na:1.8.0_45]
> at sun.security.validator.PKIXValidator.en

Re: Basic Auth for Solr Streaming Expressions

2016-11-09 Thread sandeep mukherjee

I have more progress since my last mail. I figured out that  in the 
StreamContext object there is a way to set the SolrClientCache object which 
keep reference to all the CloudSolrClient where I can set a reference to 
HttpClient which sets the Basic Auth header. However the problem is, inside the 
SolrClientCache there is no way to set your own version of CloudSolrClient with 
BasicAuth enabled. Unfortunately, SolrClientCache has no set method which takes 
a CloudSolrClient object.
So long story short we need an API in SolrClientCache to accept CloudSolrClient 
object from user.
Please let me know if there is a better way to enable Basic Auth when using 
StreamFactory as mentioned in my previous email.
Thanks much,Sandeep 

On Wednesday, November 9, 2016 11:44 AM, sandeep mukherjee 
 wrote:
 

 Hello everyone,
I trying to find the documentation for Basic Auth plugin for Solr Streaming 
expressions. But I'm not able to find it in the documentation anywhere. Could 
you please point me in right direction of how to enable Basic auth for Solr 
Streams?
I'm creating StreamFactory as follows: I wonder how and where can I specify 
Basic Auth username and password
@Bean
public StreamFactory streamFactory() {
    SolrConfig solrConfig = ConfigManager.getNamedConfig("solr", 
SolrConfig.class);

    return new StreamFactory().withDefaultZkHost(solrConfig.getConnectString())
            .withFunctionName("gatherNodes", GatherNodesStream.class);
}

Wildcard searches with space in TextField/StrField

2016-11-10 Thread Sandeep Khanzode

Hi,
How does a search like abc* work in StrField. Since the entire thing is stored 
as a single token, is it a type of a trie structure that allows such wildcard 
matching? 
How can searches with space like 'a b*' be executed for text fields (tokenized 
on whitespace)? If we specify this type of query, it is broken down into two 
queries with field:a and field:b*. I would like them to be contiguous, sort of, 
like a phrase search with wild card.
SRK

Re: Wildcard searches with space in TextField/StrField

2016-11-10 Thread Sandeep Khanzode

Hi Erick, Reth,

The 'a\ b*' as well as the q.op=AND approach worked (successfully) only for 
StrField for me.

Any attempt at creating a 'a\ b*' for a TextField does not match any documents. 
The parsedQuery in debug mode does show 'field:a b*'. I am sure there are 
documents that should match.
Another (maybe unrelated) observation is if I have 'field:a\ b', then the 
parsedQuery is field:a field:b. Which does not match as expected (matches 
individually).

Can you please provide an example that I can use in Solr Query dashboard? That 
will be helpful. 

I have also seen that wildcard queries work irrespective of field type i.e. 
StrField as well as TextField. That makes sense because with a 
WhitespaceTokenizer only creates word boundaries when we do not use a 
EdgeNGramFilter. If I am not wrong, that is. SRK 

On Friday, November 11, 2016 5:00 AM, Erick Erickson 
 wrote:

 You can escape the space with a backslash as  'a\ b*'

Best,
Erick

On Thu, Nov 10, 2016 at 2:37 PM, Reth RM  wrote:
> I don't think you can do wildcard on StrField. For text field, if your
> query is "category:(test m*)"  the parsed query will be  "category:test OR
> category:m*"
> You can add q.op=AND to make an AND between those terms.
>
> For phrase type wild card query support, as per docs, it
> is ComplexPhraseQueryParser that supports it. (I haven't tested it myself)
>
> https://cwiki.apache.org/confluence/display/solr/Other+Parsers#OtherParsers-ComplexPhraseQueryParser
>
> On Thu, Nov 10, 2016 at 11:40 AM, Sandeep Khanzode <
> sandeep_khanz...@yahoo.com.invalid> wrote:
>
>> Hi,
>> How does a search like abc* work in StrField. Since the entire thing is
>> stored as a single token, is it a type of a trie structure that allows such
>> wildcard matching?
>> How can searches with space like 'a b*' be executed for text fields
>> (tokenized on whitespace)? If we specify this type of query, it is broken
>> down into two queries with field:a and field:b*. I would like them to be
>> contiguous, sort of, like a phrase search with wild card.
>> SRK

Re: Wildcard searches with space in TextField/StrField

2016-11-12 Thread Sandeep Khanzode

Thanks, Erick.
I am actually not trying to use the String field (prefer a TextField here). 
But, in my comparisons with TextField, it seems that something like phrase 
matching with whitespace and wildcard (like, 'my do*' or say, 'my dog*', or 
say, 'my dog has*') can only be accomplished with a string type field, 
especially because, with a WhitespaceTokenizer in TextField, the space will be 
lost, and all tokens will be individually considered. Am I missing something? 
SRK 

On Friday, November 11, 2016 10:05 PM, Erick Erickson 
 wrote:

 You have to query text and string fields differently, that's just the
way it works. The problem is getting the query string through the
parser as a _single_ token or as multiple tokens.

Let's say you have a string field with the "a b" example. You have a
single token
a b that starts at offset 0.

But with a text field, you have two tokens,
a at position 0
b at position 1

But when the query parser sees "a b" (without quotes) it splits it
into two tokens, and only the text field has both tokens so the string
field won't match.

OTOH, when the query parser sees "a\ b" it passes this through as a
single token, which only matches the string field as there's no
_single_ token "a b" in the text field.

But a more interesting question is why you want to search this way.
String fields are intended for keywords, machine-generated IDs and the
like. They're pretty useless for searching anything except
1> exact tokens
2> prefixes

While if you have "my dog has fleas" in a string field, you _can_
search "*dog*" and get a hit but the performance is poor when you get
a large corpus. Performance for "my*" will be pretty good though.

In all this sounds like an XY problem, what's the use-case you're
trying to solve?

Best,
Erick

On Thu, Nov 10, 2016 at 10:11 PM, Sandeep Khanzode
 wrote:
> Hi Erick, Reth,
>
> The 'a\ b*' as well as the q.op=AND approach worked (successfully) only for 
> StrField for me.
>
> Any attempt at creating a 'a\ b*' for a TextField does not match any 
> documents. The parsedQuery in debug mode does show 'field:a b*'. I am sure 
> there are documents that should match.
> Another (maybe unrelated) observation is if I have 'field:a\ b', then the 
> parsedQuery is field:a field:b. Which does not match as expected (matches 
> individually).
>
> Can you please provide an example that I can use in Solr Query dashboard? 
> That will be helpful.
>
> I have also seen that wildcard queries work irrespective of field type i.e. 
> StrField as well as TextField. That makes sense because with a 
> WhitespaceTokenizer only creates word boundaries when we do not use a 
> EdgeNGramFilter. If I am not wrong, that is. SRK
>
>    On Friday, November 11, 2016 5:00 AM, Erick Erickson 
> wrote:
>
>
>  You can escape the space with a backslash as  'a\ b*'
>
> Best,
> Erick
>
> On Thu, Nov 10, 2016 at 2:37 PM, Reth RM  wrote:
>> I don't think you can do wildcard on StrField. For text field, if your
>> query is "category:(test m*)"  the parsed query will be  "category:test OR
>> category:m*"
>> You can add q.op=AND to make an AND between those terms.
>>
>> For phrase type wild card query support, as per docs, it
>> is ComplexPhraseQueryParser that supports it. (I haven't tested it myself)
>>
>> https://cwiki.apache.org/confluence/display/solr/Other+Parsers#OtherParsers-ComplexPhraseQueryParser
>>
>> On Thu, Nov 10, 2016 at 11:40 AM, Sandeep Khanzode <
>> sandeep_khanz...@yahoo.com.invalid> wrote:
>>
>>> Hi,
>>> How does a search like abc* work in StrField. Since the entire thing is
>>> stored as a single token, is it a type of a trie structure that allows such
>>> wildcard matching?
>>> How can searches with space like 'a b*' be executed for text fields
>>> (tokenized on whitespace)? If we specify this type of query, it is broken
>>> down into two queries with field:a and field:b*. I would like them to be
>>> contiguous, sort of, like a phrase search with wild card.
>>> SRK
>
>
>

Re: Basic Auth for Solr Streaming Expressions

2016-11-16 Thread sandeep mukherjee

 blockquote, div.yahoo_quoted { margin-left: 0 !important; border-left:1px 
#715FFA solid !important; padding-left:1ex !important; background-color:white 
!important; }  Nope never got past the login screen. Will create one today.


Sent from Yahoo Mail for iPhone


On Wednesday, November 16, 2016, 8:17 AM, Kevin Risden 
 wrote:

Was a JIRA ever created for this? I couldn't find it searching.

One that is semi related is SOLR-8213 for SolrJ JDBC auth.

Kevin Risden

On Wed, Nov 9, 2016 at 8:25 PM, Joel Bernstein  wrote:

> Thanks for digging into this, let's create a jira ticket for this.
>
> Joel Bernstein
> http://joelsolr.blogspot.com/
>
> On Wed, Nov 9, 2016 at 6:23 PM, sandeep mukherjee <
> wiredcit...@yahoo.com.invalid> wrote:
>
> > I have more progress since my last mail. I figured out that  in the
> > StreamContext object there is a way to set the SolrClientCache object
> which
> > keep reference to all the CloudSolrClient where I can set a reference to
> > HttpClient which sets the Basic Auth header. However the problem is,
> inside
> > the SolrClientCache there is no way to set your own version of
> > CloudSolrClient with BasicAuth enabled. Unfortunately, SolrClientCache
> has
> > no set method which takes a CloudSolrClient object.
> > So long story short we need an API in SolrClientCache to
> > accept CloudSolrClient object from user.
> > Please let me know if there is a better way to enable Basic Auth when
> > using StreamFactory as mentioned in my previous email.
> > Thanks much,Sandeep
> >
> >    On Wednesday, November 9, 2016 11:44 AM, sandeep mukherjee
> >  wrote:
> >
> >
> >  Hello everyone,
> > I trying to find the documentation for Basic Auth plugin for Solr
> > Streaming expressions. But I'm not able to find it in the documentation
> > anywhere. Could you please point me in right direction of how to enable
> > Basic auth for Solr Streams?
> > I'm creating StreamFactory as follows: I wonder how and where can I
> > specify Basic Auth username and password
> > @Bean
> > public StreamFactory streamFactory() {
> >    SolrConfig solrConfig = ConfigManager.getNamedConfig("solr",
> > SolrConfig.class);
> >
> >    return new StreamFactory().withDefaultZkHost(solrConfig.
> > getConnectString())
> >            .withFunctionName("gatherNodes", GatherNodesStream.class);
> > }
> >
> >
> >
>

Re: Basic Auth for Solr Streaming Expressions

2016-11-16 Thread sandeep mukherjee

[SOLR-9779] Basic auth in not supported in Streaming Expressions - ASF JIRA
  
|  
|   
|   
|   ||

   |

  |
|  
|   |  
[SOLR-9779] Basic auth in not supported in Streaming Expressions - ASF JIRA
   |   |

  |

  |

 

I have created the above jira ticket for the base auth support in solr 
streaming expressions.
ThanksSandeep 

On Wednesday, November 16, 2016 8:22 AM, sandeep mukherjee 
 wrote:
 

  blockquote, div.yahoo_quoted { margin-left: 0 !important; border-left:1px 
#715FFA solid !important; padding-left:1ex !important; background-color:white 
!important; }  Nope never got past the login screen. Will create one today.


Sent from Yahoo Mail for iPhone


On Wednesday, November 16, 2016, 8:17 AM, Kevin Risden 
 wrote:

Was a JIRA ever created for this? I couldn't find it searching.

One that is semi related is SOLR-8213 for SolrJ JDBC auth.

Kevin Risden

On Wed, Nov 9, 2016 at 8:25 PM, Joel Bernstein  wrote:

> Thanks for digging into this, let's create a jira ticket for this.
>
> Joel Bernstein
> http://joelsolr.blogspot.com/
>
> On Wed, Nov 9, 2016 at 6:23 PM, sandeep mukherjee <
> wiredcit...@yahoo.com.invalid> wrote:
>
> > I have more progress since my last mail. I figured out that  in the
> > StreamContext object there is a way to set the SolrClientCache object
> which
> > keep reference to all the CloudSolrClient where I can set a reference to
> > HttpClient which sets the Basic Auth header. However the problem is,
> inside
> > the SolrClientCache there is no way to set your own version of
> > CloudSolrClient with BasicAuth enabled. Unfortunately, SolrClientCache
> has
> > no set method which takes a CloudSolrClient object.
> > So long story short we need an API in SolrClientCache to
> > accept CloudSolrClient object from user.
> > Please let me know if there is a better way to enable Basic Auth when
> > using StreamFactory as mentioned in my previous email.
> > Thanks much,Sandeep
> >
> >    On Wednesday, November 9, 2016 11:44 AM, sandeep mukherjee
> >  wrote:
> >
> >
> >  Hello everyone,
> > I trying to find the documentation for Basic Auth plugin for Solr
> > Streaming expressions. But I'm not able to find it in the documentation
> > anywhere. Could you please point me in right direction of how to enable
> > Basic auth for Solr Streams?
> > I'm creating StreamFactory as follows: I wonder how and where can I
> > specify Basic Auth username and password
> > @Bean
> > public StreamFactory streamFactory() {
> >    SolrConfig solrConfig = ConfigManager.getNamedConfig("solr",
> > SolrConfig.class);
> >
> >    return new StreamFactory().withDefaultZkHost(solrConfig.
> > getConnectString())
> >            .withFunctionName("gatherNodes", GatherNodesStream.class);
> > }
> >
> >
> >
>

Re: Wildcard searches with space in TextField/StrField

2016-11-22 Thread Sandeep Khanzode

Hi Erick,
I gave this a try. 
These are my results. There is a record with "John D. Smith", and another named 
"John Doe".

1.] {!complexphrase inOrder=true}name:"John D.*" ... does not fetch any 
results. 

2.] {!complexphrase inOrder=true}name:"John D*" ... fetches both results. 



Second observation: There is a record with "John D Smith"
1.] {!complexphrase inOrder=true}name:"John*" ... does not fetch any results. 

2.] {!complexphrase inOrder=true}name:"John D*" ... fetches that record. 

3.] {!complexphrase inOrder=true}name:"John D S*" ... fetches that record. 

SRK 

On Sunday, November 13, 2016 7:43 AM, Erick Erickson 
 wrote:
 

 Right, for that kind of use case you want complexPhraseQueryParser,
see: 
https://cwiki.apache.org/confluence/display/solr/Other+Parsers#OtherParsers-ComplexPhraseQueryParser

Best,
Erick

On Sat, Nov 12, 2016 at 9:39 AM, Sandeep Khanzode
 wrote:
> Thanks, Erick.
>
> I am actually not trying to use the String field (prefer a TextField here).
> But, in my comparisons with TextField, it seems that something like phrase
> matching with whitespace and wildcard (like, 'my do*' or say, 'my dog*', or
> say, 'my dog has*') can only be accomplished with a string type field,
> especially because, with a WhitespaceTokenizer in TextField, the space will
> be lost, and all tokens will be individually considered. Am I missing
> something?
>
> SRK
>
>
> On Friday, November 11, 2016 10:05 PM, Erick Erickson
>  wrote:
>
>
> You have to query text and string fields differently, that's just the
> way it works. The problem is getting the query string through the
> parser as a _single_ token or as multiple tokens.
>
> Let's say you have a string field with the "a b" example. You have a
> single token
> a b that starts at offset 0.
>
> But with a text field, you have two tokens,
> a at position 0
> b at position 1
>
> But when the query parser sees "a b" (without quotes) it splits it
> into two tokens, and only the text field has both tokens so the string
> field won't match.
>
> OTOH, when the query parser sees "a\ b" it passes this through as a
> single token, which only matches the string field as there's no
> _single_ token "a b" in the text field.
>
> But a more interesting question is why you want to search this way.
> String fields are intended for keywords, machine-generated IDs and the
> like. They're pretty useless for searching anything except
> 1> exact tokens
> 2> prefixes
>
> While if you have "my dog has fleas" in a string field, you _can_
> search "*dog*" and get a hit but the performance is poor when you get
> a large corpus. Performance for "my*" will be pretty good though.
>
> In all this sounds like an XY problem, what's the use-case you're
> trying to solve?
>
> Best,
> Erick
>
>
>
> On Thu, Nov 10, 2016 at 10:11 PM, Sandeep Khanzode
>  wrote:
>> Hi Erick, Reth,
>>
>> The 'a\ b*' as well as the q.op=AND approach worked (successfully) only
>> for StrField for me.
>>
>> Any attempt at creating a 'a\ b*' for a TextField does not match any
>> documents. The parsedQuery in debug mode does show 'field:a b*'. I am sure
>> there are documents that should match.
>> Another (maybe unrelated) observation is if I have 'field:a\ b', then the
>> parsedQuery is field:a field:b. Which does not match as expected (matches
>> individually).
>>
>> Can you please provide an example that I can use in Solr Query dashboard?
>> That will be helpful.
>>
>> I have also seen that wildcard queries work irrespective of field type
>> i.e. StrField as well as TextField. That makes sense because with a
>> WhitespaceTokenizer only creates word boundaries when we do not use a
>> EdgeNGramFilter. If I am not wrong, that is. SRK
>>
>>    On Friday, November 11, 2016 5:00 AM, Erick Erickson
>>  wrote:
>>
>>
>>  You can escape the space with a backslash as  'a\ b*'
>>
>> Best,
>> Erick
>>
>> On Thu, Nov 10, 2016 at 2:37 PM, Reth RM  wrote:
>>> I don't think you can do wildcard on StrField. For text field, if your
>>> query is "category:(test m*)"  the parsed query will be  "category:test
>>> OR
>>> category:m*"
>>> You can add q.op=AND to make an AND between those terms.
>>>
>>> For phrase type wild card query support, as per docs, it
>>> is ComplexPhraseQueryParser that supports it. (I haven't

Query parser behavior with AND and negative clause

2016-11-22 Thread Sandeep Khanzode

Hi,
I have a simple query that should intersect with dateRange1 and NOT be 
contained within dateRange2. I have tried the following options:

WORKS:
+{!field f=dateRange1 op=Intersects v='[2016-11-22T12:01:00Z TO 
2016-11-22T13:59:00Z]'} +(*:* -{!field f=dateRange2 op=Contains 
v='[2016-11-22T12:01:00Z TO 2016-11-22T13:59:00Z]'}) 


DOES NOT WORK :
{!field f=dateRange1 op=Intersects v='[2016-11-22T12:01:00Z TO 
2016-11-22T13:59:00Z]'} AND (*:* -{!field f=dateRange2 op=Contains 
v='[2016-11-22T12:01:00Z TO 2016-11-22T13:59:00Z]'}) 

Why?

WILL NOT WORK (because of the negative clause at the top level?):
{!field f=dateRange1 op=Intersects v='[2016-11-22T12:01:00Z TO 
2016-11-22T13:59:00Z]'} AND -{!field f=dateRange2 op=Contains 
v='[2016-11-22T12:01:00Z TO 2016-11-22T13:59:00Z]'} 


SRK

Re: Wildcard searches with space in TextField/StrField

2016-11-24 Thread Sandeep Khanzode

Hi,
This is the typical TextField with ...             
            



SRK 

On Thursday, November 24, 2016 1:38 AM, Reth RM  
wrote:
 

 what is the fieldType of those records?  
On Tue, Nov 22, 2016 at 4:18 AM, Sandeep Khanzode 
 wrote:

Hi Erick,
I gave this a try. 
These are my results. There is a record with "John D. Smith", and another named 
"John Doe".

1.] {!complexphrase inOrder=true}name:"John D.*" ... does not fetch any 
results. 

2.] {!complexphrase inOrder=true}name:"John D*" ... fetches both results. 



Second observation: There is a record with "John D Smith"
1.] {!complexphrase inOrder=true}name:"John*" ... does not fetch any results. 

2.] {!complexphrase inOrder=true}name:"John D*" ... fetches that record. 

3.] {!complexphrase inOrder=true}name:"John D S*" ... fetches that record. 

SRK

    On Sunday, November 13, 2016 7:43 AM, Erick Erickson 
 wrote:


 Right, for that kind of use case you want complexPhraseQueryParser,
see: https://cwiki.apache.org/ confluence/display/solr/Other+ 
Parsers#OtherParsers- ComplexPhraseQueryParser

Best,
Erick

On Sat, Nov 12, 2016 at 9:39 AM, Sandeep Khanzode
 wrote:
> Thanks, Erick.
>
> I am actually not trying to use the String field (prefer a TextField here).
> But, in my comparisons with TextField, it seems that something like phrase
> matching with whitespace and wildcard (like, 'my do*' or say, 'my dog*', or
> say, 'my dog has*') can only be accomplished with a string type field,
> especially because, with a WhitespaceTokenizer in TextField, the space will
> be lost, and all tokens will be individually considered. Am I missing
> something?
>
> SRK
>
>
> On Friday, November 11, 2016 10:05 PM, Erick Erickson
>  wrote:
>
>
> You have to query text and string fields differently, that's just the
> way it works. The problem is getting the query string through the
> parser as a _single_ token or as multiple tokens.
>
> Let's say you have a string field with the "a b" example. You have a
> single token
> a b that starts at offset 0.
>
> But with a text field, you have two tokens,
> a at position 0
> b at position 1
>
> But when the query parser sees "a b" (without quotes) it splits it
> into two tokens, and only the text field has both tokens so the string
> field won't match.
>
> OTOH, when the query parser sees "a\ b" it passes this through as a
> single token, which only matches the string field as there's no
> _single_ token "a b" in the text field.
>
> But a more interesting question is why you want to search this way.
> String fields are intended for keywords, machine-generated IDs and the
> like. They're pretty useless for searching anything except
> 1> exact tokens
> 2> prefixes
>
> While if you have "my dog has fleas" in a string field, you _can_
> search "*dog*" and get a hit but the performance is poor when you get
> a large corpus. Performance for "my*" will be pretty good though.
>
> In all this sounds like an XY problem, what's the use-case you're
> trying to solve?
>
> Best,
> Erick
>
>
>
> On Thu, Nov 10, 2016 at 10:11 PM, Sandeep Khanzode
>  wrote:
>> Hi Erick, Reth,
>>
>> The 'a\ b*' as well as the q.op=AND approach worked (successfully) only
>> for StrField for me.
>>
>> Any attempt at creating a 'a\ b*' for a TextField does not match any
>> documents. The parsedQuery in debug mode does show 'field:a b*'. I am sure
>> there are documents that should match.
>> Another (maybe unrelated) observation is if I have 'field:a\ b', then the
>> parsedQuery is field:a field:b. Which does not match as expected (matches
>> individually).
>>
>> Can you please provide an example that I can use in Solr Query dashboard?
>> That will be helpful.
>>
>> I have also seen that wildcard queries work irrespective of field type
>> i.e. StrField as well as TextField. That makes sense because with a
>> WhitespaceTokenizer only creates word boundaries when we do not use a
>> EdgeNGramFilter. If I am not wrong, that is. SRK
>>
>>    On Friday, November 11, 2016 5:00 AM, Erick Erickson
>>  wrote:
>>
>>
>>  You can escape the space with a backslash as  'a\ b*'
>>
>> Best,
>> Erick
>>
>> On Thu, Nov 10, 2016 at 2:37 PM, Reth RM  wrote:
>>> I don't think you can do wildcard on StrField. For text field, if your
>>> query is "category:(test m*)"  the parsed query will be  "category:test
>>> OR
>>>

Re: Query parser behavior with AND and negative clause

2016-11-24 Thread Sandeep Khanzode

Hi Erick,
The example record contains ...dateRange1 = [2016-11-22T18:00:00Z TO 
2016-11-22T20:00:00Z], [2016-11-22T06:00:00Z TO 2016-11-22T14:00:00Z]dateRange2 
= [2016-11-22T12:00:00Z TO 2016-11-22T14:00:00Z]"
The first query works ... which means that it is able to EXCLUDE this record 
from the result (since the negative dateRange2 clause should return false). 
Whereas the second query should also work but it does not and actually pulls 
the record in the result.
WORKS:
+{!field f=dateRange1 op=Intersects v='[2016-11-22T12:01:00Z TO 
2016-11-22T13:59:00Z]'} +(*:* -{!field f=dateRange2 op=Contains 
v='[2016-11-22T12:01:00Z TO 2016-11-22T13:59:00Z]'})


DOES NOT WORK :
{!field f=dateRange1 op=Intersects v='[2016-11-22T12:01:00Z TO 
2016-11-22T13:59:00Z]'} AND (*:* -{!field f=dateRange2 op=Contains 
v='[2016-11-22T12:01:00Z TO 2016-11-22T13:59:00Z]'})
 SRK 

On Tuesday, November 22, 2016 9:41 PM, Erick Erickson 
 wrote:
 

 _How_ does it "not work"? You haven't told us what you expect .vs.
what you get back.

Plus a sample doc that that violates your expectations (just the
dateRange field) would
also help.

Best,
Erick

On Tue, Nov 22, 2016 at 4:23 AM, Sandeep Khanzode
 wrote:
> Hi,
> I have a simple query that should intersect with dateRange1 and NOT be 
> contained within dateRange2. I have tried the following options:
>
> WORKS:
> +{!field f=dateRange1 op=Intersects v='[2016-11-22T12:01:00Z TO 
> 2016-11-22T13:59:00Z]'} +(*:* -{!field f=dateRange2 op=Contains 
> v='[2016-11-22T12:01:00Z TO 2016-11-22T13:59:00Z]'})
>
>
> DOES NOT WORK :
> {!field f=dateRange1 op=Intersects v='[2016-11-22T12:01:00Z TO 
> 2016-11-22T13:59:00Z]'} AND (*:* -{!field f=dateRange2 op=Contains 
> v='[2016-11-22T12:01:00Z TO 2016-11-22T13:59:00Z]'})
>
> Why?
>
> WILL NOT WORK (because of the negative clause at the top level?):
> {!field f=dateRange1 op=Intersects v='[2016-11-22T12:01:00Z TO 
> 2016-11-22T13:59:00Z]'} AND -{!field f=dateRange2 op=Contains 
> v='[2016-11-22T12:01:00Z TO 2016-11-22T13:59:00Z]'}
>
>
> SRK

Re: Wildcard searches with space in TextField/StrField

2016-11-24 Thread Sandeep Khanzode

Hi All, Erick,
Please suggest. Would like to use the ComplexPhraseQueryParser for searching 
text (with wildcard) that may contain special characters.
For example ...John* should match John V. DoeJohn* should match Johnson 
SmithBruce-Willis* should match Bruce-WillisV.* should match John V. F. Doe
SRK 

On Thursday, November 24, 2016 5:57 PM, Sandeep Khanzode 
 wrote:
 

 Hi,
This is the typical TextField with ...             
            



SRK 

    On Thursday, November 24, 2016 1:38 AM, Reth RM  
wrote:
 

 what is the fieldType of those records?  
On Tue, Nov 22, 2016 at 4:18 AM, Sandeep Khanzode 
 wrote:

Hi Erick,
I gave this a try. 
These are my results. There is a record with "John D. Smith", and another named 
"John Doe".

1.] {!complexphrase inOrder=true}name:"John D.*" ... does not fetch any 
results. 

2.] {!complexphrase inOrder=true}name:"John D*" ... fetches both results. 



Second observation: There is a record with "John D Smith"
1.] {!complexphrase inOrder=true}name:"John*" ... does not fetch any results. 

2.] {!complexphrase inOrder=true}name:"John D*" ... fetches that record. 

3.] {!complexphrase inOrder=true}name:"John D S*" ... fetches that record. 

SRK

    On Sunday, November 13, 2016 7:43 AM, Erick Erickson 
 wrote:


 Right, for that kind of use case you want complexPhraseQueryParser,
see: https://cwiki.apache.org/ confluence/display/solr/Other+ 
Parsers#OtherParsers- ComplexPhraseQueryParser

Best,
Erick

On Sat, Nov 12, 2016 at 9:39 AM, Sandeep Khanzode
 wrote:
> Thanks, Erick.
>
> I am actually not trying to use the String field (prefer a TextField here).
> But, in my comparisons with TextField, it seems that something like phrase
> matching with whitespace and wildcard (like, 'my do*' or say, 'my dog*', or
> say, 'my dog has*') can only be accomplished with a string type field,
> especially because, with a WhitespaceTokenizer in TextField, the space will
> be lost, and all tokens will be individually considered. Am I missing
> something?
>
> SRK
>
>
> On Friday, November 11, 2016 10:05 PM, Erick Erickson
>  wrote:
>
>
> You have to query text and string fields differently, that's just the
> way it works. The problem is getting the query string through the
> parser as a _single_ token or as multiple tokens.
>
> Let's say you have a string field with the "a b" example. You have a
> single token
> a b that starts at offset 0.
>
> But with a text field, you have two tokens,
> a at position 0
> b at position 1
>
> But when the query parser sees "a b" (without quotes) it splits it
> into two tokens, and only the text field has both tokens so the string
> field won't match.
>
> OTOH, when the query parser sees "a\ b" it passes this through as a
> single token, which only matches the string field as there's no
> _single_ token "a b" in the text field.
>
> But a more interesting question is why you want to search this way.
> String fields are intended for keywords, machine-generated IDs and the
> like. They're pretty useless for searching anything except
> 1> exact tokens
> 2> prefixes
>
> While if you have "my dog has fleas" in a string field, you _can_
> search "*dog*" and get a hit but the performance is poor when you get
> a large corpus. Performance for "my*" will be pretty good though.
>
> In all this sounds like an XY problem, what's the use-case you're
> trying to solve?
>
> Best,
> Erick
>
>
>
> On Thu, Nov 10, 2016 at 10:11 PM, Sandeep Khanzode
>  wrote:
>> Hi Erick, Reth,
>>
>> The 'a\ b*' as well as the q.op=AND approach worked (successfully) only
>> for StrField for me.
>>
>> Any attempt at creating a 'a\ b*' for a TextField does not match any
>> documents. The parsedQuery in debug mode does show 'field:a b*'. I am sure
>> there are documents that should match.
>> Another (maybe unrelated) observation is if I have 'field:a\ b', then the
>> parsedQuery is field:a field:b. Which does not match as expected (matches
>> individually).
>>
>> Can you please provide an example that I can use in Solr Query dashboard?
>> That will be helpful.
>>
>> I have also seen that wildcard queries work irrespective of field type
>> i.e. StrField as well as TextField. That makes sense because with a
>> WhitespaceTokenizer only creates word boundaries when we do not use a
>> EdgeNGramFilter. If I am not wrong, that is. SRK
>>
>>    On Friday, November 11, 2016 5:00 AM, Erick Erickson
>>  wrote:
>>
>>
>>

Re: Wildcard searches with space in TextField/StrField

2016-11-25 Thread Sandeep Khanzode

Hi All,

Can someone please assist with this query?

My data consists of:
1.] John Doe
2.] John V. Doe
3.] Johnson Doe
4.] Johnson V. Doe
5.] John Smith
6.] Johnson V. Smith
7.] Matt Doe
8.] Matt V. Doe
9.] Matt Doe
10.] Matthew V. Doe
11.] Matthew Smith

12.] Matthew V. Smith

Querying ...
(a) Matt/Matt* should return records 7-12
(b) John/John* should return records 1-6
(c) Doe/Doe* should return records 1-4, 7-10
(d) Smith/Smith* should return records 5,6,11,12
(e) V/V./V.*/V* should return records 2,4,6,8,10,12
(f) V. Doe/V. Doe* should return records 2,4,8,10
(g) John V/John V./John V*/John V.* should return record 2
(h) V. Smith/V. Smith* should return records 6,12

Any guidance would be appreciated!
I have tried ComplexPhraseQueryParser, but with a single token like Doe*, there 
is an error that indicates that the query is being identified as a prefix 
query. I may be missing something in the syntax.
 SRK 

On Thursday, November 24, 2016 11:16 PM, Sandeep Khanzode 
 wrote:
 

 Hi All, Erick,
Please suggest. Would like to use the ComplexPhraseQueryParser for searching 
text (with wildcard) that may contain special characters.
For example ...John* should match John V. DoeJohn* should match Johnson 
SmithBruce-Willis* should match Bruce-WillisV.* should match John V. F. Doe
SRK 

    On Thursday, November 24, 2016 5:57 PM, Sandeep Khanzode 
 wrote:
 

 Hi,
This is the typical TextField with ...             
            



SRK 

    On Thursday, November 24, 2016 1:38 AM, Reth RM  
wrote:
 

 what is the fieldType of those records?  
On Tue, Nov 22, 2016 at 4:18 AM, Sandeep Khanzode 
 wrote:

Hi Erick,
I gave this a try. 
These are my results. There is a record with "John D. Smith", and another named 
"John Doe".

1.] {!complexphrase inOrder=true}name:"John D.*" ... does not fetch any 
results. 

2.] {!complexphrase inOrder=true}name:"John D*" ... fetches both results. 



Second observation: There is a record with "John D Smith"
1.] {!complexphrase inOrder=true}name:"John*" ... does not fetch any results. 

2.] {!complexphrase inOrder=true}name:"John D*" ... fetches that record. 

3.] {!complexphrase inOrder=true}name:"John D S*" ... fetches that record. 

SRK

    On Sunday, November 13, 2016 7:43 AM, Erick Erickson 
 wrote:


 Right, for that kind of use case you want complexPhraseQueryParser,
see: https://cwiki.apache.org/ confluence/display/solr/Other+ 
Parsers#OtherParsers- ComplexPhraseQueryParser

Best,
Erick

On Sat, Nov 12, 2016 at 9:39 AM, Sandeep Khanzode
 wrote:
> Thanks, Erick.
>
> I am actually not trying to use the String field (prefer a TextField here).
> But, in my comparisons with TextField, it seems that something like phrase
> matching with whitespace and wildcard (like, 'my do*' or say, 'my dog*', or
> say, 'my dog has*') can only be accomplished with a string type field,
> especially because, with a WhitespaceTokenizer in TextField, the space will
> be lost, and all tokens will be individually considered. Am I missing
> something?
>
> SRK
>
>
> On Friday, November 11, 2016 10:05 PM, Erick Erickson
>  wrote:
>
>
> You have to query text and string fields differently, that's just the
> way it works. The problem is getting the query string through the
> parser as a _single_ token or as multiple tokens.
>
> Let's say you have a string field with the "a b" example. You have a
> single token
> a b that starts at offset 0.
>
> But with a text field, you have two tokens,
> a at position 0
> b at position 1
>
> But when the query parser sees "a b" (without quotes) it splits it
> into two tokens, and only the text field has both tokens so the string
> field won't match.
>
> OTOH, when the query parser sees "a\ b" it passes this through as a
> single token, which only matches the string field as there's no
> _single_ token "a b" in the text field.
>
> But a more interesting question is why you want to search this way.
> String fields are intended for keywords, machine-generated IDs and the
> like. They're pretty useless for searching anything except
> 1> exact tokens
> 2> prefixes
>
> While if you have "my dog has fleas" in a string field, you _can_
> search "*dog*" and get a hit but the performance is poor when you get
> a large corpus. Performance for "my*" will be pretty good though.
>
> In all this sounds like an XY problem, what's the use-case you're
> trying to solve?
>
> Best,
> Erick
>
>
>
> On Thu, Nov 10, 2016 at 10:11 PM, Sandeep Khanzode
>  wrote:
>> Hi Erick, Reth,
>>
>> The 'a\ b*' as well as the q.op=AND approach worked (successfully) only
>> for StrF

Re: Query parser behavior with AND and negative clause

2016-11-25 Thread Sandeep Khanzode

WORKS:
+{!field f=dateRange1 op=Intersects v='[2016-11-22T12:01:00Z TO 
2016-11-22T13:59:00Z]'} +(*:* -{!field f=dateRange2 op=Contains 
v='[2016-11-22T12:01:00Z TO 2016-11-22T13:59:00Z]'})


+ConstantScore(IntersectsPrefixTreeFilter(fieldName=dateRange1,queryShape=[2016-11-22T12:01
 TO 2016-11-22T13:59:00],detailLevel=9,prefixGridScanLevel=7)) 
+(MatchAllDocsQuery(*:*) 
-ConstantScore(ContainsPrefixTreeFilter(fieldName=dateRange2,queryShape=[2016-11-22T12:01
 TO 2016-11-22T13:59:00],detailLevel=9,multiOverlappingIndexedShapes=true)))




DOES NOT WORK :
{!field f=dateRange1 op=Intersects v='[2016-11-22T12:01:00Z TO 
2016-11-22T13:59:00Z]'} AND (*:* -{!field f=dateRange2 op=Contains 
v='[2016-11-22T12:01:00Z TO 2016-11-22T13:59:00Z]'})


ConstantScore(IntersectsPrefixTreeFilter(fieldName=dateRange1,queryShape=[2016-11-22T12:01
 TO 2016-11-22T13:59:00],detailLevel=9,prefixGridScanLevel=7))
 SRK 


On Thursday, November 24, 2016 9:02 PM, Alessandro Benedetti 
 wrote:
 

 Hey Sandeep,
can you debug the query ( debugQuery=on) and show how the query is parsed ?

Cheers



On Thu, Nov 24, 2016 at 12:38 PM, Sandeep Khanzode <
sandeep_khanz...@yahoo.com.invalid> wrote:

> Hi Erick,
> The example record contains ...dateRange1 = [2016-11-22T18:00:00Z TO
> 2016-11-22T20:00:00Z], [2016-11-22T06:00:00Z TO 
> 2016-11-22T14:00:00Z]dateRange2
> = [2016-11-22T12:00:00Z TO 2016-11-22T14:00:00Z]"
> The first query works ... which means that it is able to EXCLUDE this
> record from the result (since the negative dateRange2 clause should return
> false). Whereas the second query should also work but it does not and
> actually pulls the record in the result.
> WORKS:
> +{!field f=dateRange1 op=Intersects v='[2016-11-22T12:01:00Z TO
> 2016-11-22T13:59:00Z]'} +(*:* -{!field f=dateRange2 op=Contains
> v='[2016-11-22T12:01:00Z TO 2016-11-22T13:59:00Z]'})
>
>
> DOES NOT WORK :
> {!field f=dateRange1 op=Intersects v='[2016-11-22T12:01:00Z TO
> 2016-11-22T13:59:00Z]'} AND (*:* -{!field f=dateRange2 op=Contains
> v='[2016-11-22T12:01:00Z TO 2016-11-22T13:59:00Z]'})
>  SRK
>
>    On Tuesday, November 22, 2016 9:41 PM, Erick Erickson <
> erickerick...@gmail.com> wrote:
>
>
>  _How_ does it "not work"? You haven't told us what you expect .vs.
> what you get back.
>
> Plus a sample doc that that violates your expectations (just the
> dateRange field) would
> also help.
>
> Best,
> Erick
>
> On Tue, Nov 22, 2016 at 4:23 AM, Sandeep Khanzode
>  wrote:
> > Hi,
> > I have a simple query that should intersect with dateRange1 and NOT be
> contained within dateRange2. I have tried the following options:
> >
> > WORKS:
> > +{!field f=dateRange1 op=Intersects v='[2016-11-22T12:01:00Z TO
> 2016-11-22T13:59:00Z]'} +(*:* -{!field f=dateRange2 op=Contains
> v='[2016-11-22T12:01:00Z TO 2016-11-22T13:59:00Z]'})
> >
> >
> > DOES NOT WORK :
> > {!field f=dateRange1 op=Intersects v='[2016-11-22T12:01:00Z TO
> 2016-11-22T13:59:00Z]'} AND (*:* -{!field f=dateRange2 op=Contains
> v='[2016-11-22T12:01:00Z TO 2016-11-22T13:59:00Z]'})
> >
> > Why?
> >
> > WILL NOT WORK (because of the negative clause at the top level?):
> > {!field f=dateRange1 op=Intersects v='[2016-11-22T12:01:00Z TO
> 2016-11-22T13:59:00Z]'} AND -{!field f=dateRange2 op=Contains
> v='[2016-11-22T12:01:00Z TO 2016-11-22T13:59:00Z]'}
> >
> >
> > SRK
>
>
>
>



-- 
--

Benedetti Alessandro
Visiting card - http://about.me/alessandro_benedetti
Blog - http://alexbenedetti.blogspot.co.uk

"Tyger, tyger burning bright
In the forests of the night,
What immortal hand or eye
Could frame thy fearful symmetry?"

William Blake - Songs of Experience -1794 England

Collection API CREATE creates name like '_shard1_replica1'

2016-12-14 Thread Sandeep Khanzode

Hi,
I uploaded (upconfig) config (schema and solrconfig XMLs) to Zookeeper and then 
linked (linkconfig) the confname to a collection name.
When I attempt to create a collection using the API like this 
.../solr/admin/collections?action=CREATE&name=abc&numShards=1&collection.configName=abc
  ... it creates a collection core named abc_shard1_replica1 and not simply abc.
What is missing? 
SRK

Re: SolrJ doesn't work with Json facet api

2017-01-05 Thread Sandeep Khanzode

For me, these variants have worked ...

solrQuery.add("json.facet", "...");

solrQuery.setParam("json.facet", "...");
 
You get ...
QueryResponse.getResponse().get("facets");

SRK 

On Thursday, January 5, 2017 1:19 PM, Jeffery Yuan  
wrote:
 

 Thanks for your response.
We definitely use solrQuery.set("json.facet", "the json query here");

Btw we are using Solr 5.2.1.





--
View this message in context: 
http://lucene.472066.n3.nabble.com/SolrJ-doesn-t-work-with-Json-facet-api-tp4299867p4312459.html
Sent from the Solr - User mailing list archive at Nabble.com.

solr in classic asp project

2014-08-07 Thread Sandeep Bohra

I am using an classic ASP 3.0 application and would like to implement SOLR
onto it. My database is SQL server and also it connects to AS/400 using
batch processing. Can someone suggest a starting point?



*RegardsSandeep*

Fwd: Searching within list of regions with 1:1 document-region mapping

2013-10-18 Thread Sandeep Gupta

Hi,

I have a Solr index of around 100 million documents with each document
being given a region id growing at a rate of about 10 million documents per
month - the average document size being aronud 10KB of pure text. The total
number of region ids are themselves in the range of 2.5 million.

I want to search for a query with a given list of region ids. The number of
region ids in this list is usually around 250-300 (most of the time), but
can be upto 500, with a maximum cap of around 2000 ids in one request.


What is the best way to model such queries besides using an IN param in the
query, or using a Filter FQ in the query or some other means?


 If it may help, the index is on a VM with 4 virtual-cores and has
currently 4GB of Java memory allocated out of 16GB in the machine. The
number of queries do not exceed more than 1 per minute for now. If needed,
we can throw more hardware to the index - but the index will still be only
on a single machine for atleast 6 months.

Best Regards,
Sandeep Gupta

Re: Class name of parsing the fq clause

2013-10-23 Thread Sandeep Gupta

Thanks Jack for detailing out the parser logic.
Would it be possible for you to say something more about filter cache code
flow...  sometimes we do not use fq parameter  in query string and pass the
raw query

Regards
Sandeep


On Mon, Oct 21, 2013 at 7:11 PM, Jack Krupansky wrote:

> Start with org.apache.solr.handler.**component.QueryComponent#**prepare
> which fetches the fq parameters and indirectly invokes the query parser(s):
>
> String[] fqs = req.getParams().getParams(**CommonParams.FQ);
> if (fqs!=null && fqs.length!=0) {
>   List filters = rb.getFilters();
>   // if filters already exists, make a copy instead of modifying the
> original
>   filters = filters == null ? new ArrayList(fqs.length) : new
> ArrayList(filters);
>   for (String fq : fqs) {
> if (fq != null && fq.trim().length()!=0) {
>   QParser fqp = QParser.getParser(fq, null, req);
>   filters.add(fqp.getQuery());
> }
>   }
>   // only set the filters if they are not empty otherwise
>   // fq=&someotherParam= will trigger all docs filter for every request
>   // if filter cache is disabled
>   if (!filters.isEmpty()) {
> rb.setFilters( filters );
>
> Note that this line actually invokes the parser:
>
>   filters.add(fqp.getQuery());
>
> Then in org.apache.lucene.search.**Query.QParser#getParser:
>
> QParserPlugin qplug = req.getCore().getQueryPlugin(**parserName);
> QParser parser =  qplug.createParser(qstr, localParams, req.getParams(),
> req);
>
> And for the common case of the Lucene query parser, org.apache.solr.search.
> **LuceneQParserPlugin#**createParser:
>
> public QParser createParser(String qstr, SolrParams localParams,
> SolrParams params, SolrQueryRequest req) {
>  return new LuceneQParser(qstr, localParams, params, req);
> }
>
> And then in org.apache.lucene.search.**Query.QParser#getQuery:
>
> public Query getQuery() throws SyntaxError {
>  if (query==null) {
>query=parse();
>
> And then in org.apache.lucene.search.**Query.LuceneQParser#parse:
>
> lparser = new SolrQueryParser(this, defaultField);
>
> lparser.setDefaultOperator
>  (QueryParsing.**getQueryParserDefaultOperator(**getReq().getSchema(),
>  getParam(QueryParsing.OP)));
>
> return lparser.parse(qstr);
>
> And then in org.apache.solr.parser.**SolrQueryParserBase#parse:
>
> Query res = TopLevelQuery(null);  // pass null so we can tell later if an
> explicit field was provided or not
>
> And then in org.apache.solr.parser.**QueryParser#TopLevelQuery, the
> parsing begins.
>
> And org.apache.solr.parser.**QueryParser.jj is the grammar for a basic
> Solr/Lucene query, and org.apache.solr.parser.**QueryParser.java is
> generated by JFlex, and a lot of the logic is in the base class of the
> generated class, org.apache.solr.parser.**SolrQueryParserBase.java.
>
> Good luck! Happy hunting!
>
> -- Jack Krupansky
>
> -Original Message- From: YouPeng Yang
> Sent: Monday, October 21, 2013 2:57 AM
> To: solr-user@lucene.apache.org
> Subject: Class name of parsing the fq clause
>
>
> Hi
>   I search the solr with fq clause,which is like:
>   fq=BEGINTIME:[2013-08-25T16:**00:00Z TO *] AND BUSID:(M3 OR M9)
>
>
>   I am curious about the parsing process . I want to study it.
>   What is the Java file name describes  the parsing  process of the fq
> clause.
>
>
>  Thanks
>
> Regards.
>

Re: Class name of parsing the fq clause

2013-10-23 Thread Sandeep Gupta

Yes.. it is not related to this particular mail thread. I will post
separate mail.

Thanks
Sandeep


On Wed, Oct 23, 2013 at 4:36 PM, Jack Krupansky wrote:

> Not in just a few words. Do you have specific questions? I mean none of
> that relates to parsing of fq, the topic of this particular email thread,
> right?
>
> -- Jack Krupansky
>
> -Original Message- From: Sandeep Gupta
> Sent: Wednesday, October 23, 2013 3:58 AM
> To: solr-user@lucene.apache.org
> Subject: Re: Class name of parsing the fq clause
>
>
> Thanks Jack for detailing out the parser logic.
> Would it be possible for you to say something more about filter cache code
> flow...  sometimes we do not use fq parameter  in query string and pass the
> raw query
>
> Regards
> Sandeep
>
>
> On Mon, Oct 21, 2013 at 7:11 PM, Jack Krupansky *
> *wrote:
>
>  Start with org.apache.solr.handler.component.QueryComponent#
>> prepare
>>
>> which fetches the fq parameters and indirectly invokes the query
>> parser(s):
>>
>> String[] fqs = req.getParams().getParams(CommonParams.FQ);
>>
>> if (fqs!=null && fqs.length!=0) {
>>   List filters = rb.getFilters();
>>   // if filters already exists, make a copy instead of modifying the
>> original
>>   filters = filters == null ? new ArrayList(fqs.length) : new
>> ArrayList(filters);
>>   for (String fq : fqs) {
>> if (fq != null && fq.trim().length()!=0) {
>>   QParser fqp = QParser.getParser(fq, null, req);
>>   filters.add(fqp.getQuery());
>> }
>>   }
>>   // only set the filters if they are not empty otherwise
>>   // fq=&someotherParam= will trigger all docs filter for every request
>>   // if filter cache is disabled
>>   if (!filters.isEmpty()) {
>> rb.setFilters( filters );
>>
>> Note that this line actually invokes the parser:
>>
>>   filters.add(fqp.getQuery());
>>
>> Then in org.apache.lucene.search.Query.QParser#getParser:
>>
>> QParserPlugin qplug = req.getCore().getQueryPlugin(parserName);
>> QParser parser =  qplug.createParser(qstr, localParams, req.getParams(),
>> req);
>>
>> And for the common case of the Lucene query parser,
>> org.apache.solr.search.
>> **LuceneQParserPlugin#createParser:
>>
>>
>> public QParser createParser(String qstr, SolrParams localParams,
>> SolrParams params, SolrQueryRequest req) {
>>  return new LuceneQParser(qstr, localParams, params, req);
>> }
>>
>> And then in org.apache.lucene.search.Query.QParser#getQuery:
>>
>>
>> public Query getQuery() throws SyntaxError {
>>  if (query==null) {
>>query=parse();
>>
>> And then in org.apache.lucene.search.Query.LuceneQParser#parse:
>>
>>
>> lparser = new SolrQueryParser(this, defaultField);
>>
>> lparser.setDefaultOperator
>>  (QueryParsing.getQueryParserDefaultOperator(**
>> **getReq().getSchema(),
>>  getParam(QueryParsing.OP)));
>>
>> return lparser.parse(qstr);
>>
>> And then in org.apache.solr.parser.SolrQueryParserBase#parse:
>>
>>
>> Query res = TopLevelQuery(null);  // pass null so we can tell later if an
>> explicit field was provided or not
>>
>> And then in org.apache.solr.parser.QueryParser#TopLevelQuery, the
>> parsing begins.
>>
>> And org.apache.solr.parser.QueryParser.jj is the grammar for a basic
>> Solr/Lucene query, and org.apache.solr.parser.QueryParser.java is
>>
>> generated by JFlex, and a lot of the logic is in the base class of the
>> generated class, org.apache.solr.parser.SolrQueryParserBase.java.
>>
>>
>> Good luck! Happy hunting!
>>
>> -- Jack Krupansky
>>
>> -Original Message- From: YouPeng Yang
>> Sent: Monday, October 21, 2013 2:57 AM
>> To: solr-user@lucene.apache.org
>> Subject: Class name of parsing the fq clause
>>
>>
>> Hi
>>   I search the solr with fq clause,which is like:
>>   fq=BEGINTIME:[2013-08-25T16:00:00Z TO *] AND BUSID:(M3 OR
>> M9)
>>
>>
>>
>>   I am curious about the parsing process . I want to study it.
>>   What is the Java file name describes  the parsing  process of the fq
>> clause.
>>
>>
>>  Thanks
>>
>> Regards.
>>
>>
>

Solr subset searching in 100-million document index

2013-10-24 Thread Sandeep Gupta

Hi,

We have a Solr index of around 100 million documents with each document
being given a region id growing at a rate of about 10 million documents per
month - the average document size being aronud 10KB of pure text. The total
number of region ids are themselves in the range of 2.5 million.

We want to search for a query with a given list of region ids. The number
of region ids in this list is usually around 250-300 (most of the time),
but can be upto 500, with a maximum cap of around 2000 ids in one request.


What is the best way to model such queries besides using an IN param in the
query, or using a Filter FQ in the query? Are there any other faster
methods available?


If it may help, the index is on a VM with 4 virtual-cores and has currently
4GB of Java memory allocated out of 16GB in the machine. The number of
queries do not exceed more than 1 per minute for now. If needed, we can
throw more hardware to the index - but the index will still be only on a
single machine for atleast 6 months.

Regards,
Sandeep Gupta

Re: Solr subset searching in 100-million document index

2013-10-24 Thread Sandeep Gupta

Hi Joel,

Thanks a lot for the information - I haven't worked with PostFilter's
before but found an example at
http://java.dzone.com/articles/custom-security-filtering-solr.

Will try it over the next few days and come back if still have questions.

Thanks again!



Keep Walking,
~ Sandeep


On Thu, Oct 24, 2013 at 8:25 PM, Joel Bernstein  wrote:

> Sandeep,
>
> This type of operation can often be expressed as a PostFilter very
> efficiently. This is particularly true if the region id's are integer keys.
>
> Joel
>
> On Thu, Oct 24, 2013 at 7:46 AM, Sandeep Gupta 
> wrote:
>
> > Hi,
> >
> > We have a Solr index of around 100 million documents with each document
> > being given a region id growing at a rate of about 10 million documents
> per
> > month - the average document size being aronud 10KB of pure text. The
> total
> > number of region ids are themselves in the range of 2.5 million.
> >
> > We want to search for a query with a given list of region ids. The number
> > of region ids in this list is usually around 250-300 (most of the time),
> > but can be upto 500, with a maximum cap of around 2000 ids in one
> request.
> >
> >
> > What is the best way to model such queries besides using an IN param in
> the
> > query, or using a Filter FQ in the query? Are there any other faster
> > methods available?
> >
> >
> > If it may help, the index is on a VM with 4 virtual-cores and has
> currently
> > 4GB of Java memory allocated out of 16GB in the machine. The number of
> > queries do not exceed more than 1 per minute for now. If needed, we can
> > throw more hardware to the index - but the index will still be only on a
> > single machine for atleast 6 months.
> >
> > Regards,
> > Sandeep Gupta
> >
>
>
>
> --
>

SolrCould read-only replicas

2014-09-27 Thread Sandeep Tikoo

Hi-

I have been reading up on SolrCloud and it seems that it is not possible to 
have a cross-datacenter read-only slave anymore but wanted to ask here to be 
sure.
We currently have a pre Solr 4.0 installation with the master instance in our 
US mid-west datacenter. The datacenter in Europe has read-replicas which pull 
data using solr.ReplicationHandler. We wanted to upgrade to SolrCloud. As far 
as I have been able to figure out, with SolrCloud you cannot have a read-only 
replica anymore. A replica has to be able to become a leader and writes against 
all replicas for a shard have to succeed. Because of the a strong consistency 
model across replicas, it seems that replicas cannot be across datacenters 
anymore.

So my question is, how can we have a read-ony replica in a remote datacenter in 
Solr 4.0+ similar to pre Solr 4.0? Is it not possible anymore without doing it 
all yourself?

cheers,
Tikoo

RE: SolrCould read-only replicas

2014-10-02 Thread Sandeep Tikoo

Erick,

Thank you for your response. Yup, when I said it is not possible to have a 
cross continent data center replica, I meant that we never ever want to do that 
because of the latency.

What I was hoping is that  I could have Solr cloud in my DataCentre A (DC-A) 
and get all the benefits of sharding ( scaling/parallel computing) and failover 
redundancy within the same data center. If I could then have a read-only 
replica (with no guaranteed consistency of course ) of this entire cloud in my 
DataCenter B (DC-B), that would make my reads over DC-B faster without making 
my writes slow. To clarify, all the writes were going to go against DC-A only. 
The read-only cluster in DC-B could  also be made the master in case the entire 
DC-A went down.  The DC-B wouldn't be guaranteed to be in sync with the DB-A 
master but in my use case I could live with that. Seems like that is no 
possible out-of-the-box if I am using Solr 4.0+ in the cloud mode. It is either 
Solr Coud or cross data center read only replica. Can't do both at the same 
time.
I think that is what you confirmed as well. If I have it wrong, please let me 
know. Also, any thoughts on the most easy way to accomplish the read-only 
replica of the entire solr cloud cluster?

Thanks!
Tikoo

From: Sandeep Tikoo
Sent: Saturday, September 27, 2014 9:43 PM
To: 'solr-user@lucene.apache.org'
Subject: SolrCould read-only replicas

Hi-

I have been reading up on SolrCloud and it seems that it is not possible to 
have a cross-datacenter read-only slave anymore but wanted to ask here to be 
sure.
We currently have a pre Solr 4.0 installation with the master instance in our 
US mid-west datacenter. The datacenter in Europe has read-replicas which pull 
data using solr.ReplicationHandler. We wanted to upgrade to SolrCloud. As far 
as I have been able to figure out, with SolrCloud you cannot have a read-only 
replica anymore. A replica has to be able to become a leader and writes against 
all replicas for a shard have to succeed. Because of the a strong consistency 
model across replicas, it seems that replicas cannot be across datacenters 
anymore.

So my question is, how can we have a read-ony replica in a remote datacenter in 
Solr 4.0+ similar to pre Solr 4.0? Is it not possible anymore without doing it 
all yourself?

cheers,
Tikoo

Solr 4.10.3 annotations on nested objects

2015-02-09 Thread Sandeep Jangra

Hello,

  I have Java beans with parent-child relation that I am trying to index
using @Field annotation (copying the sample code below).

  I see that https://issues.apache.org/jira/browse/SOLR-1945 is open. Is
there any other way or document that describes how to use solrj annotations
to index nested objects in 4.10.3

  Please provide any references.

Thanks,
SJ

Solr version: 4.10.3

*Parent class:*

public class Asset {

@Field
private String name;

@Field("content_type")
private String contentType = "asset";

@Field("childDocuments")  // Does not work
private List attributes;

...
}

*Child class:*
public class Attribute {
@Field
private String name;

@Field("content_type")
private String contentType = "attribute";

@Field
private String value;
...
}

Re: Need Help in migrating Solr version 1.4 to 4.3

2013-06-25 Thread Sandeep Gupta

Thanks for all the answers.
Sure I am going to create new index again with Solr 4.3.

Also in application development side,
as I said that I am going to use HTTPSolrServer API and I found that we
shouldn't create this object multiple times
(as per the wiki document http://wiki.apache.org/solr/Solrj#HttpSolrServer)
So I am planning to have my Server class as singleton.
 Please advice little bit in this front also.

Regards
Sandeep



On Tue, Jun 25, 2013 at 11:16 PM, André Widhani wrote:

> fwiw, I can confirm that Solr 4.x can definitely not read indexes created
> with 1.4.
>
> You'll get an exception like the following:
>
> Caused by: org.apache.lucene.index.IndexFormatTooOldException: Format
> version is not supported (resource: segment _16ofy in resource
> ChecksumIndexInput(MMapIndexInput(path="/var/opt/dcx/solr2/core-tex60l254lpachcjhtz4se-index2/data/index/segments_1dlof"))):
> 2.x. This version of Lucene only supports indexes created with release 3.0
> and later.
>
> But as Erick mentioned, you could get away with optimizing the index with
> 3.x instead of re-indexing from scratch before moving on to 4.x - I think I
> did that once and it worked.
>
> Regards,
> André
>
> 
> Von: Erick Erickson [erickerick...@gmail.com]
> Gesendet: Dienstag, 25. Juni 2013 19:37
> An: solr-user@lucene.apache.org
> Betreff: Re: Need Help in migrating Solr version 1.4 to 4.3
>
> bq: I'm not sure if Solr 4.3 will be able to read Solr 1.4 indexes
>
> Solr/Lucene explicitly try to read _one_ major revision backwards.
> Solr 3.x should be able to read 1.4 indexes. Solr 4.x should be
> able to read Solr 3.x. No attempt is made to allow Solr 4.x to read
> Solr 1.4 indexes, so I wouldn't even try.
>
> Shalin's comment is best. If at all possible I'd just forget about
> reading the old index and re-index from scratch. But if you _do_
> try upgrading 1.4 -> 3.x -> 4.x, you probably want to optimize
> at each step. That'll (I think) rewrite all the segments in the
> current format.
>
> Good luck!
> Erick
>
> On Tue, Jun 25, 2013 at 12:59 AM, Shalin Shekhar Mangar
>  wrote:
> > You must carefully go through the upgrade instructions starting from
> > 1.4 upto 4.3. In particular the instructions for 1.4 to 3.1 and from
> > 3.1 to 4.0 should be given special attention.
> >
> > On Tue, Jun 25, 2013 at 11:43 AM, Sandeep Gupta 
> wrote:
> >> Hello All,
> >>
> >> We are planning to migrate solr 1.4 to Solr 4.3 version.
> >> And I am seeking some help in this side.
> >>
> >> Considering Schema file change:
> >> By default there are lots of changes if I compare original Solr 1.4
> schema
> >> file to Sol 4.3 schema file.
> >> And that is the reason we are not copying paste of schema file.
> >> In our Solr 1.4 schema implementation, we have some custom fields with
> type
> >> "textgen" and "text"
> >> So in migration of these custom fields to Solr 4.3,  should I use type
> of
> >> "text_general" as replacement of "textgen" and
> >> "text_en" as replacement of "text"?
> >> Please confirm the same.
> >
> > Please check the text_general definition in 4.3 against the textgen
> > fieldtype in Solr 1.4 to see if they're equivalent. Same for text_en
> > and text.
> >
> >>
> >> Considering Solrconfig change:
> >> As we didn't have lots of changes in 1.4 solrconfig file except the
> >> dataimport request handler.
> >> And therefore in migration side, we are simply modifying the Solr 4.3
> >> solrconfig file with his request handler.
> >
> > And you need to add the dataimporthandler jar into Solr's lib
> > directory. DIH is not added automatically anymore.
> >
> >>
> >> Considering the application development:
> >>
> >> We used all the queries as BOOLEAN type style (was not good)  I mean put
> >> all the parameter in query fields i.e
> >> *:* AND EntityName: <<>> AND : AND .
> >>
> >> I think we should simplify our queries using other fields like df, qf
> 
> >>
> >
> > Probably. AND queries are best done by filter queries (fq).
> >
> >> We also used to create Solr server object via CommonsHttpSolrServer()
> so I
> >> am planning to use now HttpSolrServer API>
> >
> > Yes. Also, there was a compatibility break between Solr 1.4 and 3.1 in
> > the javabin format so old clients using javabin won't be able to
> > communicate with Solr until you upgrade both solr client and solr
> > servers.
> >
> >>
> >> Please let me know the suggestion for above points also what are the
> other
> >> factors I need to take care while considering the migration.
> >
> > There is no substitute for reading the upgrade sections in the
> changes.txt.
> >
> > I'm not sure if Solr 4.3 will be able to read Solr 1.4 indexes. You
> > will most likely need to re-index your documents.
> >
> > You should also think about switching to SolrCloud to take advantage
> > of its features.
> >
> > --
> > Regards,
> > Shalin Shekhar Mangar.
>

Re: Need Help in migrating Solr version 1.4 to 4.3

2013-06-25 Thread Sandeep Gupta

Thanks for all the answers.
Sure I am going to create new index again with Solr 4.3.

Also in application development side,
as I said that I am going to use HTTPSolrServer API and I found that we
shouldn't create this object multiple times
(as per the wiki document http://wiki.apache.org/solr/Solrj#HttpSolrServer)
So I am planning to have my Server class as singleton.
 Please advice little bit in this front also.

Regards
Sandeep



On Tue, Jun 25, 2013 at 11:16 PM, André Widhani wrote:

> fwiw, I can confirm that Solr 4.x can definitely not read indexes created
> with 1.4.
>
> You'll get an exception like the following:
>
> Caused by: org.apache.lucene.index.IndexFormatTooOldException: Format
> version is not supported (resource: segment _16ofy in resource
> ChecksumIndexInput(MMapIndexInput(path="/var/opt/dcx/solr2/core-tex60l254lpachcjhtz4se-index2/data/index/segments_1dlof"))):
> 2.x. This version of Lucene only supports indexes created with release 3.0
> and later.
>
> But as Erick mentioned, you could get away with optimizing the index with
> 3.x instead of re-indexing from scratch before moving on to 4.x - I think I
> did that once and it worked.
>
> Regards,
> André
>
> 
> Von: Erick Erickson [erickerick...@gmail.com]
> Gesendet: Dienstag, 25. Juni 2013 19:37
> An: solr-user@lucene.apache.org
> Betreff: Re: Need Help in migrating Solr version 1.4 to 4.3
>
> bq: I'm not sure if Solr 4.3 will be able to read Solr 1.4 indexes
>
> Solr/Lucene explicitly try to read _one_ major revision backwards.
> Solr 3.x should be able to read 1.4 indexes. Solr 4.x should be
> able to read Solr 3.x. No attempt is made to allow Solr 4.x to read
> Solr 1.4 indexes, so I wouldn't even try.
>
> Shalin's comment is best. If at all possible I'd just forget about
> reading the old index and re-index from scratch. But if you _do_
> try upgrading 1.4 -> 3.x -> 4.x, you probably want to optimize
> at each step. That'll (I think) rewrite all the segments in the
> current format.
>
> Good luck!
> Erick
>
> On Tue, Jun 25, 2013 at 12:59 AM, Shalin Shekhar Mangar
>  wrote:
> > You must carefully go through the upgrade instructions starting from
> > 1.4 upto 4.3. In particular the instructions for 1.4 to 3.1 and from
> > 3.1 to 4.0 should be given special attention.
> >
> > On Tue, Jun 25, 2013 at 11:43 AM, Sandeep Gupta 
> wrote:
> >> Hello All,
> >>
> >> We are planning to migrate solr 1.4 to Solr 4.3 version.
> >> And I am seeking some help in this side.
> >>
> >> Considering Schema file change:
> >> By default there are lots of changes if I compare original Solr 1.4
> schema
> >> file to Sol 4.3 schema file.
> >> And that is the reason we are not copying paste of schema file.
> >> In our Solr 1.4 schema implementation, we have some custom fields with
> type
> >> "textgen" and "text"
> >> So in migration of these custom fields to Solr 4.3,  should I use type
> of
> >> "text_general" as replacement of "textgen" and
> >> "text_en" as replacement of "text"?
> >> Please confirm the same.
> >
> > Please check the text_general definition in 4.3 against the textgen
> > fieldtype in Solr 1.4 to see if they're equivalent. Same for text_en
> > and text.
> >
> >>
> >> Considering Solrconfig change:
> >> As we didn't have lots of changes in 1.4 solrconfig file except the
> >> dataimport request handler.
> >> And therefore in migration side, we are simply modifying the Solr 4.3
> >> solrconfig file with his request handler.
> >
> > And you need to add the dataimporthandler jar into Solr's lib
> > directory. DIH is not added automatically anymore.
> >
> >>
> >> Considering the application development:
> >>
> >> We used all the queries as BOOLEAN type style (was not good)  I mean put
> >> all the parameter in query fields i.e
> >> *:* AND EntityName: <<>> AND : AND .
> >>
> >> I think we should simplify our queries using other fields like df, qf
> 
> >>
> >
> > Probably. AND queries are best done by filter queries (fq).
> >
> >> We also used to create Solr server object via CommonsHttpSolrServer()
> so I
> >> am planning to use now HttpSolrServer API>
> >
> > Yes. Also, there was a compatibility break between Solr 1.4 and 3.1 in
> > the javabin format so old clients using javabin won't be able to
> > communicate with Solr until you upgrade both solr client and solr
> > servers.
> >
> >>
> >> Please let me know the suggestion for above points also what are the
> other
> >> factors I need to take care while considering the migration.
> >
> > There is no substitute for reading the upgrade sections in the
> changes.txt.
> >
> > I'm not sure if Solr 4.3 will be able to read Solr 1.4 indexes. You
> > will most likely need to re-index your documents.
> >
> > You should also think about switching to SolrCloud to take advantage
> > of its features.
> >
> > --
> > Regards,
> > Shalin Shekhar Mangar.
>

Re: Need Help in migrating Solr version 1.4 to 4.3

2013-06-26 Thread Sandeep Gupta

Thanks Shawn.
To have singleton design pattern for SolrServer object creation,
I found that there are so many ways described in
http://en.wikipedia.org/wiki/Singleton_pattern
So which is the best one, out of 5 examples mentioned in above url, for web
application in general practice.

I am sure lots of people (in this mailing list) will have practical
experience
as which type of singleton pattern need to be implement for creation of
SolrServer object.

Waiting for some comments in this front ?

Regards
Sandeep

On Wed, Jun 26, 2013 at 9:20 PM, Shawn Heisey  wrote:

> On 6/25/2013 11:52 PM, Sandeep Gupta wrote:
> > Also in application development side,
> > as I said that I am going to use HTTPSolrServer API and I found that we
> > shouldn't create this object multiple times
> > (as per the wiki document
> http://wiki.apache.org/solr/Solrj#HttpSolrServer)
> > So I am planning to have my Server class as singleton.
> >  Please advice little bit in this front also.
>
> This is always the way that SolrServer objects are intended to be used,
> including CommonsHttpSolrServer in version 1.4.  The only major
> difference between the two objects is that the new one uses
> HttpComponents 4.x and the old one uses HttpClient 3.x.  There are other
> differences, but they are just the result of incremental improvements
> from version to version.
>
> Thanks,
> Shawn
>
>

Re: Need Help in migrating Solr version 1.4 to 4.3

2013-06-27 Thread Sandeep Gupta

Thanks again Shawn for your comments.

I am little worried about the multi threading of web application which uses
servlets.

I also found one of your explanation (please confirm the same whether its
your comment only) in
http://lucene.472066.n3.nabble.com/Memory-problems-with-HttpSolrServer-td4060985.html
for the question :
http://stackoverflow.com/questions/11931179/httpsolrserver-instance-management

As you said correctly that creation of SolrServer object depends on number
of shards/solrcores and thereafter need to think for implementation which
may use singleton pattern.

In my web application side,  I have only one solrcore which is default one
"collection1" so I will create one SolrServer object for my application.
Sure If we decide to go for Solr Cloud then also I will create one object.

Thanks Upayavira, yes I will do the re-index. Anything you want to suggest
as you did the same migration.

Thanks
Sandeep









On Thu, Jun 27, 2013 at 1:33 PM, Upayavira  wrote:

> I have done this - upgraded a 1.4 index to 3.x then on to 4.x. It
> worked, but...
>
> New field types have been introduced over time that facilitate new
> functionality. To continue to use an upgraded index, you need to
> continue using the old field types, and thus loose some of the coolness
> of newer versions.
>
> So, a re-index will set you in far better stead, if it is at all
> possible.
>
> Upayavira
>
> On Tue, Jun 25, 2013, at 06:37 PM, Erick Erickson wrote:
> > bq: I'm not sure if Solr 4.3 will be able to read Solr 1.4 indexes
> >
> > Solr/Lucene explicitly try to read _one_ major revision backwards.
> > Solr 3.x should be able to read 1.4 indexes. Solr 4.x should be
> > able to read Solr 3.x. No attempt is made to allow Solr 4.x to read
> > Solr 1.4 indexes, so I wouldn't even try.
> >
> > Shalin's comment is best. If at all possible I'd just forget about
> > reading the old index and re-index from scratch. But if you _do_
> > try upgrading 1.4 -> 3.x -> 4.x, you probably want to optimize
> > at each step. That'll (I think) rewrite all the segments in the
> > current format.
> >
> > Good luck!
> > Erick
> >
> > On Tue, Jun 25, 2013 at 12:59 AM, Shalin Shekhar Mangar
> >  wrote:
> > > You must carefully go through the upgrade instructions starting from
> > > 1.4 upto 4.3. In particular the instructions for 1.4 to 3.1 and from
> > > 3.1 to 4.0 should be given special attention.
> > >
> > > On Tue, Jun 25, 2013 at 11:43 AM, Sandeep Gupta 
> wrote:
> > >> Hello All,
> > >>
> > >> We are planning to migrate solr 1.4 to Solr 4.3 version.
> > >> And I am seeking some help in this side.
> > >>
> > >> Considering Schema file change:
> > >> By default there are lots of changes if I compare original Solr 1.4
> schema
> > >> file to Sol 4.3 schema file.
> > >> And that is the reason we are not copying paste of schema file.
> > >> In our Solr 1.4 schema implementation, we have some custom fields
> with type
> > >> "textgen" and "text"
> > >> So in migration of these custom fields to Solr 4.3,  should I use
> type of
> > >> "text_general" as replacement of "textgen" and
> > >> "text_en" as replacement of "text"?
> > >> Please confirm the same.
> > >
> > > Please check the text_general definition in 4.3 against the textgen
> > > fieldtype in Solr 1.4 to see if they're equivalent. Same for text_en
> > > and text.
> > >
> > >>
> > >> Considering Solrconfig change:
> > >> As we didn't have lots of changes in 1.4 solrconfig file except the
> > >> dataimport request handler.
> > >> And therefore in migration side, we are simply modifying the Solr 4.3
> > >> solrconfig file with his request handler.
> > >
> > > And you need to add the dataimporthandler jar into Solr's lib
> > > directory. DIH is not added automatically anymore.
> > >
> > >>
> > >> Considering the application development:
> > >>
> > >> We used all the queries as BOOLEAN type style (was not good)  I mean
> put
> > >> all the parameter in query fields i.e
> > >> *:* AND EntityName: <<>> AND : AND .
> > >>
> > >> I think we should simplify our queries using other fields like df, qf
> 
> > >>
> > >
> > > Probably. AND queries are best done by filter queries (fq).
> > >
> >

Re: Dot operater issue.

2013-06-27 Thread Sandeep Mestry

Hi Sri,

This depends on how the fields (that hold the value) are defined and how
the query is generated.
Try running the query in solr console and use &debug=true to see how the
query string is getting parsed.

If that doesn't help then could you answer following 3 questions relating
to your question.

1) field definition in schema.xml
2) solr query url
3) parser config from solrconfig.xml


Thanks,
Sandeep


On 27 June 2013 10:41, Srinivasa Chegu  wrote:

> Hi team,
>
> When the user enter search term as "h.e.r.b.a.l"  in the search textbox
> and click on search button then  SOLR search engine is not returning any
>  results found. As I can see SOLR is accepting the request parameter as
> h.e.r.b.a.l. However we have many records with the string h.e.r.b.a.l as
> part of the product name.
>
> Look like there is an issue with dot operator in the search term.  If we
> enter search term as "herbal" then it is returning search results .
>
> Our requirement is search term should be "h.e.r.b.a.l" then it needs to
> display results based on dot operator .
>
> Please help us on this issue.
>
> Regards
> Srinivas
>
>
> ::DISCLAIMER::
>
> 
>
> The contents of this e-mail and any attachment(s) are confidential and
> intended for the named recipient(s) only.
> E-mail transmission is not guaranteed to be secure or error-free as
> information could be intercepted, corrupted,
> lost, destroyed, arrive late or incomplete, or may contain viruses in
> transmission. The e mail and its contents
> (with or without referred errors) shall therefore not attach any liability
> on the originator or HCL or its affiliates.
> Views or opinions, if any, presented in this email are solely those of the
> author and may not necessarily reflect the
> views or opinions of HCL or its affiliates. Any form of reproduction,
> dissemination, copying, disclosure, modification,
> distribution and / or publication of this message without the prior
> written consent of authorized representative of
> HCL is strictly prohibited. If you have received this email in error
> please delete it and notify the sender immediately.
> Before opening any email and/or attachments, please check them for viruses
> and other defects.
>
>
> 
>

Re: Newbie SolR - Need advice

2013-07-02 Thread Sandeep Mestry

Hi Fabio,

No, Solr isn't the database replacement for MS SQL.
Solr is built on top of Lucene which is a search engine library for text
searches.

Solr in itself is not a replacement for any database as it does not support
any relational db features, however as Jack and David mentioned its fully
optimised search engine platform that can provide all search related
features like faceting, highlighting etc.
Solr does not have a *database*. It stores the data in binary files called
indexes <http://lucene.apache.org/core/3_0_3/fileformats.html>. These
indexes are populated with the data from the database. Solr provides an
inbuilt functionality through DataImportHandler component to get the data
and generate indexes.

When you say, your web servers are mainly doing search function, do you
mean it is a text search and you use queries with clauses as 'like', 'in'
etc. (in addition to multiple joints) to get the results? Does the web
application need faceting? If yes, then solr can be your friend to get it
through.

Do remember that it always takes some time to get the new concepts from
understanding through to implementation. As David mentioned already, it
*is* going to be a bumpy ride at the start but *definitely* a sensational
one.

Good Luck,
Sandeep

On 2 July 2013 17:09, fabio1605  wrote:

> Thanks guys
>
> So SolR is actually a database replacement for mssql...  Am I right
>
>
> We have a lot of perl scripts that contains lots of sql insert
> queries. Etc
>
>
> How do we query the SolR database from scripts  I know I have a lot to
> learn still so excuse my ignorance.
>
> Also...  What is mongo and how does it compare
>
> I just don't understand how in 10years of Web development I have never
> heard of SolR till last week
>
>
>
>
> Sent from Samsung Mobile
>
>  Original message 
> From: "David Quarterman [via Lucene]" <
> ml-node+s472066n4074772...@n3.nabble.com>
> Date: 02/07/2013  16:57  (GMT+00:00)
> To: fabio1605 
> Subject: RE: Newbie SolR - Need advice
>
> Hi Fabio,
>
> Like Jack says, try the tutorial. But to answer your question, SOLR isn't
> a bolt on to SQLServer or any other DB. It's a fantastically fast
> indexing/searching tool. You'll need to use the DataImportHandler (see the
> tutorial) to import your data from the DB into the indices that SOLR uses.
> Once in there, you'll have more power & flexibility than SQLServer would
> ever give you!
>
> Haven't tried SOLR on Windows (I guess your environment) but I'm sure
> it'll work using Jetty or Tomcat as web container.
>
> Stick with it. The ride can be bumpy but the experience is sensational!
>
> DQ
>
> -Original Message-
> From: fabio1605 [mailto:[hidden email]]
> Sent: 02 July 2013 16:16
> To: [hidden email]
> Subject: Newbie SolR - Need advice
>
> Hi
>
> we have a MSSQL Server which is just getting far to large now and
> performance is dying! the majority of our webservers mainly are doing
> search function so i thought it may be best to move to SolR But i know very
> little about it!
>
> My questions are!
>
> Does SolR Run as a bolt on to MSSQL - as in the data is still in MSSQL and
> SolR is just the search bit between?
>
> Im really struggling to understand the point of SOLR etc so if someone
> could point me to a Dummies website id apprecaite it! google is throwing to
> much confusion at me!
>
>
>
> --
> View this message in context:
> http://lucene.472066.n3.nabble.com/Newbie-SolR-Need-advice-tp4074746.html
> Sent from the Solr - User mailing list archive at Nabble.com.
>
>
> If you reply to this email, your message will be added to the discussion
> below:
>
> http://lucene.472066.n3.nabble.com/Newbie-SolR-Need-advice-tp4074746p4074772.html
> To unsubscribe from Newbie SolR - Need advice, click here.
> NAML
>
>
>
> --
> View this message in context:
> http://lucene.472066.n3.nabble.com/Newbie-SolR-Need-advice-tp4074746p4074782.html
> Sent from the Solr - User mailing list archive at Nabble.com.
>

Re: Newbie SolR - Need advice

2013-07-02 Thread Sandeep Mestry

Hi Fabio,

Yes, you're on right track.

I'd like to now direct you to first reply from Jack to go through solr
tutorial.
Even with Solr,, it will take some time to learn various bits and pieces
about designing fields, their field types, server configuration, etc. and
then tune the results to match the results that you're currently getting
from the database. There is lots of info available for Solr on web and do
check Lucidworks' Solr Reference Guide.
http://docs.lucidworks.com/display/solr/Apache+Solr+Reference+Guide;jsessionid=16ED0DB3B6F6BE8CEC6E6CDB207DBC49

Best of Solr Luck!

Sandeep

On 2 July 2013 20:47, fabio1605  wrote:

>
> So, you keep your mssql database, you just don't use it for searches -
> that'll relieve some of the load. Searches then all go through SOLR & its
> Lucene indexes. If your various tables need SQL joins, you specify those in
> the DataImportHandler (DIH) config. That way, when SOLR indexes everything,
> it indexes the data the way you want to see it.
>
> -- SO  by this you mean we keep mssql as we do!!
>
> But we use the website to run through SOLR SOLR will then handle the
> indexing and retrieval of data from its own index's, and will make its own
> calls to our MSSQL server when required(i.e updating/adding to
> indexs..)
>
> Am I on the right tracks there now!
>
> So MSSQL becomes the datastore
> SOLR becomes the search engine...
>
>
>
>
>
> --
> View this message in context:
> http://lucene.472066.n3.nabble.com/Newbie-SolR-Need-advice-tp4074746p4074889.html
> Sent from the Solr - User mailing list archive at Nabble.com.
>

Re: Newbie SolR - Need advice

2013-07-03 Thread Sandeep Mestry

+1


On 3 July 2013 14:58, Jack Krupansky  wrote:

> Design your own application layer for both indexing and query that knows
> about both SQL and Solr. Give it a REST API and then your client
> applications can talk to your REST API and not have to care about the
> details of Solr or SQL. That's the best starting point.
>
>
> -- Jack Krupansky
>
> -Original Message- From: fabio1605
> Sent: Wednesday, July 03, 2013 4:55 AM
> To: solr-user@lucene.apache.org
> Subject: Re: Newbie SolR - Need advice
>
>
> Hi Sandeep
>
> Thank you for your reply
>
> Il have a read through the tutorials now that i understand the principle of
> all this,
>
> i would ideally like to keep mssql and bolt solr on top of this so that we
> can keep mssql as we have a 200GB database
>
> Cheers
>
>
>
> --
> View this message in context: http://lucene.472066.n3.**
> nabble.com/Newbie-SolR-Need-**advice-tp4074746p4075026.html<http://lucene.472066.n3.nabble.com/Newbie-SolR-Need-advice-tp4074746p4075026.html>
> Sent from the Solr - User mailing list archive at Nabble.com.
>

Re: HTTP Status 503 - Server is shutting down

2013-07-15 Thread Sandeep Gupta

Hello,

I am able to configure solr 4.3.1 version with tomcat6.

I followed these steps:
1. Extract solr431 package. In my case I did in
"E:\solr-4.3.1\example\solr"
2. Now copied solr dir from extracted package (E:\solr-4.3.1\example\solr)
into TOMCAT_HOME dir.
In my case TOMCAT_HOME dir is pointed to E:\Apache\Tomcat 6.0.
3. I can refer now SOLR_HOME as " E:\Apache\Tomcat 6.0\solr"  (please
remember this)
4. Copy the solr.war file from extracted package to SOLR HOME dir i.e
E:\Apache\Tomcat 6.0\solr. This is required to create the context. As I
donot want to pass this as JAVA OPTS
5. Create solr1.xml file into TOMCAT_HOME\conf\Catalina\localhost (I gave
file name as solr1.xml )

6.  Also copy solr.war file into TOMCAT_HOME\webapps for deployment purpose
7.  If you start tomcat you will get errors as mentioned by Shawn.  S0 you
need to copy all the 5 jar files from solr extracted package (
E:\solr-4.3.1\example\lib\ext ) to TOMCAT_HOME\lib dir.(jul-to-slf4j-1.6.6,
jcl-over-slf4j-1.6.6, slf4j-log4j12-1.6.6, slf4j-api-1.6.6,log4j-1.2.16)
8. Also copy the log4js.properties file  from
E:\solr-4.3.1\example\resources dir to TOMCAT_HOME\lib dir.
9. Now if you start the tomcat you wont having any problem.

10. As in my side I am using additional jar for data import requesthandler
.  So for this please modify the solrconfig.xml file to point the location
of data import jar.
11. What I did :
In solrconfig.xml file :  In section

I add one line after this section (If I use above line then I need to
create lib dir inside Collection1 dir)

12. In SOLR_HOME (E:\Apache\Tomcat 6.0\solr) I created a lib folder because
in my solrconfig.xml file I am referring this lib dir.
And copied all the dataimport related jar
files.(solr-dataimporthandler-4.3.1***)
I did it in this way because I do not want to use TOMCAT_HOME\lib.
13.  Now restart the tomcat I am sure there should not be any problem. If
there is some problem, refer solr.log file which is in TOMCAT_HOME\logs dir.

As I said in point 12 that I do not want to put jar files related to solr
ino TOMCAT_HOME\lib dir,  but for logging mechanism I have to do. I tried
to put all the 5 jars into this folder and removed from TOMCAT lib.. but
then I got the error.

In Ideal scenario, we should not put all the jar files related to solr into
TOMCAT lib dir

Regards
Sandeep

On Mon, Jul 15, 2013 at 12:27 AM, PeterKerk  wrote:

> Ok, still getting the same error "HTTP Status 503 - Server is shutting
> down",
> so here's what I did now:
>
> - reinstalled tomcat
> - deployed solr-4.3.1.war in C:\Program Files\Apache Software
> Foundation\Tomcat 6.0\webapps
> - copied log4j-1.2.16.jar,slf4j-api-1.6.6.jar,slf4j-log4j12-1.6.6.jar to
> C:\Program Files\Apache Software Foundation\Tomcat
> 6.0\webapps\solr-4.3.1\WEB-INF\lib
> - copied log4j.properties from
> C:\Dropbox\Databases\solr-4.3.1\example\resources to
> C:\Dropbox\Databases\solr-4.3.1\example\lib
> - restarted tomcat
>
>
> Now this shows in my Tomcat console:
>
> 14-jul-2013 20:54:38 org.apache.catalina.core.AprLifecycleListener init
> INFO: The APR based Apache Tomcat Native library which allows optimal
> performanc
> e in production environments was not found on the java.library.path:
> C:\Program
> Files\Apache Software Foundation\Tomcat
> 6.0\bin;C:\Windows\Sun\Java\bin;C:\Windo
> ws\system32;C:\Windows;C:\Program Files\Common Files\Microsoft
> Shared\Windows Li
> ve;C:\Program Files (x86)\Common Files\Microsoft Shared\Windows
> Live;C:\Windows\
>
> system32;C:\Windows;C:\Windows\System32\Wbem;C:\Windows\System32\WindowsPowerShe
> ll\v1.0\;C:\Program Files\TortoiseSVN\bin;c:\msxsl;C:\Program Files
> (x86)\Window
> s Live\Shared;C:\Program Files\Microsoft\Web Platform Installer\;C:\Program
> File
> s (x86)\Microsoft ASP.NET\ASP.NET Web Pages\v1.0\;C:\Program Files
> (x86)\Windows
>  Kits\8.0\Windows Performance Toolkit\;C:\Program Files\Microsoft SQL
> Server\110
> \Tools\Binn\;C:\Program Files (x86)\Microsoft SQL
> Server\110\Tools\Binn\;C:\Prog
> ram Files\Microsoft SQL Server\110\DTS\Binn\;C:\Program Files
> (x86)\Microsoft SQ
> L Server\110\Tools\Binn\ManagementStudio\;C:\Program Files (x86)\Microsoft
> SQL S
> erver\110\DTS\Binn\;C:\Program Files (x86)\Java\jre6\bin;C:\Program
> Files\Java\j
> re631\bin;.
> 14-jul-2013 20:54:39 org.apache.coyote.http11.Http11Protocol init
> INFO: Initializing Coyote HTTP/1.1 on http-8080
> 14-jul-2013 20:54:39 org.apache.catalina.startup.Catalina load
> INFO: Initialization processed in 287 ms
> 14-jul-2013 20:54:39 org.apache.catalina.core.StandardService start
> INFO: Starting service Catalina
> 14-jul-2013 20:54:39 org.apache.catalina.core.StandardEngine start
> INFO: Starting Servlet Engine: Apache Tomcat/6.0.37
> 14-jul-2013 20:54:39 org.apache.catalina.startup.HostCon

Re: Book contest idea - feedback requested

2013-07-15 Thread Sandeep Gupta

Hi Alex,

great please go ahead..

-Sandeep


On Tue, Jul 16, 2013 at 9:40 AM, Ali, Saqib  wrote:

> Hello Alex,
>
> This sounds like an excellent idea! :)
>
> Saqib
>
>
> On Mon, Jul 15, 2013 at 8:11 PM, Alexandre Rafalovitch
> wrote:
>
> > Hello,
> >
> > Packt Publishing has kindly agreed to let me run a contest with e-copies
> of
> > my book as prizes:
> > http://www.packtpub.com/apache-solr-for-indexing-data/book
> >
> > Since my book is about learning Solr and targeted at beginners and early
> > intermediates, here is what I would like to do. I am asking for feedback
> on
> > whether people on the mailing list like the idea or have specific
> > objections to it.
> >
> > 1) The basic idea is to get Solr users and write and vote on what they
> find
> > hard with Solr, especially in understanding the features (as contrasted
> > with just missing ones).
> > 2) I'll probably set it up as a User Voice forum, which has all the
> > mechanisms for suggesting and voting on ideas. With an easier interface
> > than Jira
> > 3) The top N voted ideas will get the books as prizes and I will try to
> > fix/document/create JIRAs for those issues.
> > 4) I am hoping to specifically reach out to the communities where Solr
> is a
> > component and where they don't necessarily hang out on our mailing list.
> I
> > am thinking SolrNet, Drupal, project Blacklight, Cloudera, CrafterCMS,
> > SiteCore, Typo3, SunSpot, Nutch. Obviously, anybody and everybody from
> this
> > list would be absolutely welcome to participate as well.
> >
> > Yes? No? Suggestions?
> >
> > Also, if you are maintainer of one of the products/services/libraries
> that
> > has Solr in it and want to reach out to your community yourself, I think
> it
> > would be a lot better than If I did it. Contact me directly and I will
> let
> > you know what template/FAQ I want you to include in the announcement
> > message when it is ready.
> >
> > Thank you all in advance for the comments and suggestions.
> >
> > Regards,
> >Alex.
> >
> > Personal website: http://www.outerthoughts.com/
> > LinkedIn: http://www.linkedin.com/in/alexandrerafalovitch
> > - Time is the quality of nature that keeps events from happening all at
> > once. Lately, it doesn't seem to be working.  (Anonymous  - via GTD book)
> >
>

Re: solr 4.3.1 Installation

2013-07-16 Thread Sandeep Gupta

This problem looks to me because of solr logging ...
see below detail description (taken one of the mail thread)

-
Solr 4.3.0 and later does not have ANY slf4j jarfiles in the .war file,
so you need to put them in your classpath.  Jarfiles are included in the
example, in example/lib/ext, and those jarfiles set up logging to use
log4j, a much more flexible logging framework than JDK logging.

JDK logging is typically set up with a file called logging.properties,
which I think you must use a system property to configure.  You aren't
using JDK logging, you are using log4j, which uses a file called
log4j.properties.

http://wiki.apache.org/solr/SolrLogging#Using_the_example_logging_setup_in_containers_other_than_Jetty

On Tue, Jul 16, 2013 at 6:28 PM, Sujatha Arun  wrote:

> Hi ,
>
> We have been using solr 3.6.1 .Recently  downloaded the solr 4.3.1  version
>  and installed the same  as multicore setup as follows
>
> Folder Structure
> solr.war
> solr
>  conf
>core0
> core1
> solr.xml
>
> Created the context fragment xml file in tomcat/conf/catalina/localhost
> which refers to the solr.war file and the solr home folder
>
> copied the muticore conf folder without the zoo.cfg file
>
> I get the following error and admin page does not load
> 16 Jul, 2013 11:36:09 PM org.apache.catalina.core.StandardContext start
> SEVERE: Error filterStart
> 16 Jul, 2013 11:36:09 PM org.apache.catalina.core.StandardContext start
> SEVERE: Context [/solr_4.3.1] startup failed due to previous errors
> 16 Jul, 2013 11:36:39 PM org.apache.catalina.startup.HostConfig
> checkResources
> INFO: Undeploying context [/solr_4.3.1]
> 16 Jul, 2013 11:36:39 PM org.apache.catalina.core.StandardContext start
> SEVERE: Error filterStart
> 16 Jul, 2013 11:36:39 PM org.apache.catalina.core.StandardContext start
> SEVERE: Context [/solr_4.3.1] startup failed due to previous errors
>
>
> Please let me know what I am missing If i need to install this with the
> default multicore setup without the cloud .Thanks
>
> Regards
> Sujatha
>

Re: HTTP Status 503 - Server is shutting down

2013-07-17 Thread Sandeep Gupta

Hi,

I think I will also wait for other people reply as I do not have much idea
now.
I suggested the things because I did it recently but I have only one
collection (default one) .

As you said and I can guess...
you have multiple collections like tt, shop and home in one solr instance..
By default all the collections should go inside solr dir (tomcat\solr)...
And may be you need to modify the solr.xml file (tomcat\solr\solr.xml)
See below.

There is another xml file, I have given name as solr.xml also
(\tomcat\conf\localhost\solr.xml) which has solr home path...
and therefore starting of tomcat read this file after host-manager.xml

Thanks
-Sandeep

On Wed, Jul 17, 2013 at 3:40 PM, PeterKerk  wrote:

> I can now approach http://localhost:8080/solr-4.3.1/#/, thanks!!
>
> I also noticed you mentioning something about a data import handler.
>
> Now, what I will be requiring after I've completed the basic setup of
> Tomcat6 and Solr431 I want to migrate my Solr350 (now running on Cygwin)
> cores to that environment.
>
> C:\Dropbox\Databases\apache-solr-3.5.0\example\example-DIH\solr\tt
> C:\Dropbox\Databases\apache-solr-3.5.0\example\example-DIH\solr\shop
> C:\Dropbox\Databases\apache-solr-3.5.0\example\example-DIH\solr\homes
>
> Where do I need to copy the above cores for this all to work?
> What I don't understand is how Tomcat knows where it can find my Solr 4.3.1
> folder, in my case C:\Dropbox\Databases\solr-4.3.1, is that folder even any
> longer required?
>
> Many thanks again! :)
>
>
>
> --
> View this message in context:
> http://lucene.472066.n3.nabble.com/HTTP-Status-503-Server-is-shutting-down-tp4065958p4078567.html
> Sent from the Solr - User mailing list archive at Nabble.com.
>

Synonyms with wildcard search

2013-07-30 Thread Sandeep Gupta

Hello All,

I want to know whether it is possible to make a query of word which has
synonym+wildcard.

For example :  I have one field which is type of text_en (default fieldType
in 4.3.1)
And synonym.txt file has this entry
colour => color

Now when I am using full text search as colour* (with wild card) then
search result is not returning the keyword of type colorology... (as in
case If I use color* then I am getting this word)

So any suggestions as how I can achieve this Or its not possible.

Thanks
Sandeep

Question on Exact Matches - edismax

2013-04-03 Thread Sandeep Mestry

Hi All,

I have a requirement where in exact matches for 2 fields (Series Title,
Title) should be ranked higher than the partial matches. The configuration
looks like below:



edismax
explicit
0.01
*pg_series_title_ci*^500 *title_ci*^300 *
pg_series_title*^200 *title*^25 classifications^15 classifications_texts^15
parent_classifications^10 synonym_classifications^5 pg_brand_title^5
pg_series_working_title^5 p_programme_title^5 p_item_title^5
p_interstitial_title^5 description^15 pg_series_description annotations^0.1
classification_notes^0.05 pv_program_version_number^2
pv_program_version_number_ci^2 pv_program_number^2 pv_program_number_ci^2
p_program_number^2 ma_version_number^2 ma_recording_location
ma_contributions^0.001 rel_pg_series_title rel_programme_title
rel_programme_number rel_programme_number_ci pg_uuid^0.5 p_uuid^0.5
pv_uuid^0.5 ma_uuid^0.5
pg_series_title_ci^500 title_ci^500
0
*:*
100%
AND
true
-1
1



As you can see above, the search is against many fields. What I'd want is
the documents that have exact matches for series title and title fields
should rank higher than the rest.

I have added 2 case insensitive (*pg_series_title_ci, title_ci*) fields for
series title and title and have boosted them higher over the tokenized and
rest of the fields. I have also implemented a similarity class to override
idf however I still get documents having partial matches in title and other
fields ranking higher than exact match in pg_series_title_ci.

Many Thanks,
Sandeep

Re: Question on Exact Matches - edismax

2013-04-04 Thread Sandeep Mestry

Hi Jan,

Thanks for your reply. I have defined string_ci like below:








When I analyse the query in solr, I saw that document containing
pg_series_title_ci:"funny"  matches when I do a search for
pg_series_title_ci:"funny games" and is ranked higher than the document
containing the exact matches. I can use the default string data type but
then the match will be on exact casing.

Thanks,
Sandeep


On 3 April 2013 22:20, Jan Høydahl  wrote:

> Can you show us your *_ci field type? Solr does not really have a way to
> tell whether a match is "exact" or only partial, but you could hack around
> it with the fieldType. See https://github.com/cominvent/exactmatch for a
> possible solution.
>
> --
> Jan Høydahl, search solution architect
> Cominvent AS - www.cominvent.com
> Solr Training - www.solrtraining.com
>
> 3. apr. 2013 kl. 15:55 skrev Sandeep Mestry :
>
> > Hi All,
> >
> > I have a requirement where in exact matches for 2 fields (Series Title,
> > Title) should be ranked higher than the partial matches. The
> configuration
> > looks like below:
> >
> > 
> >
> >edismax
> >explicit
> >0.01
> >*pg_series_title_ci*^500 *title_ci*^300 *
> > pg_series_title*^200 *title*^25 classifications^15
> classifications_texts^15
> > parent_classifications^10 synonym_classifications^5 pg_brand_title^5
> > pg_series_working_title^5 p_programme_title^5 p_item_title^5
> > p_interstitial_title^5 description^15 pg_series_description
> annotations^0.1
> > classification_notes^0.05 pv_program_version_number^2
> > pv_program_version_number_ci^2 pv_program_number^2 pv_program_number_ci^2
> > p_program_number^2 ma_version_number^2 ma_recording_location
> > ma_contributions^0.001 rel_pg_series_title rel_programme_title
> > rel_programme_number rel_programme_number_ci pg_uuid^0.5 p_uuid^0.5
> > pv_uuid^0.5 ma_uuid^0.5
> >pg_series_title_ci^500 title_ci^500
> >0
> >*:*
> >100%
> >AND
> >true
> >-1
> >1
> >
> >
> >
> > As you can see above, the search is against many fields. What I'd want is
> > the documents that have exact matches for series title and title fields
> > should rank higher than the rest.
> >
> > I have added 2 case insensitive (*pg_series_title_ci, title_ci*) fields
> for
> > series title and title and have boosted them higher over the tokenized
> and
> > rest of the fields. I have also implemented a similarity class to
> override
> > idf however I still get documents having partial matches in title and
> other
> > fields ranking higher than exact match in pg_series_title_ci.
> >
> > Many Thanks,
> > Sandeep
>
>

Re: Question on Exact Matches - edismax

2013-04-04 Thread Sandeep Mestry

Another problem that I see in Solr analysis is the query term that matches
the tokenized field does not match on the case insensitive field.
So, if I'm searching for 'coast to coast', I see that the tokenized series
title (pg_series_title) is matched but not the ci field which is
pg_series_title_ci.

The definition of both field is as below:

























**
*
*
*Can this copyfield directive be an issue? Should it be other way round or
does it matter?*

Thanks,
Sandeep





On 4 April 2013 10:38, Sandeep Mestry  wrote:

> Hi Jan,
>
> Thanks for your reply. I have defined string_ci like below:
>
>  omitNorms="true" compressThreshold="10">
> 
> 
> 
> 
> 
>
> When I analyse the query in solr, I saw that document containing
> pg_series_title_ci:"funny"  matches when I do a search for
> pg_series_title_ci:"funny games" and is ranked higher than the document
> containing the exact matches. I can use the default string data type but
> then the match will be on exact casing.
>
> Thanks,
> Sandeep
>
>
> On 3 April 2013 22:20, Jan Høydahl  wrote:
>
>> Can you show us your *_ci field type? Solr does not really have a way to
>> tell whether a match is "exact" or only partial, but you could hack around
>> it with the fieldType. See https://github.com/cominvent/exactmatch for a
>> possible solution.
>>
>> --
>> Jan Høydahl, search solution architect
>> Cominvent AS - www.cominvent.com
>> Solr Training - www.solrtraining.com
>>
>> 3. apr. 2013 kl. 15:55 skrev Sandeep Mestry :
>>
>> > Hi All,
>> >
>> > I have a requirement where in exact matches for 2 fields (Series Title,
>> > Title) should be ranked higher than the partial matches. The
>> configuration
>> > looks like below:
>> >
>> > 
>> >
>> >edismax
>> >explicit
>> >0.01
>> >*pg_series_title_ci*^500 *title_ci*^300 *
>> > pg_series_title*^200 *title*^25 classifications^15
>> classifications_texts^15
>> > parent_classifications^10 synonym_classifications^5 pg_brand_title^5
>> > pg_series_working_title^5 p_programme_title^5 p_item_title^5
>> > p_interstitial_title^5 description^15 pg_series_description
>> annotations^0.1
>> > classification_notes^0.05 pv_program_version_number^2
>> > pv_program_version_number_ci^2 pv_program_number^2
>> pv_program_number_ci^2
>> > p_program_number^2 ma_version_number^2 ma_recording_location
>> > ma_contributions^0.001 rel_pg_series_title rel_programme_title
>> > rel_programme_number rel_programme_number_ci pg_uuid^0.5 p_uuid^0.5
>> > pv_uuid^0.5 ma_uuid^0.5
>> >pg_series_title_ci^500 title_ci^500
>> >0
>> >*:*
>> >100%
>> >AND
>> >true
>> >-1
>> >1
>> >
>> >
>> >
>> > As you can see above, the search is against many fields. What I'd want
>> is
>> > the documents that have exact matches for series title and title fields
>> > should rank higher than the rest.
>> >
>> > I have added 2 case insensitive (*pg_series_title_ci, title_ci*) fields
>> for
>> > series title and title and have boosted them higher over the tokenized
>> and
>> > rest of the fields. I have also implemented a similarity class to
>> override
>> > idf however I still get documents having partial matches in title and
>> other
>> > fields ranking higher than exact match in pg_series_title_ci.
>> >
>> > Many Thanks,
>> > Sandeep
>>
>>
>

Re: Exact matching in Solr 3.6.1

2013-04-25 Thread Sandeep Mestry

Hi Pawel,

Not sure which parser you are using, I am using edismax and tried using the
bq parameter to boost the results having exact matches at the top.
You may try something like:
q="cats" AND London NOT Leeds&bq="cats"^50

In edismax, pf and pf2 parameters also need some tuning to get the results
at the top.

HTH,
Sandeep

On 25 April 2013 10:33, vsl  wrote:

> Hi,
>  is it possible to get exact matched result if the search term is combined
> e.g. "cats" AND London NOT Leeds
>
>
> In the previus threads I have read that it is possible to create new field
> of String type and perform phrase search on it but nowhere the above
> mentioned combined search term had been taken into consideration.
>
> BR
> Pawel
>
>
>
> --
> View this message in context:
> http://lucene.472066.n3.nabble.com/Exact-matching-in-Solr-3-6-1-tp4058865.html
> Sent from the Solr - User mailing list archive at Nabble.com.
>

Re: Exact matching in Solr 3.6.1

2013-04-25 Thread Sandeep Mestry

I think in that case, making a field String type is your option, however
remember that it'd be case sensitive.
Another approach is to create a case insensitive field type and doing
searches on those fields only.

Can you provide your fields and dismax config and if possible records you
would like and records you do not want?

-S

On 25 April 2013 11:50, vsl  wrote:

> Thanks for your reply. I am using edismax as well. What I want to get is
> the
> exact match without other results that could be close to the given term.
>
>
>
> --
> View this message in context:
> http://lucene.472066.n3.nabble.com/Exact-matching-in-Solr-3-6-1-tp4058865p4058876.html
> Sent from the Solr - User mailing list archive at Nabble.com.
>

Re: Exact matching in Solr 3.6.1

2013-04-25 Thread Sandeep Mestry

Agree with Jack.

The current field type text_general is designed to match the query tokens
instead of exact matches - so it's not able to fulfill your requirements.

Can you use flat file
as spell check
dictionary instead and that way you can search on exact
matched field while generating spell check suggestions from the file
instead of from index?

-S


On 25 April 2013 16:25, Jack Krupansky  wrote:

> Well then just do an exact match ONLY!
>
> It sounds like you haven't worked out the inconsistencies in your
> requirements.
>
> To be clear: We're not offering you "solutions" - that's your job. We're
> only pointing out tools that you can use. It is up to you to utilize the
> tools wisely to implement your solution.
>
> I suspect that you simply haven't experimented enough with various boosts
> to assure that the unstemmed result is consistently higher.
>
> Maybe you need a custom stemmer or stemmer overide so that "passengers"
> does get stemmed to "passenger", but "cats" does not (but "dogs" does.)
> That can be a choice that you can make, but I would urge caution. Still, it
> is a decision that you can make - it's not a matter of Solr forcing or
> preventing you. I still think boosting of an unstemmed field should be
> sufficient.
>
> But until you clarify the inconsistencies in your requirements, we won't
> be able to make much progress.
>
>
> -- Jack Krupansky
>
> -Original Message- From: vsl
> Sent: Thursday, April 25, 2013 10:45 AM
>
> To: solr-user@lucene.apache.org
> Subject: Re: Exact matching in Solr 3.6.1
>
> Thanks for your reply but this solution does not fullfil my requirment
> because other documents (not exact matched) will be returned as well.
>
>
>
> --
> View this message in context: http://lucene.472066.n3.**
> nabble.com/Exact-matching-in-**Solr-3-6-1-tp4058865p4058929.**html
> Sent from the Solr - User mailing list archive at Nabble.com.
>

Exact and Partial Matches

2013-04-29 Thread Sandeep Mestry

Dear Experts,

I have a requirement for the exact matches and applying alphabetical
sorting thereafter.

To illustrate, the results should be sorted in exact matches and all later
alphabetical.

So, if there are 5 documents as below

Doc1
title: trees

Doc 2
title: plum trees

Doc 3
title: Money Trees (Legendary Trees)

Doc 4
title: Cork Trees

Doc 5
title: Old Trees

Then, if user searches with query term as 'trees', the results should be in
following order:

Doc 1 trees - Highest Rank
Doc 4 Cork Trees - Alphabetical afterwards..
Doc 3 Money Trees (Legendary Trees)
Doc 5 Old Trees
Doc 2 plum trees

I can achieve the alphabetical sorting by adding the title sort
parameter, However,
Solr relevancy is higher for Doc 3 (due to matches in 2 terms and so
it arranges
Doc 3 above Doc 4, 5 and 2).
So, it looks like:

Doc 1 trees - Highest Rank
Doc 3 Money Trees (Legendary Trees)
Doc 4 Cork Trees - Alphabetical afterwards..
Doc 5 Old Trees
Doc 2 plum trees

Can you tell me an easy way to achieve this requirement please?

I'm using Solr 4.0 and the *title *field is defined as follows:
















Many Thanks in advance,
Sandeep

Custom sorting of Solr Results

2013-04-30 Thread Sandeep Mestry

Dear Experts,

>
> I have a requirement for the exact matches and applying alphabetical
> sorting thereafter.
>
> To illustrate, the results should be sorted in exact matches and all later
> alphabetical.
>
> So, if there are 5 documents as below
>
> Doc1
> title: trees
>
> Doc 2
> title: plum trees
>
> Doc 3
> title: Money Trees (Legendary Trees)
>
> Doc 4
> title: Cork Trees
>
> Doc 5
> title: Old Trees
>
> Then, if user searches with query term as 'trees', the results should be
> in following order:
>
> Doc 1 trees - Highest Rank
> Doc 4 Cork Trees - Alphabetical afterwards..
> Doc 3 Money Trees (Legendary Trees)
> Doc 5 Old Trees
> Doc 2 plum trees
>
> I can achieve the alphabetical sorting by adding the title sort parameter, 
> However,
> Solr relevancy is higher for Doc 3 (due to matches in 2 terms and so it 
> arranges
> Doc 3 above Doc 4, 5 and 2).
> So, it looks like:
>
> Doc 1 trees - Highest Rank
> Doc 3 Money Trees (Legendary Trees)
> Doc 4 Cork Trees - Alphabetical afterwards..
> Doc 5 Old Trees
> Doc 2 plum trees
>
> Can you tell me an easy way to achieve this requirement please?
>
> I'm using Solr 4.0 and the *title *field is defined as follows:
>
>  positionIncrementGap="100" >
> 
> 
>  stemEnglishPossessive="0" generateWordParts="1" generateNumberParts="1"
> catenateWords="1" catenateNumbers="1" catenateAll="1" splitOnCaseChange="1"
> splitOnNumerics="0" preserveOriginal="1" />
> 
> 
> 
>     
>  stemEnglishPossessive="0" generateWordParts="1" generateNumberParts="1"
> catenateWords="1" catenateNumbers="1" catenateAll="1" splitOnCaseChange="1"
> splitOnNumerics="0" preserveOriginal="1" />
> 
> 
> 
>
>
>
> Many Thanks in advance,
> Sandeep
>

Re: Exact and Partial Matches

2013-04-30 Thread Sandeep Mestry

Thanks Erick,

I tried grouping and it appears to work okay. However, I will need to
change the client to parse the output..

&fq=title:(tree)&group=true&group.query=title:(trees) NOT
title_ci:trees&group.query=title_ci:blair&group.sort=title_sort
desc&sort=score desc,title_sort asc

I used the actual query as the filter query so my scores will be 1 and then
used 2 group queries - one which will give me exact matches and other that
will give me partial minus exact matches.
I have tried this with operators too and it seems to be doing the job I
want, do you see any issue in this?

Thanks again for your reply and by the way thanks for SOLR-4662.

-S


On 30 April 2013 15:06, Erick Erickson  wrote:

> I don't think you can do that. You're essentially
> trying to mix ordering of the result set. You
> _might_ be able to kludge some of this with
> grouping, but I doubt it.
>
> You'll need two queries I'd guess.
>
> Best
> Erick
>
> On Mon, Apr 29, 2013 at 9:44 AM, Sandeep Mestry 
> wrote:
> > Dear Experts,
> >
> > I have a requirement for the exact matches and applying alphabetical
> > sorting thereafter.
> >
> > To illustrate, the results should be sorted in exact matches and all
> later
> > alphabetical.
> >
> > So, if there are 5 documents as below
> >
> > Doc1
> > title: trees
> >
> > Doc 2
> > title: plum trees
> >
> > Doc 3
> > title: Money Trees (Legendary Trees)
> >
> > Doc 4
> > title: Cork Trees
> >
> > Doc 5
> > title: Old Trees
> >
> > Then, if user searches with query term as 'trees', the results should be
> in
> > following order:
> >
> > Doc 1 trees - Highest Rank
> > Doc 4 Cork Trees - Alphabetical afterwards..
> > Doc 3 Money Trees (Legendary Trees)
> > Doc 5 Old Trees
> > Doc 2 plum trees
> >
> > I can achieve the alphabetical sorting by adding the title sort
> > parameter, However,
> > Solr relevancy is higher for Doc 3 (due to matches in 2 terms and so
> > it arranges
> > Doc 3 above Doc 4, 5 and 2).
> > So, it looks like:
> >
> > Doc 1 trees - Highest Rank
> > Doc 3 Money Trees (Legendary Trees)
> > Doc 4 Cork Trees - Alphabetical afterwards..
> > Doc 5 Old Trees
> > Doc 2 plum trees
> >
> > Can you tell me an easy way to achieve this requirement please?
> >
> > I'm using Solr 4.0 and the *title *field is defined as follows:
> >
> >  positionIncrementGap="100"
> >>
> > 
> > 
> >  > stemEnglishPossessive="0" generateWordParts="1" generateNumberParts="1"
> > catenateWords="1" catenateNumbers="1" catenateAll="1"
> splitOnCaseChange="1"
> > splitOnNumerics="0" preserveOriginal="1" />
> > 
> > 
> > 
> > 
> >  > stemEnglishPossessive="0" generateWordParts="1" generateNumberParts="1"
> > catenateWords="1" catenateNumbers="1" catenateAll="1"
> splitOnCaseChange="1"
> > splitOnNumerics="0" preserveOriginal="1" />
> > 
> > 
> > 
> >
> >
> >
> > Many Thanks in advance,
> > Sandeep
>

Re: commit in solr4 takes a longer time

2013-05-02 Thread Sandeep Mestry

Hi Vicky,

I faced this issue as well and after some playing around I found the
autowarm count in cache sizes to be a problem.
I changed that from a fixed count (3072) to percentage (10%) and all commit
times were stable then onwards.





HTH,
Sandeep


On 2 May 2013 16:31, Alexandre Rafalovitch  wrote:

> If you don't re-open the searcher, you will not see new changes. So,
> if you only have hard commit, you never see those changes (until
> restart). But if you also have soft commit enabled, that will re-open
> your searcher for you.
>
> Regards,
>Alex.
> Personal blog: http://blog.outerthoughts.com/
> LinkedIn: http://www.linkedin.com/in/alexandrerafalovitch
> - Time is the quality of nature that keeps events from happening all
> at once. Lately, it doesn't seem to be working.  (Anonymous  - via GTD
> book)
>
>
> On Thu, May 2, 2013 at 11:21 AM, Furkan KAMACI 
> wrote:
> > What happens exactly when you don't open searcher at commit?
> >
> > 2013/5/2 Gopal Patwa 
> >
> >> you might want to added openSearcher=false for hard commit, so hard
> commit
> >> also act like soft commit
> >>
> >>
> >> 5
> >> 30
> >>false
> >> 
> >>
> >>
> >>
> >> On Thu, May 2, 2013 at 12:16 AM, vicky desai  >> >wrote:
> >>
> >> > Hi,
> >> >
> >> > I am using 1 shard and two replicas. Document size is around 6 lakhs
> >> >
> >> >
> >> > My solrconfig.xml is as follows
> >> > 
> >> > 
> >> > LUCENE_40
> >> > 
> >> >
> >> >
> >> > 2147483647
> >> > simple
> >> > true
> >> > 
> >> > 
> >> > 
> >> > 500
> >> > 1000
> >> > 
> >> > 
> >> > 5
> >> > 30
> >> > 
> >> > 
> >> >
> >> > 
> >> >  >> > multipartUploadLimitInKB="204800" />
> >> > 
> >> >
> >> >  >> class="solr.StandardRequestHandler"
> >> > default="true" />
> >> >  class="solr.UpdateRequestHandler"
> >> />
> >> >  >> > class="org.apache.solr.handler.admin.AdminHandlers" />
> >> >  >> > class="solr.ReplicationHandler" />
> >> >  >> > class="${solr.directoryFactory:solr.NRTCachingDirectoryFactory}" />
> >> > true
> >> > 
> >> > *:*
> >> > 
> >> > 
> >> >
> >> >
> >> >
> >> >
> >> > --
> >> > View this message in context:
> >> >
> >>
> http://lucene.472066.n3.nabble.com/commit-in-solr4-takes-a-longer-time-tp4060396p4060402.html
> >> > Sent from the Solr - User mailing list archive at Nabble.com.
> >> >
> >>
>

Re: commit in solr4 takes a longer time

2013-05-03 Thread Sandeep Mestry

That's not ideal.
Can you post solrconfig.xml?
On 3 May 2013 07:41, "vicky desai"  wrote:

> Hi sandeep,
>
> I made the changes u mentioned and tested again for the same set of docs
> but
> unfortunately the commit time increased.
>
>
>
> --
> View this message in context:
> http://lucene.472066.n3.nabble.com/commit-in-solr4-takes-a-longer-time-tp4060396p4060622.html
> Sent from the Solr - User mailing list archive at Nabble.com.
>

Solr Sorting Algorithm

2013-05-13 Thread Sandeep Mestry

Good Morning All,

The alphabetical sorting is causing slight issues as below:

I have 3 documents with title value as below:

1) "Acer Palmatum (Tree)"
2) "Aceraceae (Tree Family)"
3) "Acer Pseudoplatanus (Tree)"

I have created title_sort field which is defined with field type as
alphaNumericalSort (that comes with solr example schema)

When I apply the sort order (sort=title_sort asc), I get the results as:

"Aceraceae (Tree Family)"
"Acer Palmatum (Tree)"
"Acer Pseudoplatanus (Tree)"

But, the expected order is (spaces first),

"Acer Palmatum (Tree)"
"Acer Pseudoplatanus (Tree)"
"Aceraceae (Tree Family)"

My unit test contains Collections.sort method and I get the expected
results but I'm not sure why Solr is doing it in different way.

>From Collections.sort API, I can see that it uses modified merge sort,
could you tell me which algorithm solr follows for sorting logic and also
if there is any other approach I can take?

Many Thanks,
Sandeep

Question about Edismax - Solr 4.0

2013-05-16 Thread Sandeep Mestry

-- *Edismax and Filter Queries with Commas and spaces* --

Dear Experts,

This appears to be a bug, please suggest if I'm wrong.

If I search with the following filter query,

1) fq=title:(, 10)

- I get no results.
- The debug output does NOT show the section containing
parsed_filter_queries

if I carry a search with the filter query,

2) fq=title:(,10) - (No space between , and 10)

- I get results and the debug output shows the parsed filter queries
section as,

(titles:(,10))
(collection:assets)

As you can see above, I'm also passing in other filter queries
(collection:assets) which appear correctly but they do not appear in case 1
above.

I can't make this as part of the query parameter as that needs to be
searched against multiple fields.

Can someone suggest a fix in this case please. I'm using Solr 4.0.

Many Thanks,
Sandeep

Re: Question about Edismax - Solr 4.0

2013-05-16 Thread Sandeep Mestry

Thanks Jack for your reply..

The problem is, I'm finding results for fq=title:(,10) but not for
fq=title:(, 10) - apologies if that was not clear from my first mail.
I have already mentioned the debug analysis in my previous mail.

Additionally, the title field is defined as below:

 











I have the set catenate options to 1 for all types.
I can understand if ',' getting ignored when it is on its own (title:(,
10)) but
- Why solr is not searching for 10 in that case just like it did when the
query was (title:(,10))?
- And why other filter queries did not show up (collection:assets) in debug
section?


Thanks,
Sandeep


On 16 May 2013 13:57, Jack Krupansky  wrote:

> You haven't indicated any problem here! What is the symptom that you
> actually think is a problem.
>
> There is no comma operator in any of the Solr query parsers. Comma is just
> another character that may or may not be included or discarded depending on
> the specific field type and analyzer. For example, a white space analyzer
> will keep commas, but the standard analyzer or the word delimiter filter
> will discard them. If "title" were a "string" type, all punctuation would
> be preserved, including commas and spaces (but spaces would need to be
> escaped or the term text enclosed in parentheses.)
>
> Let us know what your symptom is though, first.
>
> I mean, the filter query looks perfectly reasonable from an abstract
> perspective.
>
> -- Jack Krupansky
>
> -Original Message- From: Sandeep Mestry
> Sent: Thursday, May 16, 2013 6:51 AM
> To: solr-user@lucene.apache.org
> Subject: Question about Edismax - Solr 4.0
>
> -- *Edismax and Filter Queries with Commas and spaces* --
>
>
> Dear Experts,
>
> This appears to be a bug, please suggest if I'm wrong.
>
> If I search with the following filter query,
>
> 1) fq=title:(, 10)
>
> - I get no results.
> - The debug output does NOT show the section containing
> parsed_filter_queries
>
> if I carry a search with the filter query,
>
> 2) fq=title:(,10) - (No space between , and 10)
>
> - I get results and the debug output shows the parsed filter queries
> section as,
> 
> (titles:(,10))
> (collection:assets)
>
> As you can see above, I'm also passing in other filter queries
> (collection:assets) which appear correctly but they do not appear in case 1
> above.
>
> I can't make this as part of the query parameter as that needs to be
> searched against multiple fields.
>
> Can someone suggest a fix in this case please. I'm using Solr 4.0.
>
> Many Thanks,
> Sandeep
>

Re: Question about Edismax - Solr 4.0

2013-05-16 Thread Sandeep Mestry

Hi Jack,

Thanks for your response again and for helping me out to get through this.

The URL is definitely encoded for spaces and it looks like below. As I
mentioned in my previous mail, I can't add it to query parameter as that
searches on multiple fields.

The title field is defined as below:


q=countryside&rows=20&qt=assdismax&fq=%28title%3A%28,10%29%29&fq=collection:assets



edismax
explicit
0.01
title^10 description^5 annotations^3 notes^2 categories
title
0
*:*
*,score
100%
AND
score desc
true
-1
1
uniq_subtype_id
component_type
genre_type


collection:assets



The term 'countryside' needs to be searched against multiple fields
including titles, descriptions, annotations, categories, notes but the UI
also has a feature to limit results by providing a title field.


I can see that the filter queries are always parsed by LuceneQueryParser
however I'd expect it to generate the parsed_filter_queries debug output in
every situation.

I have tried it as the main query with both edismax and lucene defType and
it gives me correct output and correct results.
But, there is some problem when this is used as a filter query as the the
parser is not able to parse a comma with a space.

Thanks again Jack, please let me know in case you need more inputs from my
side.

Best Regards,
Sandeep

On 16 May 2013 18:03, Jack Krupansky  wrote:

> Could you show us the full query URL - spaces must be encoded in URL query
> parameters.
>
> Also show the actual field XML - you omitted that.
>
> Try the same query as a main query, using both defType=edismax and
> defType=lucene.
>
> Note that the filter query is parsed using the Lucene query parser, not
> edismax, independent of the defType parameter. But you don't have any
> edismax features in your fq anyway.
>
> But you can stick {!edismax} in front of the query to force edismax to be
> used for the fq, although it really shouldn't change anything:
>
> Also, catenate is fine for indexing, but will mess up your queries at
> query time, so set them to "0" in the query analyzer
>
> Also, make sure you have autoGeneratePhraseQueries="**true" on the field
> type, but that's not the issue here.
>
>
> -- Jack Krupansky
>
> -Original Message- From: Sandeep Mestry
> Sent: Thursday, May 16, 2013 12:42 PM
> To: solr-user@lucene.apache.org
> Subject: Re: Question about Edismax - Solr 4.0
>
>
> Thanks Jack for your reply..
>
> The problem is, I'm finding results for fq=title:(,10) but not for
> fq=title:(, 10) - apologies if that was not clear from my first mail.
> I have already mentioned the debug analysis in my previous mail.
>
> Additionally, the title field is defined as below:
> 
>>
>>  
>
> stemEnglishPossessive="0" generateWordParts="1" generateNumberParts="1"
> catenateWords="1" catenateNumbers="1" catenateAll="1" splitOnCaseChange="1"
> splitOnNumerics="0" preserveOriginal="1" />
>
>
>
>
> stemEnglishPossessive="0" generateWordParts="1" generateNumberParts="1"
> catenateWords="1" catenateNumbers="1" catenateAll="1" splitOnCaseChange="1"
> splitOnNumerics="0" preserveOriginal="1" />
>
>
>
>
> I have the set catenate options to 1 for all types.
> I can understand if ',' getting ignored when it is on its own (title:(,
> 10)) but
> - Why solr is not searching for 10 in that case just like it did when the
> query was (title:(,10))?
> - And why other filter queries did not show up (collection:assets) in debug
> section?
>
>
> Thanks,
> Sandeep
>
>
> On 16 May 2013 13:57, Jack Krupansky  wrote:
>
>  You haven't indicated any problem here! What is the symptom that you
>> actually think is a problem.
>>
>> There is no comma operator in any of the Solr query parsers. Comma is just
>> another character that may or may not be included or discarded depending
>> on
>> the specific field type and analyzer. For example, a white space analyzer
>> will keep commas, but the standard analyzer or the word delimiter filter
>> will discard them. If "title" were a "string" type, all punctuation would
>> be preserved, including commas and spaces (but spaces would need to be
>> escaped or the term text enclosed in parentheses.)
>>
>> Let us know what your symptom is though, first.
>>
>> I mean, the filter query looks perfectly reasonable from an abstr

Re: Question about Edismax - Solr 4.0

2013-05-17 Thread Sandeep Mestry

Hello Jack,

Thanks for pointing the issues out and for your valuable suggestion. My
preliminary tests were okay on search but I will be doing more testing to
see if this has impacted any other searches.

Thanks once again and have a nice sunny weekend,
Sandeep


On 17 May 2013 05:35, Jack Krupansky  wrote:

> Ah... I think your issue is the preserveOriginal=1 on the query analyzer
> as well as the fact that you have all of these catenatexx="1" options on
> the query analyzer - I indicated that you should remove them all.
>
> The problem is that the whitespace analyzer leaves the leading comma in
> place, and the preserveOriginal="1" also generates an extra token for the
> term, with the comma in place . But, with the space, the comma and "10" are
> separate terms and get analyzed independently.
>
> The query results probably indicate that you don't have that exact
> combination of the term and leading punctuation - or that there is no
> standalone comma in your input data.
>
> Try the following replacement for the query-time WDF:
>
>
>  stemEnglishPossessive="0" generateWordParts="1" generateNumberParts="1"
> catenateWords="0" catenateNumbers="0" catenateAll="0"
> splitOnCaseChange="1" splitOnNumerics="0" preserveOriginal="0" />
>
>
> -- Jack Krupansky
>
> -Original Message- From: Sandeep Mestry
> Sent: Thursday, May 16, 2013 5:50 PM
>
> To: solr-user@lucene.apache.org
> Subject: Re: Question about Edismax - Solr 4.0
>
> Hi Jack,
>
> Thanks for your response again and for helping me out to get through this.
>
> The URL is definitely encoded for spaces and it looks like below. As I
> mentioned in my previous mail, I can't add it to query parameter as that
> searches on multiple fields.
>
> The title field is defined as below:
>  multiValued="true"/>
>
> q=countryside&rows=20&qt=**assdismax&fq=%28title%3A%28,**
> 10%29%29&fq=collection:assets
>
> 
> 
> edismax
> explicit
> 0.01
> title^10 description^5 annotations^3 notes^2
> categories
> title
> 0
> *:*
> *,score
> 100%
> AND
> score desc
> true
> -1
> 1
> uniq_**subtype_id
> component_**type
> genre_type<**/str>
> 
> 
> collection:assets
> 
> 
>
> The term 'countryside' needs to be searched against multiple fields
> including titles, descriptions, annotations, categories, notes but the UI
> also has a feature to limit results by providing a title field.
>
>
> I can see that the filter queries are always parsed by LuceneQueryParser
> however I'd expect it to generate the parsed_filter_queries debug output in
> every situation.
>
> I have tried it as the main query with both edismax and lucene defType and
> it gives me correct output and correct results.
> But, there is some problem when this is used as a filter query as the the
> parser is not able to parse a comma with a space.
>
> Thanks again Jack, please let me know in case you need more inputs from my
> side.
>
> Best Regards,
> Sandeep
>
> On 16 May 2013 18:03, Jack Krupansky  wrote:
>
>  Could you show us the full query URL - spaces must be encoded in URL query
>> parameters.
>>
>> Also show the actual field XML - you omitted that.
>>
>> Try the same query as a main query, using both defType=edismax and
>> defType=lucene.
>>
>> Note that the filter query is parsed using the Lucene query parser, not
>> edismax, independent of the defType parameter. But you don't have any
>> edismax features in your fq anyway.
>>
>> But you can stick {!edismax} in front of the query to force edismax to be
>> used for the fq, although it really shouldn't change anything:
>>
>> Also, catenate is fine for indexing, but will mess up your queries at
>> query time, so set them to "0" in the query analyzer
>>
>> Also, make sure you have autoGeneratePhraseQueries="true" on the
>> field
>>
>> type, but that's not the issue here.
>>
>>
>> -- Jack Krupansky
>>
>> -Original Message- From: Sandeep Mestry
>> Sent: Thursday, May 16, 2013 12:42 PM
>> To: solr-user@lucene.apache.org
>> Subject: Re: Question about Edismax - Solr 4.0
>>
>>
>> Thanks Jack for your reply..
>>
>> The problem is, I'm finding results for fq=title:(,10) but not for
>> fq=title:(, 10) - apologies if that was not clear from my first mail.
>> I have already mentioned the debug analysis in my previous mail.
>>
>> Additionally,

Highlight only when all keywords match

2013-05-20 Thread Sandeep Mestry

Dear All,

I have a requirement to highlight a field only when all keywords entered
match. This also needs to support phrase, operator or wildcard queries.
I'm using Solr 4.0 with edismax because the search needs to be carried out
on multiple fields.
I know with highlighting feature I can configure a field to indicate a
match, however I do not find a setting to highlight only if all keywords
match. That makes me think is that the right approach to take? Can you
please guide me in right direction?

The edsimax config looks like below:



edismax
explicit
0.01
title^10 description^5 annotations^3 notes^2 categories
title
0
*:*
*,score
100%
AND
score desc
true
-1
1
uniq_subtype_id
component_type
genre_type


collection:assets



If I search for 'countryside number 10' as the keyword then highlight only
if the 'annotations' contain all these entered search terms. Any document
containing just one or two terms is not a match.

Thanks,
Sandeep
(p.s: I haven't enabled the highlighting feature yet on this config and
will be doing so only if that will fulfil the requirement I have mentioned
above.)

Re: Highlight only when all keywords match

2013-05-20 Thread Sandeep Mestry

Hi Jaideep,

The edismax config I have posted mentioned that the default operator is
AND. I am sorry if I was not clear in my previous mail, what I need really
is highlight a field when all search query terms present. The current
highlighter works for *any* of the terms match and not for *all* terms
match.

Thanks,
Sandeep


On 20 May 2013 11:40, Jaideep Dhok  wrote:

> Sandeep,
> If you AND all keywords, that should be OK?
>
> Thanks
> Jaideep
>
>
> On Mon, May 20, 2013 at 3:44 PM, Sandeep Mestry 
> wrote:
>
> > Dear All,
> >
> > I have a requirement to highlight a field only when all keywords entered
> > match. This also needs to support phrase, operator or wildcard queries.
> > I'm using Solr 4.0 with edismax because the search needs to be carried
> out
> > on multiple fields.
> > I know with highlighting feature I can configure a field to indicate a
> > match, however I do not find a setting to highlight only if all keywords
> > match. That makes me think is that the right approach to take? Can you
> > please guide me in right direction?
> >
> > The edsimax config looks like below:
> >
> > 
> > 
> > edismax
> > explicit
> > 0.01
> > title^10 description^5 annotations^3 notes^2
> > categories
> > title
> > 0
> > *:*
> > *,score
> > 100%
> > AND
> > score desc
> > true
> > -1
> > 1
> > uniq_subtype_id
> > component_type
> > genre_type
> > 
> > 
> > collection:assets
> > 
> > 
> >
> > If I search for 'countryside number 10' as the keyword then highlight
> only
> > if the 'annotations' contain all these entered search terms. Any document
> > containing just one or two terms is not a match.
> >
> > Thanks,
> > Sandeep
> > (p.s: I haven't enabled the highlighting feature yet on this config and
> > will be doing so only if that will fulfil the requirement I have
> mentioned
> > above.)
> >
>
> --
> _
> The information contained in this communication is intended solely for the
> use of the individual or entity to whom it is addressed and others
> authorized to receive it. It may contain confidential or legally privileged
> information. If you are not the intended recipient you are hereby notified
> that any disclosure, copying, distribution or taking any action in reliance
> on the contents of this information is strictly prohibited and may be
> unlawful. If you have received this communication in error, please notify
> us immediately by responding to this email and then delete it from your
> system. The firm is neither liable for the proper and complete transmission
> of the information contained in this communication nor for any delay in its
> receipt.
>

Re: Highlight only when all keywords match

2013-05-20 Thread Sandeep Mestry

I doubt if that will be the correct approach as it will be hard to generate
the query grammar considering we have support for phrase, operator,
wildcard and group queries.
That's why I have kept it simple and only passing the query text with
minimal parsing (escaping lucene special characters) to configured edismax.
The number of fields I have mentioned above are a lot lesser than the
actual number of fields - around 50 in number :-). So forming such a long
query will both be time and resource consuming. Further, it's not going to
fulfill my requirement anyway because I do not want to change my search
results, the requirement is only to provide a highlight if a field is
matched for all the query terms.

Thanks,
Sandeep


On 20 May 2013 12:02, Jaideep Dhok  wrote:

> If you know all fields that need to be queried, you can rewrite it as -
> (assuming, f1, f2 are the fields that you have to search)
> (f1:kw1 AND f1:kw2 ... f1:kwn) OR (f2:kw1 AND f2:kw2 ... f2:kwn)
>
> -
> Jaideep
>
>
> On Mon, May 20, 2013 at 4:22 PM, Sandeep Mestry 
> wrote:
>
> > Hi Jaideep,
> >
> > The edismax config I have posted mentioned that the default operator is
> > AND. I am sorry if I was not clear in my previous mail, what I need
> really
> > is highlight a field when all search query terms present. The current
> > highlighter works for *any* of the terms match and not for *all* terms
> > match.
> >
> > Thanks,
> > Sandeep
> >
> >
> > On 20 May 2013 11:40, Jaideep Dhok  wrote:
> >
> > > Sandeep,
> > > If you AND all keywords, that should be OK?
> > >
> > > Thanks
> > > Jaideep
> > >
> > >
> > > On Mon, May 20, 2013 at 3:44 PM, Sandeep Mestry 
> > > wrote:
> > >
> > > > Dear All,
> > > >
> > > > I have a requirement to highlight a field only when all keywords
> > entered
> > > > match. This also needs to support phrase, operator or wildcard
> queries.
> > > > I'm using Solr 4.0 with edismax because the search needs to be
> carried
> > > out
> > > > on multiple fields.
> > > > I know with highlighting feature I can configure a field to indicate
> a
> > > > match, however I do not find a setting to highlight only if all
> > keywords
> > > > match. That makes me think is that the right approach to take? Can
> you
> > > > please guide me in right direction?
> > > >
> > > > The edsimax config looks like below:
> > > >
> > > > 
> > > > 
> > > > edismax
> > > > explicit
> > > > 0.01
> > > > title^10 description^5 annotations^3 notes^2
> > > > categories
> > > > title
> > > > 0
> > > > *:*
> > > > *,score
> > > > 100%
> > > > AND
> > > > score desc
> > > > true
> > > > -1
> > > > 1
> > > > uniq_subtype_id
> > > > component_type
> > > > genre_type
> > > > 
> > > > 
> > > > collection:assets
> > > > 
> > > > 
> > > >
> > > > If I search for 'countryside number 10' as the keyword then highlight
> > > only
> > > > if the 'annotations' contain all these entered search terms. Any
> > document
> > > > containing just one or two terms is not a match.
> > > >
> > > > Thanks,
> > > > Sandeep
> > > > (p.s: I haven't enabled the highlighting feature yet on this config
> and
> > > > will be doing so only if that will fulfil the requirement I have
> > > mentioned
> > > > above.)
> > > >
> > >
> > > --
> > > _
> > > The information contained in this communication is intended solely for
> > the
> > > use of the individual or entity to whom it is addressed and others
> > > authorized to receive it. It may contain confidential or legally
> > privileged
> > > information. If you are not the intended recipient you are hereby
> > notified
> > > that any disclosure, copying, distribution or taking any action in
> > reliance
> > > on the contents of this information is strictly prohibited and may be
> > > unlawful. If you have received this communication in error, please
> notify
> > > us immediately by responding to this email and then delete it from your
> > > system. The firm is neither liable for the proper and complete
> > transmission
> > > of the information contained in this communication nor for any delay in
> > its
> > > receipt.
> > >
> >
>
> --
> _
> The information contained in this communication is intended solely for the
> use of the individual or entity to whom it is addressed and others
> authorized to receive it. It may contain confidential or legally privileged
> information. If you are not the intended recipient you are hereby notified
> that any disclosure, copying, distribution or taking any action in reliance
> on the contents of this information is strictly prohibited and may be
> unlawful. If you have received this communication in error, please notify
> us immediately by responding to this email and then delete it from your
> system. The firm is neither liable for the proper and complete transmission
> of the information contained in this communication nor for any delay in its
> receipt.
>

Re: Highlight only when all keywords match

2013-05-20 Thread Sandeep Mestry

Thanks Upayavira for that valuable suggestion.

I believe overriding highlight component should be the way forward.
Could you tell me if there is any existing example or which methods I
should particularly override?

Thanks,
Sandeep


On 20 May 2013 12:47, Upayavira  wrote:

> If you are saying that you want to change highlighting behaviour, not
> query behaviour, then I suspect you are going to have to interact with
> the java HighlightComponent. If you can work out how to update that
> component to behave as you wish, you could either subclass it, or create
> your own implementation that you can include in your Solr setup. Or, if
> you make it generic enough, offer it back as a contribution that can be
> included in future Solr releases.
>
> Upayavira
>
> On Mon, May 20, 2013, at 12:14 PM, Sandeep Mestry wrote:
> > I doubt if that will be the correct approach as it will be hard to
> > generate
> > the query grammar considering we have support for phrase, operator,
> > wildcard and group queries.
> > That's why I have kept it simple and only passing the query text with
> > minimal parsing (escaping lucene special characters) to configured
> > edismax.
> > The number of fields I have mentioned above are a lot lesser than the
> > actual number of fields - around 50 in number :-). So forming such a long
> > query will both be time and resource consuming. Further, it's not going
> > to
> > fulfill my requirement anyway because I do not want to change my search
> > results, the requirement is only to provide a highlight if a field is
> > matched for all the query terms.
> >
> > Thanks,
> > Sandeep
> >
> >
> > On 20 May 2013 12:02, Jaideep Dhok  wrote:
> >
> > > If you know all fields that need to be queried, you can rewrite it as -
> > > (assuming, f1, f2 are the fields that you have to search)
> > > (f1:kw1 AND f1:kw2 ... f1:kwn) OR (f2:kw1 AND f2:kw2 ... f2:kwn)
> > >
> > > -
> > > Jaideep
> > >
> > >
> > > On Mon, May 20, 2013 at 4:22 PM, Sandeep Mestry 
> > > wrote:
> > >
> > > > Hi Jaideep,
> > > >
> > > > The edismax config I have posted mentioned that the default operator
> is
> > > > AND. I am sorry if I was not clear in my previous mail, what I need
> > > really
> > > > is highlight a field when all search query terms present. The current
> > > > highlighter works for *any* of the terms match and not for *all*
> terms
> > > > match.
> > > >
> > > > Thanks,
> > > > Sandeep
> > > >
> > > >
> > > > On 20 May 2013 11:40, Jaideep Dhok  wrote:
> > > >
> > > > > Sandeep,
> > > > > If you AND all keywords, that should be OK?
> > > > >
> > > > > Thanks
> > > > > Jaideep
> > > > >
> > > > >
> > > > > On Mon, May 20, 2013 at 3:44 PM, Sandeep Mestry <
> sanmes...@gmail.com>
> > > > > wrote:
> > > > >
> > > > > > Dear All,
> > > > > >
> > > > > > I have a requirement to highlight a field only when all keywords
> > > > entered
> > > > > > match. This also needs to support phrase, operator or wildcard
> > > queries.
> > > > > > I'm using Solr 4.0 with edismax because the search needs to be
> > > carried
> > > > > out
> > > > > > on multiple fields.
> > > > > > I know with highlighting feature I can configure a field to
> indicate
> > > a
> > > > > > match, however I do not find a setting to highlight only if all
> > > > keywords
> > > > > > match. That makes me think is that the right approach to take?
> Can
> > > you
> > > > > > please guide me in right direction?
> > > > > >
> > > > > > The edsimax config looks like below:
> > > > > >
> > > > > > 
> > > > > > 
> > > > > > edismax
> > > > > > explicit
> > > > > > 0.01
> > > > > > title^10 description^5 annotations^3 notes^2
> > > > > > categories
> > > > > > title
> > > > > > 0
> > > > > > *:*
> > > > > > *,score
> > > > > > 100%
> > > > > > AND
> > > > > > score desc
> > > > > > true

Solr 4.0 war startup issue - apache-solr-core.jar Vs solr-core

2013-05-20 Thread Sandeep Mestry

Hi All,

I want to override a component from solr-core and for that I need solr-core
jar.

I am using the solr.war that comes from Apache mirror and if I open the
war, I see the solr-core jar is actually named as apache-solr-core.jar.
This is also true about solrj jar.

If I now provide a dependency in my module for apache-solr-core.jar, it's
not being found in the mirror. And if I use solr-core.jar, I get strange
class cast exception during Solr startup for MorfologikFilterFactory.

(I'm not using this factory at all in my project.)

at
org.eclipse.jdt.internal.junit.runner.RemoteTestRunner.run(RemoteTestRunner.java:390)
at
org.eclipse.jdt.internal.junit.runner.RemoteTestRunner.main(RemoteTestRunner.java:197)
Caused by: java.lang.ClassCastException: class
org.apache.lucene.analysis.morfologik.MorfologikFilterFactory
at java.lang.Class.asSubclass(Unknown Source)
at org.apache.lucene.util.SPIClassIterator.next(SPIClassIterator.java:126)
at
org.apache.lucene.analysis.util.AnalysisSPILoader.reload(AnalysisSPILoader.java:73)
at
org.apache.lucene.analysis.util.AnalysisSPILoader.(AnalysisSPILoader.java:55)

I tried manually removing the apache-solr-core.jar from the solr
distribution war and then providing the dependency and everything worked
fine.

And I do remember the discussion on the forum about dropping the name
*apache* from solr jars. If that's what caused this issue, then can you
tell me if the mirrors need updating with solr-core.jar instead of
apache-solr-core.jar?

Many Thanks,
Sandeep

Re: Solr 4.0 war startup issue - apache-solr-core.jar Vs solr-core

2013-05-21 Thread Sandeep Mestry

Hi Shawn,

Thanks for your reply.

I'm not mixing versions.
The problem I faced is I want to override Highlighter from solr-core jar
and if I add that as a dependency in my project then there was a clash
between solr-core.jar and the apache-solr-core.jar that comes bundled
within the solr distribution. It was complaining about MorfologikFilterFactory
classcastexception.
I can't use apache-solr-core.jar as a dependency as no such jar exists in
any maven repo.

The only thing I could do is to remove apache-solr-core.jar from solr.war
and then use solr-core.jar as a dependency - however I do not think this is
the ideal solution.

Thanks,
Sandeep

On 20 May 2013 15:18, Shawn Heisey  wrote:

> On 5/20/2013 8:01 AM, Sandeep Mestry wrote:
> > And I do remember the discussion on the forum about dropping the name
> > *apache* from solr jars. If that's what caused this issue, then can you
> > tell me if the mirrors need updating with solr-core.jar instead of
> > apache-solr-core.jar?
>
> If it's named apache-solr-core, then it's from 4.0 or earlier.  If it's
> named solr-core, then it's from 4.1 or later.  That might mean that you
> are mixing versions - don't do that.  Make sure that you have jars from
> the exact same version as your server.
>
> Thanks,
> Shawn
>
>

Re: Solr 4.0 war startup issue - apache-solr-core.jar Vs solr-core

2013-05-21 Thread Sandeep Mestry

Hi Steve,

Solr 4.0 - mentioned in the subject.. :-)

Thanks,
Sandeep


On 21 May 2013 14:58, Steve Rowe  wrote:

> Sandeep,
>
> What version of Solr are you using?
>
> Steve
>
> On May 21, 2013, at 6:55 AM, Sandeep Mestry  wrote:
>
> > Hi Shawn,
> >
> > Thanks for your reply.
> >
> > I'm not mixing versions.
> > The problem I faced is I want to override Highlighter from solr-core jar
> > and if I add that as a dependency in my project then there was a clash
> > between solr-core.jar and the apache-solr-core.jar that comes bundled
> > within the solr distribution. It was complaining about
> MorfologikFilterFactory
> > classcastexception.
> > I can't use apache-solr-core.jar as a dependency as no such jar exists in
> > any maven repo.
> >
> > The only thing I could do is to remove apache-solr-core.jar from solr.war
> > and then use solr-core.jar as a dependency - however I do not think this
> is
> > the ideal solution.
> >
> > Thanks,
> > Sandeep
> >
> >
> > On 20 May 2013 15:18, Shawn Heisey  wrote:
> >
> >> On 5/20/2013 8:01 AM, Sandeep Mestry wrote:
> >>> And I do remember the discussion on the forum about dropping the name
> >>> *apache* from solr jars. If that's what caused this issue, then can you
> >>> tell me if the mirrors need updating with solr-core.jar instead of
> >>> apache-solr-core.jar?
> >>
> >> If it's named apache-solr-core, then it's from 4.0 or earlier.  If it's
> >> named solr-core, then it's from 4.1 or later.  That might mean that you
> >> are mixing versions - don't do that.  Make sure that you have jars from
> >> the exact same version as your server.
> >>
> >> Thanks,
> >> Shawn
> >>
> >>
>
>

Re: Solr 4.0 war startup issue - apache-solr-core.jar Vs solr-core

2013-05-21 Thread Sandeep Mestry

Thanks Steve,

I could find solr-core.jar in the repo but could not find
apache-solr-core.jar.
I think my issue got misunderstood - which is totally my fault.

Anyway, I took into account Shawn's comment and will use solr-core.jar only
for compiling the project - not for deploying.

Thanks,
Sandeep


On 21 May 2013 16:46, Steve Rowe  wrote:

> The 4.0 solr-core jar is available in Maven Central: <
> http://search.maven.org/#artifactdetails%7Corg.apache.solr%7Csolr-core%7C4.0.0%7Cjar
> >
>
> Steve
>
> On May 21, 2013, at 11:26 AM, Sandeep Mestry  wrote:
>
> > Hi Steve,
> >
> > Solr 4.0 - mentioned in the subject.. :-)
> >
> > Thanks,
> > Sandeep
> >
> >
> > On 21 May 2013 14:58, Steve Rowe  wrote:
> >
> >> Sandeep,
> >>
> >> What version of Solr are you using?
> >>
> >> Steve
> >>
> >> On May 21, 2013, at 6:55 AM, Sandeep Mestry 
> wrote:
> >>
> >>> Hi Shawn,
> >>>
> >>> Thanks for your reply.
> >>>
> >>> I'm not mixing versions.
> >>> The problem I faced is I want to override Highlighter from solr-core
> jar
> >>> and if I add that as a dependency in my project then there was a clash
> >>> between solr-core.jar and the apache-solr-core.jar that comes bundled
> >>> within the solr distribution. It was complaining about
> >> MorfologikFilterFactory
> >>> classcastexception.
> >>> I can't use apache-solr-core.jar as a dependency as no such jar exists
> in
> >>> any maven repo.
> >>>
> >>> The only thing I could do is to remove apache-solr-core.jar from
> solr.war
> >>> and then use solr-core.jar as a dependency - however I do not think
> this
> >> is
> >>> the ideal solution.
> >>>
> >>> Thanks,
> >>> Sandeep
> >>>
> >>>
> >>> On 20 May 2013 15:18, Shawn Heisey  wrote:
> >>>
> >>>> On 5/20/2013 8:01 AM, Sandeep Mestry wrote:
> >>>>> And I do remember the discussion on the forum about dropping the name
> >>>>> *apache* from solr jars. If that's what caused this issue, then can
> you
> >>>>> tell me if the mirrors need updating with solr-core.jar instead of
> >>>>> apache-solr-core.jar?
> >>>>
> >>>> If it's named apache-solr-core, then it's from 4.0 or earlier.  If
> it's
> >>>> named solr-core, then it's from 4.1 or later.  That might mean that
> you
> >>>> are mixing versions - don't do that.  Make sure that you have jars
> from
> >>>> the exact same version as your server.
> >>>>
> >>>> Thanks,
> >>>> Shawn
> >>>>
> >>>>
> >>
> >>
>
>

Re: Boosting Documents

2013-05-22 Thread Sandeep Mestry

Hi Oussama,

This is explained very nicely on Solr Wiki..
http://wiki.apache.org/solr/SolrRelevancyFAQ#index-time_boosts
http://wiki.apache.org/solr/UpdateXmlMessages#Optional_attributes_for_.22add.22

All you need to do is something similar to below..

   -

 05991
  Bridgewater  

What is not clear from your message is whether you need better scoring or
better sorting. so, additionally, you can consider adding a secondary sort
parameter for the docs having the same score.
http://wiki.apache.org/solr/CommonQueryParameters#sort

HTH,
Sandeep

On 22 May 2013 09:21, Oussama Jilal  wrote:

> Thank you for your reply bbarani,
>
> I can't do that because I want to boost some documents over others,
> independing of the query.
>
>
> On 05/21/2013 05:41 PM, bbarani wrote:
>
>>  Why don't you boost during query time?
>>
>> Something like q=superman&qf=title^2 subject
>>
>> You can refer: 
>> http://wiki.apache.org/solr/**SolrRelevancyFAQ<http://wiki.apache.org/solr/SolrRelevancyFAQ>
>>
>>
>>
>> --
>> View this message in context: http://lucene.472066.n3.**
>> nabble.com/Boosting-Documents-**tp4064955p4064966.html<http://lucene.472066.n3.nabble.com/Boosting-Documents-tp4064955p4064966.html>
>> Sent from the Solr - User mailing list archive at Nabble.com.
>>
>
>

Re: Boosting Documents

2013-05-22 Thread Sandeep Mestry

I think that is applicable only for the field level boosting and not at
document level boosting.

Can you post your query, field definition and results you're expecting.

I am using index and query time boosting without any issues so far. also
which version of Solr you're using?


On 22 May 2013 10:44, Oussama Jilal  wrote:

> I don't know if this is the issue or not but, concidering this note from
> the wiki :
>
> NOTE: make sure norms are enabled (omitNorms="false" in the schema.xml)
> for any fields where the index-time boost should be stored.
>
> In my case where I only need to boost the whole document (not a specific
> field), do I have to activate the << omitNorms="false" >> for all the
> fields in the schema ?
>
>
>
>
> On 05/22/2013 10:41 AM, Oussama Jilal wrote:
>
>> Thank you Sandeep,
>>
>> I did post the document like that (a minor difference is that I did not
>> add the boost to the field since I don't want to boost on specific field, I
>> boosted the whole document '  '), but the issue
>> is that everything in the queries results has the same score even if they
>> had been indexed with different boosts, and I can't sort on another field
>> since this is independent from any field value.
>>
>> Any ideas ?
>>
>> On 05/22/2013 10:30 AM, Sandeep Mestry wrote:
>>
>>> Hi Oussama,
>>>
>>> This is explained very nicely on Solr Wiki..
>>> http://wiki.apache.org/solr/**SolrRelevancyFAQ#index-time_**boosts<http://wiki.apache.org/solr/SolrRelevancyFAQ#index-time_boosts>
>>> http://wiki.apache.org/solr/**UpdateXmlMessages#Optional_**
>>> attributes_for_.22add.22<http://wiki.apache.org/solr/UpdateXmlMessages#Optional_attributes_for_.22add.22>
>>>
>>> All you need to do is something similar to below..
>>>
>>> -
>>>
>>>   05991
>>>Bridgewater 
>>>
>>>
>>> What is not clear from your message is whether you need better scoring or
>>> better sorting. so, additionally, you can consider adding a secondary
>>> sort
>>> parameter for the docs having the same score.
>>> http://wiki.apache.org/solr/**CommonQueryParameters#sort<http://wiki.apache.org/solr/CommonQueryParameters#sort>
>>>
>>>
>>> HTH,
>>> Sandeep
>>>
>>>
>>> On 22 May 2013 09:21, Oussama Jilal  wrote:
>>>
>>>  Thank you for your reply bbarani,
>>>>
>>>> I can't do that because I want to boost some documents over others,
>>>> independing of the query.
>>>>
>>>>
>>>> On 05/21/2013 05:41 PM, bbarani wrote:
>>>>
>>>>Why don't you boost during query time?
>>>>>
>>>>> Something like q=superman&qf=title^2 subject
>>>>>
>>>>> You can refer: 
>>>>> http://wiki.apache.org/solr/SolrRelevancyFAQ<http://wiki.apache.org/solr/**SolrRelevancyFAQ>
>>>>> <http://wiki.**apache.org/solr/**SolrRelevancyFAQ<http://wiki.apache.org/solr/SolrRelevancyFAQ>
>>>>> >
>>>>>
>>>>>
>>>>>
>>>>> --
>>>>> View this message in context: http://lucene.472066.n3.**
>>>>> nabble.com/Boosting-Documents-tp4064955p4064966.html<http://nabble.com/Boosting-Documents-**tp4064955p4064966.html>
>>>>> >>>> tp4064955p4064966.html<http://lucene.472066.n3.nabble.com/Boosting-Documents-tp4064955p4064966.html>>
>>>>>
>>>>> Sent from the Solr - User mailing list archive at Nabble.com.
>>>>>
>>>>>
>>>>
>>
>

Re: Boosting Documents

2013-05-22 Thread Sandeep Mestry

Did you use the debugQuery=true in solr console to see how the query is
being interpreted and the result calculation?

Also, I'm not sure but this copyfield directive seems a bit confusing to
me..

Because multiValued is false for Suggestion field so does that schema mean
Suggestion has value only from Id and not from any other input?

You haven't mentioned the version of Solr, can you also post the query
params?



On 22 May 2013 11:04, Oussama Jilal  wrote:

> I don't know if this can help (since the document boost should be
> independent of any schema) but here is my schema :
>
>|
>
> 
>   sortMissingLast="true"  />
>   sortMissingLast="true"  precisionStep="0"  positionIncrementGap="0"  />
>   sortMissingLast="true"  omitNorms="true">
> 
>   />
> 
>   maxGramSize="255"  />
> 
> 
>   />
> 
> 
> 
> 
> 
>   stored="true"  multiValued="false"  required="true"  />
>   stored="true"  multiValued="false"  required="false"  />
>   stored="true"  multiValued="false"  required="true"  />
>   stored="true"  multiValued="true"  required="false"  />
>   stored="true"/>
> 
> 
> Id
> **Suggestion
>|
>
> My query is somthing like : Suggestion:"Olive Oil".
>
> The result is 9 documents, wich all has the same score "11.287682", even
> if they had been indexed with different boosts (I am sure of this).
>
>
>
>
> On 05/22/2013 10:54 AM, Sandeep Mestry wrote:
>
>> I think that is applicable only for the field level boosting and not at
>> document level boosting.
>>
>> Can you post your query, field definition and results you're expecting.
>>
>> I am using index and query time boosting without any issues so far. also
>> which version of Solr you're using?
>>
>>
>> On 22 May 2013 10:44, Oussama Jilal  wrote:
>>
>>  I don't know if this is the issue or not but, concidering this note from
>>> the wiki :
>>>
>>> NOTE: make sure norms are enabled (omitNorms="false" in the schema.xml)
>>> for any fields where the index-time boost should be stored.
>>>
>>> In my case where I only need to boost the whole document (not a specific
>>> field), do I have to activate the << omitNorms="false" >> for all the
>>> fields in the schema ?
>>>
>>>
>>>
>>>
>>> On 05/22/2013 10:41 AM, Oussama Jilal wrote:
>>>
>>>  Thank you Sandeep,
>>>>
>>>> I did post the document like that (a minor difference is that I did not
>>>> add the boost to the field since I don't want to boost on specific
>>>> field, I
>>>> boosted the whole document '  '), but the
>>>> issue
>>>> is that everything in the queries results has the same score even if
>>>> they
>>>> had been indexed with different boosts, and I can't sort on another
>>>> field
>>>> since this is independent from any field value.
>>>>
>>>> Any ideas ?
>>>>
>>>> On 05/22/2013 10:30 AM, Sandeep Mestry wrote:
>>>>
>>>>  Hi Oussama,
>>>>>
>>>>> This is explained very nicely on Solr Wiki..
>>>>> http://wiki.apache.org/solr/SolrRelevancyFAQ#index-time_boosts<http://wiki.apache.org/solr/**SolrRelevancyFAQ#index-time_**boosts>
>>>>> <http://wiki.apache.org/**solr/SolrRelevancyFAQ#index-**time_boosts<http://wiki.apache.org/solr/SolrRelevancyFAQ#index-time_boosts>
>>>>> >
>>>>> http://wiki.apache.org/solr/UpdateXmlMessages#Optional_**<http://wiki.apache.org/solr/**UpdateXmlMessages#Optional_**>
>>>>> attributes_for_.22add.22>>>> UpdateXmlMessages#Optional_**attributes_for_.22add.22<http://wiki.apache.org/solr/UpdateXmlMessages#Optional_attributes_for_.22add.22>
>>>>> >
>>>>>
>>>>>
>>>>> All

Re: Boosting Documents

2013-05-22 Thread Sandeep Mestry

I'm running out of options now, can't really see the issue you're facing
unless the debug analysis is posted.
I think a thorough debugging is required from both application and solr
level.

If you want a customize scoring from Solr, you can also consider overriding
DefaultSimilarity implementation - but that'll be a separate issue.


On 22 May 2013 11:32, Oussama Jilal  wrote:

> Yes I did debug it and there is nothing special about it, everything is
> treated the same,
>
> My Solr version is 4.2
>
> The copy field is used because the 2 field are of different types but only
> one value is indexed in them (so no multiValue is required and it works
> perfectly).
>
>
>
>
> On 05/22/2013 11:18 AM, Sandeep Mestry wrote:
>
>> Did you use the debugQuery=true in solr console to see how the query is
>> being interpreted and the result calculation?
>>
>> Also, I'm not sure but this copyfield directive seems a bit confusing to
>> me..
>> 
>> Because multiValued is false for Suggestion field so does that schema mean
>> Suggestion has value only from Id and not from any other input?
>>
>> You haven't mentioned the version of Solr, can you also post the query
>> params?
>>
>>
>>
>> On 22 May 2013 11:04, Oussama Jilal  wrote:
>>
>>  I don't know if this can help (since the document boost should be
>>> independent of any schema) but here is my schema :
>>>
>>> |
>>> 
>>>  
>>>  >>   sortMissingLast="true"  />
>>>  >>   sortMissingLast="true"  precisionStep="0"  positionIncrementGap="0"  />
>>>  >>   sortMissingLast="true"  omitNorms="true">
>>>  
>>>  >>   />
>>>  >>  />
>>>  >>
>>>   maxGramSize="255"  />
>>>  
>>>  
>>>  >>   />
>>>  >>  />
>>>
>>>  
>>>  
>>>  
>>>  
>>>  >>   stored="true"  multiValued="false"  required="true"  />
>>>  >>   stored="true"  multiValued="false"  required="false"  />
>>>  >>   stored="true"  multiValued="false"  required="true"  />
>>>  >>   stored="true"  multiValued="true"  required="false"  />
>>>  >>   stored="true"/>
>>>  
>>>  
>>>  Id
>>>  Suggestion
>>>
>>> |
>>>
>>> My query is somthing like : Suggestion:"Olive Oil".
>>>
>>> The result is 9 documents, wich all has the same score "11.287682", even
>>> if they had been indexed with different boosts (I am sure of this).
>>>
>>>
>>>
>>>
>>> On 05/22/2013 10:54 AM, Sandeep Mestry wrote:
>>>
>>>  I think that is applicable only for the field level boosting and not at
>>>> document level boosting.
>>>>
>>>> Can you post your query, field definition and results you're expecting.
>>>>
>>>> I am using index and query time boosting without any issues so far. also
>>>> which version of Solr you're using?
>>>>
>>>>
>>>> On 22 May 2013 10:44, Oussama Jilal  wrote:
>>>>
>>>>   I don't know if this is the issue or not but, concidering this note
>>>> from
>>>>
>>>>> the wiki :
>>>>>
>>>>> NOTE: make sure norms are enabled (omitNorms="false" in the schema.xml)
>>>>> for any fields where the index-time boost should be stored.
>>>>>
>>>>> In my case where I only need to boost the whole document (not a
>>>>> specific
>>>>> field), do I have to activate the << omitNorms="false" >> for all the
>>>>> fields in the schema ?
>>>>>
>>>>>
>>>>>
>>>>>
>>>>> On 05/22/2013 10:41 AM, Oussama Ji

Re: Solr 4.0 war startup issue - apache-solr-core.jar Vs solr-core

2013-05-22 Thread Sandeep Mestry

Thanks Erick for your suggestion.

Turns out I won't be going that route after all as the highlighter
component is quite complicated - to follow and to override - and not much
time left in hand so did it the manual (dirty) way.

Beat Regards,
Sandeep


On 22 May 2013 12:21, Erick Erickson  wrote:

> Sandeep:
>
> You need to be a little careful here, I second Shawn's comment that
> you are mixing versions. You say you are using solr 4.0. But the jar
> that ships with that is apache-solr-core-4.0.0.jar. Then you talk
> about using solr-core, which is called solr-core-4.1.jar.
>
> Maven is not officially supported, so grabbing some solr-core.jar
> (with no apache) and doing _anything_ with it from a 4.0 code base is
> not a good idea.
>
> You can check out the 4.0 code branch and just compile the whole
> thing. Or you can get a new 4.0 distro and use the jars there. But I'd
> be _really_ cautious about using a 4.1 or later jar with 4.0.
>
> FWIW,
> Erick
>
> On Tue, May 21, 2013 at 12:05 PM, Sandeep Mestry 
> wrote:
> > Thanks Steve,
> >
> > I could find solr-core.jar in the repo but could not find
> > apache-solr-core.jar.
> > I think my issue got misunderstood - which is totally my fault.
> >
> > Anyway, I took into account Shawn's comment and will use solr-core.jar
> only
> > for compiling the project - not for deploying.
> >
> > Thanks,
> > Sandeep
> >
> >
> > On 21 May 2013 16:46, Steve Rowe  wrote:
> >
> >> The 4.0 solr-core jar is available in Maven Central: <
> >>
> http://search.maven.org/#artifactdetails%7Corg.apache.solr%7Csolr-core%7C4.0.0%7Cjar
> >> >
> >>
> >> Steve
> >>
> >> On May 21, 2013, at 11:26 AM, Sandeep Mestry 
> wrote:
> >>
> >> > Hi Steve,
> >> >
> >> > Solr 4.0 - mentioned in the subject.. :-)
> >> >
> >> > Thanks,
> >> > Sandeep
> >> >
> >> >
> >> > On 21 May 2013 14:58, Steve Rowe  wrote:
> >> >
> >> >> Sandeep,
> >> >>
> >> >> What version of Solr are you using?
> >> >>
> >> >> Steve
> >> >>
> >> >> On May 21, 2013, at 6:55 AM, Sandeep Mestry 
> >> wrote:
> >> >>
> >> >>> Hi Shawn,
> >> >>>
> >> >>> Thanks for your reply.
> >> >>>
> >> >>> I'm not mixing versions.
> >> >>> The problem I faced is I want to override Highlighter from solr-core
> >> jar
> >> >>> and if I add that as a dependency in my project then there was a
> clash
> >> >>> between solr-core.jar and the apache-solr-core.jar that comes
> bundled
> >> >>> within the solr distribution. It was complaining about
> >> >> MorfologikFilterFactory
> >> >>> classcastexception.
> >> >>> I can't use apache-solr-core.jar as a dependency as no such jar
> exists
> >> in
> >> >>> any maven repo.
> >> >>>
> >> >>> The only thing I could do is to remove apache-solr-core.jar from
> >> solr.war
> >> >>> and then use solr-core.jar as a dependency - however I do not think
> >> this
> >> >> is
> >> >>> the ideal solution.
> >> >>>
> >> >>> Thanks,
> >> >>> Sandeep
> >> >>>
> >> >>>
> >> >>> On 20 May 2013 15:18, Shawn Heisey  wrote:
> >> >>>
> >> >>>> On 5/20/2013 8:01 AM, Sandeep Mestry wrote:
> >> >>>>> And I do remember the discussion on the forum about dropping the
> name
> >> >>>>> *apache* from solr jars. If that's what caused this issue, then
> can
> >> you
> >> >>>>> tell me if the mirrors need updating with solr-core.jar instead of
> >> >>>>> apache-solr-core.jar?
> >> >>>>
> >> >>>> If it's named apache-solr-core, then it's from 4.0 or earlier.  If
> >> it's
> >> >>>> named solr-core, then it's from 4.1 or later.  That might mean that
> >> you
> >> >>>> are mixing versions - don't do that.  Make sure that you have jars
> >> from
> >> >>>> the exact same version as your server.
> >> >>>>
> >> >>>> Thanks,
> >> >>>> Shawn
> >> >>>>
> >> >>>>
> >> >>
> >> >>
> >>
> >>
>

Re: Solr Faceting doesn't return values.

2013-05-22 Thread Sandeep Mestry

Hi There,

Not sure I understand your problem correctly, but is 'mm_state_code' a real
value or is it field name?
Also, as Erick pointed out above, the facets are not calculated if there
are no results. Hence you get no facets.

You have mentioned which facets you want but you haven't mentioned which
field you want to search against. That field should be defined in df
parameter instead of sa_property_id.

Can you post example solr document you're indexing?

-Sandeep

On 22 May 2013 14:28, samabhiK  wrote:

> Ok my bad.
>
> I do have a default field defined in the /select handler in the config
> file.
>
> 
>explicit
>10
>sa_property_id
> 
>
> But then how do I change my query now?
>
>
>
>
> --
> View this message in context:
> http://lucene.472066.n3.nabble.com/Solr-Faceting-doesn-t-return-values-tp4065276p4065298.html
> Sent from the Solr - User mailing list archive at Nabble.com.
>

Re: filter query by string length or word count?

2013-05-22 Thread Sandeep Mestry

I doubt if there is any straight out of the box feature that supports this
requirement, you will probably need to handle this at the index time.
You can play around with Function Queries
http://wiki.apache.org/solr/FunctionQuery for any such feature.

On 22 May 2013 16:37, Sam Lee  wrote:

> I have schema.xml
>  omitNorms="true"/>
> ...
>  positionIncrementGap="100">
> 
> 
> 
>  ignoreCase="true"
> words="stopwords_en.txt"
> enablePositionIncrements="true"
> />
> 
> 
>  protected="protwords.txt"/>
> 
> 
> 
> 
>  ignoreCase="true" expand="true"/>
>  ignoreCase="true"
> words="stopwords_en.txt"
> enablePositionIncrements="true"
> />
> 
> 
>  protected="protwords.txt"/>
> 
> 
> 
>
>
> how can I query docs whose body has more than 80 words (or 80 characters) ?
>

Re: Solr Faceting doesn't return values.

2013-05-22 Thread Sandeep Mestry

>From the response you've mentioned it appears to me that the query term TX
is searched against sa_site_city instead of mm_state_code.
Can you try your query like below:

http://xx.xx.xx.xx/solr/collection1/select?q=*mm_state_code:(**TX)*
&wt=xml&indent=true&facet=true&facet.field=sa_site_city&debug=all

and post your output?

On 22 May 2013 17:13, samabhiK  wrote:

> sa_site_city

Re: Solr Faceting doesn't return values.

2013-05-23 Thread Sandeep Mestry

*org.apache.solr.search.SyntaxError: Cannot parse
'*mm_state_code:(**TX)*': Encountered " ":" ": "" at line 1, column 14.
Was expecting one of:*

This suggests to me that you kept the df parameter in the query hence it
was forming mm_state_code:mm_state_code:(TX), can you try exactly they way
I gave you - i.e. without the df parameter?
Also, can you post schema.xml and /select handler config from
solrconfig.xml?


On 22 May 2013 18:36, samabhiK  wrote:

> When I use your query, I get :
>
> 
> 
>
> 
>   400
>   12
>   
> true
> mm_state_code
> true
> *mm_state_code:(**TX)*
> 1369244078714
> all
> sa_site_city
> xml
>   
> 
> 
>   org.apache.solr.search.SyntaxError: Cannot parse
> '*mm_state_code:(**TX)*': Encountered " ":" ": "" at line 1, column 14.
> Was expecting one of:
> 
>  ...
>  ...
>  ...
> "+" ...
> "-" ...
>  ...
> "(" ...
> "*" ...
> "^" ...
>  ...
>  ...
>  ...
>  ...
>  ...
>  ...
> "[" ...
> "{" ...
>  ...
>  ...
> 
>   400
> 
> 
>
> Not sure why the data wont show up. Almost all the records has the field
> sa_site_city has data and is also indexed. :(
>
>
>
> --
> View this message in context:
> http://lucene.472066.n3.nabble.com/Solr-Faceting-doesn-t-return-values-tp4065276p4065406.html
> Sent from the Solr - User mailing list archive at Nabble.com.
>

Solr 4.2.1 + Distribution scripts (rsync) Issue

2013-06-04 Thread Sandeep Mestry

Dear All,

*Background:*
I have recently upgraded solr from 4.0 to 4.2.1 and have re-indexed all the
data. All good so far, we received better query time, lesser index size and
now it looks all shiny and nice.
However, we haven't yet implemented SolrCloud and still relying on
distribution scripts - rsync, indexpuller mechanism.

*Issue:*
I see that the indexes are getting created on indexer boxes, snapshots
being created and then pulled across to search boxes. The snapshots are
getting installed on search boxes as well. There are no errors in the
scripts logs and this process works well.
However, when I check the update in solr console (on search boxes), I do
not see the updated result. The updates do not appear in search boxes even
after manual commit. Only after a *restart* of the search application
(deployed in tomcat) I can see the updated results.
I have done minimal changes for the upgrade in solrconfig.xml and is pasted
below. Please can someone take a look and let me know what the issue is.
The same config was working fine on Solr 4.0 (as well as Solr 1.4.1).

Thanks,
Sandeep
p.s: We'll be upgrading to SolrCloud in the next release of the project but
this release will be managed with only Solr 4.2.1 upgrade.


--
 solrconfig.xml
--



  LUCENE_42


${solr.abortOnConfigurationError:true}

  ${solr.data.dir:./solr/data}

  

  
14
10
32
1


  1
  0

  

  

  50
  9
  false


  50
  9
  false

  

  
1024





true

20

100


  

  (*:*)
  AND
  0
  10
  standard

  



  

  (*:*)
  AND
  0
  10
  standard

  


false

2
  

  



  

  

  explicit

  

  
  

  
  

  
  

  standard
  *:*
  all

  

  

  explicit
  true

  

  
5
  

  
*:*

Re: Solr 4.2.1 + Distribution scripts (rsync) Issue

2013-06-05 Thread Sandeep Mestry

Hi Hoss,

Thanks for your reply, Please find answers to your questions below.

*Well, for starters -- have you considered at least looking into using the java
based Replicationhandler instead of the rsync scripts?*
- There was an attempt to to implement java based replication but it was
very slow and so that option was discarded and instead rsync was used. This
was done couple of years ago and till Feb of this year, we were using Solr
1.4. I upgraded solr to 4.0 with rsync, however due to time and resource
constraint rsync alternative was not evaluated and it can't be done even
today - only in next release, we'll be doing solrcloud.

My setup looks like below - this was working correctly with Solr 1.4, Solr
4.0 versions.

1) Index Feeder applications feeds indexes to indexer boxes.
2) A cron job that runs every minute on indexer boxes (commiter), commits
the indexes (commit) and invokes snapshooter to create snapshot. rsync
daemon running on indexer boxes.
3) Another cron job runs on search boxes every minute, which pulls the
snapshot (using snappuller), installs it on search boxes (snapinstaller)
which also notifies search to open a new searcher (commit)

Additionally, there is a cron job that runs every morning at 4 am on
indexer boxes which optimises the index (optimize) and cleans the snapshots
until a day (snapcleaner).
This is as per http://wiki.apache.org/solr/SolrCollectionDistributionScripts

*Which config is this, your indexer or your searcher? (i'm assuming it's the
searcher since i don't see any postCommit commands to exec snapshooter but
i wanted to sanity check that wasn't a simple explanation for your problem)*
- Because of this set up, I do not have any post commit setup in
solrconfig.xml.
- This solrconfig.xml is used for both indexer and searcher boxes.

I can see that after my upgrade to Solr 4.2.1, all these scripts behave
normally just that I do not see the updates getting refreshed on search
boxes unless I restart.
*
*
*What exactly does your "manual commit" command look like?  *
- This is by using commit script under bin directory (commit -h localhost
-p 8983)
- I have also tried URL based commit as you had mentioned but no luck

*Are you doing this on the indexer box or the searcher boxes? *
- I executed manual commit on searcher boxes, the indexer boxes do show the
commit and updates correctly.

*what is the HTTP response from this comment? what do the logs show when
you do this?
*
- I have attached the logs, please note that I have enabled the
openSearcher for testing.

Thanks, please let me know if I'm missing something. I remembered people
not getting their deletes and the workaround was to add _version_ field in
schema, which I had done but no luck. I know it might be unrelated but I am
just trying all my options.

Thanks again,
Sandeep


On 5 June 2013 00:41, Chris Hostetter  wrote:

>
> : However, we haven't yet implemented SolrCloud and still relying on
> : distribution scripts - rsync, indexpuller mechanism.
>
> Well, for starters -- have you considered at least looking into using hte
> java based Replicationhandler instead of the rsync scripts?
>
> Script based replication has not been actively maintained since java
> replication was added back in Solr 1.4!
>
> : I see that the indexes are getting created on indexer boxes, snapshots
> : being created and then pulled across to search boxes. The snapshots are
> : getting installed on search boxes as well. There are no errors in the
> : scripts logs and this process works well.
> : However, when I check the update in solr console (on search boxes), I do
> : not see the updated result. The updates do not appear in search boxes
> even
> : after manual commit. Only after a *restart* of the search application
> : (deployed in tomcat) I can see the updated results.
>
> What exactly does your "manual commit" command look like?  Are you
> doing this on the indexer box or the searcher boxes?  what is the HTTP
> response from this comment? what do the logs show when you do this?
>
> It's possible that some internal changes in Solr relating to NRT
> improvements may have optimized away re-opening on commit if solr doesn't
> think the index has changed -- but i doubt it.  because I just tried a
> simple test using the 4.3.0 example where i manually simulated
> snapinstaller replacing hte index files with a newer index and issued
> "http://localhost:8983/solr/update?commit=true"; and solr loaded up that
> new index and started searching it -- so i suspect the devil is in the
> details of your setup.
>
> you're sure each of the snapshooter, snappuller, snapinstaller scripts are
> executing properly?
>
> : I have done minimal changes for the upgrade in solrconfig.xml and is
> pasted
> : below. Please can someone take a look and let me know

Need Help in migrating Solr version 1.4 to 4.3

2013-06-24 Thread Sandeep Gupta

Hello All,

We are planning to migrate solr 1.4 to Solr 4.3 version.
And I am seeking some help in this side.

Considering Schema file change:
By default there are lots of changes if I compare original Solr 1.4 schema
file to Sol 4.3 schema file.
And that is the reason we are not copying paste of schema file.
In our Solr 1.4 schema implementation, we have some custom fields with type
"textgen" and "text"
So in migration of these custom fields to Solr 4.3,  should I use type of
"text_general" as replacement of "textgen" and
"text_en" as replacement of "text"?
Please confirm the same.

Considering Solrconfig change:
As we didn't have lots of changes in 1.4 solrconfig file except the
dataimport request handler.
And therefore in migration side, we are simply modifying the Solr 4.3
solrconfig file with his request handler.

Considering the application development:

We used all the queries as BOOLEAN type style (was not good)  I mean put
all the parameter in query fields i.e
*:* AND EntityName: <<>> AND : AND .

I think we should simplify our queries using other fields like df, qf 

We also used to create Solr server object via CommonsHttpSolrServer() so I
am planning to use now HttpSolrServer API>

Please let me know the suggestion for above points also what are the other
factors I need to take care while considering the migration.

Regards
Sandeep

SOLR Search Query : Exception : Software caused connection abort

2010-07-15 Thread sandeep kumar


Hi,
I am trying to test the SOLR search with very big query , but when i try its
throwing exception: "Exception : Software caused connection abort".
I'm using  HTTP POST and server I'm using is Tomcat.
Is SOLR query has any limitations with size or length..etc??
P ls help me and let me know solution to this problem ASAP.

Regards
Sandeep 
-- 
View this message in context: 
http://lucene.472066.n3.nabble.com/SOLR-Search-Query-Exception-Software-caused-connection-abort-tp969331p969331.html
Sent from the Solr - User mailing list archive at Nabble.com.

SOLR Search Query : Exception : Software caused connection abort

2010-07-15 Thread sandeep kumar


Hi,
I am trying to test the SOLR search with very big query , but when i try its
throwing exception: "Exception : Software caused connection abort".
I'm using  HTTP POST and server I'm using is Tomcat.
Is SOLR query has any limitations with size or length..etc??
P ls help me and let me know solution to this problem ASAP.

Regards
Sandeep 
-- 
View this message in context: 
http://lucene.472066.n3.nabble.com/SOLR-Search-Query-Exception-Software-caused-connection-abort-tp969444p969444.html
Sent from the Solr - User mailing list archive at Nabble.com.

1 2 >

1 - 100 of 182 matches

Mail list logo