Re: Index corruption with replication

2017-03-16 Thread santosh sidnal
Hi Erik/David,

Schema is same on both live and stage servers. We are using the same schema
files on the stage and live files.


   - Schema files are included in replication but these are not being
   changed whenever we are observing schema corruption issue.
   - My guess is that because of replication the core is getting corrupted.
   - SOLR version used is 4.7.0
   -



Exception which I see in log is

org.apache.solr.common.SolrException log
org.apache.lucene.index.CorruptIndexException: Corrupted: docID=8195,
docBase=7, chunkDocs=249, numDocs=10596
(resource=MMapIndexInput(path="/app/IBM/WebSphere/CommerceServer70/instances/RBUATLV/search/solr/home/MC_10001/fr_FR/CatalogEntry/data/index/_5a.fdt"))
at
org.apache.lucene.codecs.compressing.CompressingStoredFieldsReader.visitDocument(CompressingStoredFieldsReader.java:236)
at org.apache.lucene.index.SegmentReader.document(SegmentReader.java:276)
at
org.apache.lucene.index.BaseCompositeReader.document(BaseCompositeReader.java:110)
at org.apache.solr.search.SolrIndexSearcher.doc(SolrIndexSearcher.java:661)
at
org.apache.solr.util.SolrPluginUtils.optimizePreFetchDocs(SolrPluginUtils.java:213)
at
org.apache.solr.handler.component.QueryComponent.doPrefetch(QueryComponent.java:568)
at
org.apache.solr.handler.component.QueryComponent.process(QueryComponent.java:475)
at

On 15 March 2017 at 22:42, Erick Erickson  wrote:

> You can specify your replication to include config files, but if the
> schema has changed you'll have to restart your Solr afterwards.
>
> How is it corrupt? what is the symptom? Any error messages in the solr
> log on the slave? What version of Solr? Details matter.
>
> Best,
> Erick
>
> On Wed, Mar 15, 2017 at 9:12 AM, David Hastings
>  wrote:
> > are you certain the schema is the same on both master and slave?  I find
> > that the schema file doesnt always go with the replication and if a field
> > is different on the slave it will cause problems
> >
> > On Wed, Mar 15, 2017 at 12:08 PM, Santosh Sidnal <
> sidnal.sant...@gmail.com>
> > wrote:
> >
> >> Hi all,
> >>
> >> I am facing issues of index corruption at regular intervals of the time
> on
> >> live server where i pull index data from one master server.
> >>
> >> Can anyone please give us some ppinters why we are facing issue on
> regular
> >> interval of time?
> >> I am aware of how can we correct corrupted index but i am looking some
> >> pointers how can i stop or reduce this occurrence.
> >>
> >> Thanks in advance.
> >>
> >>
> >> Sent from my iPhone
>



-- 
Regards,
Santosh Sidnal


sum multivalued field index with banana

2017-03-16 Thread tkg_cangkul

hi sorry if this a little bit out ouf topic,

i've just started to using banana dashboard. and i want to do summarize 
proccess from data that indexed in solr


can i do sum proccess with banana dashboard when i have some multivalued 
data index on my field?


this is my sample data on solr :

"timestamp_dt":"2016-12-30T15:50:00Z",
"FR":["fr1"],
"EV":"89v",
"RC":[0],
"SF":["SSP"],
"CT":["POST"],
"rb.id":["rb30", "rb30"],
"rb.co":[1,  2],
"rb.lat":[47, 9]

Ok, from the data above, is it possible to summarize the value of 
"rb.co" with EV as a Group By. ?

On my banana dashboard panel, i've try to set something like this :



but there is nothing happen on it.

any suggestion pls ?



Best Regards,

Yuza


fq performance

2017-03-16 Thread Ganesh M
Hi,

We have 1 million of documents and would like to query with multiple fq values.

We have kept the access_control ( multi value field ) which holds information 
about for which group that document is accessible.

Now to get the list of all the documents of an user, we would like to pass 
multiple fq values ( one for each group user belongs to )

q:somefiled:value&
fq:access_control:g1&fq:access_control:g2&fq:access_control:g3&fq:access_control:g4&fq:access_control:g5...

Like this, there could be 100 groups for an user.

If we fire query with 100 values in the fq, whats the penalty on the 
performance ? Can we get the result in less than one second for 1 million of 
documents.

Let us know your valuable inputs on this.

Regards,


Re: fq performance

2017-03-16 Thread Michael Kuhlmann
First of all, from what I can see, this won't do what you're expecting. 
Multiple fq conditions are always combined using AND, so if a user is 
member of 100 groups, but the document is accessible to only 99 of them, 
then the user won't find it.


Or in other words, if you add a user to some group, then she would get 
*less* results than before.


But coming back to your performance question: Just try it. Having 100 fq 
conditions will of course slow down your query a bit, but not that much. 
I rather see the problem with the filter cache: It will only be fast 
enough if all of your fq filters fit into the cache. Each possible fq 
filter will take 1 million/8 == 125k bytes, so having hundreds of 
possible access groups conditions might blow up your query cache (which 
must fit into RAM).


-Michael


Am 16.03.2017 um 13:02 schrieb Ganesh M:

Hi,

We have 1 million of documents and would like to query with multiple fq values.

We have kept the access_control ( multi value field ) which holds information 
about for which group that document is accessible.

Now to get the list of all the documents of an user, we would like to pass 
multiple fq values ( one for each group user belongs to )

q:somefiled:value&
fq:access_control:g1&fq:access_control:g2&fq:access_control:g3&fq:access_control:g4&fq:access_control:g5...

Like this, there could be 100 groups for an user.

If we fire query with 100 values in the fq, whats the penalty on the 
performance ? Can we get the result in less than one second for 1 million of 
documents.

Let us know your valuable inputs on this.

Regards,





Re: fq performance

2017-03-16 Thread Shawn Heisey
On 3/16/2017 6:02 AM, Ganesh M wrote:
> We have 1 million of documents and would like to query with multiple fq 
> values.
>
> We have kept the access_control ( multi value field ) which holds information 
> about for which group that document is accessible.
>
> Now to get the list of all the documents of an user, we would like to pass 
> multiple fq values ( one for each group user belongs to )
>
> q:somefiled:value&fq:access_control:g1&fq:access_control:g2&fq:access_control:g3&fq:access_control:g4&fq:access_control:g5...
>
> Like this, there could be 100 groups for an user.

The correct syntax is fq=field:value -- what you have there is not going
to work.

This might not do what you expect.  Filter queries are ANDed together --
*every* filter must match, which means that if a document that you want
has only one of those values in access_control, or has 98 of them but
not all 100, then the query isn't going to match that document.  The
solution is one filter query that can match ANY of them, which also
might run faster.  I can't say whether this is a problem for you or
not.  Your data might be completely correct for matching 100 filters.

Also keep in mind that there is a limit to the size of a URL that you
can send into any webserver, including the container that runs Solr. 
That default limit is 8192 bytes, and includes the "GET " or "POST " at
the beginning and the " HTTP/1.1" at the end (note the spaces).  The
filter query information for 100 of the filters you mentioned is going
to be over 2K, which will fit in the default, but if your query has more
complexity than you have mentioned here, the total URL might not fit. 
There's a workaround to this -- use a POST request and put the
parameters in the request body.

> If we fire query with 100 values in the fq, whats the penalty on the 
> performance ? Can we get the result in less than one second for 1 million of 
> documents.

With one million documents, each internal filter query result is 25
bytes -- the number of documents divided by eight.  That's 2.5 megabytes
for 100 of them.  In addition, every time a filter is run, it must
examine every document in the index to create that 25 byte
structure, which means that filters which *aren't* found in the
filterCache are relatively slow.  If they are found in the cache,
they're lightning fast, because the cache will contain the entire 25
byte bitset.

If you make your filterCache large enough, it's going to consume a LOT
of java heap memory, particularly if the index gets bigger.  The nice
thing about the filterCache is that once the cache entries exist, the
filters are REALLY fast, and if they're all cached, you would DEFINITELY
be able to get results in under one second.  I have no idea whether the
same would happen when filters aren't cached.  It might.  Filters that
do not exist in the cache will be executed in parallel, so the number of
CPUs that you have in the machine, along with the query rate, will have
a big impact on the overall performance of a single query with a lot of
filters.

Also related to the filterCache, keep in mind that every time a commit
is made that opens a new searcher, the filterCache will be autowarmed. 
If the autowarmCount value for the filterCache is large, that can make
commits take a very long time, which will cause problems if commits are
happening frequently.  On the other hand, a very small autowarmCount can
cause slow performance after a commit if you use a lot of filters.

My reply is longer and more dense than I had anticipated.  Apologies if
it's information overload.

Thanks,
Shawn



DIH delta import with cache 5.3.1 issue

2017-03-16 Thread Sujay Bawaskar
Hi,

We are using DIH with cache(SortedMapBackedCache) with solr 5.3.1. We have
around 2.8 million documents in solr and total index size is 4 GB. DIH
delta import is dumping all values of mapped columns to their respective
multi valued fields. This is causing size of one solr document upto 2 GB.
Is this a known issue with solr 5.3.1?

Thanks,
Sujay


Re: DIH delta import with cache 5.3.1 issue

2017-03-16 Thread Alexandre Rafalovitch
Could you give a bit more details. Do you mean one document gets the
content of multiple documents? And only on delta?

Regards,
Alex

On 16 Mar 2017 8:53 AM, "Sujay Bawaskar" 
wrote:

Hi,

We are using DIH with cache(SortedMapBackedCache) with solr 5.3.1. We have
around 2.8 million documents in solr and total index size is 4 GB. DIH
delta import is dumping all values of mapped columns to their respective
multi valued fields. This is causing size of one solr document upto 2 GB.
Is this a known issue with solr 5.3.1?

Thanks,
Sujay


Partial Match with DF

2017-03-16 Thread Mark Johnson
Forgive me if I'm missing something obvious -- I'm new to Solr, but I can't
seem to find an explanation for the behavior I'm seeing.

If I have a document that looks like this:
{
field1: "aaa bbb",
field2: "ccc ddd",
field3: "eee fff"
}

And I do a search where "q" is "aaa ccc", I get the document in the
results. This is because (please correct me if I'm wrong) the default "df"
is set to the "_text_" field, which contains the text values from all
fields.

However, if I do a search where "df" is "field1" and "field2" and "q" is
"aaa ccc" (words from field1 and field2) I get no results.

In a simpler example, if I do a search where "df" is "field1" and "q" is
"aaa" (a word from field1) I still get no results.

If I do a search where "df" is "field1" and "q" is "aaa bbb" (the full
value of field1) then I get the document in the results.

So I'm concluding that when using "df" to specify which fields to search
then only an exact match on the full field value will return a document.

Is that a correct conclusion? Is there another way to specify which fields
to search without requiring an exact match? The results I'd like to achieve
are:

Would Match:
q=aaa
q=aaa bbb
q=aaa ccc
q=aaa fff

Would Not Match:
q=eee
q=fff
q=eee fff

-- 
*This message is intended only for the use of the individual or entity to 
which it is addressed and may contain information that is privileged, 
confidential and exempt from disclosure under applicable law. If you have 
received this message in error, you are hereby notified that any use, 
dissemination, distribution or copying of this message is prohibited. If 
you have received this communication in error, please notify the sender 
immediately and destroy the transmitted information.*


DbVisualizer challenges with a secured solr

2017-03-16 Thread Marvin NotMyRealNameDuh
Hi,

I'm working with a product which includes solr under the covers, and
this has been secured using a custom authentication scheme. The admin UI on
port 8983 works correct once authenticated. I've also hacked the zkcli.sh
script thusly:

SOLR_ZK_CREDS_AND_ACLS="-DzkACLProvider=com.i2group.disco.search.solr.common.zookeeper.auth.internal.EncodedZkCredentialsACLProvider
\

-DzkCredentialsProvider=com.i2group.disco.search.solr.common.zookeeper.auth.internal.EncodedZkCredentialsProvider
\
  -Dsolr.solr.home=/data/cluster-nodes/clusters/is_cluster/nodes/node1"

CLASSPATH=
for i in $(ls
/i2a/deploy/wlp/usr/servers/awc/apps/awc.war/WEB-INF/lib/*.jar); do
CLASSPATH=$CLASSPATH:$i
done
for i in $(ls /i2a/deploy/wlp/usr/shared/resources/i2-common/lib/*.jar); do
CLASSPATH=$CLASSPATH:$i
done

PATH=$JAVA_HOME/bin:$PATH /opt/IBM/i2analyze/deploy/java/bin/java
$SOLR_ZK_CREDS_AND_ACLS  -Dlog4j.configuration=$log4j_config \
-classpath $CLASSPATH org.apache.solr.cloud.ZkCLI ${1+"$@"}

..and it works.

The credentials to authenticate to solr are stored in a file in
solr.solr.home - which is why that system property is needed.

 I've also hacked the launch script for dbvis to add the properties:

#!/bin/sh

# Uncomment the following line to override the JVM search sequence
# INSTALL4J_JAVA_HOME_OVERRIDE=
# Uncomment the following line to add additional VM parameters
# INSTALL4J_ADD_VM_PARAMS=
INSTALL4J_ADD_VM_PARAMS="-DzkACLProvider=com.i2group.disco.search.solr.common.zookeeper.auth.internal.EncodedZkCredentialsACLProvider
\

-DzkCredentialsProvider=com.i2group.disco.search.solr.common.zookeeper.auth.internal.EncodedZkCredentialsProvider
\
  -Dsolr.solr.home=/data/cluster-nodes/clusters/is_cluster/nodes/node1"

(no, adding these as database properties doesn't get me authenticated
to zookeeper)

.and now when I try to connect, DbVisualizer seems to connect to
zookeeper, but then I get:

2017-03-16 06:06:07.375 INFO   897 [ExecutorRunner-pool-3-thread-1 - H.??]
Exception while connecting kstephe-eia-reco
org.apache.solr.client.solrj.impl.HttpSolrClient$RemoteSolrException: Error
from server at http://kstephe-eia-reco.softlayer.com:8983/solr: Expected
mime type application/octet-stream but got text/html. 


Error 401 Unauthorized request, Response code: 401

HTTP ERROR 401
Problem accessing /solr/admin/info/system. Reason:
Unauthorized request, Response code: 401



at
org.apache.solr.client.solrj.impl.HttpSolrClient.executeMethod(HttpSolrClient.java:560)
at
org.apache.solr.client.solrj.impl.HttpSolrClient.request(HttpSolrClient.java:261)
at
org.apache.solr.client.solrj.impl.HttpSolrClient.request(HttpSolrClient.java:250)
at
org.apache.solr.client.solrj.SolrRequest.process(SolrRequest.java:149)
at org.apache.solr.client.solrj.SolrClient.query(SolrClient.java:942)
at org.apache.solr.client.solrj.SolrClient.query(SolrClient.java:957)
at
org.apache.solr.client.solrj.io.sql.DatabaseMetaDataImpl.getDatabaseProductVersion(DatabaseMetaDataImpl.java:124)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:95)

 .and the URL is weird, because its talking to port 8983 - which is
solr, not zookeeper. Once authenticated to solr, I can validate that the
/solr/admin/info/system URL responds correctly, but the problem seems to be
that when DbVisualizer asks zookeeper for the db metadata, it doesn't seem
to know how to authenticate to solr.

So.(1) is there something I can do to fix things or (2) is there a
problem in the solr / zookeeper code or (3) are the problems somewhere in
zkCredentialsProvider or zkACLProvider (where I can't fix a thing)?

Thanks
Marvin the paranoid


Re: Get handler not working

2017-03-16 Thread Chris Ulicny
iqdocid is already set to be the uniqueKey value.

I tried reindexing a few documents back into the problematic cloud and am
getting the same behavior of no document found for get handler.

I've also done some testing on standalone instances as well as some quick
cloud setups (with embedded zk), and I cannot seem to replicate the
problem. For each test, I used the exact same configset that is causing the
issue for us and indexed a document from that instance as well. I can
provide more details if that would be useful in anyway.

Standalone instance worked
Cloud mode worked regardless of the use of the security plugin
Cloud mode worked regardless of explicit get handler definition
Cloud mode consistently worked with explicitly defining the get handler,
then removing it and reloading the collection

The only differences that I know of between the tests and the problematic
cloud is that solr is running as a different user and using an external
zookeeper ensemble. The running user has ownership of the solr
installation, log, and data directories.

I'm going to keep trying different setups to see if I can replicate the
issue, but if anyone has any ideas on what direction might make the most
sense, please let me know.

Thanks again

On Wed, Mar 15, 2017 at 5:49 PM Erick Erickson 
wrote:

Wait... Is iqdocid set to the  in your schema? That might
be the missing thing.



On Wed, Mar 15, 2017 at 11:20 AM, Chris Ulicny  wrote:
> Unless the behavior's changed on the way to version 6.3.0, the get handler
> used to use whatever field is set to be the uniqueKey. We have
successfully
> been using get on a 4.9.0 standalone core with no explicit "id" field
> defined by passing in the value for the uniqueKey field to the get
handler.
> We tend to have a bunch of id fields floating around from different
> sources, so we avoid keeping any of them named as "id"
>
> iqdocid is just a basic string type
>  required="true" stored="true"/>
>
> I'll do some more testing on standalone versions, and see how that goes.
>
> On Wed, Mar 15, 2017 at 1:52 PM David Hastings <
hastings.recurs...@gmail.com>
> wrote:
>
>> from your previous email:
>> "There is no "id"
>> field defined in the schema."
>>
>> you need an id field to use the get handler
>>
>> On Wed, Mar 15, 2017 at 1:45 PM, Chris Ulicny  wrote:
>>
>> > I thought that "id" and "ids" were fixed parameters for the get
handler,
>> > but I never remember, so I've already tried both. Each time it comes
back
>> > with the same response of no document.
>> >
>> > On Wed, Mar 15, 2017 at 1:31 PM Alexandre Rafalovitch <
>> arafa...@gmail.com>
>> > wrote:
>> >
>> > > Actually.
>> > >
>> > > I think Real Time Get handler has "id" as a magical parameter, not as
>> > > a field name. It maps to the real id field via the uniqueKey
>> > > definition:
>> > > https://cwiki.apache.org/confluence/display/solr/RealTime+Get
>> > >
>> > > So, if you have not, could you try the way you originally wrote it.
>> > >
>> > > Regards,
>> > >Alex.
>> > > 
>> > > http://www.solr-start.com/ - Resources for Solr users, new and
>> > experienced
>> > >
>> > >
>> > > On 15 March 2017 at 13:22, Chris Ulicny  wrote:
>> > > > Sorry, that is a typo. The get is using the iqdocid field. There is
>> no
>> > > "id"
>> > > > field defined in the schema.
>> > > >
>> > > > solr/TestCollection/get?iqdocid=2957-TV-201604141900
>> > > >
>> > > > solr/TestCollection/select?q=*:*&fq=iqdocid:2957-TV-201604141900
>> > > >
>> > > > On Wed, Mar 15, 2017 at 1:15 PM Erick Erickson <
>> > erickerick...@gmail.com>
>> > > > wrote:
>> > > >
>> > > >> Is this a typo or are you trying to use get with an "id" field and
>> > > >> your filter query uses "iqdocid"?
>> > > >>
>> > > >> Best,
>> > > >> Erick
>> > > >>
>> > > >> On Wed, Mar 15, 2017 at 8:31 AM, Chris Ulicny 
>> > wrote:
>> > > >> > Yes, we're using a fixed schema with the iqdocid field set as
the
>> > > >> uniqueKey.
>> > > >> >
>> > > >> > On Wed, Mar 15, 2017 at 11:28 AM Alexandre Rafalovitch <
>> > > >> arafa...@gmail.com>
>> > > >> > wrote:
>> > > >> >
>> > > >> >> What is your uniqueKey? Is it iqdocid?
>> > > >> >>
>> > > >> >> Regards,
>> > > >> >>Alex.
>> > > >> >> 
>> > > >> >> http://www.solr-start.com/ - Resources for Solr users, new and
>> > > >> experienced
>> > > >> >>
>> > > >> >>
>> > > >> >> On 15 March 2017 at 11:24, Chris Ulicny 
>> wrote:
>> > > >> >> > Hi,
>> > > >> >> >
>> > > >> >> > I've been trying to use the get handler for a new solr cloud
>> > > >> collection
>> > > >> >> we
>> > > >> >> > are using, and something seems to be amiss.
>> > > >> >> >
>> > > >> >> > We are running 6.3.0, so we did not explicitly define the
>> request
>> > > >> handler
>> > > >> >> > in the solrconfig since it's supposed to be implicitly
defined.
>> > We
>> > > >> also
>> > > >> >> > have the update log enabled with the default configuration.
>> > > >> >> >
>> > > >> >> > Whenever I send a get query for a document already known to
be
>> in
>> > > the
>> > > >> 

Solr shingles is not working in solr 6.4.0

2017-03-16 Thread Aman Deep Singh
Hi,

Recently I migrated from solr 4 to 6
IN solr 4 shinglefilterfactory is working correctly
my configration  i



 
 
  


  
 
  
  

  



But after updating to solr 6 shingles is not working ,schema is as below,



 
 
  


  
 
  

  

Although in the Analysis tab is was showing proper shingle result but when
using in the queryParser it was not giving proper results

my sample hit is

http://localhost:8983/solr/shingel_test/select?q=one%20plus%20one&wt=xml&debugQuery=true&defType=edismax&qf=cust_shingle

it create the parsed query as

one plus one
one plus one
(+())/no_coord
+()

ExtendedDismaxQParser


Re: DIH delta import with cache 5.3.1 issue

2017-03-16 Thread Sujay Bawaskar
This behaviour is for delta import only. One document get field values of
all documents. These fields are child entities which maps column to multi
valued fields.



 






On Thu, Mar 16, 2017 at 6:35 PM, Alexandre Rafalovitch 
wrote:

> Could you give a bit more details. Do you mean one document gets the
> content of multiple documents? And only on delta?
>
> Regards,
> Alex
>
> On 16 Mar 2017 8:53 AM, "Sujay Bawaskar" 
> wrote:
>
> Hi,
>
> We are using DIH with cache(SortedMapBackedCache) with solr 5.3.1. We have
> around 2.8 million documents in solr and total index size is 4 GB. DIH
> delta import is dumping all values of mapped columns to their respective
> multi valued fields. This is causing size of one solr document upto 2 GB.
> Is this a known issue with solr 5.3.1?
>
> Thanks,
> Sujay
>


Re: Get handler not working

2017-03-16 Thread Alexandre Rafalovitch
If you have the test bed, could you just enable full trace log mode and run
two most similar tests. Then look for log difference.

It sounds like a bug, but of what kind...?

Regards,
   Alex

On 16 Mar 2017 9:16 AM, "Chris Ulicny"  wrote:

> iqdocid is already set to be the uniqueKey value.
>
> I tried reindexing a few documents back into the problematic cloud and am
> getting the same behavior of no document found for get handler.
>
> I've also done some testing on standalone instances as well as some quick
> cloud setups (with embedded zk), and I cannot seem to replicate the
> problem. For each test, I used the exact same configset that is causing the
> issue for us and indexed a document from that instance as well. I can
> provide more details if that would be useful in anyway.
>
> Standalone instance worked
> Cloud mode worked regardless of the use of the security plugin
> Cloud mode worked regardless of explicit get handler definition
> Cloud mode consistently worked with explicitly defining the get handler,
> then removing it and reloading the collection
>
> The only differences that I know of between the tests and the problematic
> cloud is that solr is running as a different user and using an external
> zookeeper ensemble. The running user has ownership of the solr
> installation, log, and data directories.
>
> I'm going to keep trying different setups to see if I can replicate the
> issue, but if anyone has any ideas on what direction might make the most
> sense, please let me know.
>
> Thanks again
>
> On Wed, Mar 15, 2017 at 5:49 PM Erick Erickson 
> wrote:
>
> Wait... Is iqdocid set to the  in your schema? That might
> be the missing thing.
>
>
>
> On Wed, Mar 15, 2017 at 11:20 AM, Chris Ulicny  wrote:
> > Unless the behavior's changed on the way to version 6.3.0, the get
> handler
> > used to use whatever field is set to be the uniqueKey. We have
> successfully
> > been using get on a 4.9.0 standalone core with no explicit "id" field
> > defined by passing in the value for the uniqueKey field to the get
> handler.
> > We tend to have a bunch of id fields floating around from different
> > sources, so we avoid keeping any of them named as "id"
> >
> > iqdocid is just a basic string type
> >  > required="true" stored="true"/>
> >
> > I'll do some more testing on standalone versions, and see how that goes.
> >
> > On Wed, Mar 15, 2017 at 1:52 PM David Hastings <
> hastings.recurs...@gmail.com>
> > wrote:
> >
> >> from your previous email:
> >> "There is no "id"
> >> field defined in the schema."
> >>
> >> you need an id field to use the get handler
> >>
> >> On Wed, Mar 15, 2017 at 1:45 PM, Chris Ulicny  wrote:
> >>
> >> > I thought that "id" and "ids" were fixed parameters for the get
> handler,
> >> > but I never remember, so I've already tried both. Each time it comes
> back
> >> > with the same response of no document.
> >> >
> >> > On Wed, Mar 15, 2017 at 1:31 PM Alexandre Rafalovitch <
> >> arafa...@gmail.com>
> >> > wrote:
> >> >
> >> > > Actually.
> >> > >
> >> > > I think Real Time Get handler has "id" as a magical parameter, not
> as
> >> > > a field name. It maps to the real id field via the uniqueKey
> >> > > definition:
> >> > > https://cwiki.apache.org/confluence/display/solr/RealTime+Get
> >> > >
> >> > > So, if you have not, could you try the way you originally wrote it.
> >> > >
> >> > > Regards,
> >> > >Alex.
> >> > > 
> >> > > http://www.solr-start.com/ - Resources for Solr users, new and
> >> > experienced
> >> > >
> >> > >
> >> > > On 15 March 2017 at 13:22, Chris Ulicny  wrote:
> >> > > > Sorry, that is a typo. The get is using the iqdocid field. There
> is
> >> no
> >> > > "id"
> >> > > > field defined in the schema.
> >> > > >
> >> > > > solr/TestCollection/get?iqdocid=2957-TV-201604141900
> >> > > >
> >> > > > solr/TestCollection/select?q=*:*&fq=iqdocid:2957-TV-201604141900
> >> > > >
> >> > > > On Wed, Mar 15, 2017 at 1:15 PM Erick Erickson <
> >> > erickerick...@gmail.com>
> >> > > > wrote:
> >> > > >
> >> > > >> Is this a typo or are you trying to use get with an "id" field
> and
> >> > > >> your filter query uses "iqdocid"?
> >> > > >>
> >> > > >> Best,
> >> > > >> Erick
> >> > > >>
> >> > > >> On Wed, Mar 15, 2017 at 8:31 AM, Chris Ulicny 
> >> > wrote:
> >> > > >> > Yes, we're using a fixed schema with the iqdocid field set as
> the
> >> > > >> uniqueKey.
> >> > > >> >
> >> > > >> > On Wed, Mar 15, 2017 at 11:28 AM Alexandre Rafalovitch <
> >> > > >> arafa...@gmail.com>
> >> > > >> > wrote:
> >> > > >> >
> >> > > >> >> What is your uniqueKey? Is it iqdocid?
> >> > > >> >>
> >> > > >> >> Regards,
> >> > > >> >>Alex.
> >> > > >> >> 
> >> > > >> >> http://www.solr-start.com/ - Resources for Solr users, new
> and
> >> > > >> experienced
> >> > > >> >>
> >> > > >> >>
> >> > > >> >> On 15 March 2017 at 11:24, Chris Ulicny 
> >> wrote:
> >> > > >> >> > Hi,
> >> > > >> >> >
> >> > > >> >> > I've been trying to use the get handler for a new solr cloud
> 

RE: Group by range results

2017-03-16 Thread Mikhail Ibraheem
Any help on this please?

 

From: Mikhail Ibraheem 
Sent: 15 مارس, 2017 08:53 م
To: solr-user@lucene.apache.org
Subject: Group by range results

 

Hi, 

Can we group by ranges? something like:

facet=true
stats=true
stats.field={!tag=piv1 min=true max=true}price
facet.range={!tag=r1}manufacturedate_dt
facet.range.start=2006-01-01T00:00:00Z
facet.range.end=NOW/YEAR
facet.range.gap=+1YEAR
facet.pivot={!stats=piv1}r1

 

Where I want the max price and min price for each range of manufacturedate_dt.

Please advise.

 

Thanks


Re: Partial Match with DF

2017-03-16 Thread Alexandre Rafalovitch
df is default field - you can only give one. To search over multiple
fields, you switch to eDisMax query parser and fl parameter.

Then, the question will be what type definition your fields have. When you
search text field, you are using its definition because of copyField. Your
original fields may be strings.

Remember to reload core and reminded when you change definitions.

Regards,
   Alex


On 16 Mar 2017 9:15 AM, "Mark Johnson" 
wrote:

> Forgive me if I'm missing something obvious -- I'm new to Solr, but I can't
> seem to find an explanation for the behavior I'm seeing.
>
> If I have a document that looks like this:
> {
> field1: "aaa bbb",
> field2: "ccc ddd",
> field3: "eee fff"
> }
>
> And I do a search where "q" is "aaa ccc", I get the document in the
> results. This is because (please correct me if I'm wrong) the default "df"
> is set to the "_text_" field, which contains the text values from all
> fields.
>
> However, if I do a search where "df" is "field1" and "field2" and "q" is
> "aaa ccc" (words from field1 and field2) I get no results.
>
> In a simpler example, if I do a search where "df" is "field1" and "q" is
> "aaa" (a word from field1) I still get no results.
>
> If I do a search where "df" is "field1" and "q" is "aaa bbb" (the full
> value of field1) then I get the document in the results.
>
> So I'm concluding that when using "df" to specify which fields to search
> then only an exact match on the full field value will return a document.
>
> Is that a correct conclusion? Is there another way to specify which fields
> to search without requiring an exact match? The results I'd like to achieve
> are:
>
> Would Match:
> q=aaa
> q=aaa bbb
> q=aaa ccc
> q=aaa fff
>
> Would Not Match:
> q=eee
> q=fff
> q=eee fff
>
> --
> *This message is intended only for the use of the individual or entity to
> which it is addressed and may contain information that is privileged,
> confidential and exempt from disclosure under applicable law. If you have
> received this message in error, you are hereby notified that any use,
> dissemination, distribution or copying of this message is prohibited. If
> you have received this communication in error, please notify the sender
> immediately and destroy the transmitted information.*
>


Re: Get handler not working

2017-03-16 Thread Yonik Seeley
Something to do with routing perhaps? (the mapping of ids to shards,
by default is based on hashes of the id)
-Yonik


On Thu, Mar 16, 2017 at 9:16 AM, Chris Ulicny  wrote:
> iqdocid is already set to be the uniqueKey value.
>
> I tried reindexing a few documents back into the problematic cloud and am
> getting the same behavior of no document found for get handler.
>
> I've also done some testing on standalone instances as well as some quick
> cloud setups (with embedded zk), and I cannot seem to replicate the
> problem. For each test, I used the exact same configset that is causing the
> issue for us and indexed a document from that instance as well. I can
> provide more details if that would be useful in anyway.
>
> Standalone instance worked
> Cloud mode worked regardless of the use of the security plugin
> Cloud mode worked regardless of explicit get handler definition
> Cloud mode consistently worked with explicitly defining the get handler,
> then removing it and reloading the collection
>
> The only differences that I know of between the tests and the problematic
> cloud is that solr is running as a different user and using an external
> zookeeper ensemble. The running user has ownership of the solr
> installation, log, and data directories.
>
> I'm going to keep trying different setups to see if I can replicate the
> issue, but if anyone has any ideas on what direction might make the most
> sense, please let me know.
>
> Thanks again
>
> On Wed, Mar 15, 2017 at 5:49 PM Erick Erickson 
> wrote:
>
> Wait... Is iqdocid set to the  in your schema? That might
> be the missing thing.
>
>
>
> On Wed, Mar 15, 2017 at 11:20 AM, Chris Ulicny  wrote:
>> Unless the behavior's changed on the way to version 6.3.0, the get handler
>> used to use whatever field is set to be the uniqueKey. We have
> successfully
>> been using get on a 4.9.0 standalone core with no explicit "id" field
>> defined by passing in the value for the uniqueKey field to the get
> handler.
>> We tend to have a bunch of id fields floating around from different
>> sources, so we avoid keeping any of them named as "id"
>>
>> iqdocid is just a basic string type
>> > required="true" stored="true"/>
>>
>> I'll do some more testing on standalone versions, and see how that goes.
>>
>> On Wed, Mar 15, 2017 at 1:52 PM David Hastings <
> hastings.recurs...@gmail.com>
>> wrote:
>>
>>> from your previous email:
>>> "There is no "id"
>>> field defined in the schema."
>>>
>>> you need an id field to use the get handler
>>>
>>> On Wed, Mar 15, 2017 at 1:45 PM, Chris Ulicny  wrote:
>>>
>>> > I thought that "id" and "ids" were fixed parameters for the get
> handler,
>>> > but I never remember, so I've already tried both. Each time it comes
> back
>>> > with the same response of no document.
>>> >
>>> > On Wed, Mar 15, 2017 at 1:31 PM Alexandre Rafalovitch <
>>> arafa...@gmail.com>
>>> > wrote:
>>> >
>>> > > Actually.
>>> > >
>>> > > I think Real Time Get handler has "id" as a magical parameter, not as
>>> > > a field name. It maps to the real id field via the uniqueKey
>>> > > definition:
>>> > > https://cwiki.apache.org/confluence/display/solr/RealTime+Get
>>> > >
>>> > > So, if you have not, could you try the way you originally wrote it.
>>> > >
>>> > > Regards,
>>> > >Alex.
>>> > > 
>>> > > http://www.solr-start.com/ - Resources for Solr users, new and
>>> > experienced
>>> > >
>>> > >
>>> > > On 15 March 2017 at 13:22, Chris Ulicny  wrote:
>>> > > > Sorry, that is a typo. The get is using the iqdocid field. There is
>>> no
>>> > > "id"
>>> > > > field defined in the schema.
>>> > > >
>>> > > > solr/TestCollection/get?iqdocid=2957-TV-201604141900
>>> > > >
>>> > > > solr/TestCollection/select?q=*:*&fq=iqdocid:2957-TV-201604141900
>>> > > >
>>> > > > On Wed, Mar 15, 2017 at 1:15 PM Erick Erickson <
>>> > erickerick...@gmail.com>
>>> > > > wrote:
>>> > > >
>>> > > >> Is this a typo or are you trying to use get with an "id" field and
>>> > > >> your filter query uses "iqdocid"?
>>> > > >>
>>> > > >> Best,
>>> > > >> Erick
>>> > > >>
>>> > > >> On Wed, Mar 15, 2017 at 8:31 AM, Chris Ulicny 
>>> > wrote:
>>> > > >> > Yes, we're using a fixed schema with the iqdocid field set as
> the
>>> > > >> uniqueKey.
>>> > > >> >
>>> > > >> > On Wed, Mar 15, 2017 at 11:28 AM Alexandre Rafalovitch <
>>> > > >> arafa...@gmail.com>
>>> > > >> > wrote:
>>> > > >> >
>>> > > >> >> What is your uniqueKey? Is it iqdocid?
>>> > > >> >>
>>> > > >> >> Regards,
>>> > > >> >>Alex.
>>> > > >> >> 
>>> > > >> >> http://www.solr-start.com/ - Resources for Solr users, new and
>>> > > >> experienced
>>> > > >> >>
>>> > > >> >>
>>> > > >> >> On 15 March 2017 at 11:24, Chris Ulicny 
>>> wrote:
>>> > > >> >> > Hi,
>>> > > >> >> >
>>> > > >> >> > I've been trying to use the get handler for a new solr cloud
>>> > > >> collection
>>> > > >> >> we
>>> > > >> >> > are using, and something seems to be amiss.
>>> > > >> >> >
>>> > > >> >> > We are running 6.3.0, so we did not expl

Re: Get handler not working

2017-03-16 Thread David Hastings
i still would like to see an experiment where you change the field to id
instead of iqdocid,

On Thu, Mar 16, 2017 at 9:33 AM, Yonik Seeley  wrote:

> Something to do with routing perhaps? (the mapping of ids to shards,
> by default is based on hashes of the id)
> -Yonik
>
>
> On Thu, Mar 16, 2017 at 9:16 AM, Chris Ulicny  wrote:
> > iqdocid is already set to be the uniqueKey value.
> >
> > I tried reindexing a few documents back into the problematic cloud and am
> > getting the same behavior of no document found for get handler.
> >
> > I've also done some testing on standalone instances as well as some quick
> > cloud setups (with embedded zk), and I cannot seem to replicate the
> > problem. For each test, I used the exact same configset that is causing
> the
> > issue for us and indexed a document from that instance as well. I can
> > provide more details if that would be useful in anyway.
> >
> > Standalone instance worked
> > Cloud mode worked regardless of the use of the security plugin
> > Cloud mode worked regardless of explicit get handler definition
> > Cloud mode consistently worked with explicitly defining the get handler,
> > then removing it and reloading the collection
> >
> > The only differences that I know of between the tests and the problematic
> > cloud is that solr is running as a different user and using an external
> > zookeeper ensemble. The running user has ownership of the solr
> > installation, log, and data directories.
> >
> > I'm going to keep trying different setups to see if I can replicate the
> > issue, but if anyone has any ideas on what direction might make the most
> > sense, please let me know.
> >
> > Thanks again
> >
> > On Wed, Mar 15, 2017 at 5:49 PM Erick Erickson 
> > wrote:
> >
> > Wait... Is iqdocid set to the  in your schema? That might
> > be the missing thing.
> >
> >
> >
> > On Wed, Mar 15, 2017 at 11:20 AM, Chris Ulicny  wrote:
> >> Unless the behavior's changed on the way to version 6.3.0, the get
> handler
> >> used to use whatever field is set to be the uniqueKey. We have
> > successfully
> >> been using get on a 4.9.0 standalone core with no explicit "id" field
> >> defined by passing in the value for the uniqueKey field to the get
> > handler.
> >> We tend to have a bunch of id fields floating around from different
> >> sources, so we avoid keeping any of them named as "id"
> >>
> >> iqdocid is just a basic string type
> >>  >> required="true" stored="true"/>
> >>
> >> I'll do some more testing on standalone versions, and see how that goes.
> >>
> >> On Wed, Mar 15, 2017 at 1:52 PM David Hastings <
> > hastings.recurs...@gmail.com>
> >> wrote:
> >>
> >>> from your previous email:
> >>> "There is no "id"
> >>> field defined in the schema."
> >>>
> >>> you need an id field to use the get handler
> >>>
> >>> On Wed, Mar 15, 2017 at 1:45 PM, Chris Ulicny 
> wrote:
> >>>
> >>> > I thought that "id" and "ids" were fixed parameters for the get
> > handler,
> >>> > but I never remember, so I've already tried both. Each time it comes
> > back
> >>> > with the same response of no document.
> >>> >
> >>> > On Wed, Mar 15, 2017 at 1:31 PM Alexandre Rafalovitch <
> >>> arafa...@gmail.com>
> >>> > wrote:
> >>> >
> >>> > > Actually.
> >>> > >
> >>> > > I think Real Time Get handler has "id" as a magical parameter, not
> as
> >>> > > a field name. It maps to the real id field via the uniqueKey
> >>> > > definition:
> >>> > > https://cwiki.apache.org/confluence/display/solr/RealTime+Get
> >>> > >
> >>> > > So, if you have not, could you try the way you originally wrote it.
> >>> > >
> >>> > > Regards,
> >>> > >Alex.
> >>> > > 
> >>> > > http://www.solr-start.com/ - Resources for Solr users, new and
> >>> > experienced
> >>> > >
> >>> > >
> >>> > > On 15 March 2017 at 13:22, Chris Ulicny  wrote:
> >>> > > > Sorry, that is a typo. The get is using the iqdocid field. There
> is
> >>> no
> >>> > > "id"
> >>> > > > field defined in the schema.
> >>> > > >
> >>> > > > solr/TestCollection/get?iqdocid=2957-TV-201604141900
> >>> > > >
> >>> > > > solr/TestCollection/select?q=*:*&fq=iqdocid:2957-TV-201604141900
> >>> > > >
> >>> > > > On Wed, Mar 15, 2017 at 1:15 PM Erick Erickson <
> >>> > erickerick...@gmail.com>
> >>> > > > wrote:
> >>> > > >
> >>> > > >> Is this a typo or are you trying to use get with an "id" field
> and
> >>> > > >> your filter query uses "iqdocid"?
> >>> > > >>
> >>> > > >> Best,
> >>> > > >> Erick
> >>> > > >>
> >>> > > >> On Wed, Mar 15, 2017 at 8:31 AM, Chris Ulicny  >
> >>> > wrote:
> >>> > > >> > Yes, we're using a fixed schema with the iqdocid field set as
> > the
> >>> > > >> uniqueKey.
> >>> > > >> >
> >>> > > >> > On Wed, Mar 15, 2017 at 11:28 AM Alexandre Rafalovitch <
> >>> > > >> arafa...@gmail.com>
> >>> > > >> > wrote:
> >>> > > >> >
> >>> > > >> >> What is your uniqueKey? Is it iqdocid?
> >>> > > >> >>
> >>> > > >> >> Regards,
> >>> > > >> >>Alex.
> >>> > > >> >> 
> >>> > > >> >> http://www.solr-start.com/ - Resources 

Re: Partial Match with DF

2017-03-16 Thread Mark Johnson
Oh, great! Thank you!

So if I switch over to eDisMax I'd specify the fields to query via the "qf"
parameter, right? That seems to have the same result (only matches when I
specify the exact phrase in the field, not just certain words from it).

On Thu, Mar 16, 2017 at 9:33 AM, Alexandre Rafalovitch 
wrote:

> df is default field - you can only give one. To search over multiple
> fields, you switch to eDisMax query parser and fl parameter.
>
> Then, the question will be what type definition your fields have. When you
> search text field, you are using its definition because of copyField. Your
> original fields may be strings.
>
> Remember to reload core and reminded when you change definitions.
>
> Regards,
>Alex
>
>
> On 16 Mar 2017 9:15 AM, "Mark Johnson" 
> wrote:
>
> > Forgive me if I'm missing something obvious -- I'm new to Solr, but I
> can't
> > seem to find an explanation for the behavior I'm seeing.
> >
> > If I have a document that looks like this:
> > {
> > field1: "aaa bbb",
> > field2: "ccc ddd",
> > field3: "eee fff"
> > }
> >
> > And I do a search where "q" is "aaa ccc", I get the document in the
> > results. This is because (please correct me if I'm wrong) the default
> "df"
> > is set to the "_text_" field, which contains the text values from all
> > fields.
> >
> > However, if I do a search where "df" is "field1" and "field2" and "q" is
> > "aaa ccc" (words from field1 and field2) I get no results.
> >
> > In a simpler example, if I do a search where "df" is "field1" and "q" is
> > "aaa" (a word from field1) I still get no results.
> >
> > If I do a search where "df" is "field1" and "q" is "aaa bbb" (the full
> > value of field1) then I get the document in the results.
> >
> > So I'm concluding that when using "df" to specify which fields to search
> > then only an exact match on the full field value will return a document.
> >
> > Is that a correct conclusion? Is there another way to specify which
> fields
> > to search without requiring an exact match? The results I'd like to
> achieve
> > are:
> >
> > Would Match:
> > q=aaa
> > q=aaa bbb
> > q=aaa ccc
> > q=aaa fff
> >
> > Would Not Match:
> > q=eee
> > q=fff
> > q=eee fff
> >
> > --
> > *This message is intended only for the use of the individual or entity to
> > which it is addressed and may contain information that is privileged,
> > confidential and exempt from disclosure under applicable law. If you have
> > received this message in error, you are hereby notified that any use,
> > dissemination, distribution or copying of this message is prohibited. If
> > you have received this communication in error, please notify the sender
> > immediately and destroy the transmitted information.*
> >
>



-- 

Best Regards,

*Mark Johnson* | .NET Software Engineer

Office: 603-392-7017

Emerson Ecologics, LLC | 1230 Elm Street | Suite 301 | Manchester NH | 03101

  

*Supporting The Practice Of Healthy Living*









-- 
*This message is intended only for the use of the individual or entity to 
which it is addressed and may contain information that is privileged, 
confidential and exempt from disclosure under applicable law. If you have 
received this message in error, you are hereby notified that any use, 
dissemination, distribution or copying of this message is prohibited. If 
you have received this communication in error, please notify the sender 
immediately and destroy the transmitted information.*


Fwd: block join - search together at parent and childern

2017-03-16 Thread Jan Nekuda
Hi,
I have a question for which I wasn't able to find a good solution.
I have this structure of documents

A
|\
| \
B \
 \
  C
   \
\
 \
  D

Document type A has fields id_number, date_from, date_to
Document type C  has fields first_name, surname, birthdate
Document type D AND B has fields street_name, house_number, city


I want to find *all parents with block join and edismax*.
The problem is that I have found that possible is find children by parent,
or parent by children.
*I want to find parent by values in parent and in children*. I want to use
edismax with all fields from all documents (id_number, date_from, date_to,
has fields first_name, surname, birthdate,street_name, house_number, city).
I want to write *Hynek* AND *Brojova* AND 14 and I expect that it returns
document A because it found Hynek in surname, Brojova in street and 14 in
house number.
This is easy with {!parent which=type:A}
the problem is, that I'm not able to find by condition 789 AND *Brojova*
where 789 is id_number from type A and Brojova is Street from D.

In short I need to find all parents of tree (parent and childern) in which
are matched all the word which i send to condition


My only solution is to make root type X. Then A will be its child. Then I
can use {!parent which=type:X}.
Than this will work:

http://localhost:8983/solr/demo/select?q=*:*&fq={!parent
which=type:X}brojova*&fq={!parent which=type:X}16&wt=json&
indent=true&defType=edismax&qf=id_number date_from date_to has fields
first_name surname birthdate street_name house_number city&stopwords=true&
lowercaseOperators=true


But I believe it can be solved much better.

X
|
A
|\
| \
B \
 \
  C
   \
\
 \
  D


Thanks for your help
Jan


Re: Solr shingles is not working in solr 6.4.0

2017-03-16 Thread alessandro.benedetti
Hi Aman, are you using stopword in your analysis by any chance ?
Can you show us your request handler config ?
With the edismax you can configure stopwords to take effect at query parsing
stage.
Let's try to figure it out first.

Cheers



-
---
Alessandro Benedetti
Search Consultant, R&D Software Engineer, Director
Sease Ltd. - www.sease.io
--
View this message in context: 
http://lucene.472066.n3.nabble.com/Solr-shingles-is-not-working-in-solr-6-4-0-tp4325342p4325351.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: Partial Match with DF

2017-03-16 Thread Erick Erickson
My guess: Your analysis chain for the fields is different, i.e. they
have a different fieldType. In particular, watch out for the "string"
type, people are often confused about it. It does _not_ break input
into tokens, you need a text-based field type, text_en is one example
that is usually in the configs by default.

Two tools that'll help you enormously:

admin UI>>select core (or collection) from the drop-down>>analysis
That shows you exactly how Solr/Lucene break up text at query and index time

add &debug=query to the URL. That'll show you how the query was parsed.

Best,
Erick

On Thu, Mar 16, 2017 at 6:52 AM, Mark Johnson
 wrote:
> Oh, great! Thank you!
>
> So if I switch over to eDisMax I'd specify the fields to query via the "qf"
> parameter, right? That seems to have the same result (only matches when I
> specify the exact phrase in the field, not just certain words from it).
>
> On Thu, Mar 16, 2017 at 9:33 AM, Alexandre Rafalovitch 
> wrote:
>
>> df is default field - you can only give one. To search over multiple
>> fields, you switch to eDisMax query parser and fl parameter.
>>
>> Then, the question will be what type definition your fields have. When you
>> search text field, you are using its definition because of copyField. Your
>> original fields may be strings.
>>
>> Remember to reload core and reminded when you change definitions.
>>
>> Regards,
>>Alex
>>
>>
>> On 16 Mar 2017 9:15 AM, "Mark Johnson" 
>> wrote:
>>
>> > Forgive me if I'm missing something obvious -- I'm new to Solr, but I
>> can't
>> > seem to find an explanation for the behavior I'm seeing.
>> >
>> > If I have a document that looks like this:
>> > {
>> > field1: "aaa bbb",
>> > field2: "ccc ddd",
>> > field3: "eee fff"
>> > }
>> >
>> > And I do a search where "q" is "aaa ccc", I get the document in the
>> > results. This is because (please correct me if I'm wrong) the default
>> "df"
>> > is set to the "_text_" field, which contains the text values from all
>> > fields.
>> >
>> > However, if I do a search where "df" is "field1" and "field2" and "q" is
>> > "aaa ccc" (words from field1 and field2) I get no results.
>> >
>> > In a simpler example, if I do a search where "df" is "field1" and "q" is
>> > "aaa" (a word from field1) I still get no results.
>> >
>> > If I do a search where "df" is "field1" and "q" is "aaa bbb" (the full
>> > value of field1) then I get the document in the results.
>> >
>> > So I'm concluding that when using "df" to specify which fields to search
>> > then only an exact match on the full field value will return a document.
>> >
>> > Is that a correct conclusion? Is there another way to specify which
>> fields
>> > to search without requiring an exact match? The results I'd like to
>> achieve
>> > are:
>> >
>> > Would Match:
>> > q=aaa
>> > q=aaa bbb
>> > q=aaa ccc
>> > q=aaa fff
>> >
>> > Would Not Match:
>> > q=eee
>> > q=fff
>> > q=eee fff
>> >
>> > --
>> > *This message is intended only for the use of the individual or entity to
>> > which it is addressed and may contain information that is privileged,
>> > confidential and exempt from disclosure under applicable law. If you have
>> > received this message in error, you are hereby notified that any use,
>> > dissemination, distribution or copying of this message is prohibited. If
>> > you have received this communication in error, please notify the sender
>> > immediately and destroy the transmitted information.*
>> >
>>
>
>
>
> --
>
> Best Regards,
>
> *Mark Johnson* | .NET Software Engineer
>
> Office: 603-392-7017
>
> Emerson Ecologics, LLC | 1230 Elm Street | Suite 301 | Manchester NH | 03101
>
>   
>
> *Supporting The Practice Of Healthy Living*
>
> 
> 
> 
> 
> 
> 
> 
>
> --
> *This message is intended only for the use of the individual or entity to
> which it is addressed and may contain information that is privileged,
> confidential and exempt from disclosure under applicable law. If you have
> received this message in error, you are hereby notified that any use,
> dissemination, distribution or copying of this message is prohibited. If
> you have received this communication in error, please notify the sender
> immediately and destroy the transmitted information.*


Re: DIH delta import with cache 5.3.1 issue

2017-03-16 Thread Alexandre Rafalovitch
You have nested entities and accumulate the content of the inner
entities in the outer one with caching on an inner one. Your
description sounds like the inner cache is not reset on the next
iteration of the outer loop.

This may be connected to
https://issues.apache.org/jira/browse/SOLR-7843 (Fixed in 5.4)

Or it may be a different bug. I would make a simplest test case (based
on DIH-db example) and then try it on 5.3.1 and 5.4. And then 6.4 if
the problem is still there. If it is still there in 6.4, then we may
have a new bug.

Regards,
   Alex.

http://www.solr-start.com/ - Resources for Solr users, new and experienced


On 16 March 2017 at 09:17, Sujay Bawaskar  wrote:
> This behaviour is for delta import only. One document get field values of
> all documents. These fields are child entities which maps column to multi
> valued fields.
>
>  query="IMPORT_QUERY"
> deltaQuery="DELTA_QUERY"
> pk="buildingUserId"
> deletedPkQuery="DELETE_QUERY"
> onError="continue">
>
>   query="SELECT_QUERY"
> transformer="RegexTransformer" cacheImpl="SortedMapBackedCache"
> cacheKey="bldId" cacheLookup="user_building.plainBuildingId"
> onError="continue">
> 
>  splitBy="," />
>  dateTimeFormat="-MM-dd" />
> 
> 
>
> On Thu, Mar 16, 2017 at 6:35 PM, Alexandre Rafalovitch 
> wrote:
>
>> Could you give a bit more details. Do you mean one document gets the
>> content of multiple documents? And only on delta?
>>
>> Regards,
>> Alex
>>
>> On 16 Mar 2017 8:53 AM, "Sujay Bawaskar" 
>> wrote:
>>
>> Hi,
>>
>> We are using DIH with cache(SortedMapBackedCache) with solr 5.3.1. We have
>> around 2.8 million documents in solr and total index size is 4 GB. DIH
>> delta import is dumping all values of mapped columns to their respective
>> multi valued fields. This is causing size of one solr document upto 2 GB.
>> Is this a known issue with solr 5.3.1?
>>
>> Thanks,
>> Sujay
>>


Re: Solr shingles is not working in solr 6.4.0

2017-03-16 Thread Alexandre Rafalovitch
Sanity check. Is your 'df' pointing at the field you think it is
pointing at? It really does look like all tokens were eaten and
nothing was left. But you should have seen that in the Analysis screen
too, if you have the right field.

Try adding echoParams=all to your request to see the full final
parameter list. Maybe some parameters in initParams sections override
your assumed config.

Regards,
   Alex.

http://www.solr-start.com/ - Resources for Solr users, new and experienced


On 16 March 2017 at 08:30, Aman Deep Singh  wrote:
> Hi,
>
> Recently I migrated from solr 4 to 6
> IN solr 4 shinglefilterfactory is working correctly
> my configration  i
>
>  positionIncrementGap="100">
> 
>  
>   maxShingleSize="5"
>  outputUnigrams="false" outputUnigramsIfNoShingles="false" />
>   
> 
> 
>   
>   maxShingleSize="5"
>  outputUnigrams="false" outputUnigramsIfNoShingles="false" />
>   
>   
> 
>   
>
>
>
> But after updating to solr 6 shingles is not working ,schema is as below,
>
>  positionIncrementGap="100">
> 
>  
>   maxShingleSize="5"
>  outputUnigrams="false" outputUnigramsIfNoShingles="false" />
>   
> 
> 
>   
>   maxShingleSize="5"
>  outputUnigrams="false" outputUnigramsIfNoShingles="false" />
>   
> 
>   
>
> Although in the Analysis tab is was showing proper shingle result but when
> using in the queryParser it was not giving proper results
>
> my sample hit is
>
> http://localhost:8983/solr/shingel_test/select?q=one%20plus%20one&wt=xml&debugQuery=true&defType=edismax&qf=cust_shingle
>
> it create the parsed query as
>
> one plus one
> one plus one
> (+())/no_coord
> +()
> 
> ExtendedDismaxQParser


Re: DIH delta import with cache 5.3.1 issue

2017-03-16 Thread Sujay Bawaskar
Thanks Alex. I will test it with 5.4 and 6.4 and let you know.

On Thu, Mar 16, 2017 at 7:40 PM, Alexandre Rafalovitch 
wrote:

> You have nested entities and accumulate the content of the inner
> entities in the outer one with caching on an inner one. Your
> description sounds like the inner cache is not reset on the next
> iteration of the outer loop.
>
> This may be connected to
> https://issues.apache.org/jira/browse/SOLR-7843 (Fixed in 5.4)
>
> Or it may be a different bug. I would make a simplest test case (based
> on DIH-db example) and then try it on 5.3.1 and 5.4. And then 6.4 if
> the problem is still there. If it is still there in 6.4, then we may
> have a new bug.
>
> Regards,
>Alex.
> 
> http://www.solr-start.com/ - Resources for Solr users, new and experienced
>
>
> On 16 March 2017 at 09:17, Sujay Bawaskar 
> wrote:
> > This behaviour is for delta import only. One document get field values of
> > all documents. These fields are child entities which maps column to multi
> > valued fields.
> >
> >  > query="IMPORT_QUERY"
> > deltaQuery="DELTA_QUERY"
> > pk="buildingUserId"
> > deletedPkQuery="DELETE_QUERY"
> > onError="continue">
> >
> >   > query="SELECT_QUERY"
> > transformer="RegexTransformer" cacheImpl="SortedMapBackedCache"
> > cacheKey="bldId" cacheLookup="user_building.plainBuildingId"
> > onError="continue">
> > 
> >  > splitBy="," />
> >  > dateTimeFormat="-MM-dd" />
> > 
> > 
> >
> > On Thu, Mar 16, 2017 at 6:35 PM, Alexandre Rafalovitch <
> arafa...@gmail.com>
> > wrote:
> >
> >> Could you give a bit more details. Do you mean one document gets the
> >> content of multiple documents? And only on delta?
> >>
> >> Regards,
> >> Alex
> >>
> >> On 16 Mar 2017 8:53 AM, "Sujay Bawaskar" 
> >> wrote:
> >>
> >> Hi,
> >>
> >> We are using DIH with cache(SortedMapBackedCache) with solr 5.3.1. We
> have
> >> around 2.8 million documents in solr and total index size is 4 GB. DIH
> >> delta import is dumping all values of mapped columns to their respective
> >> multi valued fields. This is causing size of one solr document upto 2
> GB.
> >> Is this a known issue with solr 5.3.1?
> >>
> >> Thanks,
> >> Sujay
> >>
>


Re: Get handler not working

2017-03-16 Thread Alexandre Rafalovitch
Does real time get implementation reroutes the request internally to a
different shard? If not, then maybe the request is going to a
non-primary shard.

Regards,
   Alex.

http://www.solr-start.com/ - Resources for Solr users, new and experienced


On 16 March 2017 at 09:33, Yonik Seeley  wrote:
> Something to do with routing perhaps? (the mapping of ids to shards,
> by default is based on hashes of the id)
> -Yonik
>
>
> On Thu, Mar 16, 2017 at 9:16 AM, Chris Ulicny  wrote:
>> iqdocid is already set to be the uniqueKey value.
>>
>> I tried reindexing a few documents back into the problematic cloud and am
>> getting the same behavior of no document found for get handler.
>>
>> I've also done some testing on standalone instances as well as some quick
>> cloud setups (with embedded zk), and I cannot seem to replicate the
>> problem. For each test, I used the exact same configset that is causing the
>> issue for us and indexed a document from that instance as well. I can
>> provide more details if that would be useful in anyway.
>>
>> Standalone instance worked
>> Cloud mode worked regardless of the use of the security plugin
>> Cloud mode worked regardless of explicit get handler definition
>> Cloud mode consistently worked with explicitly defining the get handler,
>> then removing it and reloading the collection
>>
>> The only differences that I know of between the tests and the problematic
>> cloud is that solr is running as a different user and using an external
>> zookeeper ensemble. The running user has ownership of the solr
>> installation, log, and data directories.
>>
>> I'm going to keep trying different setups to see if I can replicate the
>> issue, but if anyone has any ideas on what direction might make the most
>> sense, please let me know.
>>
>> Thanks again
>>
>> On Wed, Mar 15, 2017 at 5:49 PM Erick Erickson 
>> wrote:
>>
>> Wait... Is iqdocid set to the  in your schema? That might
>> be the missing thing.
>>
>>
>>
>> On Wed, Mar 15, 2017 at 11:20 AM, Chris Ulicny  wrote:
>>> Unless the behavior's changed on the way to version 6.3.0, the get handler
>>> used to use whatever field is set to be the uniqueKey. We have
>> successfully
>>> been using get on a 4.9.0 standalone core with no explicit "id" field
>>> defined by passing in the value for the uniqueKey field to the get
>> handler.
>>> We tend to have a bunch of id fields floating around from different
>>> sources, so we avoid keeping any of them named as "id"
>>>
>>> iqdocid is just a basic string type
>>> >> required="true" stored="true"/>
>>>
>>> I'll do some more testing on standalone versions, and see how that goes.
>>>
>>> On Wed, Mar 15, 2017 at 1:52 PM David Hastings <
>> hastings.recurs...@gmail.com>
>>> wrote:
>>>
 from your previous email:
 "There is no "id"
 field defined in the schema."

 you need an id field to use the get handler

 On Wed, Mar 15, 2017 at 1:45 PM, Chris Ulicny  wrote:

 > I thought that "id" and "ids" were fixed parameters for the get
>> handler,
 > but I never remember, so I've already tried both. Each time it comes
>> back
 > with the same response of no document.
 >
 > On Wed, Mar 15, 2017 at 1:31 PM Alexandre Rafalovitch <
 arafa...@gmail.com>
 > wrote:
 >
 > > Actually.
 > >
 > > I think Real Time Get handler has "id" as a magical parameter, not as
 > > a field name. It maps to the real id field via the uniqueKey
 > > definition:
 > > https://cwiki.apache.org/confluence/display/solr/RealTime+Get
 > >
 > > So, if you have not, could you try the way you originally wrote it.
 > >
 > > Regards,
 > >Alex.
 > > 
 > > http://www.solr-start.com/ - Resources for Solr users, new and
 > experienced
 > >
 > >
 > > On 15 March 2017 at 13:22, Chris Ulicny  wrote:
 > > > Sorry, that is a typo. The get is using the iqdocid field. There is
 no
 > > "id"
 > > > field defined in the schema.
 > > >
 > > > solr/TestCollection/get?iqdocid=2957-TV-201604141900
 > > >
 > > > solr/TestCollection/select?q=*:*&fq=iqdocid:2957-TV-201604141900
 > > >
 > > > On Wed, Mar 15, 2017 at 1:15 PM Erick Erickson <
 > erickerick...@gmail.com>
 > > > wrote:
 > > >
 > > >> Is this a typo or are you trying to use get with an "id" field and
 > > >> your filter query uses "iqdocid"?
 > > >>
 > > >> Best,
 > > >> Erick
 > > >>
 > > >> On Wed, Mar 15, 2017 at 8:31 AM, Chris Ulicny 
 > wrote:
 > > >> > Yes, we're using a fixed schema with the iqdocid field set as
>> the
 > > >> uniqueKey.
 > > >> >
 > > >> > On Wed, Mar 15, 2017 at 11:28 AM Alexandre Rafalovitch <
 > > >> arafa...@gmail.com>
 > > >> > wrote:
 > > >> >
 > > >> >> What is your uniqueKey? Is it iqdocid?
 > > >> >>
 > > >> >> Regards,
 > > >> >>Alex.
 > > >> >> 
 > > >> >> http://www.solr-start.com/ 

Re: Get handler not working

2017-03-16 Thread Chris Ulicny
Speaking of routing, I realized I completely forgot to add the routing
setup to the test cloud, so it probably has something to do with the issue.
I'll add that in and report back.

So the routing and uniqueKey setup is as follows:

Schema setup:
iqdocid  

I don't think it's mentioned in the documentation about using routerField
for the compositeId router, but based on the resolution of SOLR-5017
, we decided to use the
compositeId router with routerField set to 'iqroutingkey' which is using
the "!" notation. In general, the iqroutingkey field is of the form:
!!

Unless I misunderstood what was changed with that patch, that form should
still route appropriately, and it seems that it has distributed the
documents appropriately from our basic testing.

On Thu, Mar 16, 2017 at 9:42 AM David Hastings 
wrote:

i still would like to see an experiment where you change the field to id
instead of iqdocid,

On Thu, Mar 16, 2017 at 9:33 AM, Yonik Seeley  wrote:

> Something to do with routing perhaps? (the mapping of ids to shards,
> by default is based on hashes of the id)
> -Yonik
>
>
> On Thu, Mar 16, 2017 at 9:16 AM, Chris Ulicny  wrote:
> > iqdocid is already set to be the uniqueKey value.
> >
> > I tried reindexing a few documents back into the problematic cloud and
am
> > getting the same behavior of no document found for get handler.
> >
> > I've also done some testing on standalone instances as well as some
quick
> > cloud setups (with embedded zk), and I cannot seem to replicate the
> > problem. For each test, I used the exact same configset that is causing
> the
> > issue for us and indexed a document from that instance as well. I can
> > provide more details if that would be useful in anyway.
> >
> > Standalone instance worked
> > Cloud mode worked regardless of the use of the security plugin
> > Cloud mode worked regardless of explicit get handler definition
> > Cloud mode consistently worked with explicitly defining the get handler,
> > then removing it and reloading the collection
> >
> > The only differences that I know of between the tests and the
problematic
> > cloud is that solr is running as a different user and using an external
> > zookeeper ensemble. The running user has ownership of the solr
> > installation, log, and data directories.
> >
> > I'm going to keep trying different setups to see if I can replicate the
> > issue, but if anyone has any ideas on what direction might make the most
> > sense, please let me know.
> >
> > Thanks again
> >
> > On Wed, Mar 15, 2017 at 5:49 PM Erick Erickson 
> > wrote:
> >
> > Wait... Is iqdocid set to the  in your schema? That might
> > be the missing thing.
> >
> >
> >
> > On Wed, Mar 15, 2017 at 11:20 AM, Chris Ulicny  wrote:
> >> Unless the behavior's changed on the way to version 6.3.0, the get
> handler
> >> used to use whatever field is set to be the uniqueKey. We have
> > successfully
> >> been using get on a 4.9.0 standalone core with no explicit "id" field
> >> defined by passing in the value for the uniqueKey field to the get
> > handler.
> >> We tend to have a bunch of id fields floating around from different
> >> sources, so we avoid keeping any of them named as "id"
> >>
> >> iqdocid is just a basic string type
> >>  >> required="true" stored="true"/>
> >>
> >> I'll do some more testing on standalone versions, and see how that
goes.
> >>
> >> On Wed, Mar 15, 2017 at 1:52 PM David Hastings <
> > hastings.recurs...@gmail.com>
> >> wrote:
> >>
> >>> from your previous email:
> >>> "There is no "id"
> >>> field defined in the schema."
> >>>
> >>> you need an id field to use the get handler
> >>>
> >>> On Wed, Mar 15, 2017 at 1:45 PM, Chris Ulicny 
> wrote:
> >>>
> >>> > I thought that "id" and "ids" were fixed parameters for the get
> > handler,
> >>> > but I never remember, so I've already tried both. Each time it comes
> > back
> >>> > with the same response of no document.
> >>> >
> >>> > On Wed, Mar 15, 2017 at 1:31 PM Alexandre Rafalovitch <
> >>> arafa...@gmail.com>
> >>> > wrote:
> >>> >
> >>> > > Actually.
> >>> > >
> >>> > > I think Real Time Get handler has "id" as a magical parameter, not
> as
> >>> > > a field name. It maps to the real id field via the uniqueKey
> >>> > > definition:
> >>> > > https://cwiki.apache.org/confluence/display/solr/RealTime+Get
> >>> > >
> >>> > > So, if you have not, could you try the way you originally wrote
it.
> >>> > >
> >>> > > Regards,
> >>> > >Alex.
> >>> > > 
> >>> > > http://www.solr-start.com/ - Resources for Solr users, new and
> >>> > experienced
> >>> > >
> >>> > >
> >>> > > On 15 March 2017 at 13:22, Chris Ulicny  wrote:
> >>> > > > Sorry, that is a typo. The get is using the iqdocid field. There
> is
> >>> no
> >>> > > "id"
> >>> > > > field defined in the schema.
> >>> > > >
> >>> > > > solr/TestCollection/get?iqdocid=2957-TV-201604141900
> >>> > > >
> >>> > > > solr/TestCollection/select?q=*:*&fq=iqdocid:2957-TV-201604141900
> >

Re: Partial Match with DF

2017-03-16 Thread Mark Johnson
You're right! The fields I'm searching are all "string" type. I switched to
"text_en" and now it's working exactly as I need it to! I'll do some
research to see if "text_en" or another "text" type field is best for our
needs.

Also, those debug options are amazing! They'll help tremendously in the
future.

Thank you much!

On Thu, Mar 16, 2017 at 10:02 AM, Erick Erickson 
wrote:

> My guess: Your analysis chain for the fields is different, i.e. they
> have a different fieldType. In particular, watch out for the "string"
> type, people are often confused about it. It does _not_ break input
> into tokens, you need a text-based field type, text_en is one example
> that is usually in the configs by default.
>
> Two tools that'll help you enormously:
>
> admin UI>>select core (or collection) from the drop-down>>analysis
> That shows you exactly how Solr/Lucene break up text at query and index
> time
>
> add &debug=query to the URL. That'll show you how the query was parsed.
>
> Best,
> Erick
>
> On Thu, Mar 16, 2017 at 6:52 AM, Mark Johnson
>  wrote:
> > Oh, great! Thank you!
> >
> > So if I switch over to eDisMax I'd specify the fields to query via the
> "qf"
> > parameter, right? That seems to have the same result (only matches when I
> > specify the exact phrase in the field, not just certain words from it).
> >
> > On Thu, Mar 16, 2017 at 9:33 AM, Alexandre Rafalovitch <
> arafa...@gmail.com>
> > wrote:
> >
> >> df is default field - you can only give one. To search over multiple
> >> fields, you switch to eDisMax query parser and fl parameter.
> >>
> >> Then, the question will be what type definition your fields have. When
> you
> >> search text field, you are using its definition because of copyField.
> Your
> >> original fields may be strings.
> >>
> >> Remember to reload core and reminded when you change definitions.
> >>
> >> Regards,
> >>Alex
> >>
> >>
> >> On 16 Mar 2017 9:15 AM, "Mark Johnson" 
> >> wrote:
> >>
> >> > Forgive me if I'm missing something obvious -- I'm new to Solr, but I
> >> can't
> >> > seem to find an explanation for the behavior I'm seeing.
> >> >
> >> > If I have a document that looks like this:
> >> > {
> >> > field1: "aaa bbb",
> >> > field2: "ccc ddd",
> >> > field3: "eee fff"
> >> > }
> >> >
> >> > And I do a search where "q" is "aaa ccc", I get the document in the
> >> > results. This is because (please correct me if I'm wrong) the default
> >> "df"
> >> > is set to the "_text_" field, which contains the text values from all
> >> > fields.
> >> >
> >> > However, if I do a search where "df" is "field1" and "field2" and "q"
> is
> >> > "aaa ccc" (words from field1 and field2) I get no results.
> >> >
> >> > In a simpler example, if I do a search where "df" is "field1" and "q"
> is
> >> > "aaa" (a word from field1) I still get no results.
> >> >
> >> > If I do a search where "df" is "field1" and "q" is "aaa bbb" (the full
> >> > value of field1) then I get the document in the results.
> >> >
> >> > So I'm concluding that when using "df" to specify which fields to
> search
> >> > then only an exact match on the full field value will return a
> document.
> >> >
> >> > Is that a correct conclusion? Is there another way to specify which
> >> fields
> >> > to search without requiring an exact match? The results I'd like to
> >> achieve
> >> > are:
> >> >
> >> > Would Match:
> >> > q=aaa
> >> > q=aaa bbb
> >> > q=aaa ccc
> >> > q=aaa fff
> >> >
> >> > Would Not Match:
> >> > q=eee
> >> > q=fff
> >> > q=eee fff
> >> >
> >> > --
> >> > *This message is intended only for the use of the individual or
> entity to
> >> > which it is addressed and may contain information that is privileged,
> >> > confidential and exempt from disclosure under applicable law. If you
> have
> >> > received this message in error, you are hereby notified that any use,
> >> > dissemination, distribution or copying of this message is prohibited.
> If
> >> > you have received this communication in error, please notify the
> sender
> >> > immediately and destroy the transmitted information.*
> >> >
> >>
> >
> >
> >
> > --
> >
> > Best Regards,
> >
> > *Mark Johnson* | .NET Software Engineer
> >
> > Office: 603-392-7017
> >
> > Emerson Ecologics, LLC | 1230 Elm Street | Suite 301 | Manchester NH |
> 03101
> >
> >   
> >
> > *Supporting The Practice Of Healthy Living*
> >
> > 
> > 
> > 
> > 
> > 
> > 
> >  Ecologics-EI_IE388367.11,28.htm>
> >
> > --
> > *This message is intended only for the use of the individual or entity to
> > which it is addressed and may contain information that is privileged,
> > confidential and exempt from di

Re: Get handler not working

2017-03-16 Thread Yonik Seeley
Ah, yeah, if you're using a different route field it's highly likely
that's the issue.
I was always against that "feature", and this thread demonstrates part
of the problem (complicating clients, including us human clients
trying to make sense of what's going on).

-Yonik


On Thu, Mar 16, 2017 at 10:31 AM, Chris Ulicny  wrote:
> Speaking of routing, I realized I completely forgot to add the routing
> setup to the test cloud, so it probably has something to do with the issue.
> I'll add that in and report back.
>
> So the routing and uniqueKey setup is as follows:
>
> Schema setup:
> iqdocid  multiValued="false" indexed="true" required="true" stored="true"/>  name="iqdocid" type="string" multiValued="false" indexed="true" required=
> "true" stored="true"/>
>
> I don't think it's mentioned in the documentation about using routerField
> for the compositeId router, but based on the resolution of SOLR-5017
> , we decided to use the
> compositeId router with routerField set to 'iqroutingkey' which is using
> the "!" notation. In general, the iqroutingkey field is of the form:
> !!
>
> Unless I misunderstood what was changed with that patch, that form should
> still route appropriately, and it seems that it has distributed the
> documents appropriately from our basic testing.
>
> On Thu, Mar 16, 2017 at 9:42 AM David Hastings 
> wrote:
>
> i still would like to see an experiment where you change the field to id
> instead of iqdocid,
>
> On Thu, Mar 16, 2017 at 9:33 AM, Yonik Seeley  wrote:
>
>> Something to do with routing perhaps? (the mapping of ids to shards,
>> by default is based on hashes of the id)
>> -Yonik
>>
>>
>> On Thu, Mar 16, 2017 at 9:16 AM, Chris Ulicny  wrote:
>> > iqdocid is already set to be the uniqueKey value.
>> >
>> > I tried reindexing a few documents back into the problematic cloud and
> am
>> > getting the same behavior of no document found for get handler.
>> >
>> > I've also done some testing on standalone instances as well as some
> quick
>> > cloud setups (with embedded zk), and I cannot seem to replicate the
>> > problem. For each test, I used the exact same configset that is causing
>> the
>> > issue for us and indexed a document from that instance as well. I can
>> > provide more details if that would be useful in anyway.
>> >
>> > Standalone instance worked
>> > Cloud mode worked regardless of the use of the security plugin
>> > Cloud mode worked regardless of explicit get handler definition
>> > Cloud mode consistently worked with explicitly defining the get handler,
>> > then removing it and reloading the collection
>> >
>> > The only differences that I know of between the tests and the
> problematic
>> > cloud is that solr is running as a different user and using an external
>> > zookeeper ensemble. The running user has ownership of the solr
>> > installation, log, and data directories.
>> >
>> > I'm going to keep trying different setups to see if I can replicate the
>> > issue, but if anyone has any ideas on what direction might make the most
>> > sense, please let me know.
>> >
>> > Thanks again
>> >
>> > On Wed, Mar 15, 2017 at 5:49 PM Erick Erickson 
>> > wrote:
>> >
>> > Wait... Is iqdocid set to the  in your schema? That might
>> > be the missing thing.
>> >
>> >
>> >
>> > On Wed, Mar 15, 2017 at 11:20 AM, Chris Ulicny  wrote:
>> >> Unless the behavior's changed on the way to version 6.3.0, the get
>> handler
>> >> used to use whatever field is set to be the uniqueKey. We have
>> > successfully
>> >> been using get on a 4.9.0 standalone core with no explicit "id" field
>> >> defined by passing in the value for the uniqueKey field to the get
>> > handler.
>> >> We tend to have a bunch of id fields floating around from different
>> >> sources, so we avoid keeping any of them named as "id"
>> >>
>> >> iqdocid is just a basic string type
>> >> > >> required="true" stored="true"/>
>> >>
>> >> I'll do some more testing on standalone versions, and see how that
> goes.
>> >>
>> >> On Wed, Mar 15, 2017 at 1:52 PM David Hastings <
>> > hastings.recurs...@gmail.com>
>> >> wrote:
>> >>
>> >>> from your previous email:
>> >>> "There is no "id"
>> >>> field defined in the schema."
>> >>>
>> >>> you need an id field to use the get handler
>> >>>
>> >>> On Wed, Mar 15, 2017 at 1:45 PM, Chris Ulicny 
>> wrote:
>> >>>
>> >>> > I thought that "id" and "ids" were fixed parameters for the get
>> > handler,
>> >>> > but I never remember, so I've already tried both. Each time it comes
>> > back
>> >>> > with the same response of no document.
>> >>> >
>> >>> > On Wed, Mar 15, 2017 at 1:31 PM Alexandre Rafalovitch <
>> >>> arafa...@gmail.com>
>> >>> > wrote:
>> >>> >
>> >>> > > Actually.
>> >>> > >
>> >>> > > I think Real Time Get handler has "id" as a magical parameter, not
>> as
>> >>> > > a field name. It maps to the real id field via the uniqueKey
>> >>> > > definition:
>> >>> > > https://cwiki.apache.org/confluence/display/solr/RealTime+

Re: Partial Match with DF

2017-03-16 Thread Erick Erickson
Yeah, they've saved me on numerous occasions, glad to see they helped.

One caution BTW when you start changing fieldTypes is you have to
watch punctuation. StandardTokenizerFactory won't pass through most
punctuation.

WordDelimiterFilterFactory breaks on non alpha-num, including
punctuation effectively throwing it out.

But WhitespaceTokenizer does just that and spits out punctuation as
part of tokens, i.e.
"my words." (note period) is broken up as "my" "words." and wouldn't
match a search on "word".

One other note, there's a tokenizer/filter for a zillion different
cases, you can go wild. Here's a partial
list:https://cwiki.apache.org/confluence/display/solr/Understanding+Analyzers%2C+Tokenizers%2C+and+Filters,
see the "Tokenizer", "Filters" and CharFilters" links. There are 12
tokenizers listed and 40 or so filters... and the list is not
guaranteed to be complete.

On Thu, Mar 16, 2017 at 7:39 AM, Mark Johnson
 wrote:
> You're right! The fields I'm searching are all "string" type. I switched to
> "text_en" and now it's working exactly as I need it to! I'll do some
> research to see if "text_en" or another "text" type field is best for our
> needs.
>
> Also, those debug options are amazing! They'll help tremendously in the
> future.
>
> Thank you much!
>
> On Thu, Mar 16, 2017 at 10:02 AM, Erick Erickson 
> wrote:
>
>> My guess: Your analysis chain for the fields is different, i.e. they
>> have a different fieldType. In particular, watch out for the "string"
>> type, people are often confused about it. It does _not_ break input
>> into tokens, you need a text-based field type, text_en is one example
>> that is usually in the configs by default.
>>
>> Two tools that'll help you enormously:
>>
>> admin UI>>select core (or collection) from the drop-down>>analysis
>> That shows you exactly how Solr/Lucene break up text at query and index
>> time
>>
>> add &debug=query to the URL. That'll show you how the query was parsed.
>>
>> Best,
>> Erick
>>
>> On Thu, Mar 16, 2017 at 6:52 AM, Mark Johnson
>>  wrote:
>> > Oh, great! Thank you!
>> >
>> > So if I switch over to eDisMax I'd specify the fields to query via the
>> "qf"
>> > parameter, right? That seems to have the same result (only matches when I
>> > specify the exact phrase in the field, not just certain words from it).
>> >
>> > On Thu, Mar 16, 2017 at 9:33 AM, Alexandre Rafalovitch <
>> arafa...@gmail.com>
>> > wrote:
>> >
>> >> df is default field - you can only give one. To search over multiple
>> >> fields, you switch to eDisMax query parser and fl parameter.
>> >>
>> >> Then, the question will be what type definition your fields have. When
>> you
>> >> search text field, you are using its definition because of copyField.
>> Your
>> >> original fields may be strings.
>> >>
>> >> Remember to reload core and reminded when you change definitions.
>> >>
>> >> Regards,
>> >>Alex
>> >>
>> >>
>> >> On 16 Mar 2017 9:15 AM, "Mark Johnson" 
>> >> wrote:
>> >>
>> >> > Forgive me if I'm missing something obvious -- I'm new to Solr, but I
>> >> can't
>> >> > seem to find an explanation for the behavior I'm seeing.
>> >> >
>> >> > If I have a document that looks like this:
>> >> > {
>> >> > field1: "aaa bbb",
>> >> > field2: "ccc ddd",
>> >> > field3: "eee fff"
>> >> > }
>> >> >
>> >> > And I do a search where "q" is "aaa ccc", I get the document in the
>> >> > results. This is because (please correct me if I'm wrong) the default
>> >> "df"
>> >> > is set to the "_text_" field, which contains the text values from all
>> >> > fields.
>> >> >
>> >> > However, if I do a search where "df" is "field1" and "field2" and "q"
>> is
>> >> > "aaa ccc" (words from field1 and field2) I get no results.
>> >> >
>> >> > In a simpler example, if I do a search where "df" is "field1" and "q"
>> is
>> >> > "aaa" (a word from field1) I still get no results.
>> >> >
>> >> > If I do a search where "df" is "field1" and "q" is "aaa bbb" (the full
>> >> > value of field1) then I get the document in the results.
>> >> >
>> >> > So I'm concluding that when using "df" to specify which fields to
>> search
>> >> > then only an exact match on the full field value will return a
>> document.
>> >> >
>> >> > Is that a correct conclusion? Is there another way to specify which
>> >> fields
>> >> > to search without requiring an exact match? The results I'd like to
>> >> achieve
>> >> > are:
>> >> >
>> >> > Would Match:
>> >> > q=aaa
>> >> > q=aaa bbb
>> >> > q=aaa ccc
>> >> > q=aaa fff
>> >> >
>> >> > Would Not Match:
>> >> > q=eee
>> >> > q=fff
>> >> > q=eee fff
>> >> >
>> >> > --
>> >> > *This message is intended only for the use of the individual or
>> entity to
>> >> > which it is addressed and may contain information that is privileged,
>> >> > confidential and exempt from disclosure under applicable law. If you
>> have
>> >> > received this message in error, you are hereby notified that any use,
>> >> > dissemination, distribution or copying of this message is prohibited.
>> If

Re: Partial Match with DF

2017-03-16 Thread Charlie Hull
Hi Mark,

Open Source Connection's excellent www.splainer.io might also be useful to
help you break down exactly what your query is doing.

Cheers

Charlie

P.S. planning a blog soon listing 'useful Solr tools'

On 16 March 2017 at 14:39, Mark Johnson 
wrote:

> You're right! The fields I'm searching are all "string" type. I switched to
> "text_en" and now it's working exactly as I need it to! I'll do some
> research to see if "text_en" or another "text" type field is best for our
> needs.
>
> Also, those debug options are amazing! They'll help tremendously in the
> future.
>
> Thank you much!
>
> On Thu, Mar 16, 2017 at 10:02 AM, Erick Erickson 
> wrote:
>
> > My guess: Your analysis chain for the fields is different, i.e. they
> > have a different fieldType. In particular, watch out for the "string"
> > type, people are often confused about it. It does _not_ break input
> > into tokens, you need a text-based field type, text_en is one example
> > that is usually in the configs by default.
> >
> > Two tools that'll help you enormously:
> >
> > admin UI>>select core (or collection) from the drop-down>>analysis
> > That shows you exactly how Solr/Lucene break up text at query and index
> > time
> >
> > add &debug=query to the URL. That'll show you how the query was parsed.
> >
> > Best,
> > Erick
> >
> > On Thu, Mar 16, 2017 at 6:52 AM, Mark Johnson
> >  wrote:
> > > Oh, great! Thank you!
> > >
> > > So if I switch over to eDisMax I'd specify the fields to query via the
> > "qf"
> > > parameter, right? That seems to have the same result (only matches
> when I
> > > specify the exact phrase in the field, not just certain words from it).
> > >
> > > On Thu, Mar 16, 2017 at 9:33 AM, Alexandre Rafalovitch <
> > arafa...@gmail.com>
> > > wrote:
> > >
> > >> df is default field - you can only give one. To search over multiple
> > >> fields, you switch to eDisMax query parser and fl parameter.
> > >>
> > >> Then, the question will be what type definition your fields have. When
> > you
> > >> search text field, you are using its definition because of copyField.
> > Your
> > >> original fields may be strings.
> > >>
> > >> Remember to reload core and reminded when you change definitions.
> > >>
> > >> Regards,
> > >>Alex
> > >>
> > >>
> > >> On 16 Mar 2017 9:15 AM, "Mark Johnson"  >
> > >> wrote:
> > >>
> > >> > Forgive me if I'm missing something obvious -- I'm new to Solr, but
> I
> > >> can't
> > >> > seem to find an explanation for the behavior I'm seeing.
> > >> >
> > >> > If I have a document that looks like this:
> > >> > {
> > >> > field1: "aaa bbb",
> > >> > field2: "ccc ddd",
> > >> > field3: "eee fff"
> > >> > }
> > >> >
> > >> > And I do a search where "q" is "aaa ccc", I get the document in the
> > >> > results. This is because (please correct me if I'm wrong) the
> default
> > >> "df"
> > >> > is set to the "_text_" field, which contains the text values from
> all
> > >> > fields.
> > >> >
> > >> > However, if I do a search where "df" is "field1" and "field2" and
> "q"
> > is
> > >> > "aaa ccc" (words from field1 and field2) I get no results.
> > >> >
> > >> > In a simpler example, if I do a search where "df" is "field1" and
> "q"
> > is
> > >> > "aaa" (a word from field1) I still get no results.
> > >> >
> > >> > If I do a search where "df" is "field1" and "q" is "aaa bbb" (the
> full
> > >> > value of field1) then I get the document in the results.
> > >> >
> > >> > So I'm concluding that when using "df" to specify which fields to
> > search
> > >> > then only an exact match on the full field value will return a
> > document.
> > >> >
> > >> > Is that a correct conclusion? Is there another way to specify which
> > >> fields
> > >> > to search without requiring an exact match? The results I'd like to
> > >> achieve
> > >> > are:
> > >> >
> > >> > Would Match:
> > >> > q=aaa
> > >> > q=aaa bbb
> > >> > q=aaa ccc
> > >> > q=aaa fff
> > >> >
> > >> > Would Not Match:
> > >> > q=eee
> > >> > q=fff
> > >> > q=eee fff
> > >> >
> > >> > --
> > >> > *This message is intended only for the use of the individual or
> > entity to
> > >> > which it is addressed and may contain information that is
> privileged,
> > >> > confidential and exempt from disclosure under applicable law. If you
> > have
> > >> > received this message in error, you are hereby notified that any
> use,
> > >> > dissemination, distribution or copying of this message is
> prohibited.
> > If
> > >> > you have received this communication in error, please notify the
> > sender
> > >> > immediately and destroy the transmitted information.*
> > >> >
> > >>
> > >
> > >
> > >
> > > --
> > >
> > > Best Regards,
> > >
> > > *Mark Johnson* | .NET Software Engineer
> > >
> > > Office: 603-392-7017
> > >
> > > Emerson Ecologics, LLC | 1230 Elm Street | Suite 301 | Manchester NH |
> > 03101
> > >
> > >   
> > >
> > > *Supporting The Practice Of Healthy Living*
> > >
> > > 

question about function query

2017-03-16 Thread Bernd Fehling
I'm testing some function queries and have some questions.

original queries:
1. q=collection:ftmuenster&fl=*
--> numFound="6029"

2. q=collection:ftmuenster+AND+-description:*&fl=*
--> numFound="1877"

3. q=collection:ftmuenster+AND+description:*&fl=*
--> numFound="4152"

This looks good.

But now with function query:

q={!func}exists(description)&fq=collection:ftmuenster&fl=*
--> numFound="6029"

I'm was hoping to get numFound=4152, why not?

I also tried:
q={!func}exists(description)&fq=collection:ftmuenster&q.op=AND&fl=*
--> numFound="6029"

What are the function queries equivalent to queries 2. and 3. above?

Regards
Bernd



Re: Get handler not working

2017-03-16 Thread Chris Ulicny
I think I've figured out where the issue is, at least superficially. It's
in what parameter is used to define the field to route on. I set up two
collections to use the same configset but slightly altered calls to the
Collections API.

action=CREATE&name=CollectionOne&numShards=2&router.name=compositeId&
*router.field*
=iqroutingkey&maxShardsPerNode=2&collection.configName=RoutingTest
action=CREATE&name=CollectionTwo&numShards=2&router.name=compositeId&
*routerField*
=iqroutingkey&maxShardsPerNode=2&collection.configName=RoutingTest

The get handler returns null for CollectionOne (even with a _route_
parameter), but it will return the document for CollectionTwo in any case.
I will gather and post the trace logs when I get a chance.



On Thu, Mar 16, 2017 at 10:52 AM Yonik Seeley  wrote:

> Ah, yeah, if you're using a different route field it's highly likely
> that's the issue.
> I was always against that "feature", and this thread demonstrates part
> of the problem (complicating clients, including us human clients
> trying to make sense of what's going on).
>
> -Yonik
>
>
> On Thu, Mar 16, 2017 at 10:31 AM, Chris Ulicny  wrote:
> > Speaking of routing, I realized I completely forgot to add the routing
> > setup to the test cloud, so it probably has something to do with the
> issue.
> > I'll add that in and report back.
> >
> > So the routing and uniqueKey setup is as follows:
> >
> > Schema setup:
> > iqdocid  > multiValued="false" indexed="true" required="true" stored="true"/>  > name="iqdocid" type="string" multiValued="false" indexed="true" required=
> > "true" stored="true"/>
> >
> > I don't think it's mentioned in the documentation about using routerField
> > for the compositeId router, but based on the resolution of SOLR-5017
> > , we decided to use the
> > compositeId router with routerField set to 'iqroutingkey' which is using
> > the "!" notation. In general, the iqroutingkey field is of the form:
> > !!
> >
> > Unless I misunderstood what was changed with that patch, that form should
> > still route appropriately, and it seems that it has distributed the
> > documents appropriately from our basic testing.
> >
> > On Thu, Mar 16, 2017 at 9:42 AM David Hastings <
> hastings.recurs...@gmail.com>
> > wrote:
> >
> > i still would like to see an experiment where you change the field to id
> > instead of iqdocid,
> >
> > On Thu, Mar 16, 2017 at 9:33 AM, Yonik Seeley  wrote:
> >
> >> Something to do with routing perhaps? (the mapping of ids to shards,
> >> by default is based on hashes of the id)
> >> -Yonik
> >>
> >>
> >> On Thu, Mar 16, 2017 at 9:16 AM, Chris Ulicny  wrote:
> >> > iqdocid is already set to be the uniqueKey value.
> >> >
> >> > I tried reindexing a few documents back into the problematic cloud and
> > am
> >> > getting the same behavior of no document found for get handler.
> >> >
> >> > I've also done some testing on standalone instances as well as some
> > quick
> >> > cloud setups (with embedded zk), and I cannot seem to replicate the
> >> > problem. For each test, I used the exact same configset that is
> causing
> >> the
> >> > issue for us and indexed a document from that instance as well. I can
> >> > provide more details if that would be useful in anyway.
> >> >
> >> > Standalone instance worked
> >> > Cloud mode worked regardless of the use of the security plugin
> >> > Cloud mode worked regardless of explicit get handler definition
> >> > Cloud mode consistently worked with explicitly defining the get
> handler,
> >> > then removing it and reloading the collection
> >> >
> >> > The only differences that I know of between the tests and the
> > problematic
> >> > cloud is that solr is running as a different user and using an
> external
> >> > zookeeper ensemble. The running user has ownership of the solr
> >> > installation, log, and data directories.
> >> >
> >> > I'm going to keep trying different setups to see if I can replicate
> the
> >> > issue, but if anyone has any ideas on what direction might make the
> most
> >> > sense, please let me know.
> >> >
> >> > Thanks again
> >> >
> >> > On Wed, Mar 15, 2017 at 5:49 PM Erick Erickson <
> erickerick...@gmail.com>
> >> > wrote:
> >> >
> >> > Wait... Is iqdocid set to the  in your schema? That might
> >> > be the missing thing.
> >> >
> >> >
> >> >
> >> > On Wed, Mar 15, 2017 at 11:20 AM, Chris Ulicny 
> wrote:
> >> >> Unless the behavior's changed on the way to version 6.3.0, the get
> >> handler
> >> >> used to use whatever field is set to be the uniqueKey. We have
> >> > successfully
> >> >> been using get on a 4.9.0 standalone core with no explicit "id" field
> >> >> defined by passing in the value for the uniqueKey field to the get
> >> > handler.
> >> >> We tend to have a bunch of id fields floating around from different
> >> >> sources, so we avoid keeping any of them named as "id"
> >> >>
> >> >> iqdocid is just a basic string type
> >> >>  indexed="true"
> >> >> require

Re: Solr shingles is not working in solr 6.4.0

2017-03-16 Thread Aman Deep Singh
Already check that i am sending sceenshots of various senarios

On Thu, Mar 16, 2017 at 7:46 PM Alexandre Rafalovitch 
wrote:

> Sanity check. Is your 'df' pointing at the field you think it is
> pointing at? It really does look like all tokens were eaten and
> nothing was left. But you should have seen that in the Analysis screen
> too, if you have the right field.
>
> Try adding echoParams=all to your request to see the full final
> parameter list. Maybe some parameters in initParams sections override
> your assumed config.
>
> Regards,
>Alex.
> 
> http://www.solr-start.com/ - Resources for Solr users, new and experienced
>
>
> On 16 March 2017 at 08:30, Aman Deep Singh 
> wrote:
> > Hi,
> >
> > Recently I migrated from solr 4 to 6
> > IN solr 4 shinglefilterfactory is working correctly
> > my configration  i
> >
> >  > positionIncrementGap="100">
> > 
> >  
> >   > maxShingleSize="5"
> >  outputUnigrams="false"
> outputUnigramsIfNoShingles="false" />
> >   
> > 
> > 
> >   
> >   > maxShingleSize="5"
> >  outputUnigrams="false"
> outputUnigramsIfNoShingles="false" />
> >   
> >   
> > 
> >   
> >
> >
> >
> > But after updating to solr 6 shingles is not working ,schema is as below,
> >
> >  > positionIncrementGap="100">
> > 
> >  
> >   > maxShingleSize="5"
> >  outputUnigrams="false"
> outputUnigramsIfNoShingles="false" />
> >   
> > 
> > 
> >   
> >   > maxShingleSize="5"
> >  outputUnigrams="false"
> outputUnigramsIfNoShingles="false" />
> >   
> > 
> >   
> >
> > Although in the Analysis tab is was showing proper shingle result but
> when
> > using in the queryParser it was not giving proper results
> >
> > my sample hit is
> >
> >
> http://localhost:8983/solr/shingel_test/select?q=one%20plus%20one&wt=xml&debugQuery=true&defType=edismax&qf=cust_shingle
> >
> > it create the parsed query as
> >
> > one plus one
> > one plus one
> > (+())/no_coord
> > +()
> > 
> > ExtendedDismaxQParser
>


Re: Solr shingles is not working in solr 6.4.0

2017-03-16 Thread Alexandre Rafalovitch
Images do not come through.

But I was wrong too. You use eDismax and pass "cust_shingle" in, so
the "df" value is irrelevant.

You definitely reloaded the core after changing definitions?

http://www.solr-start.com/ - Resources for Solr users, new and experienced


On 16 March 2017 at 12:37, Aman Deep Singh  wrote:
> Already check that i am sending sceenshots of various senarios
>
>
> On Thu, Mar 16, 2017 at 7:46 PM Alexandre Rafalovitch 
> wrote:
>>
>> Sanity check. Is your 'df' pointing at the field you think it is
>> pointing at? It really does look like all tokens were eaten and
>> nothing was left. But you should have seen that in the Analysis screen
>> too, if you have the right field.
>>
>> Try adding echoParams=all to your request to see the full final
>> parameter list. Maybe some parameters in initParams sections override
>> your assumed config.
>>
>> Regards,
>>Alex.
>> 
>> http://www.solr-start.com/ - Resources for Solr users, new and experienced
>>
>>
>> On 16 March 2017 at 08:30, Aman Deep Singh 
>> wrote:
>> > Hi,
>> >
>> > Recently I migrated from solr 4 to 6
>> > IN solr 4 shinglefilterfactory is working correctly
>> > my configration  i
>> >
>> > > > positionIncrementGap="100">
>> > 
>> >  
>> >  > > maxShingleSize="5"
>> >  outputUnigrams="false"
>> > outputUnigramsIfNoShingles="false" />
>> >   
>> > 
>> > 
>> >   
>> >  > > maxShingleSize="5"
>> >  outputUnigrams="false"
>> > outputUnigramsIfNoShingles="false" />
>> >   
>> >   
>> > 
>> >   
>> >
>> >
>> >
>> > But after updating to solr 6 shingles is not working ,schema is as
>> > below,
>> >
>> > > > positionIncrementGap="100">
>> > 
>> >  
>> >  > > maxShingleSize="5"
>> >  outputUnigrams="false"
>> > outputUnigramsIfNoShingles="false" />
>> >   
>> > 
>> > 
>> >   
>> >  > > maxShingleSize="5"
>> >  outputUnigrams="false"
>> > outputUnigramsIfNoShingles="false" />
>> >   
>> > 
>> >   
>> >
>> > Although in the Analysis tab is was showing proper shingle result but
>> > when
>> > using in the queryParser it was not giving proper results
>> >
>> > my sample hit is
>> >
>> >
>> > http://localhost:8983/solr/shingel_test/select?q=one%20plus%20one&wt=xml&debugQuery=true&defType=edismax&qf=cust_shingle
>> >
>> > it create the parsed query as
>> >
>> > one plus one
>> > one plus one
>> > (+())/no_coord
>> > +()
>> > 
>> > ExtendedDismaxQParser


Re: Solr shingles is not working in solr 6.4.0

2017-03-16 Thread Aman Deep Singh
Yes I have reloaded the core after config changes

On 16-Mar-2017 10:28 PM, "Alexandre Rafalovitch"  wrote:

Images do not come through.

But I was wrong too. You use eDismax and pass "cust_shingle" in, so
the "df" value is irrelevant.

You definitely reloaded the core after changing definitions?

http://www.solr-start.com/ - Resources for Solr users, new and experienced


On 16 March 2017 at 12:37, Aman Deep Singh 
wrote:
> Already check that i am sending sceenshots of various senarios
>
>
> On Thu, Mar 16, 2017 at 7:46 PM Alexandre Rafalovitch 
> wrote:
>>
>> Sanity check. Is your 'df' pointing at the field you think it is
>> pointing at? It really does look like all tokens were eaten and
>> nothing was left. But you should have seen that in the Analysis screen
>> too, if you have the right field.
>>
>> Try adding echoParams=all to your request to see the full final
>> parameter list. Maybe some parameters in initParams sections override
>> your assumed config.
>>
>> Regards,
>>Alex.
>> 
>> http://www.solr-start.com/ - Resources for Solr users, new and
experienced
>>
>>
>> On 16 March 2017 at 08:30, Aman Deep Singh 
>> wrote:
>> > Hi,
>> >
>> > Recently I migrated from solr 4 to 6
>> > IN solr 4 shinglefilterfactory is working correctly
>> > my configration  i
>> >
>> > > > positionIncrementGap="100">
>> > 
>> >  
>> >  > > maxShingleSize="5"
>> >  outputUnigrams="false"
>> > outputUnigramsIfNoShingles="false" />
>> >   
>> > 
>> > 
>> >   
>> >  > > maxShingleSize="5"
>> >  outputUnigrams="false"
>> > outputUnigramsIfNoShingles="false" />
>> >   
>> >   
>> > 
>> >   
>> >
>> >
>> >
>> > But after updating to solr 6 shingles is not working ,schema is as
>> > below,
>> >
>> > > > positionIncrementGap="100">
>> > 
>> >  
>> >  > > maxShingleSize="5"
>> >  outputUnigrams="false"
>> > outputUnigramsIfNoShingles="false" />
>> >   
>> > 
>> > 
>> >   
>> >  > > maxShingleSize="5"
>> >  outputUnigrams="false"
>> > outputUnigramsIfNoShingles="false" />
>> >   
>> > 
>> >   
>> >
>> > Although in the Analysis tab is was showing proper shingle result but
>> > when
>> > using in the queryParser it was not giving proper results
>> >
>> > my sample hit is
>> >
>> >
>> > http://localhost:8983/solr/shingel_test/select?q=one%
20plus%20one&wt=xml&debugQuery=true&defType=edismax&qf=cust_shingle
>> >
>> > it create the parsed query as
>> >
>> > one plus one
>> > one plus one
>> > (+())/no_coord
>> > +()
>> > 
>> > ExtendedDismaxQParser


Re: Solr shingles is not working in solr 6.4.0

2017-03-16 Thread Aman Deep Singh
For images dropbox url is
https://www.dropbox.com/sh/6dy6a8ajabjtxrt/AAAoxhZQe2vp3sTl3Av71_eHa?dl=0


On Thu, Mar 16, 2017 at 10:29 PM Aman Deep Singh 
wrote:

> Yes I have reloaded the core after config changes
>
>
> On 16-Mar-2017 10:28 PM, "Alexandre Rafalovitch" 
> wrote:
>
> Images do not come through.
>
> But I was wrong too. You use eDismax and pass "cust_shingle" in, so
> the "df" value is irrelevant.
>
> You definitely reloaded the core after changing definitions?
> 
> http://www.solr-start.com/ - Resources for Solr users, new and experienced
>
>
> On 16 March 2017 at 12:37, Aman Deep Singh 
> wrote:
> > Already check that i am sending sceenshots of various senarios
> >
> >
> > On Thu, Mar 16, 2017 at 7:46 PM Alexandre Rafalovitch <
> arafa...@gmail.com>
> > wrote:
> >>
> >> Sanity check. Is your 'df' pointing at the field you think it is
> >> pointing at? It really does look like all tokens were eaten and
> >> nothing was left. But you should have seen that in the Analysis screen
> >> too, if you have the right field.
> >>
> >> Try adding echoParams=all to your request to see the full final
> >> parameter list. Maybe some parameters in initParams sections override
> >> your assumed config.
> >>
> >> Regards,
> >>Alex.
> >> 
> >> http://www.solr-start.com/ - Resources for Solr users, new and
> experienced
> >>
> >>
> >> On 16 March 2017 at 08:30, Aman Deep Singh 
> >> wrote:
> >> > Hi,
> >> >
> >> > Recently I migrated from solr 4 to 6
> >> > IN solr 4 shinglefilterfactory is working correctly
> >> > my configration  i
> >> >
> >> >  >> > positionIncrementGap="100">
> >> > 
> >> >  
> >> >   >> > maxShingleSize="5"
> >> >  outputUnigrams="false"
> >> > outputUnigramsIfNoShingles="false" />
> >> >   
> >> > 
> >> > 
> >> >   
> >> >   >> > maxShingleSize="5"
> >> >  outputUnigrams="false"
> >> > outputUnigramsIfNoShingles="false" />
> >> >   
> >> >   
> >> > 
> >> >   
> >> >
> >> >
> >> >
> >> > But after updating to solr 6 shingles is not working ,schema is as
> >> > below,
> >> >
> >> >  >> > positionIncrementGap="100">
> >> > 
> >> >  
> >> >   >> > maxShingleSize="5"
> >> >  outputUnigrams="false"
> >> > outputUnigramsIfNoShingles="false" />
> >> >   
> >> > 
> >> > 
> >> >   
> >> >   >> > maxShingleSize="5"
> >> >  outputUnigrams="false"
> >> > outputUnigramsIfNoShingles="false" />
> >> >   
> >> > 
> >> >   
> >> >
> >> > Although in the Analysis tab is was showing proper shingle result but
> >> > when
> >> > using in the queryParser it was not giving proper results
> >> >
> >> > my sample hit is
> >> >
> >> >
> >> >
> http://localhost:8983/solr/shingel_test/select?q=one%20plus%20one&wt=xml&debugQuery=true&defType=edismax&qf=cust_shingle
> >> >
> >> > it create the parsed query as
> >> >
> >> > one plus one
> >> > one plus one
> >> > (+())/no_coord
> >> > +()
> >> > 
> >> > ExtendedDismaxQParser
>
>
>


Re: Partial Match with DF

2017-03-16 Thread Mark Johnson
Wow, that's really powerful! Thank you!

On Thu, Mar 16, 2017 at 11:19 AM, Charlie Hull  wrote:

> Hi Mark,
>
> Open Source Connection's excellent www.splainer.io might also be useful to
> help you break down exactly what your query is doing.
>
> Cheers
>
> Charlie
>
> P.S. planning a blog soon listing 'useful Solr tools'
>
> On 16 March 2017 at 14:39, Mark Johnson 
> wrote:
>
> > You're right! The fields I'm searching are all "string" type. I switched
> to
> > "text_en" and now it's working exactly as I need it to! I'll do some
> > research to see if "text_en" or another "text" type field is best for our
> > needs.
> >
> > Also, those debug options are amazing! They'll help tremendously in the
> > future.
> >
> > Thank you much!
> >
> > On Thu, Mar 16, 2017 at 10:02 AM, Erick Erickson <
> erickerick...@gmail.com>
> > wrote:
> >
> > > My guess: Your analysis chain for the fields is different, i.e. they
> > > have a different fieldType. In particular, watch out for the "string"
> > > type, people are often confused about it. It does _not_ break input
> > > into tokens, you need a text-based field type, text_en is one example
> > > that is usually in the configs by default.
> > >
> > > Two tools that'll help you enormously:
> > >
> > > admin UI>>select core (or collection) from the drop-down>>analysis
> > > That shows you exactly how Solr/Lucene break up text at query and index
> > > time
> > >
> > > add &debug=query to the URL. That'll show you how the query was parsed.
> > >
> > > Best,
> > > Erick
> > >
> > > On Thu, Mar 16, 2017 at 6:52 AM, Mark Johnson
> > >  wrote:
> > > > Oh, great! Thank you!
> > > >
> > > > So if I switch over to eDisMax I'd specify the fields to query via
> the
> > > "qf"
> > > > parameter, right? That seems to have the same result (only matches
> > when I
> > > > specify the exact phrase in the field, not just certain words from
> it).
> > > >
> > > > On Thu, Mar 16, 2017 at 9:33 AM, Alexandre Rafalovitch <
> > > arafa...@gmail.com>
> > > > wrote:
> > > >
> > > >> df is default field - you can only give one. To search over multiple
> > > >> fields, you switch to eDisMax query parser and fl parameter.
> > > >>
> > > >> Then, the question will be what type definition your fields have.
> When
> > > you
> > > >> search text field, you are using its definition because of
> copyField.
> > > Your
> > > >> original fields may be strings.
> > > >>
> > > >> Remember to reload core and reminded when you change definitions.
> > > >>
> > > >> Regards,
> > > >>Alex
> > > >>
> > > >>
> > > >> On 16 Mar 2017 9:15 AM, "Mark Johnson" <
> mjohn...@emersonecologics.com
> > >
> > > >> wrote:
> > > >>
> > > >> > Forgive me if I'm missing something obvious -- I'm new to Solr,
> but
> > I
> > > >> can't
> > > >> > seem to find an explanation for the behavior I'm seeing.
> > > >> >
> > > >> > If I have a document that looks like this:
> > > >> > {
> > > >> > field1: "aaa bbb",
> > > >> > field2: "ccc ddd",
> > > >> > field3: "eee fff"
> > > >> > }
> > > >> >
> > > >> > And I do a search where "q" is "aaa ccc", I get the document in
> the
> > > >> > results. This is because (please correct me if I'm wrong) the
> > default
> > > >> "df"
> > > >> > is set to the "_text_" field, which contains the text values from
> > all
> > > >> > fields.
> > > >> >
> > > >> > However, if I do a search where "df" is "field1" and "field2" and
> > "q"
> > > is
> > > >> > "aaa ccc" (words from field1 and field2) I get no results.
> > > >> >
> > > >> > In a simpler example, if I do a search where "df" is "field1" and
> > "q"
> > > is
> > > >> > "aaa" (a word from field1) I still get no results.
> > > >> >
> > > >> > If I do a search where "df" is "field1" and "q" is "aaa bbb" (the
> > full
> > > >> > value of field1) then I get the document in the results.
> > > >> >
> > > >> > So I'm concluding that when using "df" to specify which fields to
> > > search
> > > >> > then only an exact match on the full field value will return a
> > > document.
> > > >> >
> > > >> > Is that a correct conclusion? Is there another way to specify
> which
> > > >> fields
> > > >> > to search without requiring an exact match? The results I'd like
> to
> > > >> achieve
> > > >> > are:
> > > >> >
> > > >> > Would Match:
> > > >> > q=aaa
> > > >> > q=aaa bbb
> > > >> > q=aaa ccc
> > > >> > q=aaa fff
> > > >> >
> > > >> > Would Not Match:
> > > >> > q=eee
> > > >> > q=fff
> > > >> > q=eee fff
> > > >> >
> > > >> > --
> > > >> > *This message is intended only for the use of the individual or
> > > entity to
> > > >> > which it is addressed and may contain information that is
> > privileged,
> > > >> > confidential and exempt from disclosure under applicable law. If
> you
> > > have
> > > >> > received this message in error, you are hereby notified that any
> > use,
> > > >> > dissemination, distribution or copying of this message is
> > prohibited.
> > > If
> > > >> > you have received this communication in error, please notify the
> >

Re: Partial Match with DF

2017-03-16 Thread Mark Johnson
Thank you for the heads up! I think in some cases we will want to strip out
punctuation but in others we might need it (for example, "liquid courage."
should tokenize to "liquid" and "courage", while "1.5 oz liquid courage"
should tokenize to "1.5", "oz", "liquid" and "courage").

I'll have to do some experimenting to see which one will work best for us.

On Thu, Mar 16, 2017 at 11:09 AM, Erick Erickson 
wrote:

> Yeah, they've saved me on numerous occasions, glad to see they helped.
>
> One caution BTW when you start changing fieldTypes is you have to
> watch punctuation. StandardTokenizerFactory won't pass through most
> punctuation.
>
> WordDelimiterFilterFactory breaks on non alpha-num, including
> punctuation effectively throwing it out.
>
> But WhitespaceTokenizer does just that and spits out punctuation as
> part of tokens, i.e.
> "my words." (note period) is broken up as "my" "words." and wouldn't
> match a search on "word".
>
> One other note, there's a tokenizer/filter for a zillion different
> cases, you can go wild. Here's a partial
> list:https://cwiki.apache.org/confluence/display/solr/
> Understanding+Analyzers%2C+Tokenizers%2C+and+Filters,
> see the "Tokenizer", "Filters" and CharFilters" links. There are 12
> tokenizers listed and 40 or so filters... and the list is not
> guaranteed to be complete.
>
> On Thu, Mar 16, 2017 at 7:39 AM, Mark Johnson
>  wrote:
> > You're right! The fields I'm searching are all "string" type. I switched
> to
> > "text_en" and now it's working exactly as I need it to! I'll do some
> > research to see if "text_en" or another "text" type field is best for our
> > needs.
> >
> > Also, those debug options are amazing! They'll help tremendously in the
> > future.
> >
> > Thank you much!
> >
> > On Thu, Mar 16, 2017 at 10:02 AM, Erick Erickson <
> erickerick...@gmail.com>
> > wrote:
> >
> >> My guess: Your analysis chain for the fields is different, i.e. they
> >> have a different fieldType. In particular, watch out for the "string"
> >> type, people are often confused about it. It does _not_ break input
> >> into tokens, you need a text-based field type, text_en is one example
> >> that is usually in the configs by default.
> >>
> >> Two tools that'll help you enormously:
> >>
> >> admin UI>>select core (or collection) from the drop-down>>analysis
> >> That shows you exactly how Solr/Lucene break up text at query and index
> >> time
> >>
> >> add &debug=query to the URL. That'll show you how the query was parsed.
> >>
> >> Best,
> >> Erick
> >>
> >> On Thu, Mar 16, 2017 at 6:52 AM, Mark Johnson
> >>  wrote:
> >> > Oh, great! Thank you!
> >> >
> >> > So if I switch over to eDisMax I'd specify the fields to query via the
> >> "qf"
> >> > parameter, right? That seems to have the same result (only matches
> when I
> >> > specify the exact phrase in the field, not just certain words from
> it).
> >> >
> >> > On Thu, Mar 16, 2017 at 9:33 AM, Alexandre Rafalovitch <
> >> arafa...@gmail.com>
> >> > wrote:
> >> >
> >> >> df is default field - you can only give one. To search over multiple
> >> >> fields, you switch to eDisMax query parser and fl parameter.
> >> >>
> >> >> Then, the question will be what type definition your fields have.
> When
> >> you
> >> >> search text field, you are using its definition because of copyField.
> >> Your
> >> >> original fields may be strings.
> >> >>
> >> >> Remember to reload core and reminded when you change definitions.
> >> >>
> >> >> Regards,
> >> >>Alex
> >> >>
> >> >>
> >> >> On 16 Mar 2017 9:15 AM, "Mark Johnson" <
> mjohn...@emersonecologics.com>
> >> >> wrote:
> >> >>
> >> >> > Forgive me if I'm missing something obvious -- I'm new to Solr,
> but I
> >> >> can't
> >> >> > seem to find an explanation for the behavior I'm seeing.
> >> >> >
> >> >> > If I have a document that looks like this:
> >> >> > {
> >> >> > field1: "aaa bbb",
> >> >> > field2: "ccc ddd",
> >> >> > field3: "eee fff"
> >> >> > }
> >> >> >
> >> >> > And I do a search where "q" is "aaa ccc", I get the document in the
> >> >> > results. This is because (please correct me if I'm wrong) the
> default
> >> >> "df"
> >> >> > is set to the "_text_" field, which contains the text values from
> all
> >> >> > fields.
> >> >> >
> >> >> > However, if I do a search where "df" is "field1" and "field2" and
> "q"
> >> is
> >> >> > "aaa ccc" (words from field1 and field2) I get no results.
> >> >> >
> >> >> > In a simpler example, if I do a search where "df" is "field1" and
> "q"
> >> is
> >> >> > "aaa" (a word from field1) I still get no results.
> >> >> >
> >> >> > If I do a search where "df" is "field1" and "q" is "aaa bbb" (the
> full
> >> >> > value of field1) then I get the document in the results.
> >> >> >
> >> >> > So I'm concluding that when using "df" to specify which fields to
> >> search
> >> >> > then only an exact match on the full field value will return a
> >> document.
> >> >> >
> >> >> > Is that a correct conclusion? Is there another way to spe

Exact match works only for some of the strings

2017-03-16 Thread Gintautas Sulskus
Hi All,

I am trying to figure out why Solr returns an empty result when searching
for the following query:

nameExact:"Guardian EU-referendum"


The field definition:




The type definition:













The analysis, as expected, matches the query parameter against the stored
value. Please take a look at the attached image. I am using
KeywordTokenizer and LowerCaseFilter.
​
What is more strange, the query below works just fine:

nameExact:"Guardian US"


Could you please provide me with some clues on what could be wrong?

Thanks,
Gintas


Re: Solr 6.3 will not stay connected to zookeeper

2017-03-16 Thread Walter Underwood
Still broken with zk 3.4.6. Even the “solr zk” commands can’t fetch the file. 
It is there. I can list it.

$ bin/solr zk ls /configs/tutors

Connecting to ZooKeeper at 
zookeeper01.prod2.cloud.cheggnet.com:2181,zookeeper02.prod2.cloud.cheggnet.com:2181,zookeeper03.prod2.cloud.cheggnet.com,zookeeper04.prod2.cloud.cheggnet.com,zookeeper05.prod2.cloud.cheggnet.com:2181/solr-cloud
 ...
Getting listing for Zookeeper node /configs/tutors from ZooKeeper at 
zookeeper01.prod2.cloud.cheggnet.com:2181,zookeeper02.prod2.cloud.cheggnet.com:2181,zookeeper03.prod2.cloud.cheggnet.com,zookeeper04.prod2.cloud.cheggnet.com,zookeeper05.prod2.cloud.cheggnet.com:2181/solr-cloud
 recurse: false
enumsConfig.xml
solrconfig.xml
admin-extra.menu-top.html
tutors-subject-names.txt
schema.xml
velocity
xslt
admin-extra.html
admin-extra.menu-bottom.html
$ bin/solr zk cp zk:/configs/tutors/solrconfig.xml .

Connecting to ZooKeeper at 
zookeeper01.prod2.cloud.cheggnet.com:2181,zookeeper02.prod2.cloud.cheggnet.com:2181,zookeeper03.prod2.cloud.cheggnet.com,zookeeper04.prod2.cloud.cheggnet.com,zookeeper05.prod2.cloud.cheggnet.com:2181/solr-cloud
 ...
Copying from 'zk:/configs/tutors/solrconfig.xml' to '.'. ZooKeeper at 
zookeeper01.prod2.cloud.cheggnet.com:2181,zookeeper02.prod2.cloud.cheggnet.com:2181,zookeeper03.prod2.cloud.cheggnet.com,zookeeper04.prod2.cloud.cheggnet.com,zookeeper05.prod2.cloud.cheggnet.com:2181/solr-cloud
WARN  - 2017-03-16 11:27:39.418; 
org.apache.solr.common.cloud.ConnectionManager; Watcher 
org.apache.solr.common.cloud.ConnectionManager@1a74e9d1 name: 
ZooKeeperConnection 
Watcher:zookeeper01.prod2.cloud.cheggnet.com:2181,zookeeper02.prod2.cloud.cheggnet.com:2181,zookeeper03.prod2.cloud.cheggnet.com,zookeeper04.prod2.cloud.cheggnet.com,zookeeper05.prod2.cloud.cheggnet.com:2181/solr-cloud
 got event WatchedEvent state:Disconnected type:None path:null path: null type: 
None
WARN  - 2017-03-16 11:27:39.420; 
org.apache.solr.common.cloud.ConnectionManager; zkClient has disconnected
WARN  - 2017-03-16 11:28:00.040; 
org.apache.solr.common.cloud.ConnectionManager; Watcher 
org.apache.solr.common.cloud.ConnectionManager@1a74e9d1 name: 
ZooKeeperConnection 
Watcher:zookeeper01.prod2.cloud.cheggnet.com:2181,zookeeper02.prod2.cloud.cheggnet.com:2181,zookeeper03.prod2.cloud.cheggnet.com,zookeeper04.prod2.cloud.cheggnet.com,zookeeper05.prod2.cloud.cheggnet.com:2181/solr-cloud
 got event WatchedEvent state:Disconnected type:None path:null path: null type: 
None
WARN  - 2017-03-16 11:28:00.040; 
org.apache.solr.common.cloud.ConnectionManager; zkClient has disconnected
WARN  - 2017-03-16 11:28:20.620; 
org.apache.solr.common.cloud.ConnectionManager; Watcher 
org.apache.solr.common.cloud.ConnectionManager@1a74e9d1 name: 
ZooKeeperConnection 
Watcher:zookeeper01.prod2.cloud.cheggnet.com:2181,zookeeper02.prod2.cloud.cheggnet.com:2181,zookeeper03.prod2.cloud.cheggnet.com,zookeeper04.prod2.cloud.cheggnet.com,zookeeper05.prod2.cloud.cheggnet.com:2181/solr-cloud
 got event WatchedEvent state:Disconnected type:None path:null path: null type: 
None
WARN  - 2017-03-16 11:28:20.620; 
org.apache.solr.common.cloud.ConnectionManager; zkClient has disconnected
WARN  - 2017-03-16 11:28:41.366; 
org.apache.solr.common.cloud.ConnectionManager; Watcher 
org.apache.solr.common.cloud.ConnectionManager@1a74e9d1 name: 
ZooKeeperConnection 
Watcher:zookeeper01.prod2.cloud.cheggnet.com:2181,zookeeper02.prod2.cloud.cheggnet.com:2181,zookeeper03.prod2.cloud.cheggnet.com,zookeeper04.prod2.cloud.cheggnet.com,zookeeper05.prod2.cloud.cheggnet.com:2181/solr-cloud
 got event WatchedEvent state:Disconnected type:None path:null path: null type: 
None
WARN  - 2017-03-16 11:28:41.366; 
org.apache.solr.common.cloud.ConnectionManager; zkClient has disconnected
WARN  - 2017-03-16 11:29:01.485; 
org.apache.solr.common.cloud.ConnectionManager; Watcher 
org.apache.solr.common.cloud.ConnectionManager@1a74e9d1 name: 
ZooKeeperConnection 
Watcher:zookeeper01.prod2.cloud.cheggnet.com:2181,zookeeper02.prod2.cloud.cheggnet.com:2181,zookeeper03.prod2.cloud.cheggnet.com,zookeeper04.prod2.cloud.cheggnet.com,zookeeper05.prod2.cloud.cheggnet.com:2181/solr-cloud
 got event WatchedEvent state:Disconnected type:None path:null path: null type: 
None
WARN  - 2017-03-16 11:29:01.486; 
org.apache.solr.common.cloud.ConnectionManager; zkClient has disconnected
WARN  - 2017-03-16 11:29:22.316; 
org.apache.solr.common.cloud.ConnectionManager; Watcher 
org.apache.solr.common.cloud.ConnectionManager@1a74e9d1 name: 
ZooKeeperConnection 
Watcher:zookeeper01.prod2.cloud.cheggnet.com:2181,zookeeper02.prod2.cloud.cheggnet.com:2181,zookeeper03.prod2.cloud.cheggnet.com,zookeeper04.prod2.cloud.cheggnet.com,zookeeper05.prod2.cloud.cheggnet.com:2181/solr-cloud
 got event WatchedEvent state:Disconnected type:None path:null path: null type: 
None
WARN  - 2017-03-16 11:29:22.317; 
org.apache.solr.common.cloud.ConnectionManager; zkClient has disconnected
WARN  - 2017-03-16 11:29:42.912; 
org.apache.solr.commo

Re: Exact match works only for some of the strings

2017-03-16 Thread Mikhail Khludnev
You can try to check debugQuery to understand how this query is parsed:
double quotes hardly compatible with KeywordTokenizer. Also you can check
which terms are indexed in SchemaBrowser. Also, there is Analysis page at
Solr Admin.

On Thu, Mar 16, 2017 at 8:55 PM, Gintautas Sulskus <
gintautas.suls...@gmail.com> wrote:

> Hi All,
>
> I am trying to figure out why Solr returns an empty result when searching
> for the following query:
>
> nameExact:"Guardian EU-referendum"
>
>
> The field definition:
>
> 
>
>
> The type definition:
>
>  sortMissingLast="true" omitNorms="true">
>
> 
>
> 
>
> 
>
> 
>
> 
>
> The analysis, as expected, matches the query parameter against the stored
> value. Please take a look at the attached image. I am using
> KeywordTokenizer and LowerCaseFilter.
> ​
> What is more strange, the query below works just fine:
>
> nameExact:"Guardian US"
>
>
> Could you please provide me with some clues on what could be wrong?
>
> Thanks,
> Gintas
>



-- 
Sincerely yours
Mikhail Khludnev


Re: question about function query

2017-03-16 Thread Mikhail Khludnev
Hello,
A function query matches all docs. Use {!frange} if you want to select docs
with some particular values.

On Thu, Mar 16, 2017 at 6:08 PM, Bernd Fehling <
bernd.fehl...@uni-bielefeld.de> wrote:

> I'm testing some function queries and have some questions.
>
> original queries:
> 1. q=collection:ftmuenster&fl=*
> --> numFound="6029"
>
> 2. q=collection:ftmuenster+AND+-description:*&fl=*
> --> numFound="1877"
>
> 3. q=collection:ftmuenster+AND+description:*&fl=*
> --> numFound="4152"
>
> This looks good.
>
> But now with function query:
>
> q={!func}exists(description)&fq=collection:ftmuenster&fl=*
> --> numFound="6029"
>
> I'm was hoping to get numFound=4152, why not?
>
> I also tried:
> q={!func}exists(description)&fq=collection:ftmuenster&q.op=AND&fl=*
> --> numFound="6029"
>
> What are the function queries equivalent to queries 2. and 3. above?
>
> Regards
> Bernd
>
>


-- 
Sincerely yours
Mikhail Khludnev


Re: block join - search together at parent and childern

2017-03-16 Thread Mikhail Khludnev
Hello,

It's hard to get into the problem. but you probably want to have dismax on
child level:
q={!parent ...}{!edismax qf='childF1 childF2' v=$chq}&chq=foo bar
It's usually broken because child query might match parents which is not
allowed. Thus, it's probably can solved by adding +type:child into chq.
IIRC edismax supports lucene syntax.

On Thu, Mar 16, 2017 at 4:47 PM, Jan Nekuda  wrote:

> Hi,
> I have a question for which I wasn't able to find a good solution.
> I have this structure of documents
>
> A
> |\
> | \
> B \
>  \
>   C
>\
> \
>  \
>   D
>
> Document type A has fields id_number, date_from, date_to
> Document type C  has fields first_name, surname, birthdate
> Document type D AND B has fields street_name, house_number, city
>
>
> I want to find *all parents with block join and edismax*.
> The problem is that I have found that possible is find children by parent,
> or parent by children.
> *I want to find parent by values in parent and in children*. I want to use
> edismax with all fields from all documents (id_number, date_from, date_to,
> has fields first_name, surname, birthdate,street_name, house_number, city).
> I want to write *Hynek* AND *Brojova* AND 14 and I expect that it returns
> document A because it found Hynek in surname, Brojova in street and 14 in
> house number.
> This is easy with {!parent which=type:A}
> the problem is, that I'm not able to find by condition 789 AND *Brojova*
> where 789 is id_number from type A and Brojova is Street from D.
>
> In short I need to find all parents of tree (parent and childern) in which
> are matched all the word which i send to condition
>
>
> My only solution is to make root type X. Then A will be its child. Then I
> can use {!parent which=type:X}.
> Than this will work:
>
> http://localhost:8983/solr/demo/select?q=*:*&fq={!parent
> which=type:X}brojova*&fq={!parent which=type:X}16&wt=json&
> indent=true&defType=edismax&qf=id_number date_from date_to has fields
> first_name surname birthdate street_name house_number city&stopwords=true&
> lowercaseOperators=true
>
>
> But I believe it can be solved much better.
>
> X
> |
> A
> |\
> | \
> B \
>  \
>   C
>\
> \
>  \
>   D
>
>
> Thanks for your help
> Jan
>



-- 
Sincerely yours
Mikhail Khludnev


Re: Get handler not working

2017-03-16 Thread Alexandre Rafalovitch
Well, only router.field is the valid parameter as per
https://cwiki.apache.org/confluence/display/solr/Collections+API#CollectionsAPI-CREATE:CreateaCollection

In the second case the parameter is ignored and the uniqueKey is used
instead, which is different for you.

But it is the first case that fails for you, so it sounds like maybe
/get handler somehow does not routed correctly. I wonder if there is
another parameter somewhere that should be set to match the field you
use, but is not.

Regards,
   Alex.


http://www.solr-start.com/ - Resources for Solr users, new and experienced


On 16 March 2017 at 12:28, Chris Ulicny  wrote:
> I think I've figured out where the issue is, at least superficially. It's
> in what parameter is used to define the field to route on. I set up two
> collections to use the same configset but slightly altered calls to the
> Collections API.
>
> action=CREATE&name=CollectionOne&numShards=2&router.name=compositeId&
> *router.field*
> =iqroutingkey&maxShardsPerNode=2&collection.configName=RoutingTest
> action=CREATE&name=CollectionTwo&numShards=2&router.name=compositeId&
> *routerField*
> =iqroutingkey&maxShardsPerNode=2&collection.configName=RoutingTest
>
> The get handler returns null for CollectionOne (even with a _route_
> parameter), but it will return the document for CollectionTwo in any case.
> I will gather and post the trace logs when I get a chance.
>
>
>
> On Thu, Mar 16, 2017 at 10:52 AM Yonik Seeley  wrote:
>
>> Ah, yeah, if you're using a different route field it's highly likely
>> that's the issue.
>> I was always against that "feature", and this thread demonstrates part
>> of the problem (complicating clients, including us human clients
>> trying to make sense of what's going on).
>>
>> -Yonik
>>
>>
>> On Thu, Mar 16, 2017 at 10:31 AM, Chris Ulicny  wrote:
>> > Speaking of routing, I realized I completely forgot to add the routing
>> > setup to the test cloud, so it probably has something to do with the
>> issue.
>> > I'll add that in and report back.
>> >
>> > So the routing and uniqueKey setup is as follows:
>> >
>> > Schema setup:
>> > iqdocid > > multiValued="false" indexed="true" required="true" stored="true"/> > > name="iqdocid" type="string" multiValued="false" indexed="true" required=
>> > "true" stored="true"/>
>> >
>> > I don't think it's mentioned in the documentation about using routerField
>> > for the compositeId router, but based on the resolution of SOLR-5017
>> > , we decided to use the
>> > compositeId router with routerField set to 'iqroutingkey' which is using
>> > the "!" notation. In general, the iqroutingkey field is of the form:
>> > !!
>> >
>> > Unless I misunderstood what was changed with that patch, that form should
>> > still route appropriately, and it seems that it has distributed the
>> > documents appropriately from our basic testing.
>> >
>> > On Thu, Mar 16, 2017 at 9:42 AM David Hastings <
>> hastings.recurs...@gmail.com>
>> > wrote:
>> >
>> > i still would like to see an experiment where you change the field to id
>> > instead of iqdocid,
>> >
>> > On Thu, Mar 16, 2017 at 9:33 AM, Yonik Seeley  wrote:
>> >
>> >> Something to do with routing perhaps? (the mapping of ids to shards,
>> >> by default is based on hashes of the id)
>> >> -Yonik
>> >>
>> >>
>> >> On Thu, Mar 16, 2017 at 9:16 AM, Chris Ulicny  wrote:
>> >> > iqdocid is already set to be the uniqueKey value.
>> >> >
>> >> > I tried reindexing a few documents back into the problematic cloud and
>> > am
>> >> > getting the same behavior of no document found for get handler.
>> >> >
>> >> > I've also done some testing on standalone instances as well as some
>> > quick
>> >> > cloud setups (with embedded zk), and I cannot seem to replicate the
>> >> > problem. For each test, I used the exact same configset that is
>> causing
>> >> the
>> >> > issue for us and indexed a document from that instance as well. I can
>> >> > provide more details if that would be useful in anyway.
>> >> >
>> >> > Standalone instance worked
>> >> > Cloud mode worked regardless of the use of the security plugin
>> >> > Cloud mode worked regardless of explicit get handler definition
>> >> > Cloud mode consistently worked with explicitly defining the get
>> handler,
>> >> > then removing it and reloading the collection
>> >> >
>> >> > The only differences that I know of between the tests and the
>> > problematic
>> >> > cloud is that solr is running as a different user and using an
>> external
>> >> > zookeeper ensemble. The running user has ownership of the solr
>> >> > installation, log, and data directories.
>> >> >
>> >> > I'm going to keep trying different setups to see if I can replicate
>> the
>> >> > issue, but if anyone has any ideas on what direction might make the
>> most
>> >> > sense, please let me know.
>> >> >
>> >> > Thanks again
>> >> >
>> >> > On Wed, Mar 15, 2017 at 5:49 PM Erick Erickson <
>> erickerick...@gmail.com>
>> >> > wrote

Re: Solr shingles is not working in solr 6.4.0

2017-03-16 Thread Alexandre Rafalovitch
Oh. Try your query with quotes around the phone phrase:
q="one plus one"

My hypothesis is:
Query parser splits things on whitespace before passing it down into
analyzer chain as individual match attempts. The Analysis UI does not
take that into account and treats the whole string as phrase sent. You
say
outputUnigrams="false" outputUnigramsIfNoShingles="false"
So, every single token during the query gets ignored because there is
nothing for it to shingle with.

I am not sure why it would have worked in Solr 4.

Regards,
   Alex.

http://www.solr-start.com/ - Resources for Solr users, new and experienced


On 16 March 2017 at 13:06, Aman Deep Singh  wrote:
> For images dropbox url is
> https://www.dropbox.com/sh/6dy6a8ajabjtxrt/AAAoxhZQe2vp3sTl3Av71_eHa?dl=0
>
>
> On Thu, Mar 16, 2017 at 10:29 PM Aman Deep Singh 
> wrote:
>
>> Yes I have reloaded the core after config changes
>>
>>
>> On 16-Mar-2017 10:28 PM, "Alexandre Rafalovitch" 
>> wrote:
>>
>> Images do not come through.
>>
>> But I was wrong too. You use eDismax and pass "cust_shingle" in, so
>> the "df" value is irrelevant.
>>
>> You definitely reloaded the core after changing definitions?
>> 
>> http://www.solr-start.com/ - Resources for Solr users, new and experienced
>>
>>
>> On 16 March 2017 at 12:37, Aman Deep Singh 
>> wrote:
>> > Already check that i am sending sceenshots of various senarios
>> >
>> >
>> > On Thu, Mar 16, 2017 at 7:46 PM Alexandre Rafalovitch <
>> arafa...@gmail.com>
>> > wrote:
>> >>
>> >> Sanity check. Is your 'df' pointing at the field you think it is
>> >> pointing at? It really does look like all tokens were eaten and
>> >> nothing was left. But you should have seen that in the Analysis screen
>> >> too, if you have the right field.
>> >>
>> >> Try adding echoParams=all to your request to see the full final
>> >> parameter list. Maybe some parameters in initParams sections override
>> >> your assumed config.
>> >>
>> >> Regards,
>> >>Alex.
>> >> 
>> >> http://www.solr-start.com/ - Resources for Solr users, new and
>> experienced
>> >>
>> >>
>> >> On 16 March 2017 at 08:30, Aman Deep Singh 
>> >> wrote:
>> >> > Hi,
>> >> >
>> >> > Recently I migrated from solr 4 to 6
>> >> > IN solr 4 shinglefilterfactory is working correctly
>> >> > my configration  i
>> >> >
>> >> > > >> > positionIncrementGap="100">
>> >> > 
>> >> >  
>> >> >  > >> > maxShingleSize="5"
>> >> >  outputUnigrams="false"
>> >> > outputUnigramsIfNoShingles="false" />
>> >> >   
>> >> > 
>> >> > 
>> >> >   
>> >> >  > >> > maxShingleSize="5"
>> >> >  outputUnigrams="false"
>> >> > outputUnigramsIfNoShingles="false" />
>> >> >   
>> >> >   
>> >> > 
>> >> >   
>> >> >
>> >> >
>> >> >
>> >> > But after updating to solr 6 shingles is not working ,schema is as
>> >> > below,
>> >> >
>> >> > > >> > positionIncrementGap="100">
>> >> > 
>> >> >  
>> >> >  > >> > maxShingleSize="5"
>> >> >  outputUnigrams="false"
>> >> > outputUnigramsIfNoShingles="false" />
>> >> >   
>> >> > 
>> >> > 
>> >> >   
>> >> >  > >> > maxShingleSize="5"
>> >> >  outputUnigrams="false"
>> >> > outputUnigramsIfNoShingles="false" />
>> >> >   
>> >> > 
>> >> >   
>> >> >
>> >> > Although in the Analysis tab is was showing proper shingle result but
>> >> > when
>> >> > using in the queryParser it was not giving proper results
>> >> >
>> >> > my sample hit is
>> >> >
>> >> >
>> >> >
>> http://localhost:8983/solr/shingel_test/select?q=one%20plus%20one&wt=xml&debugQuery=true&defType=edismax&qf=cust_shingle
>> >> >
>> >> > it create the parsed query as
>> >> >
>> >> > one plus one
>> >> > one plus one
>> >> > (+())/no_coord
>> >> > +()
>> >> > 
>> >> > ExtendedDismaxQParser
>>
>>
>>


Re: Exact match works only for some of the strings

2017-03-16 Thread Alvaro Cabrerizo
Hello,

I've tested on an old solr 4.3 instance and the schema and the field
definition are fine. I've also checked that only the
query nameExact:"Guardian EU-referendum" gives the result, the other one
you have commented (nameExact:"Guardian US") gives 0 hits. Maybe, you
forgot to re-index after schema modification. I mean, you indexed your
data, then changed the schema and then start querying using the new schema
that does not match your index.

Hope it helps.

On Thu, Mar 16, 2017 at 7:50 PM, Mikhail Khludnev  wrote:

> You can try to check debugQuery to understand how this query is parsed:
> double quotes hardly compatible with KeywordTokenizer. Also you can check
> which terms are indexed in SchemaBrowser. Also, there is Analysis page at
> Solr Admin.
>
> On Thu, Mar 16, 2017 at 8:55 PM, Gintautas Sulskus <
> gintautas.suls...@gmail.com> wrote:
>
> > Hi All,
> >
> > I am trying to figure out why Solr returns an empty result when searching
> > for the following query:
> >
> > nameExact:"Guardian EU-referendum"
> >
> >
> > The field definition:
> >
> >  />
> >
> >
> > The type definition:
> >
> >  > sortMissingLast="true" omitNorms="true">
> >
> > 
> >
> > 
> >
> > 
> >
> > 
> >
> > 
> >
> > The analysis, as expected, matches the query parameter against the stored
> > value. Please take a look at the attached image. I am using
> > KeywordTokenizer and LowerCaseFilter.
> > ​
> > What is more strange, the query below works just fine:
> >
> > nameExact:"Guardian US"
> >
> >
> > Could you please provide me with some clues on what could be wrong?
> >
> > Thanks,
> > Gintas
> >
>
>
>
> --
> Sincerely yours
> Mikhail Khludnev
>


Re: block join - search together at parent and childern

2017-03-16 Thread Jan Nekuda

Hello Mikhail,

thanks for fast answer. The problem is, that I want to have the dismax
on child and parent together - to have the filter evaluated together.

I need to have documents:


path: car

type:car

color:red

first_country: CZ

name:seat



path: car\engine

type:engine

power:63KW



path: car\engine\manufacturer

type:manufacturer

name: xx

country:PL


path: car

type:car

color:green

first_country: CZ

name:skoda



path: car\engine

type:engine

power:88KW



path: car\engine\manufacturer

type:manufacturer

name: yy

country:PL


where car is parent document engine is its child a manufacturer is child
of engine and the structure can be deep.

I need to make a query with edismax over fields color, first_country,
power, name, country over parent and all childern.

when I ask then "seat 63 kw" i need to get seat car

the same if I will write only "seat" or only "63kw" or only "xx"

but if I will write "seat 88kw" i expect that i will get no result

I need to return parents in which tree are all the words which I wrote
to query.

How I wrote before my solution was to split the query text and use q:*:*
and for each /word/ in query make

fq={!parent which=type:car}/word//
/

//and edismax with qf=color, first_country, power, name, country

Thank you for your time:)

Jan


Dne 16.03.2017 v 20:00 Mikhail Khludnev napsal(a):


Hello,

It's hard to get into the problem. but you probably want to have dismax on
child level:
q={!parent ...}{!edismax qf='childF1 childF2' v=$chq}&chq=foo bar
It's usually broken because child query might match parents which is not
allowed. Thus, it's probably can solved by adding +type:child into chq.
IIRC edismax supports lucene syntax.

On Thu, Mar 16, 2017 at 4:47 PM, Jan Nekuda  wrote:


Hi,
I have a question for which I wasn't able to find a good solution.
I have this structure of documents

A
|\
| \
B \
  \
   C
\
 \
  \
   D

Document type A has fields id_number, date_from, date_to
Document type C  has fields first_name, surname, birthdate
Document type D AND B has fields street_name, house_number, city


I want to find *all parents with block join and edismax*.
The problem is that I have found that possible is find children by parent,
or parent by children.
*I want to find parent by values in parent and in children*. I want to use
edismax with all fields from all documents (id_number, date_from, date_to,
has fields first_name, surname, birthdate,street_name, house_number, city).
I want to write *Hynek* AND *Brojova* AND 14 and I expect that it returns
document A because it found Hynek in surname, Brojova in street and 14 in
house number.
This is easy with {!parent which=type:A}
the problem is, that I'm not able to find by condition 789 AND *Brojova*
where 789 is id_number from type A and Brojova is Street from D.

In short I need to find all parents of tree (parent and childern) in which
are matched all the word which i send to condition


My only solution is to make root type X. Then A will be its child. Then I
can use {!parent which=type:X}.
Than this will work:

http://localhost:8983/solr/demo/select?q=*:*&fq={!parent
which=type:X}brojova*&fq={!parent which=type:X}16&wt=json&
indent=true&defType=edismax&qf=id_number date_from date_to has fields
first_name surname birthdate street_name house_number city&stopwords=true&
lowercaseOperators=true


But I believe it can be solved much better.

X
|
A
|\
| \
B \
  \
   C
\
 \
  \
   D


Thanks for your help
Jan








---
Tato zpráva byla zkontrolována na viry programem Avast Antivirus.
https://www.avast.com/antivirus


Re: block join - search together at parent and childern

2017-03-16 Thread Mikhail Khludnev
Hello Jan,

What if you combine child and parent dismaxes like below
q={!edismax qf=$parentfields}foo bar {!parent ..}{!dismax qf=$childfields
v=$childclauses}&childclauses=foo bar +type:child&parentfields=...&
parentfields=...

On Thu, Mar 16, 2017 at 10:54 PM, Jan Nekuda  wrote:

> Hello Mikhail,
>
> thanks for fast answer. The problem is, that I want to have the dismax on
> child and parent together - to have the filter evaluated together.
>
> I need to have documents:
>
>
> path: car
>
> type:car
>
> color:red
>
> first_country: CZ
>
> name:seat
>
>
>
> path: car\engine
>
> type:engine
>
> power:63KW
>
>
>
> path: car\engine\manufacturer
>
> type:manufacturer
>
> name: xx
>
> country:PL
>
>
> path: car
>
> type:car
>
> color:green
>
> first_country: CZ
>
> name:skoda
>
>
>
> path: car\engine
>
> type:engine
>
> power:88KW
>
>
>
> path: car\engine\manufacturer
>
> type:manufacturer
>
> name: yy
>
> country:PL
>
>
> where car is parent document engine is its child a manufacturer is child
> of engine and the structure can be deep.
>
> I need to make a query with edismax over fields color, first_country,
> power, name, country over parent and all childern.
>
> when I ask then "seat 63 kw" i need to get seat car
>
> the same if I will write only "seat" or only "63kw" or only "xx"
>
> but if I will write "seat 88kw" i expect that i will get no result
>
> I need to return parents in which tree are all the words which I wrote to
> query.
>
> How I wrote before my solution was to split the query text and use q:*:*
> and for each /word/ in query make
>
> fq={!parent which=type:car}/word//
> /
>
> //and edismax with qf=color, first_country, power, name, country
>
> Thank you for your time:)
>
> Jan
>
>
> Dne 16.03.2017 v 20:00 Mikhail Khludnev napsal(a):
>
>
> Hello,
>>
>> It's hard to get into the problem. but you probably want to have dismax on
>> child level:
>> q={!parent ...}{!edismax qf='childF1 childF2' v=$chq}&chq=foo bar
>> It's usually broken because child query might match parents which is not
>> allowed. Thus, it's probably can solved by adding +type:child into chq.
>> IIRC edismax supports lucene syntax.
>>
>> On Thu, Mar 16, 2017 at 4:47 PM, Jan Nekuda  wrote:
>>
>> Hi,
>>> I have a question for which I wasn't able to find a good solution.
>>> I have this structure of documents
>>>
>>> A
>>> |\
>>> | \
>>> B \
>>>   \
>>>C
>>> \
>>>  \
>>>   \
>>>D
>>>
>>> Document type A has fields id_number, date_from, date_to
>>> Document type C  has fields first_name, surname, birthdate
>>> Document type D AND B has fields street_name, house_number, city
>>>
>>>
>>> I want to find *all parents with block join and edismax*.
>>> The problem is that I have found that possible is find children by
>>> parent,
>>> or parent by children.
>>> *I want to find parent by values in parent and in children*. I want to
>>> use
>>> edismax with all fields from all documents (id_number, date_from,
>>> date_to,
>>> has fields first_name, surname, birthdate,street_name, house_number,
>>> city).
>>> I want to write *Hynek* AND *Brojova* AND 14 and I expect that it returns
>>> document A because it found Hynek in surname, Brojova in street and 14 in
>>> house number.
>>> This is easy with {!parent which=type:A}
>>> the problem is, that I'm not able to find by condition 789 AND *Brojova*
>>> where 789 is id_number from type A and Brojova is Street from D.
>>>
>>> In short I need to find all parents of tree (parent and childern) in
>>> which
>>> are matched all the word which i send to condition
>>>
>>>
>>> My only solution is to make root type X. Then A will be its child. Then I
>>> can use {!parent which=type:X}.
>>> Than this will work:
>>>
>>> http://localhost:8983/solr/demo/select?q=*:*&fq={!parent
>>> which=type:X}brojova*&fq={!parent which=type:X}16&wt=json&
>>> indent=true&defType=edismax&qf=id_number date_from date_to has fields
>>> first_name surname birthdate street_name house_number
>>> city&stopwords=true&
>>> lowercaseOperators=true
>>>
>>>
>>> But I believe it can be solved much better.
>>>
>>> X
>>> |
>>> A
>>> |\
>>> | \
>>> B \
>>>   \
>>>C
>>> \
>>>  \
>>>   \
>>>D
>>>
>>>
>>> Thanks for your help
>>> Jan
>>>
>>>
>>
>>
>
>
> ---
> Tato zpráva byla zkontrolována na viry programem Avast Antivirus.
> https://www.avast.com/antivirus
>



-- 
Sincerely yours
Mikhail Khludnev


Re: SQL JOIN eta

2017-03-16 Thread Joel Bernstein
There isn't a jira issue for this yet. FOr Solr 6.6 there are few important
features lined up:

1) SELECT COUNT(DISTINCT)
2) Date/Time function support
3) Arithmetic function support
4) SELECT ... INTO ...

Joel Bernstein
http://joelsolr.blogspot.com/

On Tue, Mar 14, 2017 at 10:53 PM, Damien Kamerman  wrote:

> Hi all, does anyone know roughly when the SQL JOIN functionally will be
> released? Is there a Jira for this? I'm guessing this might be on Solr 6.6.
>
> Cheers,
> Damien.
>


Re: Solr 6.3 will not stay connected to zookeeper

2017-03-16 Thread Shawn Heisey
On 3/15/2017 6:45 PM, Walter Underwood wrote:
> I have a pretty good guess what happened. I requested a Zookeeper
> 3.4.6 cluster, but they built a 3.4.9 cluster. 

This is the first I've heard of problems with ZK 3.4.9 on the server
side.  That would be the server version that I would recommend for a new
install -- the list of bugfixes on each 3.4.x ZK version is quite long. 
Without checking each one I cannot say whether any of them would affect
SolrCloud, but I'd rather be running a version with the fixes.

The log messages read like it's not able to establish a TCP connection,
which is a fairly low level problem.  The problems I consider to be most
likely are a firewall or a security layer like selinux/apparmor.

I would not expect a ZK downgrade to make any difference, unless another
change is made along with the downgrade.

Thanks,
Shawn



Re: Solr shingles is not working in solr 6.4.0

2017-03-16 Thread Aman Deep Singh
If I give query in quotes it converted query in to graph query as

Graph(cust_sh6:"one plus one" hasBoolean=false hasPhrase=false)


On 17-Mar-2017 1:38 AM, "Alexandre Rafalovitch"  wrote:

> Oh. Try your query with quotes around the phone phrase:
> q="one plus one"
>
> My hypothesis is:
> Query parser splits things on whitespace before passing it down into
> analyzer chain as individual match attempts. The Analysis UI does not
> take that into account and treats the whole string as phrase sent. You
> say
> outputUnigrams="false" outputUnigramsIfNoShingles="false"
> So, every single token during the query gets ignored because there is
> nothing for it to shingle with.
>
> I am not sure why it would have worked in Solr 4.
>
> Regards,
>Alex.
> 
> http://www.solr-start.com/ - Resources for Solr users, new and experienced
>
>
> On 16 March 2017 at 13:06, Aman Deep Singh 
> wrote:
> > For images dropbox url is
> > https://www.dropbox.com/sh/6dy6a8ajabjtxrt/
> AAAoxhZQe2vp3sTl3Av71_eHa?dl=0
> >
> >
> > On Thu, Mar 16, 2017 at 10:29 PM Aman Deep Singh <
> amandeep.coo...@gmail.com>
> > wrote:
> >
> >> Yes I have reloaded the core after config changes
> >>
> >>
> >> On 16-Mar-2017 10:28 PM, "Alexandre Rafalovitch" 
> >> wrote:
> >>
> >> Images do not come through.
> >>
> >> But I was wrong too. You use eDismax and pass "cust_shingle" in, so
> >> the "df" value is irrelevant.
> >>
> >> You definitely reloaded the core after changing definitions?
> >> 
> >> http://www.solr-start.com/ - Resources for Solr users, new and
> experienced
> >>
> >>
> >> On 16 March 2017 at 12:37, Aman Deep Singh 
> >> wrote:
> >> > Already check that i am sending sceenshots of various senarios
> >> >
> >> >
> >> > On Thu, Mar 16, 2017 at 7:46 PM Alexandre Rafalovitch <
> >> arafa...@gmail.com>
> >> > wrote:
> >> >>
> >> >> Sanity check. Is your 'df' pointing at the field you think it is
> >> >> pointing at? It really does look like all tokens were eaten and
> >> >> nothing was left. But you should have seen that in the Analysis
> screen
> >> >> too, if you have the right field.
> >> >>
> >> >> Try adding echoParams=all to your request to see the full final
> >> >> parameter list. Maybe some parameters in initParams sections override
> >> >> your assumed config.
> >> >>
> >> >> Regards,
> >> >>Alex.
> >> >> 
> >> >> http://www.solr-start.com/ - Resources for Solr users, new and
> >> experienced
> >> >>
> >> >>
> >> >> On 16 March 2017 at 08:30, Aman Deep Singh <
> amandeep.coo...@gmail.com>
> >> >> wrote:
> >> >> > Hi,
> >> >> >
> >> >> > Recently I migrated from solr 4 to 6
> >> >> > IN solr 4 shinglefilterfactory is working correctly
> >> >> > my configration  i
> >> >> >
> >> >> >  >> >> > positionIncrementGap="100">
> >> >> > 
> >> >> >  
> >> >> >   minShingleSize="2"
> >> >> > maxShingleSize="5"
> >> >> >  outputUnigrams="false"
> >> >> > outputUnigramsIfNoShingles="false" />
> >> >> >   
> >> >> > 
> >> >> > 
> >> >> >   
> >> >> >   minShingleSize="2"
> >> >> > maxShingleSize="5"
> >> >> >  outputUnigrams="false"
> >> >> > outputUnigramsIfNoShingles="false" />
> >> >> >   
> >> >> >   
> >> >> > 
> >> >> >   
> >> >> >
> >> >> >
> >> >> >
> >> >> > But after updating to solr 6 shingles is not working ,schema is as
> >> >> > below,
> >> >> >
> >> >> >  >> >> > positionIncrementGap="100">
> >> >> > 
> >> >> >  
> >> >> >   minShingleSize="2"
> >> >> > maxShingleSize="5"
> >> >> >  outputUnigrams="false"
> >> >> > outputUnigramsIfNoShingles="false" />
> >> >> >   
> >> >> > 
> >> >> > 
> >> >> >   
> >> >> >   minShingleSize="2"
> >> >> > maxShingleSize="5"
> >> >> >  outputUnigrams="false"
> >> >> > outputUnigramsIfNoShingles="false" />
> >> >> >   
> >> >> > 
> >> >> >   
> >> >> >
> >> >> > Although in the Analysis tab is was showing proper shingle result
> but
> >> >> > when
> >> >> > using in the queryParser it was not giving proper results
> >> >> >
> >> >> > my sample hit is
> >> >> >
> >> >> >
> >> >> >
> >> http://localhost:8983/solr/shingel_test/select?q=one%
> 20plus%20one&wt=xml&debugQuery=true&defType=edismax&qf=cust_shingle
> >> >> >
> >> >> > it create the parsed query as
> >> >> >
> >> >> > one plus one
> >> >> > one plus one
> >> >> > (+())/no_coord
> >> >> > +()
> >> >> > 
> >> >> > ExtendedDismaxQParser
> >>
> >>
> >>
>


Re: Finding time of last commit to index from SolrJ?

2017-03-16 Thread Damien Kamerman
I ended up doing something like this:

String core = "collection1_shard1_core1";
ModifiableSolrParams p = new ModifiableSolrParams();
p.set("show", "index");
GenericSolrRequest checkRequest = new GenericSolrRequest(POST, "/../" +
core + "/admin/luke", p);
NamedList checkResult = client.request("collection1", checkRequest);

On 16 March 2017 at 14:20, Phil Scadden  wrote:

> The admin gui displays the time of last commit to a core but how can this
> be queried from within SolrJ?
>
> Notice: This email and any attachments are confidential and may not be
> used, published or redistributed without the prior written consent of the
> Institute of Geological and Nuclear Sciences Limited (GNS Science). If
> received in error please destroy and immediately notify GNS Science. Do not
> copy or disclose the contents.
>


Re: Solr shingles is not working in solr 6.4.0

2017-03-16 Thread Alexandre Rafalovitch
Which is what I believe you had as a working example in your Dropbox images.

So, does it work now?

Regards,
   Alex.

http://www.solr-start.com/ - Resources for Solr users, new and experienced


On 16 March 2017 at 22:22, Aman Deep Singh  wrote:
> If I give query in quotes it converted query in to graph query as
>
> Graph(cust_sh6:"one plus one" hasBoolean=false hasPhrase=false)
>
>
> On 17-Mar-2017 1:38 AM, "Alexandre Rafalovitch"  wrote:
>
>> Oh. Try your query with quotes around the phone phrase:
>> q="one plus one"
>>
>> My hypothesis is:
>> Query parser splits things on whitespace before passing it down into
>> analyzer chain as individual match attempts. The Analysis UI does not
>> take that into account and treats the whole string as phrase sent. You
>> say
>> outputUnigrams="false" outputUnigramsIfNoShingles="false"
>> So, every single token during the query gets ignored because there is
>> nothing for it to shingle with.
>>
>> I am not sure why it would have worked in Solr 4.
>>
>> Regards,
>>Alex.
>> 
>> http://www.solr-start.com/ - Resources for Solr users, new and experienced
>>
>>
>> On 16 March 2017 at 13:06, Aman Deep Singh 
>> wrote:
>> > For images dropbox url is
>> > https://www.dropbox.com/sh/6dy6a8ajabjtxrt/
>> AAAoxhZQe2vp3sTl3Av71_eHa?dl=0
>> >
>> >
>> > On Thu, Mar 16, 2017 at 10:29 PM Aman Deep Singh <
>> amandeep.coo...@gmail.com>
>> > wrote:
>> >
>> >> Yes I have reloaded the core after config changes
>> >>
>> >>
>> >> On 16-Mar-2017 10:28 PM, "Alexandre Rafalovitch" 
>> >> wrote:
>> >>
>> >> Images do not come through.
>> >>
>> >> But I was wrong too. You use eDismax and pass "cust_shingle" in, so
>> >> the "df" value is irrelevant.
>> >>
>> >> You definitely reloaded the core after changing definitions?
>> >> 
>> >> http://www.solr-start.com/ - Resources for Solr users, new and
>> experienced
>> >>
>> >>
>> >> On 16 March 2017 at 12:37, Aman Deep Singh 
>> >> wrote:
>> >> > Already check that i am sending sceenshots of various senarios
>> >> >
>> >> >
>> >> > On Thu, Mar 16, 2017 at 7:46 PM Alexandre Rafalovitch <
>> >> arafa...@gmail.com>
>> >> > wrote:
>> >> >>
>> >> >> Sanity check. Is your 'df' pointing at the field you think it is
>> >> >> pointing at? It really does look like all tokens were eaten and
>> >> >> nothing was left. But you should have seen that in the Analysis
>> screen
>> >> >> too, if you have the right field.
>> >> >>
>> >> >> Try adding echoParams=all to your request to see the full final
>> >> >> parameter list. Maybe some parameters in initParams sections override
>> >> >> your assumed config.
>> >> >>
>> >> >> Regards,
>> >> >>Alex.
>> >> >> 
>> >> >> http://www.solr-start.com/ - Resources for Solr users, new and
>> >> experienced
>> >> >>
>> >> >>
>> >> >> On 16 March 2017 at 08:30, Aman Deep Singh <
>> amandeep.coo...@gmail.com>
>> >> >> wrote:
>> >> >> > Hi,
>> >> >> >
>> >> >> > Recently I migrated from solr 4 to 6
>> >> >> > IN solr 4 shinglefilterfactory is working correctly
>> >> >> > my configration  i
>> >> >> >
>> >> >> > > >> >> > positionIncrementGap="100">
>> >> >> > 
>> >> >> >  
>> >> >> >  > minShingleSize="2"
>> >> >> > maxShingleSize="5"
>> >> >> >  outputUnigrams="false"
>> >> >> > outputUnigramsIfNoShingles="false" />
>> >> >> >   
>> >> >> > 
>> >> >> > 
>> >> >> >   
>> >> >> >  > minShingleSize="2"
>> >> >> > maxShingleSize="5"
>> >> >> >  outputUnigrams="false"
>> >> >> > outputUnigramsIfNoShingles="false" />
>> >> >> >   
>> >> >> >   
>> >> >> > 
>> >> >> >   
>> >> >> >
>> >> >> >
>> >> >> >
>> >> >> > But after updating to solr 6 shingles is not working ,schema is as
>> >> >> > below,
>> >> >> >
>> >> >> > > >> >> > positionIncrementGap="100">
>> >> >> > 
>> >> >> >  
>> >> >> >  > minShingleSize="2"
>> >> >> > maxShingleSize="5"
>> >> >> >  outputUnigrams="false"
>> >> >> > outputUnigramsIfNoShingles="false" />
>> >> >> >   
>> >> >> > 
>> >> >> > 
>> >> >> >   
>> >> >> >  > minShingleSize="2"
>> >> >> > maxShingleSize="5"
>> >> >> >  outputUnigrams="false"
>> >> >> > outputUnigramsIfNoShingles="false" />
>> >> >> >   
>> >> >> > 
>> >> >> >   
>> >> >> >
>> >> >> > Although in the Analysis tab is was showing proper shingle result
>> but
>> >> >> > when
>> >> >> > using in the queryParser it was not giving proper results
>> >> >> >
>> >> >> > my sample hit is
>> >> >> >
>> >> >> >
>> >> >> >
>> >> http://localhost:8983/solr/shingel_test/select?q=one%
>> 20plus%20one&wt=xml&debugQuery=true&defType=edismax&qf=cust_shingle
>> >> >> >
>> >> >> > it create the parsed query as
>> >> >> >
>> >> >> > one plus one
>> >> >> > one plus one
>> >> >> > (+())/no_coord
>> >> >> > +()
>> >> >> > 
>> >> >> > ExtendedDismaxQParser
>> >>
>> >>
>> >>
>>


Re: Solr shingles is not working in solr 6.4.0

2017-03-16 Thread Aman Deep Singh
No it doesn't work

On 17-Mar-2017 8:38 AM, "Alexandre Rafalovitch"  wrote:

> Which is what I believe you had as a working example in your Dropbox
> images.
>
> So, does it work now?
>
> Regards,
>Alex.
> 
> http://www.solr-start.com/ - Resources for Solr users, new and experienced
>
>
> On 16 March 2017 at 22:22, Aman Deep Singh 
> wrote:
> > If I give query in quotes it converted query in to graph query as
> >
> > Graph(cust_sh6:"one plus one" hasBoolean=false hasPhrase=false)
> >
> >
> > On 17-Mar-2017 1:38 AM, "Alexandre Rafalovitch" 
> wrote:
> >
> >> Oh. Try your query with quotes around the phone phrase:
> >> q="one plus one"
> >>
> >> My hypothesis is:
> >> Query parser splits things on whitespace before passing it down into
> >> analyzer chain as individual match attempts. The Analysis UI does not
> >> take that into account and treats the whole string as phrase sent. You
> >> say
> >> outputUnigrams="false" outputUnigramsIfNoShingles="false"
> >> So, every single token during the query gets ignored because there is
> >> nothing for it to shingle with.
> >>
> >> I am not sure why it would have worked in Solr 4.
> >>
> >> Regards,
> >>Alex.
> >> 
> >> http://www.solr-start.com/ - Resources for Solr users, new and
> experienced
> >>
> >>
> >> On 16 March 2017 at 13:06, Aman Deep Singh 
> >> wrote:
> >> > For images dropbox url is
> >> > https://www.dropbox.com/sh/6dy6a8ajabjtxrt/
> >> AAAoxhZQe2vp3sTl3Av71_eHa?dl=0
> >> >
> >> >
> >> > On Thu, Mar 16, 2017 at 10:29 PM Aman Deep Singh <
> >> amandeep.coo...@gmail.com>
> >> > wrote:
> >> >
> >> >> Yes I have reloaded the core after config changes
> >> >>
> >> >>
> >> >> On 16-Mar-2017 10:28 PM, "Alexandre Rafalovitch"  >
> >> >> wrote:
> >> >>
> >> >> Images do not come through.
> >> >>
> >> >> But I was wrong too. You use eDismax and pass "cust_shingle" in, so
> >> >> the "df" value is irrelevant.
> >> >>
> >> >> You definitely reloaded the core after changing definitions?
> >> >> 
> >> >> http://www.solr-start.com/ - Resources for Solr users, new and
> >> experienced
> >> >>
> >> >>
> >> >> On 16 March 2017 at 12:37, Aman Deep Singh <
> amandeep.coo...@gmail.com>
> >> >> wrote:
> >> >> > Already check that i am sending sceenshots of various senarios
> >> >> >
> >> >> >
> >> >> > On Thu, Mar 16, 2017 at 7:46 PM Alexandre Rafalovitch <
> >> >> arafa...@gmail.com>
> >> >> > wrote:
> >> >> >>
> >> >> >> Sanity check. Is your 'df' pointing at the field you think it is
> >> >> >> pointing at? It really does look like all tokens were eaten and
> >> >> >> nothing was left. But you should have seen that in the Analysis
> >> screen
> >> >> >> too, if you have the right field.
> >> >> >>
> >> >> >> Try adding echoParams=all to your request to see the full final
> >> >> >> parameter list. Maybe some parameters in initParams sections
> override
> >> >> >> your assumed config.
> >> >> >>
> >> >> >> Regards,
> >> >> >>Alex.
> >> >> >> 
> >> >> >> http://www.solr-start.com/ - Resources for Solr users, new and
> >> >> experienced
> >> >> >>
> >> >> >>
> >> >> >> On 16 March 2017 at 08:30, Aman Deep Singh <
> >> amandeep.coo...@gmail.com>
> >> >> >> wrote:
> >> >> >> > Hi,
> >> >> >> >
> >> >> >> > Recently I migrated from solr 4 to 6
> >> >> >> > IN solr 4 shinglefilterfactory is working correctly
> >> >> >> > my configration  i
> >> >> >> >
> >> >> >> >  >> >> >> > positionIncrementGap="100">
> >> >> >> > 
> >> >> >> >  
> >> >> >> >   >> minShingleSize="2"
> >> >> >> > maxShingleSize="5"
> >> >> >> >  outputUnigrams="false"
> >> >> >> > outputUnigramsIfNoShingles="false" />
> >> >> >> >   
> >> >> >> > 
> >> >> >> > 
> >> >> >> >   
> >> >> >> >   >> minShingleSize="2"
> >> >> >> > maxShingleSize="5"
> >> >> >> >  outputUnigrams="false"
> >> >> >> > outputUnigramsIfNoShingles="false" />
> >> >> >> >   
> >> >> >> >   
> >> >> >> > 
> >> >> >> >   
> >> >> >> >
> >> >> >> >
> >> >> >> >
> >> >> >> > But after updating to solr 6 shingles is not working ,schema is
> as
> >> >> >> > below,
> >> >> >> >
> >> >> >> >  >> >> >> > positionIncrementGap="100">
> >> >> >> > 
> >> >> >> >  
> >> >> >> >   >> minShingleSize="2"
> >> >> >> > maxShingleSize="5"
> >> >> >> >  outputUnigrams="false"
> >> >> >> > outputUnigramsIfNoShingles="false" />
> >> >> >> >   
> >> >> >> > 
> >> >> >> > 
> >> >> >> >   
> >> >> >> >   >> minShingleSize="2"
> >> >> >> > maxShingleSize="5"
> >> >> >> >  outputUnigrams="false"
> >> >> >> > outputUnigramsIfNoShingles="false" />
> >> >> >> >   
> >> >> >> > 
> >> >> >> >   
> >> >> >> >
> >> >> >> > Although in the Analysis tab is was showing proper shingle
> result
> >> but
> >> >> >> > when
> >> >> >> > using in the queryParser it was not giving proper results
> >> >> >

Re: Solr shingles is not working in solr 6.4.0

2017-03-16 Thread Shawn Heisey
On 3/16/2017 1:40 PM, Alexandre Rafalovitch wrote:
> Oh. Try your query with quotes around the phone phrase:
> q="one plus one"

That query with the fieldType the user supplied produces this, on 6.3.0
with the lucene parser:

"querystring":"test:\"one plus one\"",
"parsedquery":"MultiPhraseQuery(test:\"(one plus one plus one) plus
one\")", Looks a little odd, but maybe it's correct.
> My hypothesis is:
> Query parser splits things on whitespace before passing it down into
> analyzer chain as individual match attempts. The Analysis UI does not
> take that into account and treats the whole string as phrase sent. You
> say
> outputUnigrams="false" outputUnigramsIfNoShingles="false"
> So, every single token during the query gets ignored because there is
> nothing for it to shingle with.

Might be that.

If I change both of those unigram options to "true" then this is what I
see (also on 6.3.0, q.op is AND):

"querystring":"test:(one plus one)", "parsedquery":"+test:one +test:plus
+test:one",

The really mystifying thing is ... it works on the analysis page.  The
whitespace tokenizer should (in theory at least) produce the same tokens
on the analysis page as the query parser does before analysis, so I have
no idea why analysis and query produce different results.  During query
analysis, the whitespace tokenizer should basically be a no-op, because
the input has already been tokenized.

If I change the analysis to this (keyword instead of whitespace):


  
  
  


Then the behavior is unchanged:

"querystring":"test:(one plus one)", "parsedquery":"",

> I am not sure why it would have worked in Solr 4.

I just tried it on on 4.9-SNAPSHOT, compiled 2015-05-20 from SVN
revision 1680667, and it doesn't work.  I don't remember whether this
was compiled from branch_4x or from the 4.9 branch.  Before that test, I
had tried back to 5.2.1 with the same results:

"querystring": "test:(one plus one)", "parsedquery": "", Thanks,
Shawn



Re: Solr shingles is not working in solr 6.4.0

2017-03-16 Thread Aman Deep Singh
I also tried in 5.2.1
for the query
http://localhost:8984/solr/test/select?q=TITLE_SH:one\%20plus\%20one&wt=xml&debugQuery=true



0
1

TITLE_SH:one\ plus\ one
xml
true




TITLE_SH:one\ plus\ one
TITLE_SH:one\ plus\ one

*((TITLE_SH:one plus TITLE_SH:one plus one)/no_coord) TITLE_SH:plus one*


(TITLE_SH:one plus TITLE_SH:one plus one) TITLE_SH:plus one


LuceneQParser


while in the solr 4.3.1
query
http://localhost:8983/solr/collection1/select?q=text_sh:one\%20plus\%20one&wt=xml&debugQuery=true

output is like


0
2

text_sh:one\ plus\ one
xml
true




text_sh:one\ plus\ one
text_sh:one\ plus\ one

(text_sh:one plus text_sh:one plus one text_sh:plus one)/no_coord


*text_sh:one plus text_sh:one plus one text_sh:plus one*


LuceneQParser

On Fri, Mar 17, 2017 at 9:50 AM Shawn Heisey  wrote:

> On 3/16/2017 1:40 PM, Alexandre Rafalovitch wrote:
> > Oh. Try your query with quotes around the phone phrase:
> > q="one plus one"
>
> That query with the fieldType the user supplied produces this, on 6.3.0
> with the lucene parser:
>
> "querystring":"test:\"one plus one\"",
> "parsedquery":"MultiPhraseQuery(test:\"(one plus one plus one) plus
> one\")", Looks a little odd, but maybe it's correct.
> > My hypothesis is:
> > Query parser splits things on whitespace before passing it down into
> > analyzer chain as individual match attempts. The Analysis UI does not
> > take that into account and treats the whole string as phrase sent. You
> > say
> > outputUnigrams="false" outputUnigramsIfNoShingles="false"
> > So, every single token during the query gets ignored because there is
> > nothing for it to shingle with.
>
> Might be that.
>
> If I change both of those unigram options to "true" then this is what I
> see (also on 6.3.0, q.op is AND):
>
> "querystring":"test:(one plus one)", "parsedquery":"+test:one +test:plus
> +test:one",
>
> The really mystifying thing is ... it works on the analysis page.  The
> whitespace tokenizer should (in theory at least) produce the same tokens
> on the analysis page as the query parser does before analysis, so I have
> no idea why analysis and query produce different results.  During query
> analysis, the whitespace tokenizer should basically be a no-op, because
> the input has already been tokenized.
>
> If I change the analysis to this (keyword instead of whitespace):
>
> 
>   
>   
>maxShingleSize="5"
>  outputUnigrams="false"
> outputUnigramsIfNoShingles="false" />
> 
>
> Then the behavior is unchanged:
>
> "querystring":"test:(one plus one)", "parsedquery":"",
>
> > I am not sure why it would have worked in Solr 4.
>
> I just tried it on on 4.9-SNAPSHOT, compiled 2015-05-20 from SVN
> revision 1680667, and it doesn't work.  I don't remember whether this
> was compiled from branch_4x or from the 4.9 branch.  Before that test, I
> had tried back to 5.2.1 with the same results:
>
> "querystring": "test:(one plus one)", "parsedquery": "", Thanks,
> Shawn
>
>


Re: Alphanumeric sort with alphabets first

2017-03-16 Thread Srinivasan Narayanan
Can someone please respond?

From: Srinivasan Narayanan 
Date: Monday, March 13, 2017 at 3:51 PM
To: "solr-user@lucene.apache.org" 
Subject: Alphanumeric sort with alphabets first


Hello SOLR experts,

I am new to SOLR and I am trying to do alphanumeric sort on string field(s). 
However, in my case, alphabets should come before numbers. I also have a large 
number of such fields (~2500), any of which can be alphanumerically sorted upon 
at runtime. I’ve explored below concepts in SOLR to arrive at a solution:

1)  Custom similarity plugin : far fetched, and probably not even 
applicable to my usecase

2)  Analyzer/tokenizer and regex magic to left pad number parts with 0s : 
two disadvantages – I believe this needs extra fields (copy) to be created 
which I cannot do (2500 more fields is too much) and this will still push 
numbers before alphabets

3)  Custom function (ValueSource) and regex magic to left pad numeric 
tokens with 0s, and invoke function for sorting only – a bit better than the 
previous one, but still numbers come before alphabets.

4)  Custom function (ValueSource) and regex magic to left pad numeric 
tokens with 0s, prefix numeric tokens with tilde (~), and invoke function for 
sorting only – this is where I stand right now. Very ugly, but it works. 
Because tilde has a very high ASCII value, it pushes numbers behind alphabets.
There should obviously be a better approach I am missing. Please help!