Re: Problem with queries that includes NOT

2015-02-26 Thread david . davila
Hi,

I thought that we were using the edismax query parser, but it seems that 
we had configured the dismax parser.
I have made some tests with the edismax parser and it works fine, so I'll 
change it in our production Solr.

Regards,

David Dávila
DIT - 915828763




De: Alvaro Cabrerizo 
Para:   "solr-user@lucene.apache.org" , 
Fecha:  25/02/2015 16:41
Asunto: Re: Problem with queries that includes NOT



Hi,

The edismax parser should be able to manage the query you want to ask. 
I've
made a test and the next both queries give me the right result (see the
parenthesis):

   - {!edismax}(NOT id:7 AND NOT id:8  AND id:9)   (gives 1 
hit
   the id:9)
   - {!edismax}((NOT id:7 AND NOT id:8)  AND id:9) (gives 1 
hit
   the id:9)

In general, the issue appears when using the lucene query parser mixing
different boolean clauses (including NOT). Thus, as you commented, the 
next
queries gives different result


   - NOT id:7 AND NOT id:8  AND id:9   (gives 1 hit the
   id:9)
   - (NOT id:7 AND NOT id:8)  AND id:9 (gives 0 hits when
   expecting 1 )

Since I read the chapter "Limitations of prohibited clauses in 
sub-queries"
from the "Apache Solr 3 Enterprise Search Server" many years ago,  I 
always
add the *all documents query clause *:**  to the negative clauses to avoid
the problem you mentioned. Thus I will recommend to rewrite the query you
showed us as:

   - (**:*: AND* NOT Proc:"ID01" AND NOT FileType:PDF_TEXT) AND
   sys_FileType:PROTOTIPE
   - (NOT id:7 AND NOT id:8 *AND *:**)  AND id:9 (gives 1 hit
   as expected)

The above query can be read then as give me all the documents except those
having ID01 and PDF_TEXT and having PROTOTIPE

Regards.




On Wed, Feb 25, 2015 at 1:23 PM, Shawn Heisey  wrote:

> On 2/25/2015 4:04 AM, david.dav...@correo.aeat.es wrote:
> > We have problems with some queries. All of them include the tag NOT, 
and
> > in my opinion, the results don´t make any sense.
> >
> > First problem:
> >
> > This query " NOT Proc:ID01 "   returns   95806 results, however this 
one
> "
> > NOT Proc:ID01 OR FileType:PDF_TEXT" returns  11484 results. But it's
> > impossible that adding a tag OR the query has less number of results.
> >
> > Second problem. Here the problem is because of the brackets and the 
NOT
> > tag:
> >
> >  This query:
> >
> > (NOT Proc:"ID01" AND NOT FileType:PDF_TEXT) AND sys_FileType:PROTOTIPE
> > returns 0 documents.
> >
> > But this query:
> >
> > (NOT Proc:"ID01" AND NOT FileType:PDF_TEXT AND sys_FileType:PROTOTIPE)
> > returns 53 documents, which is correct. So, the problem is the 
position
> of
> > the bracket. I have checked the same query without NOTs, and it works
> fine
> > returning the same number of results in both cases.  So, I think the
> > problem is the combination of the bracket positions and the NOT tag.
>
> For the first query, there is a difference between "NOT condition1 OR
> condition2" and "NOT (condition1 OR condition2)" ... I can imagine the
> first one increasing the document count compared to just "NOT
> condition1" ... the second one wouldn't increase it.
>
> Boolean queries in Solr (and very likely Lucene as well) do not always
> do what people expect.
>
> http://robotlibrarian.billdueber.com/2011/12/solr-and-boolean-operators/
> https://lucidworks.com/blog/why-not-and-or-and-not/
>
> As mentioned in the second link above, you'll get better results if you
> use the prefix operators with explicit parentheses.  One word of
> warning, though -- the prefix operators do not work correctly if you
> change the default operator to AND.
>
> Thanks,
> Shawn
>
>



Re: Facet By Distance

2015-02-26 Thread Ahmed Adel
Thank you for your replies, added q and it works! I agree the examples are
a bit confusing. It turned out also that points are clustered around the
center and had to increase d as well.

On Wed, Feb 25, 2015 at 11:46 PM, Alexandre Rafalovitch 
wrote:

> In the examples it used to default to *:* with default params, which
> caused even more confusion.
>
> Regards,
>Alex.
> 
> Solr Analyzers, Tokenizers, Filters, URPs and even a newsletter:
> http://www.solr-start.com/
>
>
> On 25 February 2015 at 15:21, david.w.smi...@gmail.com
>  wrote:
> > If 'q' is absent, then you always match nothing (there may be
> exceptions?);
> > so it's sort of required, in effect.  I wish it defaulted to *:*.
> >
> > ~ David Smiley
> > Freelance Apache Lucene/Solr Search Consultant/Developer
> > http://www.linkedin.com/in/davidwsmiley
> >
> > On Wed, Feb 25, 2015 at 2:28 PM, Ahmed Adel 
> wrote:
> >
> >> Hi,
> >> Thank you for your reply. I added a filter query to the query in two
> ways
> >> as follows:
> >>
> >>
> >>
> fq={!geofilt}&sfield=start_station&pt=40.71754834,-74.01322069&facet.query={!frange
> >> l=0.0 u=0.1}geodist()&facet.query={!frange l=0.10001
> u=0.2}geodist()&d=0.2
> >> --> returns 0 docs
> >>
> >>
> q=*:*&fq={!geofilt}&sfield=start_station&pt=40.71754834,-74.01322069&d=0.2
> >> --> returns 1484 docs
> >>
> >> Not sure why the first query with returns 0 documents
> >>
> >> On Wed, Feb 25, 2015 at 8:46 PM, david.w.smi...@gmail.com <
> >> david.w.smi...@gmail.com> wrote:
> >>
> >> > Hi,
> >> > This will "return all the documents in the index" because you did
> nothing
> >> > to filter them out.  Your query is *:* (everything) and there are no
> >> filter
> >> > queries.
> >> >
> >> > ~ David Smiley
> >> > Freelance Apache Lucene/Solr Search Consultant/Developer
> >> > http://www.linkedin.com/in/davidwsmiley
> >> >
> >> > On Wed, Feb 25, 2015 at 12:27 PM, Ahmed Adel 
> >> > wrote:
> >> >
> >> > > Hello,
> >> > >
> >> > > I'm trying to get Facet By Distance working on an index with
> LatLonType
> >> > > fields. The schema is as follows:
> >> > >
> >> > > 
> >> > > ...
> >> > >  stored="true"/>
> >> > >  >> stored="true"
> >> > />
> >> > >  stored="true"
> >> />
> >> > > 
> >> > > 
> >> > > ...
> >> > > 
> >> > >
> >> > >
> >> > > And the query I'm running is:
> >> > >
> >> > >
> >> >
> >>
> q=*:*&sfield=start_station&pt=40.71754834,-74.01322069&facet.query={!frange
> >> > > l=0.0 u=0.1}geodist()&facet.query={!frange l=0.10001 u=0.2}geodist()
> >> > >
> >> > >
> >> > > But it returns all the documents in the index so it seems something
> is
> >> > > missing. I'm using Solr 4.9.0.
> >> > >
> >> > > --
> >> > >
> >> > > A. Adel
> >> > >
> >> >
> >>
> >> A. Adel
> >>
>



-- 
A. Adel


Getting started with Solr

2015-02-26 Thread Baruch Kogan
Hi, I've just installed Solr (will be controlling with Solarium and using
to search Nutch queries.)  I'm working through the starting tutorials
described here:
https://cwiki.apache.org/confluence/display/solr/Running+Solr

When I try to run $ bin/post -c gettingstarted example/exampledocs/*.json,
I get a bunch of errors having to do
with there not being a gettingstarted folder in /solr/. Is this normal?
Should I create one?

Sincerely,

Baruch Kogan
Marketing Manager
Seller Panda 
+972(58)441-3829
baruch.kogan at Skype


Re: [ANNOUNCE] Luke 4.10.3 released

2015-02-26 Thread Tomoko Uchida
Thank you for checking out it!
Sorry, I've forgot to note important information...

ivy jar is needed to compile. Packaging process needs to be organized, but
for now, I'm borrowing it from lucene's tools/lib.
In my environment, Fedora 20 and OpenJDK 1.7.0_71, it can be compiled and
run as follows.
If there are any problems, please let me know.



$ svn co http://svn.apache.org/repos/asf/lucene/sandbox/luke/
$ cd luke/

// copy ivy jar to lib/tools
$ cp /path/to/lucene_solr_4_10_3/lucene/tools/lib/ivy-2.3.0.jar lib/tools/
$ ls lib/tools/
ivy-2.3.0.jar

$ java -version
java version "1.7.0_71"
OpenJDK Runtime Environment (fedora-2.5.3.3.fc20-x86_64 u71-b14)
OpenJDK 64-Bit Server VM (build 24.65-b04, mixed mode)

$ ant ivy-resolve
...
BUILD SUCCESSFUL

// compile and make jars and run
$ ant dist
...
BUILD SUCCESSFULL
$ java -cp "dist/*" org.apache.lucene.luke.ui.LukeApplication
...


Thanks,
Tomoko

2015-02-26 16:39 GMT+09:00 Dmitry Kan :

> Hi Tomoko,
>
> Thanks for the link. Do you have build instructions somewhere? When I
> executed ant with no params, I get:
>
> BUILD FAILED
> /home/dmitry/projects/svn/luke/build.xml:40:
> /home/dmitry/projects/svn/luke/lib-ivy does not exist.
>
>
> On Thu, Feb 26, 2015 at 2:27 AM, Tomoko Uchida <
> tomoko.uchida.1...@gmail.com
> > wrote:
>
> > Thanks!
> >
> > Would you announce at LUCENE-2562 to me and all watchers interested in
> this
> > issue, when the branch is ready? :)
> > As you know, current pivots's version (that supports Lucene 4.10.3) is
> > here.
> > http://svn.apache.org/repos/asf/lucene/sandbox/luke/
> >
> > Regards,
> > Tomoko
> >
> > 2015-02-25 18:37 GMT+09:00 Dmitry Kan :
> >
> > > Ok, sure. The plan is to make the pivot branch in the current github
> repo
> > > and update its structure accordingly.
> > > Once it is there, I'll let you know.
> > >
> > > Thank you,
> > > Dmitry
> > >
> > > On Tue, Feb 24, 2015 at 5:26 PM, Tomoko Uchida <
> > > tomoko.uchida.1...@gmail.com
> > > > wrote:
> > >
> > > > Hi Dmitry,
> > > >
> > > > Thank you for the detailed clarification!
> > > >
> > > > Recently, I've created a few patches to Pivot version(LUCENE-2562),
> so
> > > I'd
> > > > like to some more work and keep up to date it.
> > > >
> > > > > If you would like to work on the Pivot version, may I suggest you
> to
> > > fork
> > > > > the github's version? The ultimate goal is to donate this to
> Apache,
> > > but
> > > > at
> > > > > least we will have the common plate. :)
> > > >
> > > > Yes, I love to the idea about having common code base.
> > > > I've looked at both codes of github's (thinlet's) and Pivot's,
> Pivot's
> > > > version has very different structure from github's (I think that is
> > > mainly
> > > > for UI framework's requirement.)
> > > > So it seems to be difficult to directly fork github's version to
> > develop
> > > > Pivot's version..., but I think I (or any other developers) could
> catch
> > > up
> > > > changes in github's version.
> > > > There's long way to go for Pivot's version, of course, I'd like to
> also
> > > > make pull requests to enhance github's version if I can.
> > > >
> > > > Thanks,
> > > > Tomoko
> > > >
> > > > 2015-02-24 23:34 GMT+09:00 Dmitry Kan :
> > > >
> > > > > Hi, Tomoko!
> > > > >
> > > > > Thanks for being a fan of luke!
> > > > >
> > > > > Current status of github's luke (https://github.com/DmitryKey/luke
> )
> > is
> > > > > that
> > > > > it has releases for all the major lucene versions since 4.3.0,
> > > excluding
> > > > > 4.4.0 (luke 4.5.0 should be able open indices of 4.4.0) and the
> > latest
> > > --
> > > > > 5.0.0.
> > > > >
> > > > > Porting the github's luke to ALv2 compliant framework (GWT or
> Pivot)
> > > is a
> > > > > long standing goal. With GWT I had issues related to listing and
> > > reading
> > > > > the index directory. So this effort has been parked. Most recently
> I
> > > have
> > > > > been approaching the Pivot. Mark Miller has done an initial port,
> > that
> > > I
> > > > > took as the basis. I'm hoping to continue on this track as time
> > > permits.
> > > > >
> > > > >
> > > > > If you would like to work on the Pivot version, may I suggest you
> to
> > > fork
> > > > > the github's version? The ultimate goal is to donate this to
> Apache,
> > > but
> > > > at
> > > > > least we will have the common plate. :)
> > > > >
> > > > >
> > > > > Thanks,
> > > > > Dmitry
> > > > >
> > > > > On Tue, Feb 24, 2015 at 4:02 PM, Tomoko Uchida <
> > > > > tomoko.uchida.1...@gmail.com
> > > > > > wrote:
> > > > >
> > > > > > Hi,
> > > > > >
> > > > > > I'm an user / fan of Luke, so deeply appreciate your work.
> > > > > >
> > > > > > I've carefully read the readme, noticed the (one of) project's
> > goal:
> > > > > > "To port the thinlet UI to an ASL compliant license framework so
> > that
> > > > it
> > > > > > can be contributed back to Apache Lucene. Current work is done
> with
> > > GWT
> > > > > > 2.5.1."
> > > > > >
> > > > > > There has been GWT based, ASL compliant 

Unable to find query result in solr 5.0.0

2015-02-26 Thread rupak
Hi,

I am new in Solr and using Solr 5.0.0 search server. After installing when
I’m going to search any keyword in solr 5.0.0 it dose not give any results
back. But when I was using a previous version of Solr (1.3.0)(previously
installed) it gives each and every results of the queried Keyword.

For Example: In previous version (1.3.0) when I’m searching with any keyword
like “Hotel”, “Motel”, “Television” , “i-pod” , “Books”, “cricket” etc in
Query String section, it gives all search results with large number of
records as a XML output.

But in Solr 5.0.0 I start up with techproducts core (bin/solr -e
techproducts) and then going to search keywords like “Television” , “i-pod”
etc then it gives 2 or 3 results and also if we going to search any others
keyword like “Hotel”, “Motel” it dose not return back any results. Also if
we start up with cloud by bin/solr start -e cloud -noprompt it dose not
gives any results. Also when we are going to use ‘POST’ tools by executing
post.jar in command prompt says an error that this is not a valid command.

Currently I’m unable to find any keyword. Please help me to query any string
keyword from  solr 5.0.0.

Thanks & Regards,
Rupak Das



--
View this message in context: 
http://lucene.472066.n3.nabble.com/Unable-to-find-query-result-in-solr-5-0-0-tp4189196.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: [ANNOUNCE] Luke 4.10.3 released

2015-02-26 Thread Dmitry Kan
Thanks, Tomoko, it compiles ok!

Now launching produces some errors:

$ java -cp "dist/*" org.apache.lucene.luke.ui.LukeApplication
Exception in thread "main" java.lang.ExceptionInInitializerError
at org.apache.lucene.luke.ui.LukeApplication.main(Unknown Source)
Caused by: java.lang.NumberFormatException: For input string: "3 1644336 "
at
java.lang.NumberFormatException.forInputString(NumberFormatException.java:65)
at java.lang.Integer.parseInt(Integer.java:492)
at java.lang.Byte.parseByte(Byte.java:148)
at java.lang.Byte.parseByte(Byte.java:174)
at org.apache.pivot.util.Version.decode(Version.java:156)
at
org.apache.pivot.wtk.ApplicationContext.(ApplicationContext.java:1704)
... 1 more


On Thu, Feb 26, 2015 at 1:48 PM, Tomoko Uchida  wrote:

> Thank you for checking out it!
> Sorry, I've forgot to note important information...
>
> ivy jar is needed to compile. Packaging process needs to be organized, but
> for now, I'm borrowing it from lucene's tools/lib.
> In my environment, Fedora 20 and OpenJDK 1.7.0_71, it can be compiled and
> run as follows.
> If there are any problems, please let me know.
>
> 
>
> $ svn co http://svn.apache.org/repos/asf/lucene/sandbox/luke/
> $ cd luke/
>
> // copy ivy jar to lib/tools
> $ cp /path/to/lucene_solr_4_10_3/lucene/tools/lib/ivy-2.3.0.jar lib/tools/
> $ ls lib/tools/
> ivy-2.3.0.jar
>
> $ java -version
> java version "1.7.0_71"
> OpenJDK Runtime Environment (fedora-2.5.3.3.fc20-x86_64 u71-b14)
> OpenJDK 64-Bit Server VM (build 24.65-b04, mixed mode)
>
> $ ant ivy-resolve
> ...
> BUILD SUCCESSFUL
>
> // compile and make jars and run
> $ ant dist
> ...
> BUILD SUCCESSFULL
> $ java -cp "dist/*" org.apache.lucene.luke.ui.LukeApplication
> ...
> 
>
> Thanks,
> Tomoko
>
> 2015-02-26 16:39 GMT+09:00 Dmitry Kan :
>
> > Hi Tomoko,
> >
> > Thanks for the link. Do you have build instructions somewhere? When I
> > executed ant with no params, I get:
> >
> > BUILD FAILED
> > /home/dmitry/projects/svn/luke/build.xml:40:
> > /home/dmitry/projects/svn/luke/lib-ivy does not exist.
> >
> >
> > On Thu, Feb 26, 2015 at 2:27 AM, Tomoko Uchida <
> > tomoko.uchida.1...@gmail.com
> > > wrote:
> >
> > > Thanks!
> > >
> > > Would you announce at LUCENE-2562 to me and all watchers interested in
> > this
> > > issue, when the branch is ready? :)
> > > As you know, current pivots's version (that supports Lucene 4.10.3) is
> > > here.
> > > http://svn.apache.org/repos/asf/lucene/sandbox/luke/
> > >
> > > Regards,
> > > Tomoko
> > >
> > > 2015-02-25 18:37 GMT+09:00 Dmitry Kan :
> > >
> > > > Ok, sure. The plan is to make the pivot branch in the current github
> > repo
> > > > and update its structure accordingly.
> > > > Once it is there, I'll let you know.
> > > >
> > > > Thank you,
> > > > Dmitry
> > > >
> > > > On Tue, Feb 24, 2015 at 5:26 PM, Tomoko Uchida <
> > > > tomoko.uchida.1...@gmail.com
> > > > > wrote:
> > > >
> > > > > Hi Dmitry,
> > > > >
> > > > > Thank you for the detailed clarification!
> > > > >
> > > > > Recently, I've created a few patches to Pivot version(LUCENE-2562),
> > so
> > > > I'd
> > > > > like to some more work and keep up to date it.
> > > > >
> > > > > > If you would like to work on the Pivot version, may I suggest you
> > to
> > > > fork
> > > > > > the github's version? The ultimate goal is to donate this to
> > Apache,
> > > > but
> > > > > at
> > > > > > least we will have the common plate. :)
> > > > >
> > > > > Yes, I love to the idea about having common code base.
> > > > > I've looked at both codes of github's (thinlet's) and Pivot's,
> > Pivot's
> > > > > version has very different structure from github's (I think that is
> > > > mainly
> > > > > for UI framework's requirement.)
> > > > > So it seems to be difficult to directly fork github's version to
> > > develop
> > > > > Pivot's version..., but I think I (or any other developers) could
> > catch
> > > > up
> > > > > changes in github's version.
> > > > > There's long way to go for Pivot's version, of course, I'd like to
> > also
> > > > > make pull requests to enhance github's version if I can.
> > > > >
> > > > > Thanks,
> > > > > Tomoko
> > > > >
> > > > > 2015-02-24 23:34 GMT+09:00 Dmitry Kan :
> > > > >
> > > > > > Hi, Tomoko!
> > > > > >
> > > > > > Thanks for being a fan of luke!
> > > > > >
> > > > > > Current status of github's luke (
> https://github.com/DmitryKey/luke
> > )
> > > is
> > > > > > that
> > > > > > it has releases for all the major lucene versions since 4.3.0,
> > > > excluding
> > > > > > 4.4.0 (luke 4.5.0 should be able open indices of 4.4.0) and the
> > > latest
> > > > --
> > > > > > 5.0.0.
> > > > > >
> > > > > > Porting the github's luke to ALv2 compliant framework (GWT or
> > Pivot)
> > > > is a
> > > > > > long standing goal. With GWT I had issues related to listing and
> > > > reading
> > > > > > the index directory. So this effort has been parked. Most
> recently
> > I
> > > > have
> > > > 

Re: Unable to find query result in solr 5.0.0

2015-02-26 Thread Jack Krupansky
Does a query for *:* return all documents? Pick one of those documents and
try a query using a field name and the value of that field for one of the
documents and see if that document is returned.

Maybe you skipped a step in the tutorial process or maybe there was an
error that you ignored.

Please confirm which doc you were reading for the tutorial steps.


-- Jack Krupansky

On Thu, Feb 26, 2015 at 6:17 AM, rupak  wrote:

> Hi,
>
> I am new in Solr and using Solr 5.0.0 search server. After installing when
> I’m going to search any keyword in solr 5.0.0 it dose not give any results
> back. But when I was using a previous version of Solr (1.3.0)(previously
> installed) it gives each and every results of the queried Keyword.
>
> For Example: In previous version (1.3.0) when I’m searching with any
> keyword
> like “Hotel”, “Motel”, “Television” , “i-pod” , “Books”, “cricket” etc in
> Query String section, it gives all search results with large number of
> records as a XML output.
>
> But in Solr 5.0.0 I start up with techproducts core (bin/solr -e
> techproducts) and then going to search keywords like “Television” , “i-pod”
> etc then it gives 2 or 3 results and also if we going to search any others
> keyword like “Hotel”, “Motel” it dose not return back any results. Also if
> we start up with cloud by bin/solr start -e cloud -noprompt it dose not
> gives any results. Also when we are going to use ‘POST’ tools by executing
> post.jar in command prompt says an error that this is not a valid command.
>
> Currently I’m unable to find any keyword. Please help me to query any
> string
> keyword from  solr 5.0.0.
>
> Thanks & Regards,
> Rupak Das
>
>
>
> --
> View this message in context:
> http://lucene.472066.n3.nabble.com/Unable-to-find-query-result-in-solr-5-0-0-tp4189196.html
> Sent from the Solr - User mailing list archive at Nabble.com.
>


Solr 5.0.0 on Windows Server

2015-02-26 Thread John Jenniskens
Hello,

I'm deploying Solr5.0.0. on Windows 2008 server.
I'm planning to add a task to the task scheduler to start the Solr server at 
system boot time.
So I call the bin\solr.bat start  from the task scheduler.

Is this the preferred method om Windows, because I read that running under 
Tomcat or Jetty is not supported any more.

Regards,

John Jenniskens
(fairly new to Solr)


Re: [ANNOUNCE] Luke 4.10.3 released

2015-02-26 Thread Tomoko Uchida
Sorry, I'm afraid I have not encountered such errors when launch.
Seems something wrong around Pivot's, but I have no idea about it.
Would you tell me java version you're using ?

Tomoko

2015-02-26 21:15 GMT+09:00 Dmitry Kan :

> Thanks, Tomoko, it compiles ok!
>
> Now launching produces some errors:
>
> $ java -cp "dist/*" org.apache.lucene.luke.ui.LukeApplication
> Exception in thread "main" java.lang.ExceptionInInitializerError
> at org.apache.lucene.luke.ui.LukeApplication.main(Unknown Source)
> Caused by: java.lang.NumberFormatException: For input string: "3 1644336 "
> at
>
> java.lang.NumberFormatException.forInputString(NumberFormatException.java:65)
> at java.lang.Integer.parseInt(Integer.java:492)
> at java.lang.Byte.parseByte(Byte.java:148)
> at java.lang.Byte.parseByte(Byte.java:174)
> at org.apache.pivot.util.Version.decode(Version.java:156)
> at
>
> org.apache.pivot.wtk.ApplicationContext.(ApplicationContext.java:1704)
> ... 1 more
>
>
> On Thu, Feb 26, 2015 at 1:48 PM, Tomoko Uchida <
> tomoko.uchida.1...@gmail.com
> > wrote:
>
> > Thank you for checking out it!
> > Sorry, I've forgot to note important information...
> >
> > ivy jar is needed to compile. Packaging process needs to be organized,
> but
> > for now, I'm borrowing it from lucene's tools/lib.
> > In my environment, Fedora 20 and OpenJDK 1.7.0_71, it can be compiled and
> > run as follows.
> > If there are any problems, please let me know.
> >
> > 
> >
> > $ svn co http://svn.apache.org/repos/asf/lucene/sandbox/luke/
> > $ cd luke/
> >
> > // copy ivy jar to lib/tools
> > $ cp /path/to/lucene_solr_4_10_3/lucene/tools/lib/ivy-2.3.0.jar
> lib/tools/
> > $ ls lib/tools/
> > ivy-2.3.0.jar
> >
> > $ java -version
> > java version "1.7.0_71"
> > OpenJDK Runtime Environment (fedora-2.5.3.3.fc20-x86_64 u71-b14)
> > OpenJDK 64-Bit Server VM (build 24.65-b04, mixed mode)
> >
> > $ ant ivy-resolve
> > ...
> > BUILD SUCCESSFUL
> >
> > // compile and make jars and run
> > $ ant dist
> > ...
> > BUILD SUCCESSFULL
> > $ java -cp "dist/*" org.apache.lucene.luke.ui.LukeApplication
> > ...
> > 
> >
> > Thanks,
> > Tomoko
> >
> > 2015-02-26 16:39 GMT+09:00 Dmitry Kan :
> >
> > > Hi Tomoko,
> > >
> > > Thanks for the link. Do you have build instructions somewhere? When I
> > > executed ant with no params, I get:
> > >
> > > BUILD FAILED
> > > /home/dmitry/projects/svn/luke/build.xml:40:
> > > /home/dmitry/projects/svn/luke/lib-ivy does not exist.
> > >
> > >
> > > On Thu, Feb 26, 2015 at 2:27 AM, Tomoko Uchida <
> > > tomoko.uchida.1...@gmail.com
> > > > wrote:
> > >
> > > > Thanks!
> > > >
> > > > Would you announce at LUCENE-2562 to me and all watchers interested
> in
> > > this
> > > > issue, when the branch is ready? :)
> > > > As you know, current pivots's version (that supports Lucene 4.10.3)
> is
> > > > here.
> > > > http://svn.apache.org/repos/asf/lucene/sandbox/luke/
> > > >
> > > > Regards,
> > > > Tomoko
> > > >
> > > > 2015-02-25 18:37 GMT+09:00 Dmitry Kan :
> > > >
> > > > > Ok, sure. The plan is to make the pivot branch in the current
> github
> > > repo
> > > > > and update its structure accordingly.
> > > > > Once it is there, I'll let you know.
> > > > >
> > > > > Thank you,
> > > > > Dmitry
> > > > >
> > > > > On Tue, Feb 24, 2015 at 5:26 PM, Tomoko Uchida <
> > > > > tomoko.uchida.1...@gmail.com
> > > > > > wrote:
> > > > >
> > > > > > Hi Dmitry,
> > > > > >
> > > > > > Thank you for the detailed clarification!
> > > > > >
> > > > > > Recently, I've created a few patches to Pivot
> version(LUCENE-2562),
> > > so
> > > > > I'd
> > > > > > like to some more work and keep up to date it.
> > > > > >
> > > > > > > If you would like to work on the Pivot version, may I suggest
> you
> > > to
> > > > > fork
> > > > > > > the github's version? The ultimate goal is to donate this to
> > > Apache,
> > > > > but
> > > > > > at
> > > > > > > least we will have the common plate. :)
> > > > > >
> > > > > > Yes, I love to the idea about having common code base.
> > > > > > I've looked at both codes of github's (thinlet's) and Pivot's,
> > > Pivot's
> > > > > > version has very different structure from github's (I think that
> is
> > > > > mainly
> > > > > > for UI framework's requirement.)
> > > > > > So it seems to be difficult to directly fork github's version to
> > > > develop
> > > > > > Pivot's version..., but I think I (or any other developers) could
> > > catch
> > > > > up
> > > > > > changes in github's version.
> > > > > > There's long way to go for Pivot's version, of course, I'd like
> to
> > > also
> > > > > > make pull requests to enhance github's version if I can.
> > > > > >
> > > > > > Thanks,
> > > > > > Tomoko
> > > > > >
> > > > > > 2015-02-24 23:34 GMT+09:00 Dmitry Kan :
> > > > > >
> > > > > > > Hi, Tomoko!
> > > > > > >
> > > > > > > Thanks for being a fan of luke!
> > > > > > >
> > > > > > > Current status of github's luke (
> > https://githu

Re: Solr Document expiration with TTL

2015-02-26 Thread Makailol Charls
Hi

Thanks for your quick reply.

" since your time_to_live_s and expire_at_dt fields are both
stored, can you confirm that a expire_at_dt field is getting popularted by
the update processor by doing as simple query for your doc (ie
q=id:10seconds) "

No, expire_at_dt field does not get populated when we have added document
with the TTL defined in the TTL field. Like with following query,

curl -X POST -H 'Content-Type: application/json' '
http://localhost:8983/solr/collection1/update?commit=true' -d
'[{"id":"10seconds","time_to_live_s":"+10SECONDS"}]'

and when document retrieved, it gives following result (Can see that
expire_at_dt field is not showing at all).

curl -H 'Content-Type: application/json' '
http://localhost:8983/solr/collection1/select?q=id:10seconds&wt=json&indent=true
'

{
  "responseHeader":{
"status":0,
"QTime":19,
"params":{
  "indent":"true",
  "q":"id:10seconds",
  "wt":"json"}},
  "response":{"numFound":1,"start":0,"docs":[
  {
"id":"10seconds",
"time_to_live_s":"+10SECONDS",
"_version_":1494171978430414848}]
  }}


While if document is added with the TTL value defined explicitly in
expire_at_dt field, like,

curl -X POST -H 'Content-Type: application/json' '
http://localhost:8983/solr/collection1/update?commit=true' -d
'[{"id":"10seconds","expire_at_dt":"NOW+10SECONDS"}]'

We can see the document with expire_at_dt field populated.

curl -H 'Content-Type: application/json' '
http://localhost:8983/solr/collection1/select?q=id:10seconds&wt=json&indent=true
'
{
  "responseHeader":{
"status":0,
"QTime":2,
"params":{
  "indent":"true",
  "q":"id:10seconds",
  "wt":"json"}},
  "response":{"numFound":1,"start":0,"docs":[
  {
"id":"10seconds",
"expire_at_dt":"2015-02-26T12:27:31.983Z",
"_version_":1494172190095966208}]
  }}

Thanks,
Makailol

On Wed, Feb 25, 2015 at 10:00 PM, Chris Hostetter 
wrote:

>
> : Following query posts a document and sets "expire_at_dt" explicitly. That
> : is working perfectly ok and ducument expires at defined time.
>
> so the delete trigge logic is working correctly...
>
> : But when trying to post with TTL (following query), document does not
> : expire after given time.
>
> ...which suggests that the TTL->expire_at logic is not being applied
> properly.
>
> which is weird.
>
> since your time_to_live_s and expire_at_dt fields are both
> stored, can you confirm that a expire_at_dt field is getting popularted by
> the update processor by doing as simple query for your doc (ie
> q=id:10seconds)
>
> (either way: i can't explain why it's not getting deleted, but it would
> help narrow down where the problem is)
>
>
> -Hoss
> http://www.lucidworks.com/
>


Re: Solr Document expiration with TTL

2015-02-26 Thread Makailol Charls
Hi Alex,

Thanks for the reply.

Yes, we have already tried to set the autoDeletePeriodSeconds period to
some low value like 5 seconds and tried checking the document expiration
after 30 seconds or minute or even after an hour. But result is same and
document does not get expired automatically.

Thanks,
Makailol

On Thu, Feb 26, 2015 at 6:22 PM, Makailol Charls <4extrama...@gmail.com>
wrote:

> Hi
>
> Thanks for your quick reply.
>
> " since your time_to_live_s and expire_at_dt fields are both
> stored, can you confirm that a expire_at_dt field is getting popularted by
> the update processor by doing as simple query for your doc (ie
> q=id:10seconds) "
>
> No, expire_at_dt field does not get populated when we have added document
> with the TTL defined in the TTL field. Like with following query,
>
> curl -X POST -H 'Content-Type: application/json' '
> http://localhost:8983/solr/collection1/update?commit=true' -d
> '[{"id":"10seconds","time_to_live_s":"+10SECONDS"}]'
>
> and when document retrieved, it gives following result (Can see that
> expire_at_dt field is not showing at all).
>
> curl -H 'Content-Type: application/json' '
> http://localhost:8983/solr/collection1/select?q=id:10seconds&wt=json&indent=true
> '
>
> {
>   "responseHeader":{
> "status":0,
> "QTime":19,
> "params":{
>   "indent":"true",
>   "q":"id:10seconds",
>   "wt":"json"}},
>   "response":{"numFound":1,"start":0,"docs":[
>   {
> "id":"10seconds",
> "time_to_live_s":"+10SECONDS",
> "_version_":1494171978430414848}]
>   }}
>
>
> While if document is added with the TTL value defined explicitly in
> expire_at_dt field, like,
>
> curl -X POST -H 'Content-Type: application/json' '
> http://localhost:8983/solr/collection1/update?commit=true' -d
> '[{"id":"10seconds","expire_at_dt":"NOW+10SECONDS"}]'
>
> We can see the document with expire_at_dt field populated.
>
> curl -H 'Content-Type: application/json' '
> http://localhost:8983/solr/collection1/select?q=id:10seconds&wt=json&indent=true
> '
> {
>   "responseHeader":{
> "status":0,
> "QTime":2,
> "params":{
>   "indent":"true",
>   "q":"id:10seconds",
>   "wt":"json"}},
>   "response":{"numFound":1,"start":0,"docs":[
>   {
> "id":"10seconds",
> "expire_at_dt":"2015-02-26T12:27:31.983Z",
> "_version_":1494172190095966208}]
>   }}
>
> Thanks,
> Makailol
>
> On Wed, Feb 25, 2015 at 10:00 PM, Chris Hostetter <
> hossman_luc...@fucit.org> wrote:
>
>>
>> : Following query posts a document and sets "expire_at_dt" explicitly.
>> That
>> : is working perfectly ok and ducument expires at defined time.
>>
>> so the delete trigge logic is working correctly...
>>
>> : But when trying to post with TTL (following query), document does not
>> : expire after given time.
>>
>> ...which suggests that the TTL->expire_at logic is not being applied
>> properly.
>>
>> which is weird.
>>
>> since your time_to_live_s and expire_at_dt fields are both
>> stored, can you confirm that a expire_at_dt field is getting popularted by
>> the update processor by doing as simple query for your doc (ie
>> q=id:10seconds)
>>
>> (either way: i can't explain why it's not getting deleted, but it would
>> help narrow down where the problem is)
>>
>>
>> -Hoss
>> http://www.lucidworks.com/
>>
>
>


Re: [ANNOUNCE] Luke 4.10.3 released

2015-02-26 Thread Dmitry Kan
Sure, it is:

java version "1.7.0_76"
Java(TM) SE Runtime Environment (build 1.7.0_76-b13)
Java HotSpot(TM) 64-Bit Server VM (build 24.76-b04, mixed mode)


On Thu, Feb 26, 2015 at 2:39 PM, Tomoko Uchida  wrote:

> Sorry, I'm afraid I have not encountered such errors when launch.
> Seems something wrong around Pivot's, but I have no idea about it.
> Would you tell me java version you're using ?
>
> Tomoko
>
> 2015-02-26 21:15 GMT+09:00 Dmitry Kan :
>
> > Thanks, Tomoko, it compiles ok!
> >
> > Now launching produces some errors:
> >
> > $ java -cp "dist/*" org.apache.lucene.luke.ui.LukeApplication
> > Exception in thread "main" java.lang.ExceptionInInitializerError
> > at org.apache.lucene.luke.ui.LukeApplication.main(Unknown Source)
> > Caused by: java.lang.NumberFormatException: For input string: "3 1644336
> "
> > at
> >
> >
> java.lang.NumberFormatException.forInputString(NumberFormatException.java:65)
> > at java.lang.Integer.parseInt(Integer.java:492)
> > at java.lang.Byte.parseByte(Byte.java:148)
> > at java.lang.Byte.parseByte(Byte.java:174)
> > at org.apache.pivot.util.Version.decode(Version.java:156)
> > at
> >
> >
> org.apache.pivot.wtk.ApplicationContext.(ApplicationContext.java:1704)
> > ... 1 more
> >
> >
> > On Thu, Feb 26, 2015 at 1:48 PM, Tomoko Uchida <
> > tomoko.uchida.1...@gmail.com
> > > wrote:
> >
> > > Thank you for checking out it!
> > > Sorry, I've forgot to note important information...
> > >
> > > ivy jar is needed to compile. Packaging process needs to be organized,
> > but
> > > for now, I'm borrowing it from lucene's tools/lib.
> > > In my environment, Fedora 20 and OpenJDK 1.7.0_71, it can be compiled
> and
> > > run as follows.
> > > If there are any problems, please let me know.
> > >
> > > 
> > >
> > > $ svn co http://svn.apache.org/repos/asf/lucene/sandbox/luke/
> > > $ cd luke/
> > >
> > > // copy ivy jar to lib/tools
> > > $ cp /path/to/lucene_solr_4_10_3/lucene/tools/lib/ivy-2.3.0.jar
> > lib/tools/
> > > $ ls lib/tools/
> > > ivy-2.3.0.jar
> > >
> > > $ java -version
> > > java version "1.7.0_71"
> > > OpenJDK Runtime Environment (fedora-2.5.3.3.fc20-x86_64 u71-b14)
> > > OpenJDK 64-Bit Server VM (build 24.65-b04, mixed mode)
> > >
> > > $ ant ivy-resolve
> > > ...
> > > BUILD SUCCESSFUL
> > >
> > > // compile and make jars and run
> > > $ ant dist
> > > ...
> > > BUILD SUCCESSFULL
> > > $ java -cp "dist/*" org.apache.lucene.luke.ui.LukeApplication
> > > ...
> > > 
> > >
> > > Thanks,
> > > Tomoko
> > >
> > > 2015-02-26 16:39 GMT+09:00 Dmitry Kan :
> > >
> > > > Hi Tomoko,
> > > >
> > > > Thanks for the link. Do you have build instructions somewhere? When I
> > > > executed ant with no params, I get:
> > > >
> > > > BUILD FAILED
> > > > /home/dmitry/projects/svn/luke/build.xml:40:
> > > > /home/dmitry/projects/svn/luke/lib-ivy does not exist.
> > > >
> > > >
> > > > On Thu, Feb 26, 2015 at 2:27 AM, Tomoko Uchida <
> > > > tomoko.uchida.1...@gmail.com
> > > > > wrote:
> > > >
> > > > > Thanks!
> > > > >
> > > > > Would you announce at LUCENE-2562 to me and all watchers interested
> > in
> > > > this
> > > > > issue, when the branch is ready? :)
> > > > > As you know, current pivots's version (that supports Lucene 4.10.3)
> > is
> > > > > here.
> > > > > http://svn.apache.org/repos/asf/lucene/sandbox/luke/
> > > > >
> > > > > Regards,
> > > > > Tomoko
> > > > >
> > > > > 2015-02-25 18:37 GMT+09:00 Dmitry Kan :
> > > > >
> > > > > > Ok, sure. The plan is to make the pivot branch in the current
> > github
> > > > repo
> > > > > > and update its structure accordingly.
> > > > > > Once it is there, I'll let you know.
> > > > > >
> > > > > > Thank you,
> > > > > > Dmitry
> > > > > >
> > > > > > On Tue, Feb 24, 2015 at 5:26 PM, Tomoko Uchida <
> > > > > > tomoko.uchida.1...@gmail.com
> > > > > > > wrote:
> > > > > >
> > > > > > > Hi Dmitry,
> > > > > > >
> > > > > > > Thank you for the detailed clarification!
> > > > > > >
> > > > > > > Recently, I've created a few patches to Pivot
> > version(LUCENE-2562),
> > > > so
> > > > > > I'd
> > > > > > > like to some more work and keep up to date it.
> > > > > > >
> > > > > > > > If you would like to work on the Pivot version, may I suggest
> > you
> > > > to
> > > > > > fork
> > > > > > > > the github's version? The ultimate goal is to donate this to
> > > > Apache,
> > > > > > but
> > > > > > > at
> > > > > > > > least we will have the common plate. :)
> > > > > > >
> > > > > > > Yes, I love to the idea about having common code base.
> > > > > > > I've looked at both codes of github's (thinlet's) and Pivot's,
> > > > Pivot's
> > > > > > > version has very different structure from github's (I think
> that
> > is
> > > > > > mainly
> > > > > > > for UI framework's requirement.)
> > > > > > > So it seems to be difficult to directly fork github's version
> to
> > > > > develop
> > > > > > > Pivot's version..., but I think I (or any other developers)
> cou

Re: Can't index all docs in a local folder with DIH in Solr 5.0.0

2015-02-26 Thread Gary Taylor

Alex,

Same results on recursive=true / recursive=false.

I also tried importing plain text files instead of epub (still using 
TikeEntityProcessor though) and get exactly the same result - ie. all 
files fetched, but only one document indexed in Solr.


With verbose output, I get a row for each file in the directory, but 
only the first one has a non-empty documentImport entity.   All 
subsequent documentImport entities just have an empty document#2 entry.  eg:


 
  "verbose-output": [
"entity:files",
[
  null,
  "--- row #1-",
  "fileSize",
  2609004,
  "fileLastModified",
  "2015-02-25T11:37:25.217Z",
  "fileAbsolutePath",
  "c:\\Users\\gt\\Documents\\epub\\issue018.epub",
  "fileDir",
  "c:\\Users\\gt\\Documents\\epub",
  "file",
  "issue018.epub",
  null,
  "-",
  "entity:documentImport",
  [
"document#1",
[
  "query",
  "c:\\Users\\gt\\Documents\\epub\\issue018.epub",
  "time-taken",
  "0:0:0.0",
  null,
  "--- row #1-",
  "text",
  "< ... parsed epub text - snip ... >"
  "title",
  "Issue 18 title",
  "Author",
  "Author text",
  null,
  "-"
],
"document#2",
[]
  ],
  null,
  "--- row #2-",
  "fileSize",
  4428804,
  "fileLastModified",
  "2015-02-25T11:37:36.399Z",
  "fileAbsolutePath",
  "c:\\Users\\gt\\Documents\\epub\\issue019.epub",
  "fileDir",
  "c:\\Users\\gt\\Documents\\epub",
  "file",
  "issue019.epub",
  null,
  "-",
  "entity:documentImport",
  [
"document#2",
[]
  ],
  null,
  "--- row #3-",
  "fileSize",
  2580266,
  "fileLastModified",
  "2015-02-25T11:37:41.188Z",
  "fileAbsolutePath",
  "c:\\Users\\gt\\Documents\\epub\\issue020.epub",
  "fileDir",
  "c:\\Users\\gt\\Documents\\epub",
  "file",
  "issue020.epub",
  null,
  "-",
  "entity:documentImport",
  [
"document#2",
[]
  ],






Re: Solr takes time to start

2015-02-26 Thread Shawn Heisey
On 2/26/2015 12:11 AM, Nitin Solanki wrote:
>  Why Solr is taking too much of time to start all nodes/ports?

Very slow Solr startup is typically caused by one of two things.  Both
are described here:

https://wiki.apache.org/solr/SolrPerformanceProblems#Slow_startup

There could be other causes, but one of these two are usually the culprit.

Thanks,
Shawn



Re: Getting started with Solr

2015-02-26 Thread Erik Hatcher
How did you start Solr?   If you started with `bin/solr start -e cloud` you’ll 
have a gettingstarted collection created automatically, otherwise you’ll need 
to create it yourself with `bin/solr create -c gettingstarted`


—
Erik Hatcher, Senior Solutions Architect
http://www.lucidworks.com 




> On Feb 26, 2015, at 4:53 AM, Baruch Kogan  wrote:
> 
> Hi, I've just installed Solr (will be controlling with Solarium and using
> to search Nutch queries.)  I'm working through the starting tutorials
> described here:
> https://cwiki.apache.org/confluence/display/solr/Running+Solr
> 
> When I try to run $ bin/post -c gettingstarted example/exampledocs/*.json,
> I get a bunch of errors having to do
> with there not being a gettingstarted folder in /solr/. Is this normal?
> Should I create one?
> 
> Sincerely,
> 
> Baruch Kogan
> Marketing Manager
> Seller Panda 
> +972(58)441-3829
> baruch.kogan at Skype



Re: Can't index all docs in a local folder with DIH in Solr 5.0.0

2015-02-26 Thread Alexandre Rafalovitch
On 26 February 2015 at 08:32, Gary Taylor  wrote:
> Alex,
>
> Same results on recursive=true / recursive=false.
>
> I also tried importing plain text files instead of epub (still using
> TikeEntityProcessor though) and get exactly the same result - ie. all files
> fetched, but only one document indexed in Solr.

To me, this would indicate that something is a problem with the inner
DIH entity then. As a next set of steps, I would probably
1) remove both onError statements and see if there is an exception
that is being swallowed.
2) run the import under ProcessMonitor and see if the other files are
actually being read
https://technet.microsoft.com/en-us/library/bb896645.aspx
3) Assume a Windows bug and test this on Mac/Linux
4) File a JIRA with a replication case. If there is a full replication
setup, I'll test it machines I have access to with full debugger
step-through

For example, I wonder if FileBinDataSource is somehow not cleaning up
after the first file properly on Windows and fails to open the second
one.

Regards,
   Alex.


Solr Analyzers, Tokenizers, Filters, URPs and even a newsletter:
http://www.solr-start.com/


Re: Can't index all docs in a local folder with DIH in Solr 5.0.0

2015-02-26 Thread Gary Taylor

Alex,

That's great.  Thanks for the pointers.  I'll try and get more info on 
this and file a JIRA issue.


Kind regards,
Gary.

On 26/02/2015 14:16, Alexandre Rafalovitch wrote:

On 26 February 2015 at 08:32, Gary Taylor  wrote:

Alex,

Same results on recursive=true / recursive=false.

I also tried importing plain text files instead of epub (still using
TikeEntityProcessor though) and get exactly the same result - ie. all files
fetched, but only one document indexed in Solr.

To me, this would indicate that something is a problem with the inner
DIH entity then. As a next set of steps, I would probably
1) remove both onError statements and see if there is an exception
that is being swallowed.
2) run the import under ProcessMonitor and see if the other files are
actually being read
https://technet.microsoft.com/en-us/library/bb896645.aspx
3) Assume a Windows bug and test this on Mac/Linux
4) File a JIRA with a replication case. If there is a full replication
setup, I'll test it machines I have access to with full debugger
step-through

For example, I wonder if FileBinDataSource is somehow not cleaning up
after the first file properly on Windows and fails to open the second
one.

Regards,
Alex.


Solr Analyzers, Tokenizers, Filters, URPs and even a newsletter:
http://www.solr-start.com/



--
Gary Taylor | www.inovem.com | www.kahootz.com

INOVEM Ltd is registered in England and Wales No 4228932
Registered Office 1, Weston Court, Weston, Berkshire. RG20 8JE
kahootz.com is a trading name of INOVEM Ltd.



qt.shards in solrconfig.xml

2015-02-26 Thread Benson Margulies
A query I posted yesterday amounted to me forgetting that I have to
set qt.shards when I use a URL other than plain old '/select' with
SolrCloud. Is there any way to configure a query handler to automate
this, so that all queries addressed to '/RNI' get that added in?


Re: Solr Document expiration with TTL

2015-02-26 Thread Alexandre Rafalovitch
If your expire_at_dt field is not populated automatically, let's step
back and recheck a sanity setting. You said it is a managed schema? Is
it a schemaless as well? With an explicit processor chain? If that's
the case, your default chain may not be running AT ALL.

So, recheck your solrconfig.xml. Or add another explicit field
population inside the chain, just like the example did with
TimestampUpdateProcessorFactory :
https://lucidworks.com/blog/document-expiration/

Regards,
Alex.

On 26 February 2015 at 07:52, Makailol Charls <4extrama...@gmail.com> wrote:
> " since your time_to_live_s and expire_at_dt fields are both
> stored, can you confirm that a expire_at_dt field is getting popularted by
> the update processor by doing as simple query for your doc (ie
> q=id:10seconds) "
>
> No, expire_at_dt field does not get populated when we have added document
> with the TTL defined in the TTL field. Like with following query,




Solr Analyzers, Tokenizers, Filters, URPs and even a newsletter:
http://www.solr-start.com/


Re: Getting started with Solr

2015-02-26 Thread Baruch Kogan
Oh, I see. I used the start -e cloud command, then ran through a setup with
one core and default options for the rest, then tried to post the json
example again, and got another error:
buntu@ubuntu-VirtualBox:~/crawler/solr$ bin/post -c gettingstarted
example/exampledocs/*.json
/usr/lib/jvm/java-7-oracle/bin/java -classpath
/home/ubuntu/crawler/solr/dist/solr-core-5.0.0.jar -Dauto=yes
-Dc=gettingstarted -Ddata=files org.apache.solr.util.SimplePostTool
example/exampledocs/books.json
SimplePostTool version 5.0.0
Posting files to [base] url
http://localhost:8983/solr/gettingstarted/update...
Entering auto mode. File endings considered are
xml,json,csv,pdf,doc,docx,ppt,pptx,xls,xlsx,odt,odp,ods,ott,otp,ots,rtf,htm,html,txt,log
POSTing file books.json (application/json) to [base]
SimplePostTool: WARNING: Solr returned an error #404 (Not Found) for url:
http://localhost:8983/solr/gettingstarted/update
SimplePostTool: WARNING: Response: 


Error 404 Not Found

HTTP ERROR 404
Problem accessing /solr/gettingstarted/update. Reason:
Not FoundPowered by
Jetty://

Sincerely,

Baruch Kogan
Marketing Manager
Seller Panda 
+972(58)441-3829
baruch.kogan at Skype

On Thu, Feb 26, 2015 at 4:07 PM, Erik Hatcher 
wrote:

> How did you start Solr?   If you started with `bin/solr start -e cloud`
> you’ll have a gettingstarted collection created automatically, otherwise
> you’ll need to create it yourself with `bin/solr create -c gettingstarted`
>
>
> —
> Erik Hatcher, Senior Solutions Architect
> http://www.lucidworks.com 
>
>
>
>
> > On Feb 26, 2015, at 4:53 AM, Baruch Kogan 
> wrote:
> >
> > Hi, I've just installed Solr (will be controlling with Solarium and using
> > to search Nutch queries.)  I'm working through the starting tutorials
> > described here:
> > https://cwiki.apache.org/confluence/display/solr/Running+Solr
> >
> > When I try to run $ bin/post -c gettingstarted
> example/exampledocs/*.json,
> > I get a bunch of errors having to do
> > with there not being a gettingstarted folder in /solr/. Is this normal?
> > Should I create one?
> >
> > Sincerely,
> >
> > Baruch Kogan
> > Marketing Manager
> > Seller Panda 
> > +972(58)441-3829
> > baruch.kogan at Skype
>
>


Solr Backup Strategy

2015-02-26 Thread Siddharth Nayar
What is the best backup and restore strategy for Solr 3.6.1?


Re: New leader/replica solution for HDFS

2015-02-26 Thread Mark Miller
I’ll be working on this at some point: 
https://issues.apache.org/jira/browse/SOLR-6237

- Mark

http://about.me/markrmiller

> On Feb 25, 2015, at 2:12 AM, longsan  wrote:
> 
> We used HDFS as our Solr index storage and we really have a heavy update
> load. We had met much problems with current leader/replica solution. There
> is duplicate index computing on Replilca side. And the data sync between
> leader/replica is always a problem.
> 
> As HDFS already provides data replication on data layer, could Solr provide
> just service layer replication?
> 
> My thought is that the leader and the replica all bind to the same data
> index directory. And the leader will build up index for new request, the
> replica will just keep update the index version with the leader(such as a
> soft commit periodically? ). If the leader lost then the replica will take
> the duty immediately. 
> 
> Thanks for any suggestion of this idea.
> 
> 
> 
> 
> 
> 
> 
> --
> View this message in context: 
> http://lucene.472066.n3.nabble.com/New-leader-replica-solution-for-HDFS-tp4188735.html
> Sent from the Solr - User mailing list archive at Nabble.com.



Re: Unable to find query result in solr 5.0.0

2015-02-26 Thread Erick Erickson
What data did you have in the 1.3 version? Because the bin/solr -e techproduts
process only indexes 30+ docs total. So if your 1.3 installation is
returning more
docs as your note seems to imply, you somehow have a lot more docs indexed.

There is no mention of "hotel" in any of the sample docs that are indexed by the
'techproducts' target, so I suspect you're comparing apples to oranges.

Not to mention that there are many, many changes to both the code and the
sample data since 1.3 so whether the results are exactly comparable or not
is highly questionable.

Best,
Erick

On Thu, Feb 26, 2015 at 4:30 AM, Jack Krupansky
 wrote:
> Does a query for *:* return all documents? Pick one of those documents and
> try a query using a field name and the value of that field for one of the
> documents and see if that document is returned.
>
> Maybe you skipped a step in the tutorial process or maybe there was an
> error that you ignored.
>
> Please confirm which doc you were reading for the tutorial steps.
>
>
> -- Jack Krupansky
>
> On Thu, Feb 26, 2015 at 6:17 AM, rupak  wrote:
>
>> Hi,
>>
>> I am new in Solr and using Solr 5.0.0 search server. After installing when
>> I’m going to search any keyword in solr 5.0.0 it dose not give any results
>> back. But when I was using a previous version of Solr (1.3.0)(previously
>> installed) it gives each and every results of the queried Keyword.
>>
>> For Example: In previous version (1.3.0) when I’m searching with any
>> keyword
>> like “Hotel”, “Motel”, “Television” , “i-pod” , “Books”, “cricket” etc in
>> Query String section, it gives all search results with large number of
>> records as a XML output.
>>
>> But in Solr 5.0.0 I start up with techproducts core (bin/solr -e
>> techproducts) and then going to search keywords like “Television” , “i-pod”
>> etc then it gives 2 or 3 results and also if we going to search any others
>> keyword like “Hotel”, “Motel” it dose not return back any results. Also if
>> we start up with cloud by bin/solr start -e cloud -noprompt it dose not
>> gives any results. Also when we are going to use ‘POST’ tools by executing
>> post.jar in command prompt says an error that this is not a valid command.
>>
>> Currently I’m unable to find any keyword. Please help me to query any
>> string
>> keyword from  solr 5.0.0.
>>
>> Thanks & Regards,
>> Rupak Das
>>
>>
>>
>> --
>> View this message in context:
>> http://lucene.472066.n3.nabble.com/Unable-to-find-query-result-in-solr-5-0-0-tp4189196.html
>> Sent from the Solr - User mailing list archive at Nabble.com.
>>


Re: qt.shards in solrconfig.xml

2015-02-26 Thread Mikhail Khludnev
Hello,

Giving
http://mail-archives.apache.org/mod_mbox/lucene-solr-user/201301.mbox/%3c711daae5-c366-4349-b644-8e29e80e2...@gmail.com%3E
you can add qt.shards into handler defaults/invariants.

On Thu, Feb 26, 2015 at 5:40 PM, Benson Margulies 
wrote:

> A query I posted yesterday amounted to me forgetting that I have to
> set qt.shards when I use a URL other than plain old '/select' with
> SolrCloud. Is there any way to configure a query handler to automate
> this, so that all queries addressed to '/RNI' get that added in?
>



-- 
Sincerely yours
Mikhail Khludnev
Principal Engineer,
Grid Dynamics





Re: Collations are not working fine.

2015-02-26 Thread Rajesh Hazari
Below is the filed definition that we used its just a basic definition ::






  
  






  




*Rajesh.*


On Thu, Feb 26, 2015 at 2:03 AM, Nitin Solanki  wrote:

> Hi Rajesh,
> What configuration had you set in your schema.xml?
>
> On Sat, Feb 14, 2015 at 2:18 AM, Rajesh Hazari 
> wrote:
>
> > Hi Nitin,
> >
> > Can u try with the below config, we have these config seems to be working
> > for us.
> >
> > 
> >
> >  text_general
> >
> >
> >   
> > wordbreak
> > solr.WordBreakSolrSpellChecker
> > textSpell
> > true
> > false
> > 5
> >   
> >
> >
> > default
> > textSpell
> > solr.IndexBasedSpellChecker
> > ./spellchecker
> > 0.75
> > 0.01
> > true
> > 5
> >  
> >
> >
> >   
> >
> >
> >
> > true
> > default
> > wordbreak
> > 5
> > 15
> > true
> > false
> > true
> > 100
> > 100%
> > AND
> > 1000
> >
> >
> > *Rajesh.*
> >
> > On Fri, Feb 13, 2015 at 1:01 PM, Dyer, James <
> james.d...@ingramcontent.com
> > >
> > wrote:
> >
> > > Nitin,
> > >
> > > Can you post the full spellcheck response when you query:
> > >
> > > q=gram_ci:"gone wthh thes wint"&wt=json&indent=true&shards.qt=/spell
> > >
> > > James Dyer
> > > Ingram Content Group
> > >
> > >
> > > -Original Message-
> > > From: Nitin Solanki [mailto:nitinml...@gmail.com]
> > > Sent: Friday, February 13, 2015 1:05 AM
> > > To: solr-user@lucene.apache.org
> > > Subject: Re: Collations are not working fine.
> > >
> > > Hi James Dyer,
> > >   I did the same as you told me. Used
> > > WordBreakSolrSpellChecker instead of shingles. But still collations are
> > not
> > > coming or working.
> > > For instance, I tried to get collation of "gone with the wind" by
> > searching
> > > "gone wthh thes wint" on field=gram_ci but didn't succeed. Even, I am
> > > getting the suggestions of wtth as *with*, thes as *the*, wint as
> *wind*.
> > > Also I have documents which contains "gone with the wind" having 167
> > times
> > > in the documents. I don't know that I am missing something or not.
> > > Please check my below solr configuration:
> > >
> > > *URL: *localhost:8983/solr/wikingram/spell?q=gram_ci:"gone wthh thes
> > > wint"&wt=json&indent=true&shards.qt=/spell
> > >
> > > *solrconfig.xml:*
> > >
> > > 
> > > textSpellCi
> > > 
> > >   default
> > >   gram_ci
> > >   solr.DirectSolrSpellChecker
> > >   internal
> > >   0.5
> > >   2
> > >   0
> > >   5
> > >   2
> > >   0.9
> > >   freq
> > > 
> > > 
> > >   wordbreak
> > >   solr.WordBreakSolrSpellChecker
> > >   gram
> > >   true
> > >   true
> > >   5
> > > 
> > > 
> > >
> > >  startup="lazy">
> > > 
> > >   gram_ci
> > >   default
> > >   on
> > >   true
> > >   25
> > >   true
> > >   1
> > >   25
> > >   true
> > >   50
> > >   50
> > >   true
> > > 
> > > 
> > >   spellcheck
> > > 
> > >   
> > >
> > > *Schema.xml: *
> > >
> > >  > > multiValued="false"/>
> > >
> > >  > > positionIncrementGap="100">
> > >
> > > 
> > > 
> > > 
> > > 
> > > 
> > > 
> > > 
> > > 
> > >
> >
>


Re: qt.shards in solrconfig.xml

2015-02-26 Thread Jack Krupansky
I was hoping that Benson was hinting at adding a qt.shards.auto=true
parameter to so that would magically use on the path from the incoming
request - and that this would be the default, since that's what most people
would expect.

Or, maybe just add a commented-out custom handler that has the qt.shards
parameter as suggested, to re-emphasize to people that if they want to use
a custom handler in distributed mode, then they will most likely need this
parameter.

-- Jack Krupansky

On Thu, Feb 26, 2015 at 11:28 AM, Mikhail Khludnev <
mkhlud...@griddynamics.com> wrote:

> Hello,
>
> Giving
>
> http://mail-archives.apache.org/mod_mbox/lucene-solr-user/201301.mbox/%3c711daae5-c366-4349-b644-8e29e80e2...@gmail.com%3E
> you can add qt.shards into handler defaults/invariants.
>
> On Thu, Feb 26, 2015 at 5:40 PM, Benson Margulies 
> wrote:
>
> > A query I posted yesterday amounted to me forgetting that I have to
> > set qt.shards when I use a URL other than plain old '/select' with
> > SolrCloud. Is there any way to configure a query handler to automate
> > this, so that all queries addressed to '/RNI' get that added in?
> >
>
>
>
> --
> Sincerely yours
> Mikhail Khludnev
> Principal Engineer,
> Grid Dynamics
>
> 
> 
>


Re: Solr Document expiration with TTL

2015-02-26 Thread Chris Hostetter

: If your expire_at_dt field is not populated automatically, let's step
: back and recheck a sanity setting. You said it is a managed schema? Is
: it a schemaless as well? With an explicit processor chain? If that's
: the case, your default chain may not be running AT ALL.

yeah ... my only guess here is that even though you posted before that you 
had this configured in your defaut chain...


  30
  time_to_live_s
  expire_at_dt


...perhaps you have an update.chain=foo type default param configured for 
your /update handler?

* what does your /update  config look like?
* are you using the new  feature of solr? what does it's 
config look like?

: So, recheck your solrconfig.xml. Or add another explicit field
: population inside the chain, just like the example did with
: TimestampUpdateProcessorFactory :
: https://lucidworks.com/blog/document-expiration/

yeah ... that would help as a sanity check as well ... point is: we need 
to verify which chain you are using when adding the doc.

: 
: 
: 
: 
: Solr Analyzers, Tokenizers, Filters, URPs and even a newsletter:
: http://www.solr-start.com/
: 

-Hoss
http://www.lucidworks.com/


Re: New leader/replica solution for HDFS

2015-02-26 Thread Joseph Obernberger

Great!  Thank you!

I had a 4 shard setup - no replicas.  Index size was 2.0TBytes stored in 
HDFS with each node having approximately 500G of index.  I added four 
more shards on four other machines as replicas.  One thing that happened 
was the 4 replicas all ran out of HDFS cache size
 (SnapPull failed: java.lang.RuntimeException: The max direct memory is 
likely too low.  Either increase it (by adding - 
XX:MaxDirectMemorySize=g -XX:+UseLargePages to your containers 
startup args) or disable direct allocation using 
solr.hdfs.blockcache.direct.memory.allocation=false in solrconfig.xml.  
If you are putting the block cache on the heap, your java heap size 
might not be large enough.  Failed allocating)


I was using 160 slabs (20GBytes iof RAM).  I dropped the config to 80 
slabs and restarted the replicas.  Two of the replicas came up OK, but 
the other 2 have stayed in 'Recovering'.  I stopped those two and 
restarted them - now I have 3 OK, but one is still in Recovering.


Given that each replica does indexing as well, I was expecting the 
amount of HDFS disk usage to double, but that has not happened. Once I 
get the last replica to come up, I'll run some tests.


-Joe

On 2/26/2015 10:45 AM, Mark Miller wrote:

I’ll be working on this at some point: 
https://issues.apache.org/jira/browse/SOLR-6237

- Mark

http://about.me/markrmiller


On Feb 25, 2015, at 2:12 AM, longsan  wrote:

We used HDFS as our Solr index storage and we really have a heavy update
load. We had met much problems with current leader/replica solution. There
is duplicate index computing on Replilca side. And the data sync between
leader/replica is always a problem.

As HDFS already provides data replication on data layer, could Solr provide
just service layer replication?

My thought is that the leader and the replica all bind to the same data
index directory. And the leader will build up index for new request, the
replica will just keep update the index version with the leader(such as a
soft commit periodically? ). If the leader lost then the replica will take
the duty immediately.

Thanks for any suggestion of this idea.







--
View this message in context: 
http://lucene.472066.n3.nabble.com/New-leader-replica-solution-for-HDFS-tp4188735.html
Sent from the Solr - User mailing list archive at Nabble.com.






Solr collection unavailable after reload

2015-02-26 Thread akira...@libero.it
Hi everybody,
hi have a very strange issue in my solr (version 4.10.2) installation (on a 
windows server 2008 java jdk 1.7).
I am sure nobody met this problem before (at least googling around i found 
nothing). I have a simple configuration with a base "text_general"  field. I 
was able to index any kind of text without any problem. After some days i got a 
strange behavior. After restarting my server i was not able to load my 
collection 
My server started without error messages in my log but, trying to select my 
collection, web app hangs (as in picture)
I googled around to find a way to debug such situation, or a way to "fix" 
collection itself (but found nothing).
I am able to produce such error trying to index word "wasting" in my 
text_general field ( i know that problem is not related to this wordbut is 
the only way to produce such behavior). The only thing i can do is stop solr, 
delete "index" folder content and restart solr. Then i try again to index a 
document with "wasting" word, reload collection, and the problem is there again.
Please someone could at least tell me a way to investigate


Re: Solr collection unavailable after reload

2015-02-26 Thread Erick Erickson
This is very, very strange. How are you indexing the docs? SolrJ? XML? DIH?

What happens if you _only_ index the doc?

You say nothing comes out in the log file indicating an error, but
what _does_ come out? Particularly at the end?

And in general note that attachments don't come through the Apache
mailing lists, to show them you usually have to post it somewhere and
provide a link.

You _might_ be hitting the problem where the suggester takes a long
time to build on startup, in which case you'll be seeing a message at
the end of your log when you can't start things about building. If you
wait _very_ patiently it'll eventually complete. This was commented
out of solrconfig.xml stock distros in 4.10.3, see:
https://issues.apache.org/jira/browse/SOLR-6679. And if your suggester
is configured with anything like "build on commit", it'll happen
whenever you index.

Anyway, this is consistent with the problems you've reported, so at
least there's a chance it's the issue.

Best,
Erick

On Thu, Feb 26, 2015 at 10:40 AM, akira...@libero.it  wrote:
> Hi everybody,
>
> hi have a very strange issue in my solr (version 4.10.2) installation (on a
> windows server 2008 java jdk 1.7).
>
> I am sure nobody met this problem before (at least googling around i found
> nothing). I have a simple configuration with a base "text_general"  field.
> I was able to index any kind of text without any problem. After some days i
> got a strange behavior. After restarting my server i was not able to load my
> collection
> My server started without error messages in my log but, trying to select my
> collection, web app hangs (as in picture)
>
> I googled around to find a way to debug such situation, or a way to "fix"
> collection itself (but found nothing).
>
> I am able to produce such error trying to index word "wasting" in my
> text_general field ( i know that problem is not related to this wordbut
> is the only way to produce such behavior).
> The only thing i can do is stop solr, delete "index" folder content and
> restart solr. Then i try again to index a document with "wasting" word,
> reload collection, and the problem is there again.
>
> Please someone could at least tell me a way to investigate
>


Leading Wildcard Support (ReversedWildcardFilterFactory)

2015-02-26 Thread jaime spicciati
All,
I am currently using 4.10.3 running Solr Cloud.

I have configured my index analyzer to leverage the
solr.ReversedWildcardFilterFactory with various settings for the
maxFractionAsterisk, maxPosAsterisk,etc. Currently I am running with the
defaults (ie not configured)

Using the Analysis capability in the Solr admin I see the "Field Value
(Index)" fields going in correctly, both normal order and reversed order.
However, on the "Field Value (Query)" side it is not generating a token
that is reversed as expected (no matter where I place the * in the leading
position of the search term). I also confirmed through the Query capability
with debugQuery turned on that the parsed query is not reversed as expected.

>From my current understanding you do not need to have anything configured
on the index analyzer to make leading wildcards work as expected with the
reversedwildcardfilterfactory. The default query parser will know to look
at the index analyzer and leverage the ReversedWildcardFilterFactory
configuration if the term contains a leading wildcard. (This is what I have
read)

Without uploading my entire configuration to this email I was hoping
someone could point me in the right direction because I am at a loss at
this point.

Thanks!


Re: Leading Wildcard Support (ReversedWildcardFilterFactory)

2015-02-26 Thread Jack Krupansky
Please post your field type... or at least confirm a comparison to the
example in the javadoc:
http://lucene.apache.org/solr/4_10_3/solr-core/org/apache/solr/analysis/ReversedWildcardFilterFactory.html

-- Jack Krupansky

On Thu, Feb 26, 2015 at 2:38 PM, jaime spicciati 
wrote:

> All,
> I am currently using 4.10.3 running Solr Cloud.
>
> I have configured my index analyzer to leverage the
> solr.ReversedWildcardFilterFactory with various settings for the
> maxFractionAsterisk, maxPosAsterisk,etc. Currently I am running with the
> defaults (ie not configured)
>
> Using the Analysis capability in the Solr admin I see the "Field Value
> (Index)" fields going in correctly, both normal order and reversed order.
> However, on the "Field Value (Query)" side it is not generating a token
> that is reversed as expected (no matter where I place the * in the leading
> position of the search term). I also confirmed through the Query capability
> with debugQuery turned on that the parsed query is not reversed as
> expected.
>
> From my current understanding you do not need to have anything configured
> on the index analyzer to make leading wildcards work as expected with the
> reversedwildcardfilterfactory. The default query parser will know to look
> at the index analyzer and leverage the ReversedWildcardFilterFactory
> configuration if the term contains a leading wildcard. (This is what I have
> read)
>
> Without uploading my entire configuration to this email I was hoping
> someone could point me in the right direction because I am at a loss at
> this point.
>
> Thanks!
>


Re: Leading Wildcard Support (ReversedWildcardFilterFactory)

2015-02-26 Thread jaime spicciati
Thanks for the quick response.

The index I am currently testing with has the following configuration which
is the default for the text_general_rev

The field type is solr.TextField

maxFractionAsterisk=.33
maxPosAsterisk=3
maxPosQuestion=2
withOriginal=true

Through additional review I think it *might *be working as expected even
though the Analysis tab and debugQuery parsed query lead me to think
otherwise. If I look at the explain plan from the debugQuery and I actually
get a hit, I see word/word(s) that actually come back in reversed order
with the "\u0001 prefix character, so the actual hit against the inverted
index appears to be correct even though the parsed query doesn't reflect
this. Is it safe to say that things are in fact working correctly?

Thanks again



On Thu, Feb 26, 2015 at 3:34 PM, Jack Krupansky 
wrote:

> Please post your field type... or at least confirm a comparison to the
> example in the javadoc:
>
> http://lucene.apache.org/solr/4_10_3/solr-core/org/apache/solr/analysis/ReversedWildcardFilterFactory.html
>
> -- Jack Krupansky
>
> On Thu, Feb 26, 2015 at 2:38 PM, jaime spicciati <
> jaime.spicci...@gmail.com>
> wrote:
>
> > All,
> > I am currently using 4.10.3 running Solr Cloud.
> >
> > I have configured my index analyzer to leverage the
> > solr.ReversedWildcardFilterFactory with various settings for the
> > maxFractionAsterisk, maxPosAsterisk,etc. Currently I am running with the
> > defaults (ie not configured)
> >
> > Using the Analysis capability in the Solr admin I see the "Field Value
> > (Index)" fields going in correctly, both normal order and reversed order.
> > However, on the "Field Value (Query)" side it is not generating a token
> > that is reversed as expected (no matter where I place the * in the
> leading
> > position of the search term). I also confirmed through the Query
> capability
> > with debugQuery turned on that the parsed query is not reversed as
> > expected.
> >
> > From my current understanding you do not need to have anything configured
> > on the index analyzer to make leading wildcards work as expected with the
> > reversedwildcardfilterfactory. The default query parser will know to look
> > at the index analyzer and leverage the ReversedWildcardFilterFactory
> > configuration if the term contains a leading wildcard. (This is what I
> have
> > read)
> >
> > Without uploading my entire configuration to this email I was hoping
> > someone could point me in the right direction because I am at a loss at
> > this point.
> >
> > Thanks!
> >
>


Re: Leading Wildcard Support (ReversedWildcardFilterFactory)

2015-02-26 Thread Jack Krupansky
Most of the magic is done internal to the query parser which actually
inspects the index analyzer chain when a leading wildcard is present. Look
at the parsed_query in the debug response, and you should see that special
prefix query.

-- Jack Krupansky

On Thu, Feb 26, 2015 at 3:49 PM, jaime spicciati 
wrote:

> Thanks for the quick response.
>
> The index I am currently testing with has the following configuration which
> is the default for the text_general_rev
>
> The field type is solr.TextField
>
> maxFractionAsterisk=.33
> maxPosAsterisk=3
> maxPosQuestion=2
> withOriginal=true
>
> Through additional review I think it *might *be working as expected even
> though the Analysis tab and debugQuery parsed query lead me to think
> otherwise. If I look at the explain plan from the debugQuery and I actually
> get a hit, I see word/word(s) that actually come back in reversed order
> with the "\u0001 prefix character, so the actual hit against the inverted
> index appears to be correct even though the parsed query doesn't reflect
> this. Is it safe to say that things are in fact working correctly?
>
> Thanks again
>
>
>
> On Thu, Feb 26, 2015 at 3:34 PM, Jack Krupansky 
> wrote:
>
> > Please post your field type... or at least confirm a comparison to the
> > example in the javadoc:
> >
> >
> http://lucene.apache.org/solr/4_10_3/solr-core/org/apache/solr/analysis/ReversedWildcardFilterFactory.html
> >
> > -- Jack Krupansky
> >
> > On Thu, Feb 26, 2015 at 2:38 PM, jaime spicciati <
> > jaime.spicci...@gmail.com>
> > wrote:
> >
> > > All,
> > > I am currently using 4.10.3 running Solr Cloud.
> > >
> > > I have configured my index analyzer to leverage the
> > > solr.ReversedWildcardFilterFactory with various settings for the
> > > maxFractionAsterisk, maxPosAsterisk,etc. Currently I am running with
> the
> > > defaults (ie not configured)
> > >
> > > Using the Analysis capability in the Solr admin I see the "Field Value
> > > (Index)" fields going in correctly, both normal order and reversed
> order.
> > > However, on the "Field Value (Query)" side it is not generating a token
> > > that is reversed as expected (no matter where I place the * in the
> > leading
> > > position of the search term). I also confirmed through the Query
> > capability
> > > with debugQuery turned on that the parsed query is not reversed as
> > > expected.
> > >
> > > From my current understanding you do not need to have anything
> configured
> > > on the index analyzer to make leading wildcards work as expected with
> the
> > > reversedwildcardfilterfactory. The default query parser will know to
> look
> > > at the index analyzer and leverage the ReversedWildcardFilterFactory
> > > configuration if the term contains a leading wildcard. (This is what I
> > have
> > > read)
> > >
> > > Without uploading my entire configuration to this email I was hoping
> > > someone could point me in the right direction because I am at a loss at
> > > this point.
> > >
> > > Thanks!
> > >
> >
>


Re: Basic Multilingual search capability

2015-02-26 Thread Rishi Easwaran
Hi Tom,

Thanks for your inputs. 
I was planning to use stopword filter, but will definitely make sure they are 
unique and not to step over each other.  I think for our system even going with 
length of 50-75 should be fine, will definitely up that number after doing some 
analysis on our input.
Just one clarification, when you say ICUFilterFactory am I correct in thinking 
its ICUFodingFilterFactory.
 
Thanks,
Rishi.

 

 

-Original Message-
From: Tom Burton-West 
To: solr-user 
Sent: Wed, Feb 25, 2015 4:33 pm
Subject: Re: Basic Multilingual search capability


Hi Rishi,

As others have indicated Multilingual search is very difficult to do well.

At HathiTrust we've been using the ICUTokenizer and ICUFilterFactory to
deal with having materials in 400 languages.  We also added the
CJKBigramFilter to get better precision on CJK queries.  We don't use stop
words because stop words in one language are content words in another.  For
example "die" in German is a stopword but it is a content word in English.

Putting multiple languages in one index can affect word frequency
statistics which make relevance ranking less accurate.  So for example for
the English query "Die Hard" the word "die" would get a low idf score
because it occurs so frequently in German.  We realize that our  approach
does not produce the best results, but given the 400 languages, and limited
resources, we do our best to make search "not suck" for non-English
languages.   When we have the resources we are thinking about doing special
processing for a small fraction of the top 20 languages.  We plan to select
those languages  that most need special processing and relatively easy to
disambiguate from other languages.


If you plan on identifying languages (rather than scripts), you should be
aware that most language detection libraries don't work well on short texts
such as queries.

If you know that you have scripts for which you have content in only one
language, you can use script detection instead of language detection.


If you have German, a filter length of 25 might be too low (Because of
compounding). You might want to analyze a sample of your German text to
find a good length.

Tom

http://www.hathitrust.org/blogs/Large-scale-Search


On Wed, Feb 25, 2015 at 10:31 AM, Rishi Easwaran 
wrote:

> Hi Alex,
>
> Thanks for the suggestions. These steps will definitely help out with our
> use case.
> Thanks for the idea about the lengthFilter to protect our system.
>
> Thanks,
> Rishi.
>
>
>
>
>
>
>
> -Original Message-
> From: Alexandre Rafalovitch 
> To: solr-user 
> Sent: Tue, Feb 24, 2015 8:50 am
> Subject: Re: Basic Multilingual search capability
>
>
> Given the limited needs, I would probably do something like this:
>
> 1) Put a language identifier in the UpdateRequestProcessor chain
> during indexing and route out at least known problematic languages,
> such as Chinese, Japanese, Arabic into individual fields
> 2) Put everything else together into one field with ICUTokenizer,
> maybe also ICUFoldingFilter
> 3) At the very end of that joint filter, stick in LengthFilter with
> some high number, e.g. 25 characters max. This will ensure that
> super-long words from non-space languages and edge conditions do not
> break the rest of your system.
>
>
> Regards,
>Alex.
> 
> Solr Analyzers, Tokenizers, Filters, URPs and even a newsletter:
> http://www.solr-start.com/
>
>
> On 23 February 2015 at 23:14, Walter Underwood 
> wrote:
> >> I understand relevancy, stemming etc becomes extremely complicated with
> multilingual support, but our first goal is to be able to tokenize and
> provide
> basic search capability for any language. Ex: When the document contains
> hello
> or здравствуйте, the analyzer creates tokens and provides exact match
> search
> results.
>
>
>

 


Re: Getting started with Solr

2015-02-26 Thread Erik Hatcher
I’m sorry, I’m not following exactly.   

Somehow you no longer have a gettingstarted collection, but it is not clear how 
that happened.  

Could you post the exact script steps you used that got you this error?

What collections/cores does the Solr admin show you have?What are the 
results of http://localhost:8983/solr/admin/cores 
 ?

—
Erik Hatcher, Senior Solutions Architect
http://www.lucidworks.com 




> On Feb 26, 2015, at 9:58 AM, Baruch Kogan  wrote:
> 
> Oh, I see. I used the start -e cloud command, then ran through a setup with
> one core and default options for the rest, then tried to post the json
> example again, and got another error:
> buntu@ubuntu-VirtualBox:~/crawler/solr$ bin/post -c gettingstarted
> example/exampledocs/*.json
> /usr/lib/jvm/java-7-oracle/bin/java -classpath
> /home/ubuntu/crawler/solr/dist/solr-core-5.0.0.jar -Dauto=yes
> -Dc=gettingstarted -Ddata=files org.apache.solr.util.SimplePostTool
> example/exampledocs/books.json
> SimplePostTool version 5.0.0
> Posting files to [base] url
> http://localhost:8983/solr/gettingstarted/update...
> Entering auto mode. File endings considered are
> xml,json,csv,pdf,doc,docx,ppt,pptx,xls,xlsx,odt,odp,ods,ott,otp,ots,rtf,htm,html,txt,log
> POSTing file books.json (application/json) to [base]
> SimplePostTool: WARNING: Solr returned an error #404 (Not Found) for url:
> http://localhost:8983/solr/gettingstarted/update
> SimplePostTool: WARNING: Response: 
> 
> 
> Error 404 Not Found
> 
> HTTP ERROR 404
> Problem accessing /solr/gettingstarted/update. Reason:
> Not FoundPowered by
> Jetty://
> 
> Sincerely,
> 
> Baruch Kogan
> Marketing Manager
> Seller Panda 
> +972(58)441-3829
> baruch.kogan at Skype
> 
> On Thu, Feb 26, 2015 at 4:07 PM, Erik Hatcher 
> wrote:
> 
>> How did you start Solr?   If you started with `bin/solr start -e cloud`
>> you’ll have a gettingstarted collection created automatically, otherwise
>> you’ll need to create it yourself with `bin/solr create -c gettingstarted`
>> 
>> 
>> —
>> Erik Hatcher, Senior Solutions Architect
>> http://www.lucidworks.com 
>> 
>> 
>> 
>> 
>>> On Feb 26, 2015, at 4:53 AM, Baruch Kogan 
>> wrote:
>>> 
>>> Hi, I've just installed Solr (will be controlling with Solarium and using
>>> to search Nutch queries.)  I'm working through the starting tutorials
>>> described here:
>>> https://cwiki.apache.org/confluence/display/solr/Running+Solr
>>> 
>>> When I try to run $ bin/post -c gettingstarted
>> example/exampledocs/*.json,
>>> I get a bunch of errors having to do
>>> with there not being a gettingstarted folder in /solr/. Is this normal?
>>> Should I create one?
>>> 
>>> Sincerely,
>>> 
>>> Baruch Kogan
>>> Marketing Manager
>>> Seller Panda 
>>> +972(58)441-3829
>>> baruch.kogan at Skype
>> 
>> 



Re: qt.shards in solrconfig.xml

2015-02-26 Thread Benson Margulies
I apparently am feeling dense; the following does not worl.

 

  /RNI


name-indexing-query
name-indexing-rescore
facet
mlt
highlight
stats
debug
  
  


On Thu, Feb 26, 2015 at 11:33 AM, Jack Krupansky
 wrote:
> I was hoping that Benson was hinting at adding a qt.shards.auto=true
> parameter to so that would magically use on the path from the incoming
> request - and that this would be the default, since that's what most people
> would expect.
>
> Or, maybe just add a commented-out custom handler that has the qt.shards
> parameter as suggested, to re-emphasize to people that if they want to use
> a custom handler in distributed mode, then they will most likely need this
> parameter.
>
> -- Jack Krupansky
>
> On Thu, Feb 26, 2015 at 11:28 AM, Mikhail Khludnev <
> mkhlud...@griddynamics.com> wrote:
>
>> Hello,
>>
>> Giving
>>
>> http://mail-archives.apache.org/mod_mbox/lucene-solr-user/201301.mbox/%3c711daae5-c366-4349-b644-8e29e80e2...@gmail.com%3E
>> you can add qt.shards into handler defaults/invariants.
>>
>> On Thu, Feb 26, 2015 at 5:40 PM, Benson Margulies 
>> wrote:
>>
>> > A query I posted yesterday amounted to me forgetting that I have to
>> > set qt.shards when I use a URL other than plain old '/select' with
>> > SolrCloud. Is there any way to configure a query handler to automate
>> > this, so that all queries addressed to '/RNI' get that added in?
>> >
>>
>>
>>
>> --
>> Sincerely yours
>> Mikhail Khludnev
>> Principal Engineer,
>> Grid Dynamics
>>
>> 
>> 
>>


Re: qt.shards in solrconfig.xml

2015-02-26 Thread Shalin Shekhar Mangar
If I'm reading your suggestion right, Tim fixed this for 5.1 with
http://issues.apache.org/jira/browse/SOLR-6311

On Thu, Feb 26, 2015 at 10:03 PM, Jack Krupansky 
wrote:

> I was hoping that Benson was hinting at adding a qt.shards.auto=true
> parameter to so that would magically use on the path from the incoming
> request - and that this would be the default, since that's what most people
> would expect.
>
> Or, maybe just add a commented-out custom handler that has the qt.shards
> parameter as suggested, to re-emphasize to people that if they want to use
> a custom handler in distributed mode, then they will most likely need this
> parameter.
>
> -- Jack Krupansky
>
> On Thu, Feb 26, 2015 at 11:28 AM, Mikhail Khludnev <
> mkhlud...@griddynamics.com> wrote:
>
> > Hello,
> >
> > Giving
> >
> >
> http://mail-archives.apache.org/mod_mbox/lucene-solr-user/201301.mbox/%3c711daae5-c366-4349-b644-8e29e80e2...@gmail.com%3E
> > you can add qt.shards into handler defaults/invariants.
> >
> > On Thu, Feb 26, 2015 at 5:40 PM, Benson Margulies  >
> > wrote:
> >
> > > A query I posted yesterday amounted to me forgetting that I have to
> > > set qt.shards when I use a URL other than plain old '/select' with
> > > SolrCloud. Is there any way to configure a query handler to automate
> > > this, so that all queries addressed to '/RNI' get that added in?
> > >
> >
> >
> >
> > --
> > Sincerely yours
> > Mikhail Khludnev
> > Principal Engineer,
> > Grid Dynamics
> >
> > 
> > 
> >
>



-- 
Regards,
Shalin Shekhar Mangar.


Re: qt.shards in solrconfig.xml

2015-02-26 Thread Shalin Shekhar Mangar
Hi Benson,

Do not use shards.qt with a leading '/'. See
https://issues.apache.org/jira/browse/SOLR-3161 for details. Also note that
shards.qt will not be necessary with 5.1 and beyond because of SOLR-6311

On Fri, Feb 27, 2015 at 8:16 AM, Benson Margulies 
wrote:

> I apparently am feeling dense; the following does not worl.
>
>  
> 
>   /RNI
> 
> 
> name-indexing-query
> name-indexing-rescore
> facet
> mlt
> highlight
> stats
> debug
>   
>   
>
>
> On Thu, Feb 26, 2015 at 11:33 AM, Jack Krupansky
>  wrote:
> > I was hoping that Benson was hinting at adding a qt.shards.auto=true
> > parameter to so that would magically use on the path from the incoming
> > request - and that this would be the default, since that's what most
> people
> > would expect.
> >
> > Or, maybe just add a commented-out custom handler that has the qt.shards
> > parameter as suggested, to re-emphasize to people that if they want to
> use
> > a custom handler in distributed mode, then they will most likely need
> this
> > parameter.
> >
> > -- Jack Krupansky
> >
> > On Thu, Feb 26, 2015 at 11:28 AM, Mikhail Khludnev <
> > mkhlud...@griddynamics.com> wrote:
> >
> >> Hello,
> >>
> >> Giving
> >>
> >>
> http://mail-archives.apache.org/mod_mbox/lucene-solr-user/201301.mbox/%3c711daae5-c366-4349-b644-8e29e80e2...@gmail.com%3E
> >> you can add qt.shards into handler defaults/invariants.
> >>
> >> On Thu, Feb 26, 2015 at 5:40 PM, Benson Margulies <
> bimargul...@gmail.com>
> >> wrote:
> >>
> >> > A query I posted yesterday amounted to me forgetting that I have to
> >> > set qt.shards when I use a URL other than plain old '/select' with
> >> > SolrCloud. Is there any way to configure a query handler to automate
> >> > this, so that all queries addressed to '/RNI' get that added in?
> >> >
> >>
> >>
> >>
> >> --
> >> Sincerely yours
> >> Mikhail Khludnev
> >> Principal Engineer,
> >> Grid Dynamics
> >>
> >> 
> >> 
> >>
>



-- 
Regards,
Shalin Shekhar Mangar.


Solr logs encoding

2015-02-26 Thread Moshe Recanati
Hi,
I've wired situation. Starting yesterday restart I've issue with log encoding. 
My log looks like:
DEBUG - 2015-02-27 10:47:01.432; << 
"[0x4][0xfc][0xff][0xff][0xff][0xf][0x4][0xc7]8[0x4][0xfc][0xff][0xff][0xff][0xf][0x4][0x89][0x5][0x4][0xfc][0xff][0xff][0xff][0xf][0x4][0x97][0x4][0x4][0xfc][0xff][0xff][0xff][0xf][0x4][0xa4][0x6][0x4][0xfc][0xff][0xff][0xff][0xf][0x4][0xfc]b[0x4][0xfc][0xff][0xff][0xff][0xf][0x4][0xfc]F[0x4][0xfc][0xff][0xff][0xff][0xf][0x4][0xfb]:[0x4][0xfc][0xff][0xff][0xff][0xf][0x4]a[0x4][0xfc][0xff][0xff][0xff][0xf][0x4]v[0x4][0xfc][0xff][0xff][0xff][0xf][0x4]Y[0x4][0xfc][0xff][0xff][0xff][0xf][0x4]Y[0x4][0xfc][0xff][0xff][0xff][0xf][0x4]V[0x4][0xfc][0xff][0xff][0xff][0xf][0x4]H[0x4][0xfc][0xff][0xff][0xff][0xf][0x4]U[0x4][0xfc][0xff][0xff][0xff][0xf][0x4]\[0x4][0xfc][0xff][0xff][0xff][0xf][0x4][0xe4][0x96][0x1][0x4][0xfc][0xff][0xff][0xff][0xf][0x4]`[0x4][0xfc][0xff][0xff][0xff][0xf][0x4]j[0x4][0xfc][0xff][0xff][0xff][0xf][0x4]l[0x4][0xfc][0xff][0xff][0xff][0xf][0x4]j[0x4][0xfc][0xff][0xff][0xff][0xf][0x4]][0x4][0xfc][0xff][0xff][0xff][0xf][0x4]X[0x4][0xfc][0xff][0xff][0xff][0xf][0x4]e[0x4][0xfc][0xff][0xff][0xff][0xf][0x4][0xdd][0xba][0x1][0x4][0xfc][0xff][0xff][0xff][0xf][0x4]h[0x4][0xfc][0xff][0xff][0xff][0xf][0x4][0xb5]<[0x4][0xfc][0xff][0xff][0xff][0xf][0x4][0xee][0x3][0x4][0xfc][0xff][0xff][0xff][0xf][0x4]\[0x4][0xfc][0xff][0xff][0xff][0xf][0x4][0xe2][0x1d][0x4][0xfc][0xff][0xff][0xff][0xf][0x4][0xbb][0x1a][0x4][0xfc][0xff][0xff][0xff][0xf][0x4]c[0x4][0xfc][0xff][0xff][0xff][0xf][0x4][0xd2]%[0x4][0xfc][0xff][0xff][0xff][0xf][0x4]b[0x4][0xfc][0xff][0xff][0xff][0xf][0x4][0x92][0x1a][0x4][0xfc][0xff][0xff][0xff][0xf][0x4][0xa3][0x4][0x4][0xfc][0xff][0xff][0xff]"

Anyone familiar with this? How to fix it?


Regards,
Moshe Recanati
SVP Engineering
Office + 972-73-2617564
Mobile  + 972-52-6194481
Skype:  recanati
[KMS2]
More at:  www.kmslh.com | 
LinkedIn | 
FB




Dependency Need to include for embedded solr.

2015-02-26 Thread Danesh Kuruppu
Hi all,

I need to include embed solr server into my maven project. I am going to
use latest solr 5.0.0.

Need to know which dependencies I need to include in my project. As I
understand, I need to have solr-core[1] and solr-solrj[2]. Do I need to
include lucene dependency in my project. If so, which dependencies we need
to include to enable all indexing capabilities.

1. http://mvnrepository.com/artifact/org.apache.solr/solr-core/5.0.0
2. http://mvnrepository.com/artifact/org.apache.solr/solr-solrj/5.0.0

Please help
Thanks
Danesh


Re: Dependency Need to include for embedded solr.

2015-02-26 Thread Shawn Heisey
On 2/26/2015 10:07 PM, Danesh Kuruppu wrote:
> I need to include embed solr server into my maven project. I am going to
> use latest solr 5.0.0.
> 
> Need to know which dependencies I need to include in my project. As I
> understand, I need to have solr-core[1] and solr-solrj[2]. Do I need to
> include lucene dependency in my project. If so, which dependencies we need
> to include to enable all indexing capabilities.
> 
> 1. http://mvnrepository.com/artifact/org.apache.solr/solr-core/5.0.0
> 2. http://mvnrepository.com/artifact/org.apache.solr/solr-solrj/5.0.0

Using the embedded server may not be the best idea.  A lot of Solr
functionality is not available in the embedded server.  You can't use
SolrCloud, which is a relatively easy way to provide high availability.
 The legacy method for redundancy, master-slave replication, also does
not work in the embedded server.  The admin UI is not available.

If you choose to go ahead with the embedded server ... for complete
safety, you should probably extract the war file and copy all the jars
from WEB-INF/lib.  If you want to take a more minimalistic approach, I
think these are the Lucene jars you will need for minimum functionality:

lucene-analyzers-common-5.0.0.jar
lucene-codecs-5.0.0.jar
lucene-core-5.0.0.jar
lucene-expressions-5.0.0.jar
lucene-queries-5.0.0.jar
lucene-queryparser-5.0.0.jar

There are quite a few Lucene jars, and I'm not overly familiar with
everything that Solr uses, so I might have left some out that would be
required for very basic functionality.  For more advanced functionality,
additional Lucene jars will definitely be required.

There are also third-party jars that are required, such as slf4j jars
for logging.  The codebase as a whole has dependencies on things like
google guava, several apache commons jars, and other pieces ... I have
no idea which of those can be left out when using the embedded server.
I tried to find a definitive list of required jars, and was not able to
locate one.

Thanks,
Shawn



Get suggestion for each term in the query

2015-02-26 Thread Nitin Solanki
Hi,
  I want to get suggestion of each term/word in query.
Condition:
i) Either word/term is correct or incorrect.
ii) Either word/term has high frequency or has low frequency.

Whatever the condition of term/word, I need to suggestion all time.


solr cloud does not start with many collections

2015-02-26 Thread Damien Kamerman
I've run into an issue with starting my solr cloud with many collections.
My setup is:
3 nodes (solr 4.10.3 ; 64GB RAM each ; jdk1.8.0_25) running on a single
server (256GB RAM).
5,000 collections (1 x shard ; 2 x replica) = 10,000 cores
1 x Zookeeper 3.4.6
Java arg -Djute.maxbuffer=67108864 added to solr and ZK.

Then I stop all nodes, then start all nodes. All replicas are in the down
state, some have no leader. At times I have seen some (12 or so) leaders in
the active state. In the solr logs I see lots of:

org.apache.solr.cloud.ZkController; Still seeing conflicting information
about the leader of shard shard1 for collection DD-4351 after 30
seconds; our state says http://ftea1:8001/solr/DD-4351_shard1_replica1/,
but ZooKeeper says http://ftea1:8000/solr/DD-4351_shard1_replica2/

org.apache.solr.common.SolrException;
:org.apache.solr.common.SolrException: Error getting leader from zk for
shard shard1
at
org.apache.solr.cloud.ZkController.getLeader(ZkController.java:910)
at
org.apache.solr.cloud.ZkController.register(ZkController.java:822)
at
org.apache.solr.cloud.ZkController.register(ZkController.java:770)
at org.apache.solr.core.ZkContainer$2.run(ZkContainer.java:221)
at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)
Caused by: org.apache.solr.common.SolrException: There is conflicting
information about the leader of shard: shard1 our state says:
http://ftea1:8001/solr/DD-1564_shard1_replica2/ but zookeeper says:
http://ftea1:8000/solr/DD-1564_shard1_replica1/
at
org.apache.solr.cloud.ZkController.getLeader(ZkController.java:889)
... 6 more

I've tried staggering the starts (1min) but does not help.
I've reproduced with zero documents.
Restarts are OK up to around 3,000 cores.
Should this work?

Damien.


Confusion in making true or false in spellcheck.onlymorepopular

2015-02-26 Thread Nitin Solanki
HI,
"Only return suggestions that result in more hits for the query
than the existing query"

What does it means "the existing query" in above sentence for
"spellcheck.onlymorepopular"?

what happens when I make true to "spellcheck.onlymorepopular" or false to
"spellcheck.onlymorepopular"? Any difference in it?


Unsubscribing MAIL

2015-02-26 Thread Kishan Parmar
HI
I want unsubscribe the mailing list od solr and lucene so plz do the same..



Regards,

Kishan Parmar
Software Developer
+91 95 100 77394
Jay Shree Krishnaa !!


Re: Dependency Need to include for embedded solr.

2015-02-26 Thread Danesh Kuruppu
Thanks Shawn,

My application is a standalone application. I though of embedding solr
server, so I can pack it inside my application.

In solr 5.0.0, solr is no longer distributed as a "war" file. how I can
find the war file from the distribution.

I need some advanced features like synonyms search, stop words, wild card
search etc. It would be great, if you can provide some references to get
idea which dependencies need to add to get those features.

Thanks
Danesh

On Fri, Feb 27, 2015 at 11:32 AM, Shawn Heisey  wrote:

> On 2/26/2015 10:07 PM, Danesh Kuruppu wrote:
> > I need to include embed solr server into my maven project. I am going to
> > use latest solr 5.0.0.
> >
> > Need to know which dependencies I need to include in my project. As I
> > understand, I need to have solr-core[1] and solr-solrj[2]. Do I need to
> > include lucene dependency in my project. If so, which dependencies we
> need
> > to include to enable all indexing capabilities.
> >
> > 1. http://mvnrepository.com/artifact/org.apache.solr/solr-core/5.0.0
> > 2. http://mvnrepository.com/artifact/org.apache.solr/solr-solrj/5.0.0
>
> Using the embedded server may not be the best idea.  A lot of Solr
> functionality is not available in the embedded server.  You can't use
> SolrCloud, which is a relatively easy way to provide high availability.
>  The legacy method for redundancy, master-slave replication, also does
> not work in the embedded server.  The admin UI is not available.
>
> If you choose to go ahead with the embedded server ... for complete
> safety, you should probably extract the war file and copy all the jars
> from WEB-INF/lib.  If you want to take a more minimalistic approach, I
> think these are the Lucene jars you will need for minimum functionality:
>
> lucene-analyzers-common-5.0.0.jar
> lucene-codecs-5.0.0.jar
> lucene-core-5.0.0.jar
> lucene-expressions-5.0.0.jar
> lucene-queries-5.0.0.jar
> lucene-queryparser-5.0.0.jar
>
> There are quite a few Lucene jars, and I'm not overly familiar with
> everything that Solr uses, so I might have left some out that would be
> required for very basic functionality.  For more advanced functionality,
> additional Lucene jars will definitely be required.
>
> There are also third-party jars that are required, such as slf4j jars
> for logging.  The codebase as a whole has dependencies on things like
> google guava, several apache commons jars, and other pieces ... I have
> no idea which of those can be left out when using the embedded server.
> I tried to find a definitive list of required jars, and was not able to
> locate one.
>
> Thanks,
> Shawn
>
>


Re: solr cloud does not start with many collections

2015-02-26 Thread Shawn Heisey
On 2/26/2015 11:14 PM, Damien Kamerman wrote:
> I've run into an issue with starting my solr cloud with many collections.
> My setup is:
> 3 nodes (solr 4.10.3 ; 64GB RAM each ; jdk1.8.0_25) running on a single
> server (256GB RAM).
> 5,000 collections (1 x shard ; 2 x replica) = 10,000 cores
> 1 x Zookeeper 3.4.6
> Java arg -Djute.maxbuffer=67108864 added to solr and ZK.
> 
> Then I stop all nodes, then start all nodes. All replicas are in the down
> state, some have no leader. At times I have seen some (12 or so) leaders in
> the active state. In the solr logs I see lots of:
> 
> org.apache.solr.cloud.ZkController; Still seeing conflicting information
> about the leader of shard shard1 for collection DD-4351 after 30
> seconds; our state says http://ftea1:8001/solr/DD-4351_shard1_replica1/,
> but ZooKeeper says http://ftea1:8000/solr/DD-4351_shard1_replica2/



> I've tried staggering the starts (1min) but does not help.
> I've reproduced with zero documents.
> Restarts are OK up to around 3,000 cores.
> Should this work?

This is going to push SolrCloud beyond its limits.  Is this just an
exercise to see how far you can push Solr, or are you looking at setting
up a production install with several thousand collections?

In Solr 4.x, the clusterstate is one giant JSON structure containing the
state of the entire cloud.  With 5000 collections, the entire thing
would need to be downloaded and uploaded at least 5000 times during the
course of a successful full system startup ... and I think with
replicationFactor set to 2, that might actually be 1 times. The
best-case scenario is that it would take a VERY long time, the
worst-case scenario is that concurrency problems would lead to a
deadlock.  A deadlock might be what is happening here.

In Solr 5.x, the clusterstate is broken up so there's a separate state
structure for each collection.  This setup allows for faster and safer
multi-threading and far less data transfer.  Assuming I understand the
implications correctly, there might not be any need to increase
jute.maxbuffer with 5.x ... although I have to assume that I might be
wrong about that.

I would very much recommend that you set your scenario up from scratch
in Solr 5.0.0, to see if the new clusterstate format can eliminate the
problem you're seeing.  If it doesn't, then we can pursue it as a likely
bug in the 5.x branch and you can file an issue in Jira.

Thanks,
Shawn



Re: solr cloud does not start with many collections

2015-02-26 Thread Damien Kamerman
>
> This is going to push SolrCloud beyond its limits.  Is this just an
> exercise to see how far you can push Solr, or are you looking at setting
> up a production install with several thousand collections?
>
>
I'm looking towards production.


> In Solr 4.x, the clusterstate is one giant JSON structure containing the
> state of the entire cloud.  With 5000 collections, the entire thing
> would need to be downloaded and uploaded at least 5000 times during the
> course of a successful full system startup ... and I think with
> replicationFactor set to 2, that might actually be 1 times. The
> best-case scenario is that it would take a VERY long time, the
> worst-case scenario is that concurrency problems would lead to a
> deadlock.  A deadlock might be what is happening here.
>
>
Yes, clusterstate.json is 3.3M. At times on startup I think it does
deadlock; log shows after 1min:
org.apache.solr.cloud.ZkController; Timed out waiting to see all nodes
published as DOWN in our cluster state.


> In Solr 5.x, the clusterstate is broken up so there's a separate state
> structure for each collection.  This setup allows for faster and safer
> multi-threading and far less data transfer.  Assuming I understand the
> implications correctly, there might not be any need to increase
> jute.maxbuffer with 5.x ... although I have to assume that I might be
> wrong about that.
>
> I would very much recommend that you set your scenario up from scratch
> in Solr 5.0.0, to see if the new clusterstate format can eliminate the
> problem you're seeing.  If it doesn't, then we can pursue it as a likely
> bug in the 5.x branch and you can file an issue in Jira.
>
>
Thanks, will test in Solr 5.0.0.


Re: solr cloud does not start with many collections

2015-02-26 Thread Damien Kamerman
Oh, and I was wondering if 'leaderVoteWait' might help in Solr4.

On 27 February 2015 at 18:04, Damien Kamerman  wrote:

> This is going to push SolrCloud beyond its limits.  Is this just an
>> exercise to see how far you can push Solr, or are you looking at setting
>> up a production install with several thousand collections?
>>
>>
> I'm looking towards production.
>
>
>> In Solr 4.x, the clusterstate is one giant JSON structure containing the
>> state of the entire cloud.  With 5000 collections, the entire thing
>> would need to be downloaded and uploaded at least 5000 times during the
>> course of a successful full system startup ... and I think with
>> replicationFactor set to 2, that might actually be 1 times. The
>> best-case scenario is that it would take a VERY long time, the
>> worst-case scenario is that concurrency problems would lead to a
>> deadlock.  A deadlock might be what is happening here.
>>
>>
> Yes, clusterstate.json is 3.3M. At times on startup I think it does
> deadlock; log shows after 1min:
> org.apache.solr.cloud.ZkController; Timed out waiting to see all nodes
> published as DOWN in our cluster state.
>
>
>> In Solr 5.x, the clusterstate is broken up so there's a separate state
>> structure for each collection.  This setup allows for faster and safer
>> multi-threading and far less data transfer.  Assuming I understand the
>> implications correctly, there might not be any need to increase
>> jute.maxbuffer with 5.x ... although I have to assume that I might be
>> wrong about that.
>>
>> I would very much recommend that you set your scenario up from scratch
>> in Solr 5.0.0, to see if the new clusterstate format can eliminate the
>> problem you're seeing.  If it doesn't, then we can pursue it as a likely
>> bug in the 5.x branch and you can file an issue in Jira.
>>
>>
> Thanks, will test in Solr 5.0.0.
>



-- 
Damien Kamerman


Re: Dependency Need to include for embedded solr.

2015-02-26 Thread Shawn Heisey
On 2/26/2015 11:41 PM, Danesh Kuruppu wrote:
> My application is a standalone application. I though of embedding solr
> server, so I can pack it inside my application.
> 
> In solr 5.0.0, solr is no longer distributed as a "war" file. how I can
> find the war file from the distribution.

I am glad to see that people are actually reading documentation that is
included with the release.

With 5.0.0 (and probably the next few 5.x releases), Solr actually still
is a war file.  You can find it in server/webapps in the binary
download.  There are two reason we are telling everyone it's not a war
file:  1) We now have very capable scripts to start and stop Solr with
optimal java options, so there's no longer any need to rely on scripts
packaged with a servlet container.  2) In a future 5.x release, Solr
actually will become a standalone application, not a war ... preparing
users in advance is a good idea.

> I need some advanced features like synonyms search, stop words, wild card
> search etc. It would be great, if you can provide some references to get
> idea which dependencies need to add to get those features.

If you don't want to simply add every dependency included in the war,
then you can use the tried and true method for finding the minimum set
of jars:  Try to get it running.  If it fails, look at the log and see
which class it was unable to find.  Add the relevant jar to the
classpath and try again.

Thanks,
Shawn



Re: Unsubscribing MAIL

2015-02-26 Thread Gora Mohanty
On 27 February 2015 at 12:10, Kishan Parmar  wrote:
>
> HI
> I want unsubscribe the mailing list od solr and lucene so plz do the same..

Please follow the standard procedure for unsubscribing from most
mailing lists, and send a mail to
solr-user-unsubscr...@lucene.apache.org . For other lists, you might
want to take a look at the addresses listed under
http://lucene.apache.org/core/discussion.html

Regards,
Gora


Re: Dependency Need to include for embedded solr.

2015-02-26 Thread Danesh Kuruppu
Thanks shawn,
I am doing some feasibility studies for moving directly to solr 5.0.0.

One more thing, It is related to standalone server.

How security handle in solr standalone server. lets say, I configured my
application to use remote solr standalone server.

1. How I would enable secure communication between my application and solr
server.
2. How solr server authenticate user.

Thanks
Danesh

On Fri, Feb 27, 2015 at 12:35 PM, Shawn Heisey  wrote:

> On 2/26/2015 11:41 PM, Danesh Kuruppu wrote:
> > My application is a standalone application. I though of embedding solr
> > server, so I can pack it inside my application.
> >
> > In solr 5.0.0, solr is no longer distributed as a "war" file. how I can
> > find the war file from the distribution.
>
> I am glad to see that people are actually reading documentation that is
> included with the release.
>
> With 5.0.0 (and probably the next few 5.x releases), Solr actually still
> is a war file.  You can find it in server/webapps in the binary
> download.  There are two reason we are telling everyone it's not a war
> file:  1) We now have very capable scripts to start and stop Solr with
> optimal java options, so there's no longer any need to rely on scripts
> packaged with a servlet container.  2) In a future 5.x release, Solr
> actually will become a standalone application, not a war ... preparing
> users in advance is a good idea.
>
> > I need some advanced features like synonyms search, stop words, wild card
> > search etc. It would be great, if you can provide some references to get
> > idea which dependencies need to add to get those features.
>
> If you don't want to simply add every dependency included in the war,
> then you can use the tried and true method for finding the minimum set
> of jars:  Try to get it running.  If it fails, look at the log and see
> which class it was unable to find.  Add the relevant jar to the
> classpath and try again.
>
> Thanks,
> Shawn
>
>