when transaction logs are closing?

2017-10-09 Thread Bernd Fehling
I'm trying to figure out when transaction logs are closing.
Unfortunately the docs and guides are not very clear about this.

I tried any combination of commits with waitSearcher true/false,
expungeDeletes true/false, openSearcher true/false.
And also optimize with maxSegements=1.

The stats of my updateHandler say
transaction_logs_total_number:  2
transaction_logs_total_size:59287641

A "lsof | grep tlog" reports still many open tlog files.
Actually there are only 2 tlog files but each java process has handles
open to tlog.

But why they are not closing even after hard commit and optimize
with maxSegments=1 ?
There is no need to keep the tlogs open. Everything is flushed to disk,
optimized, all nodes are up, running and in sync.

Can someone explain the rules when tlogs are closing?

Regards
Bernd


Re: upgrade to 7.0.0

2017-10-09 Thread Stefano Mancini
thanks a lot,

i’ve just installed 7.0.1 and it works fine with all the indexes  i’ve created 
with 6.x

Stefano Mancini





Nella  nuvola  del  futuro
 
Tupla srl © 2011 – P.Iva 03019191208
via Porrettana, 277 - 40033 Casalecchio di Reno (BO)
Ph +39 051 0956963 – i...@tupla.it   - www.tupla.it 

Il presente messaggio è normalmente protetto dalla normativa vigente sulla 
privacy (d.l. 196/1996, d.p.r. 695/2003). Ogni informazione contenuta in esso e 
nei suoi eventuali allegati, costituisce materiale riservato e confidenziale. 
Il messaggio potrebbe contenere opinioni personali non necessariamente 
riconducibili a quelle di Tupla srl. Qualora abbia ricevuto il presente 
messaggio per errore, La preghiamo di darcene notizia e di provvedere 
all'immediata eliminazione dello stesso e di ogni sua eventuale copia (come già 
previsto dalla normativa citata).





> Il giorno 30 set 2017, alle ore 00:14, Shawn Heisey  ha 
> scritto:
> 
> On 9/27/2017 4:36 AM, Stefano Mancini wrote:
>> i’ve just installed solr 7.0.0 and i’ve an error opening an index created 
>> with 6.6.1.
>> 
>> The server works fine if  start it with an empty index so i suppose that 
>> configurations is ok
> 
> I thought that I replied to this, turns out that I didn't.  I replicated
> the error a couple of days ago and opened an issue.  You (Stefano) are
> the mailing list user that I mentioned in the issue.
> 
> https://issues.apache.org/jira/browse/SOLR-11406
> 
> The problem has been located and Steve Rowe has figured out how to fix
> it.  He has also volunteered to be the release manager for the 7.0.1
> version that will contain the fix.  It's impossible to predict when that
> release will be ready, but a preliminary estimate is about a week, maybe
> two.
> 
> You have the option of grabbing the branch_7_0 source code and building
> a SNAPSHOT package yourself if you want it right now.
> 
> Thanks,
> Shawn
> 



Re: when transaction logs are closing?

2017-10-09 Thread Emir Arnautović
Hi Bernd,
I did not look at the code, but I would guess never. Solr tends to keep file 
handle for each file that it uses and it keeps last N transaction logs. 
Transaction log file is flushed and new one is created when you issue hard 
commit - with or without open searcher. At that moment it will delete the 
oldest one, but the number of tlogs will remain the same.

Hope that someone more into transaction logs will jump in and correct me if I 
am wrong.

HTH,
Emir
--
Monitoring - Log Management - Alerting - Anomaly Detection
Solr & Elasticsearch Consulting Support Training - http://sematext.com/



> On 9 Oct 2017, at 09:21, Bernd Fehling  wrote:
> 
> I'm trying to figure out when transaction logs are closing.
> Unfortunately the docs and guides are not very clear about this.
> 
> I tried any combination of commits with waitSearcher true/false,
> expungeDeletes true/false, openSearcher true/false.
> And also optimize with maxSegements=1.
> 
> The stats of my updateHandler say
> transaction_logs_total_number:  2
> transaction_logs_total_size:59287641
> 
> A "lsof | grep tlog" reports still many open tlog files.
> Actually there are only 2 tlog files but each java process has handles
> open to tlog.
> 
> But why they are not closing even after hard commit and optimize
> with maxSegments=1 ?
> There is no need to keep the tlogs open. Everything is flushed to disk,
> optimized, all nodes are up, running and in sync.
> 
> Can someone explain the rules when tlogs are closing?
> 
> Regards
> Bernd



Re: when transaction logs are closing?

2017-10-09 Thread alessandro.benedetti
In addition to what Emir mentioned, when Solr opens a new Transaction Log
file it will delete the older ones up to some conditions :
keep at least N number of records [1] and max K number of files[2].
N is specified in the solrconfig.xml ( in the update handler section) and
can be documents related or files related or both.
So , potentially it could delete no one.

This blog from Erick is quite explicative[3] .
If you like to take a look to the code, this class should help[4]



[1]  ${solr.ulog.numRecordsToKeep:100}
[2]  ${solr.ulog.maxNumLogsToKeep:10}
[3]
https://lucidworks.com/2013/08/23/understanding-transaction-logs-softcommit-and-commit-in-sorlcloud/
[4] org.apache.solr.update.UpdateLog




-
---
Alessandro Benedetti
Search Consultant, R&D Software Engineer, Director
Sease Ltd. - www.sease.io
--
Sent from: http://lucene.472066.n3.nabble.com/Solr-User-f472068.html


Re: when transaction logs are closing?

2017-10-09 Thread Bernd Fehling
Thanks a lot Alessandro and Emir.

Am 09.10.2017 um 13:40 schrieb alessandro.benedetti:
> In addition to what Emir mentioned, when Solr opens a new Transaction Log
> file it will delete the older ones up to some conditions :
> keep at least N number of records [1] and max K number of files[2].
> N is specified in the solrconfig.xml ( in the update handler section) and
> can be documents related or files related or both.
> So , potentially it could delete no one.
> 
> This blog from Erick is quite explicative[3] .
> If you like to take a look to the code, this class should help[4]
> 
> 
> 
> [1]  ${solr.ulog.numRecordsToKeep:100}
> [2]  ${solr.ulog.maxNumLogsToKeep:10}
> [3]
> https://lucidworks.com/2013/08/23/understanding-transaction-logs-softcommit-and-commit-in-sorlcloud/
> [4] org.apache.solr.update.UpdateLog
> 
> 
> 
> 
> -
> ---
> Alessandro Benedetti
> Search Consultant, R&D Software Engineer, Director
> Sease Ltd. - www.sease.io
> --
> Sent from: http://lucene.472066.n3.nabble.com/Solr-User-f472068.html
> 


Re: Complexphrase treats wildcards differently than other query parsers

2017-10-09 Thread Bjarke Buur Mortensen
Thanks again, Tim,
following your recipe, I was able to write a failing test:

assertQ(req("q", "{!complexphrase} iso-latin1:cr\u00E6zy*")
, "//result[@numFound='1']"
, "//doc[./str[@name='id']='1']"
);

Notice how cr\u00E6zy* is used as a query term which mimics the behaviour I
originally reported, namely that CPQP does not analyse it because of the
wildcard and thus does not hit the charfilter from the query side.


2017-10-06 20:54 GMT+02:00 Allison, Timothy B. :

> That could be it.  I'm not able to reproduce this with trunk.  More next
> week.
>
> In trunk, if I add this to schema15.xml:
>   
> 
>   
>   
> 
>   
>stored="true"/>
>
> This test passes.
>
>   @Test
>   public void testCharFilter() {
> assertU(adoc("iso-latin1", "cr\u00E6zy tr\u00E6n", "id", "1"));
> assertU(commit());
> assertU(optimize());
>
> assertQ(req("q", "{!complexphrase} iso-latin1:craezy")
> , "//result[@numFound='1']"
> , "//doc[./str[@name='id']='1']"
> );
>
> assertQ(req("q", "{!complexphrase} iso-latin1:traen")
> , "//result[@numFound='1']"
> , "//doc[./str[@name='id']='1']"
> );
>
> assertQ(req("q", "{!complexphrase} iso-latin1:caezy~1")
> , "//result[@numFound='1']"
> , "//doc[./str[@name='id']='1']"
> );
>
> assertQ(req("q", "{!complexphrase} iso-latin1:crae*")
> , "//result[@numFound='1']"
> , "//doc[./str[@name='id']='1']"
> );
>
> assertQ(req("q", "{!complexphrase} iso-latin1:*aezy")
> , "//result[@numFound='1']"
> , "//doc[./str[@name='id']='1']"
> );
>
> assertQ(req("q", "{!complexphrase} iso-latin1:crae*y")
> , "//result[@numFound='1']"
> , "//doc[./str[@name='id']='1']"
> );
>
> assertQ(req("q", "{!complexphrase} iso-latin1:\"craezy traen\"")
> , "//result[@numFound='1']"
> , "//doc[./str[@name='id']='1']"
> );
>
> assertQ(req("q", "{!complexphrase} iso-latin1:\"caezy~1 traen\"")
> , "//result[@numFound='1']"
> , "//doc[./str[@name='id']='1']"
> );
>
> assertQ(req("q", "{!complexphrase} iso-latin1:\"craez* traen\"")
> , "//result[@numFound='1']"
> , "//doc[./str[@name='id']='1']"
> );
>
> assertQ(req("q", "{!complexphrase} iso-latin1:\"*aezy traen\"")
> , "//result[@numFound='1']"
> , "//doc[./str[@name='id']='1']"
> );
>
> assertQ(req("q", "{!complexphrase} iso-latin1:\"crae*y traen\"")
> , "//result[@numFound='1']"
> , "//doc[./str[@name='id']='1']"
> );
>   }
>
>
>
> -Original Message-
> From: Bjarke Buur Mortensen [mailto:morten...@eluence.com]
> Sent: Friday, October 6, 2017 6:46 AM
> To: solr-user@lucene.apache.org
> Subject: Re: Complexphrase treats wildcards differently than other query
> parsers
>
> Thanks a lot for your effort, Tim.
>
> Looking at it from the Solr side, I see some use of local classes. The
> snippet below in particular caught my eye (in solr/core/src/java/org/apache/
> solr/search/ComplexPhraseQParserPlugin.java).
> The instance of ComplexPhraseQueryParser is not the clean one from Lucene,
> but a modified one. If any of the modifications messes with the analysis
> logic, well then that might answer it.
>
> What do you make of it?
>
> lparser = new ComplexPhraseQueryParser(defaultField, getReq().getSchema().
> getQueryAnalyzer())
> {
> protected Query newWildcardQuery(org.apache.lucene.index.Term t) { try {
> org.apache.lucene.search.Query wildcardQuery = reverseAwareParser.
> getWildcardQuery(t.field(), t.text());
> setRewriteMethod(wildcardQuery);
> return wildcardQuery;
> } catch (SyntaxError e) {
> throw new RuntimeException(e);
> }
> }
> private Query setRewriteMethod(org.apache.lucene.search.Query query) { if
> (query instanceof MultiTermQuery) {
> ((MultiTermQuery) query).setRewriteMethod( org.apache.lucene.search.
> MultiTermQuery.SCORING_BOOLEAN_REWRITE);
> }
> return query;
> }
> protected Query newRangeQuery(String field, String part1, String part2,
> boolean startInclusive, boolean endInclusive) { boolean reverse =
> reverseAwareParser.isRangeShouldBeProtectedFromReverse(field,
> part1);
> return super.newRangeQuery(field,
> reverse ? reverseAwareParser.getLowerBoundForReverse() : part1, part2,
> startInclusive || reverse, endInclusive); } } ;
>
> Thanks,
> Bjarke
>
>
>


Re: Rescoring from 0 - full

2017-10-09 Thread alessandro.benedetti
The weights you express could flag a probabilistic view or your final score.
The model you quoted will calculate the final score as :
0.9*scorePersonalId +0.1* originalScore

The final score will NOT necessarily be  0https://lucene.apache.org/solr/guide/6_6/the-dismax-query-parser.html#the-dismax-query-parser





-
---
Alessandro Benedetti
Search Consultant, R&D Software Engineer, Director
Sease Ltd. - www.sease.io
--
Sent from: http://lucene.472066.n3.nabble.com/Solr-User-f472068.html


Re: spell-check does not return collations when using search query with filter

2017-10-09 Thread alessandro.benedetti
Does spellcheck.q=polt help ?
How your queries normally look ?
How would you like the collation to be returned ?



-
---
Alessandro Benedetti
Search Consultant, R&D Software Engineer, Director
Sease Ltd. - www.sease.io
--
Sent from: http://lucene.472066.n3.nabble.com/Solr-User-f472068.html


Semantic Knowledge Graph

2017-10-09 Thread David Hastings
Hey All, slides form the 2017 lucene revolution were put up recently, but
unfortunately, the one I have the most interest in, the semantic knowledge
graph, have not been put up:

https://lucenesolrrevolution2017.sched.com/event/BAwX/the-apache-solr-semantic-knowledge-graph?iframe=no&w=100%&sidebar=yes&bg=no


dont suppose any one knows where i may be able to find them, or point me in
a direction to get more information about this tool.

Thanks - dave


Need help with Slow Query Logging

2017-10-09 Thread Atita Arora
Hi ,

I have a situation here where I am required to log the slow queries into a
seperate log file which then can be used for optimization purposes.
For now this log is aggregated into the mainstream log marking
[slow:..].
I looked into the code and the configuration and I am really clueless as to
how do I go about seperating the slow query logs as it needs another file
appender
to be created other than the one already present in the log4j.
If I create another appender I can do so by degregating through log levels
, so that moves all the WARN logs to another file (which is not what I am
looking for).
Also from the code prespective , I feel how about if I introduce another
config setting along with the slowQueryThresholdMillis value , something
like

slowQueryLogFile = get("query/slowQueryLogFile", logfilepath);


where slowQueryLogFile and if present it logs into this file otherwise it
works on the already present along with

slowQueryThresholdMillis = getInt("query/slowQueryThresholdMillis", -1);


or should I tweak log4j ?
I am not sure if anyone has done that before or have any pointers to guide
me on this.
Please help.

Thanks in advance,
Atita


Re: Semantic Knowledge Graph

2017-10-09 Thread Atita Arora
Hi,

Is this the one you're looking for :

https://www.slideshare.net/treygrainger/leveraging-lucenesolr-as-a-knowledge-graph-and-intent-engine

-Atita

On Mon, Oct 9, 2017 at 7:44 PM, David Hastings  wrote:

> Hey All, slides form the 2017 lucene revolution were put up recently, but
> unfortunately, the one I have the most interest in, the semantic knowledge
> graph, have not been put up:
>
> https://lucenesolrrevolution2017.sched.com/event/BAwX/the-
> apache-solr-semantic-knowledge-graph?iframe=no&w=100%&sidebar=yes&bg=no
>
>
> dont suppose any one knows where i may be able to find them, or point me in
> a direction to get more information about this tool.
>
> Thanks - dave
>


RE: Complexphrase treats wildcards differently than other query parsers

2017-10-09 Thread Allison, Timothy B.
  Right.  Sorry.

Despite appearances to the contrary, I'm not a bot designed to lead you down 
the garden path of debugging for yourself with the goal of increasing the size 
of the Solr contributor pool...

I confirmed the failure in 6.x, but all seems to work in 7.x and trunk.  I 
opened SOLR-11450 and attached a unit test based on your correction of mine. 😊

Thank you, again!


-Original Message-
From: Bjarke Buur Mortensen [mailto:morten...@eluence.com] 
Sent: Monday, October 9, 2017 8:39 AM
To: solr-user@lucene.apache.org
Subject: Re: Complexphrase treats wildcards differently than other query parsers

Thanks again, Tim,
following your recipe, I was able to write a failing test:

assertQ(req("q", "{!complexphrase} iso-latin1:cr\u00E6zy*")
, "//result[@numFound='1']"
, "//doc[./str[@name='id']='1']"
);

Notice how cr\u00E6zy* is used as a query term which mimics the behaviour I 
originally reported, namely that CPQP does not analyse it because of the 
wildcard and thus does not hit the charfilter from the query side.


2017-10-06 20:54 GMT+02:00 Allison, Timothy B. :

> That could be it.  I'm not able to reproduce this with trunk.  More 
> next week.
>
> In trunk, if I add this to schema15.xml:
>   
> 
>mapping="mapping- ISOLatin1Accent.txt"/>
>   
> 
>   
>stored="true"/>
>
> This test passes.
>
>   @Test
>   public void testCharFilter() {
> assertU(adoc("iso-latin1", "cr\u00E6zy tr\u00E6n", "id", "1"));
> assertU(commit());
> assertU(optimize());
>
> assertQ(req("q", "{!complexphrase} iso-latin1:craezy")
> , "//result[@numFound='1']"
> , "//doc[./str[@name='id']='1']"
> );
>
> assertQ(req("q", "{!complexphrase} iso-latin1:traen")
> , "//result[@numFound='1']"
> , "//doc[./str[@name='id']='1']"
> );
>
> assertQ(req("q", "{!complexphrase} iso-latin1:caezy~1")
> , "//result[@numFound='1']"
> , "//doc[./str[@name='id']='1']"
> );
>
> assertQ(req("q", "{!complexphrase} iso-latin1:crae*")
> , "//result[@numFound='1']"
> , "//doc[./str[@name='id']='1']"
> );
>
> assertQ(req("q", "{!complexphrase} iso-latin1:*aezy")
> , "//result[@numFound='1']"
> , "//doc[./str[@name='id']='1']"
> );
>
> assertQ(req("q", "{!complexphrase} iso-latin1:crae*y")
> , "//result[@numFound='1']"
> , "//doc[./str[@name='id']='1']"
> );
>
> assertQ(req("q", "{!complexphrase} iso-latin1:\"craezy traen\"")
> , "//result[@numFound='1']"
> , "//doc[./str[@name='id']='1']"
> );
>
> assertQ(req("q", "{!complexphrase} iso-latin1:\"caezy~1 traen\"")
> , "//result[@numFound='1']"
> , "//doc[./str[@name='id']='1']"
> );
>
> assertQ(req("q", "{!complexphrase} iso-latin1:\"craez* traen\"")
> , "//result[@numFound='1']"
> , "//doc[./str[@name='id']='1']"
> );
>
> assertQ(req("q", "{!complexphrase} iso-latin1:\"*aezy traen\"")
> , "//result[@numFound='1']"
> , "//doc[./str[@name='id']='1']"
> );
>
> assertQ(req("q", "{!complexphrase} iso-latin1:\"crae*y traen\"")
> , "//result[@numFound='1']"
> , "//doc[./str[@name='id']='1']"
> );
>   }
>
>
>
> -Original Message-
> From: Bjarke Buur Mortensen [mailto:morten...@eluence.com]
> Sent: Friday, October 6, 2017 6:46 AM
> To: solr-user@lucene.apache.org
> Subject: Re: Complexphrase treats wildcards differently than other 
> query parsers
>
> Thanks a lot for your effort, Tim.
>
> Looking at it from the Solr side, I see some use of local classes. The 
> snippet below in particular caught my eye (in 
> solr/core/src/java/org/apache/ solr/search/ComplexPhraseQParserPlugin.java).
> The instance of ComplexPhraseQueryParser is not the clean one from 
> Lucene, but a modified one. If any of the modifications messes with 
> the analysis logic, well then that might answer it.
>
> What do you make of it?
>
> lparser = new ComplexPhraseQueryParser(defaultField, getReq().getSchema().
> getQueryAnalyzer())
> {
> protected Query newWildcardQuery(org.apache.lucene.index.Term t) { try 
> { org.apache.lucene.search.Query wildcardQuery = reverseAwareParser.
> getWildcardQuery(t.field(), t.text()); 
> setRewriteMethod(wildcardQuery); return wildcardQuery; } catch 
> (SyntaxError e) { throw new RuntimeException(e); } } private Query 
> setRewriteMethod(org.apache.lucene.search.Query query) { if (query 
> instanceof MultiTermQuery) {
> ((MultiTermQuery) query).setRewriteMethod( org.apache.lucene.search.
> MultiTermQuery.SCORING_BOOLEAN_REWRITE);
> }
> return query;
> }
> protected Query newRangeQuery(String field, String part1, String 
> part2, boolean startInclusive, boolean endInclusive) { boolean reverse 
> = reverseAwareParser.isRangeShouldBeProtectedFromReverse(field,
> part1);
> return super.newRangeQuery(field,
> reverse ? reverseAwareParser.getLowerBoundForReverse() : part1, part2, 
> start

Re: Semantic Knowledge Graph

2017-10-09 Thread David Hastings
Thank you!
from digging around i found the one from 2016:
https://www.slideshare.net/treygrainger/the-semantic-knowledge-graph

which seems very close to the presentation this past october.  thanks again!
-David

On Mon, Oct 9, 2017 at 10:34 AM, Atita Arora  wrote:

> Hi,
>
> Is this the one you're looking for :
>
> https://www.slideshare.net/treygrainger/leveraging-
> lucenesolr-as-a-knowledge-graph-and-intent-engine
>
> -Atita
>
> On Mon, Oct 9, 2017 at 7:44 PM, David Hastings <
> hastings.recurs...@gmail.com
> > wrote:
>
> > Hey All, slides form the 2017 lucene revolution were put up recently, but
> > unfortunately, the one I have the most interest in, the semantic
> knowledge
> > graph, have not been put up:
> >
> > https://lucenesolrrevolution2017.sched.com/event/BAwX/the-
> > apache-solr-semantic-knowledge-graph?iframe=no&w=100%&sidebar=yes&bg=no
> >
> >
> > dont suppose any one knows where i may be able to find them, or point me
> in
> > a direction to get more information about this tool.
> >
> > Thanks - dave
> >
>


Re: Semantic Knowledge Graph

2017-10-09 Thread alessandro.benedetti
I expect the slides to be published here :


https://www.slideshare.net/lucidworks?utm_campaign=profiletracking&utm_medium=sssite&utm_source=ssslideview

The one you are looking for is not there yet, but keep an eye on it.

Regards



-
---
Alessandro Benedetti
Search Consultant, R&D Software Engineer, Director
Sease Ltd. - www.sease.io
--
Sent from: http://lucene.472066.n3.nabble.com/Solr-User-f472068.html


Re: Semantic Knowledge Graph

2017-10-09 Thread Trey Grainger
Hi David, that's my fault. I need to do a final proofread through them
before they get posted (and may have to push one quick code change, as
well). I'll try to get that done within the next few days.

All the best,

Trey Grainger
SVP of Engineering @ Lucidworks
Co-author, Solr in Action 
http://www.treygrainger.com


On Mon, Oct 9, 2017 at 10:14 AM, David Hastings <
hastings.recurs...@gmail.com> wrote:

> Hey All, slides form the 2017 lucene revolution were put up recently, but
> unfortunately, the one I have the most interest in, the semantic knowledge
> graph, have not been put up:
>
> https://lucenesolrrevolution2017.sched.com/event/BAwX/the-
> apache-solr-semantic-knowledge-graph?iframe=no&w=100%&sidebar=yes&bg=no
>
>
> dont suppose any one knows where i may be able to find them, or point me in
> a direction to get more information about this tool.
>
> Thanks - dave
>


Re: Semantic Knowledge Graph

2017-10-09 Thread Dave
Thanks Trey. Also thanks for the presentation. It was for me the best one I 
attended. Really looking forward to experimenting with it. Are there any plans 
of it getting into the core distribution?

> On Oct 9, 2017, at 12:30 PM, Trey Grainger  wrote:
> 
> Hi David, that's my fault. I need to do a final proofread through them
> before they get posted (and may have to push one quick code change, as
> well). I'll try to get that done within the next few days.
> 
> All the best,
> 
> Trey Grainger
> SVP of Engineering @ Lucidworks
> Co-author, Solr in Action 
> http://www.treygrainger.com
> 
> 
> On Mon, Oct 9, 2017 at 10:14 AM, David Hastings <
> hastings.recurs...@gmail.com> wrote:
> 
>> Hey All, slides form the 2017 lucene revolution were put up recently, but
>> unfortunately, the one I have the most interest in, the semantic knowledge
>> graph, have not been put up:
>> 
>> https://lucenesolrrevolution2017.sched.com/event/BAwX/the-
>> apache-solr-semantic-knowledge-graph?iframe=no&w=100%&sidebar=yes&bg=no
>> 
>> 
>> dont suppose any one knows where i may be able to find them, or point me in
>> a direction to get more information about this tool.
>> 
>> Thanks - dave
>> 


Re: Learning to rank - Bad Request

2017-10-09 Thread sophia250
What missing steps did you fix to solve the issue? i am facing exactly the
same as you had before



--
Sent from: http://lucene.472066.n3.nabble.com/Solr-User-f472068.html


Re: when transaction logs are closing?

2017-10-09 Thread Erick Erickson
bq: Actually there are only 2 tlog files but each java process has
handles open to tlog.

I'm a little confused by this. Each core in each Solr JVM should have
a handle to it's _own_ tlog open. So if JVM1 has 10 cores (replicas),
I'd expect 10 different tlogs to be open, one for each core. But I
don't expect two processes to have a handle open to the _same_ tlog
unless something's horribly wrong.

And as Emir and Alessandro point out, every hard commit of any flavor
should close the current tlog (for the replica/core) and open a new
one. Soft commits have no effect.

Oh, and please don't expungeDeletes or optimize if you can possibly
help it, there'll be a blog coming soon on why this is A Bad Idea
unless you have a pattern where you periodically (I'm thinking daily)
update your index and optimize as part of your process.

Best,
Erick

P.S. The reference guide is freely editable. Well, actually they're
just files in asciidoc format. It'd be great if you wanted to edit
them. I use Atom to edit them as it's free. If you do feel moved to do
this, just raise a JIRA and add the diff as a patch. There's also an
IntelliJ plugin that will do. I'm heavily editing the Near Real Time
page...

On Mon, Oct 9, 2017 at 5:35 AM, Bernd Fehling
 wrote:
> Thanks a lot Alessandro and Emir.
>
> Am 09.10.2017 um 13:40 schrieb alessandro.benedetti:
>> In addition to what Emir mentioned, when Solr opens a new Transaction Log
>> file it will delete the older ones up to some conditions :
>> keep at least N number of records [1] and max K number of files[2].
>> N is specified in the solrconfig.xml ( in the update handler section) and
>> can be documents related or files related or both.
>> So , potentially it could delete no one.
>>
>> This blog from Erick is quite explicative[3] .
>> If you like to take a look to the code, this class should help[4]
>>
>>
>>
>> [1]  ${solr.ulog.numRecordsToKeep:100}
>> [2]  ${solr.ulog.maxNumLogsToKeep:10}
>> [3]
>> https://lucidworks.com/2013/08/23/understanding-transaction-logs-softcommit-and-commit-in-sorlcloud/
>> [4] org.apache.solr.update.UpdateLog
>>
>>
>>
>>
>> -
>> ---
>> Alessandro Benedetti
>> Search Consultant, R&D Software Engineer, Director
>> Sease Ltd. - www.sease.io
>> --
>> Sent from: http://lucene.472066.n3.nabble.com/Solr-User-f472068.html
>>


Re: upgrade to 7.0.0

2017-10-09 Thread Erick Erickson
Thanks for verifying!

Erick

On Mon, Oct 9, 2017 at 2:27 AM, Stefano Mancini 
wrote:

> thanks a lot,
>
> i’ve just installed 7.0.1 and it works fine with all the indexes  i’ve
> created with 6.x
>
> Stefano Mancini
>
>
>
>
> *Nella  nuvola  del  futuro*
>
> *Tupla srl **© 2011** – P.Iva 03019191208*
> via Porrettana, 277 - 40033 Casalecchio di Reno (BO
> 
> )
> Ph +39 051 0956963 <+39%20051%20095%206963> – i...@tupla.it  - www.tupla.
> it
>
> Il presente messaggio è normalmente protetto dalla normativa vigente sulla
> privacy (d.l. 196/1996, d.p.r. 695/2003). Ogni informazione contenuta in
> esso e nei suoi eventuali allegati, costituisce materiale riservato e
> confidenziale. Il messaggio potrebbe contenere opinioni personali non
> necessariamente riconducibili a quelle di Tupla srl. Qualora abbia ricevuto
> il presente messaggio per errore, La preghiamo di darcene notizia e di
> provvedere all'immediata eliminazione dello stesso e di ogni sua eventuale
> copia (come già previsto dalla normativa citata).
>
>
>
>
> Il giorno 30 set 2017, alle ore 00:14, Shawn Heisey 
> ha scritto:
>
> On 9/27/2017 4:36 AM, Stefano Mancini wrote:
>
> i’ve just installed solr 7.0.0 and i’ve an error opening an index created
> with 6.6.1.
>
> The server works fine if  start it with an empty index so i suppose that
> configurations is ok
>
>
> I thought that I replied to this, turns out that I didn't.  I replicated
> the error a couple of days ago and opened an issue.  You (Stefano) are
> the mailing list user that I mentioned in the issue.
>
> https://issues.apache.org/jira/browse/SOLR-11406
>
> The problem has been located and Steve Rowe has figured out how to fix
> it.  He has also volunteered to be the release manager for the 7.0.1
> version that will contain the fix.  It's impossible to predict when that
> release will be ready, but a preliminary estimate is about a week, maybe
> two.
>
> You have the option of grabbing the branch_7_0 source code and building
> a SNAPSHOT package yourself if you want it right now.
>
> Thanks,
> Shawn
>
>
>


Re: Learning to rank - Bad Request

2017-10-09 Thread sophia250
I posted name feature one by one and thus it works.



--
Sent from: http://lucene.472066.n3.nabble.com/Solr-User-f472068.html


HTTP HEAD method is gone?

2017-10-09 Thread Xie, Sean
After upgrading from 6.5.1 to 6.6.1, the http HEAD method requesting 
/favicon.ico returns the 404. Http GET method is still working as it returns 
200 OK.

Before the upgrading, both HEAD and GET give http 200 OK response.

Any settings to adjust?

Thanks
Sean

Confidentiality Notice::  This email, including attachments, may include 
non-public, proprietary, confidential or legally privileged information.  If 
you are not an intended recipient or an authorized agent of an intended 
recipient, you are hereby notified that any dissemination, distribution or 
copying of the information contained in or transmitted with this e-mail is 
unauthorized and strictly prohibited.  If you have received this email in 
error, please notify the sender by replying to this message and permanently 
delete this e-mail, its attachments, and any copies of it immediately.  You 
should not retain, copy or use this e-mail or any attachment for any purpose, 
nor disclose all or any part of the contents to any other person. Thank you.


Newbie question about why represent timestamps as "float" values

2017-10-09 Thread William Torcaso


I have inherited a working SOLR installation, that has not been upgraded since 
solr 4.0.  My task is to bring it forward (at least 6.x, maybe 7.x).  I am 
brand new to SOLR.

Here is my question.  In schema.xml, there is this field:



Question:  why is this declared as a float datatype?  I'm just looking for an 
explanation of what is there – any changes come later, after I understand 
things better.

I understand about milliseconds from the epoch.  I would expect that the author 
would have used an integer or a long integer to hold such a millisecond count, 
or a DateField or TrieDateField.
I wonder if there is some Solr magic at work.

Thanks,

  ---  Bill


Re: Newbie question about why represent timestamps as "float" values

2017-10-09 Thread Chris Hostetter

: Here is my question.  In schema.xml, there is this field:
: 
: 
: 
: Question:  why is this declared as a float datatype?  I'm just looking 
: for an explanation of what is there – any changes come later, after I 
: understand things better.

You would hvae to ask the creator of that schema.xml file why they made 
that choice ... to the best of my knowledge, no sample/example schema that 
has ever shipped with any version of solr has ever included a "unixdate" 
field -- let alone one that suggested "float" would be a logically correct 
data type for storing that type of information.


-Hoss
http://www.lucidworks.com/

Query to obtain count of term vocabulary

2017-10-09 Thread Reth RM
Dear Solr-User Group,

   Can you please suggest me API to query the *count* of total term
vocabulary in a given shard index for specified field? For example, in the
reference image click here
, count of
total terms in the "terms" column on the left hand side.

Thank you.


Re: Query to obtain count of term vocabulary

2017-10-09 Thread Mikhail Khludnev
https://lucene.apache.org/solr/guide/6_6/the-terms-component.html

On Mon, Oct 9, 2017 at 6:03 PM, Reth RM  wrote:

> Dear Solr-User Group,
>
>Can you please suggest me API to query the *count* of total term
> vocabulary in a given shard index for specified field? For example, in the
> reference image click here
> , count of
> total terms in the "terms" column on the left hand side.
>
> Thank you.
>



-- 
Sincerely yours
Mikhail Khludnev


Re: Newbie question about why represent timestamps as "float" values

2017-10-09 Thread Erick Erickson
What Hoss said, and in addition somewhere some
custom code has to be translating things back and
forth. For dates, Solr wants -MM-DDTHH:MM:SSZ
as a date string it knows how to deal with. That simply
couldn't parse as a float type so there's some custom
code that transforms dates into a float at ingest
time and converts from float to something recognizable
as a date on output.



On Mon, Oct 9, 2017 at 2:06 PM, Chris Hostetter
 wrote:
>
> : Here is my question.  In schema.xml, there is this field:
> :
> : 
> :
> : Question:  why is this declared as a float datatype?  I'm just looking
> : for an explanation of what is there – any changes come later, after I
> : understand things better.
>
> You would hvae to ask the creator of that schema.xml file why they made
> that choice ... to the best of my knowledge, no sample/example schema that
> has ever shipped with any version of solr has ever included a "unixdate"
> field -- let alone one that suggested "float" would be a logically correct
> data type for storing that type of information.
>
>
> -Hoss
> http://www.lucidworks.com/


Solr staying constant on popularity indexes

2017-10-09 Thread Tech Id
Hi,

So I was a bit frustrated the other day when all of a sudden my Solr nodes
started going into recovery.
Everything became normal after a rolling restart, but when I looked at the
logs, I was surprised to see --- nothing !
Solr UI gave me no information during recovery.
Solr logs gave me no information as to what really happened.

And though I have not had the time to use Elastic-Search yet, a couple of
friends have recommended it highly.

Here is a graph that shows 30% gain of ES over Solr in less than 2 years:

Reference: https://db-engines.com/en/ranking_trend/search+engine

​
Being a long term Solr user, I tried to do a little comparison myself and
actually found some interesting features in ES.

1. No zookeeper  - I have burnt my hands with some zookeeper issues in the
past and it is no fun to deal with. Kafka and Storm are also trying to
burden zookeeper less and less because ZK cannot handle heavy traffic.
2. REST APIs - this is a big wow over the complicated syntax Solr uses. I
think V2 APIs are coming to address this, but they did come a bit late in
the game.
3. Client nodes - No such equivalent in Solr. All nodes do scatter-gather
in Solr which adds scalability problems.
4. Much better logs in ES
5. Cluster level stats and hot-threads etc APIs make monitoring easy.

So I just wanted to discuss some of these important points about ES vs Solr.

At the very least, we should try to improve our logs.
When a node is behaving badly, Solr gives absolutely no information why its
is behaving the way it is.
In the same debugging spirit, the Solr-UI can also be improved to show
number-of-cores per node, total number of down/recovering etc nodes,
memory/CPU/disk used by each node etc which make the engineer's jobs a bit
more easy.


Cheers,
TI