date:20170106

Re: Commit required after delete ?

2017-01-06 Thread Mikhail Khludnev

Hello, Friend!
You absolutely need to commit to make delete visible. And even more, when
"softCommit" is issued in Lucene level, there is a flag which ignores
deletes for sake of performance.

06 янв. 2017 г. 10:55 пользователь "Dorian Hoxha" 
написал:

Hello friends,

Based on what I've read, I think "commit" isn't needed to make deletes
active (like we do with index/update), right ?

Since it just marks an in-memory deleted-id bitmap, right ?

Thank You

Re: CDCR logging is Needlessly verbose, fills up the file system fast

2017-01-06 Thread Webster Homer

I figured our problem with the filesystem, by default the root logger is
configured with the CONSOLE logger, which is NOT rotated and eventually
filled up the file system. That doesn't exonerate the CDCR logging problem
though. The thing writes a huge amount of junk to the logs, information
that looks more like Debug and even TRACE level statements. I hope that
this is addressed, and soon! There doesn't seem to be a way to turn it off
either. We want INFO level logging

On Tue, Jan 3, 2017 at 5:10 PM, Shawn Heisey  wrote:

> On 1/3/2017 1:12 PM, Webster Homer wrote:
> > We use the default log4j.properties file which rolls the log file to
> > solr.log.1, solr.log.2 ... which isn't really the problem. What is
> > also happening is that solr.log.1 gets renamed to
> > solr_log_20170103_1110 with a timestamp as the file name. How do I
> > turn off this behavior? It is not obvious in the log4j.properties file.
>
> That rename is not done by log4j.  It is done by the bin/solr or
> bin\solr.cmd script just before Solr is started.  Unlike the solr.log.N
> files, those files are not subject to automatic deletion.
>
> My best guess is that you've got a situation where something external to
> Solr is restarting Solr frequently.  Solr's start script renames
> solr.log to solr_log_DATE_TIME each time it starts.  As far as I am
> aware, Solr does NOT have the ability to restart itself, and the CDCR
> page in the reference guide doesn't mention anything about processes
> being restarted as part of its operation.
>
> Because the logfiles that you are seeing accumulate are not handled by
> log4j, switching to log4j2 and adding compression will not help.  Moving
> to log4j2 requires a few things to be done on the development side, and
> won't be trivial.
>
> If your system requires frequent Solr restarts for some reason, then
> you're going to have to take over management of the renamed solr
> logfiles as well, or edit the start script so that it doesn't rename the
> main logfile.  SolrCloud does not deal well with frequent restarts, so
> avoid doing them.
>
> Thanks,
> Shawn
>
>

-- 

This message and any attachment are confidential and may be privileged or 
otherwise protected from disclosure. If you are not the intended recipient, 
you must not copy this message or attachment or disclose the contents to 
any other person. If you have received this transmission in error, please 
notify the sender immediately and delete the message and any attachment 
from your system. Merck KGaA, Darmstadt, Germany and any of its 
subsidiaries do not accept liability for any omissions or errors in this 
message which may arise as a result of E-Mail-transmission or for damages 
resulting from any unauthorized changes of the content of this message and 
any attachment thereto. Merck KGaA, Darmstadt, Germany and any of its 
subsidiaries do not guarantee that this message is free of viruses and does 
not accept liability for any damages caused by any virus transmitted 
therewith.

Click http://www.emdgroup.com/disclaimer to access the German, French, 
Spanish and Portuguese versions of this disclaimer.

Strange Index not mutable error

2017-01-06 Thread Webster Homer

I occasionally see this error in our logs:
2017-01-05 20:48:46.664 ERROR (SolrConfigHandler-refreshconf)
[c:sial-catalog-material s:shard2 r:core_node2
x:sial-catalog-material_shard2_replica1] o.a.s.s.IndexSchema This
IndexSchema is not mutable.

None of our indexes are mutable, nor are we trying to use the managed
schema features.

It doesn't seem to cause a problem, but I want to understand where this
ERROR is coming from.

It may be related to the use of the Config API which we use to dynamically
set properties and to configure the CDCR Request handlers. I don't see
anything in the Config API docs that limits its use to collections with
managed schemas, moreover the API seems to work fine.

I don't see SolrConfigHandler defined in the solrconfig.xml file

We have been using the same basic schema file since Solr 3.* upgrading the
schema and config as needed. I've only seen this error since migrating to
Solr 6.2

-- 


This message and any attachment are confidential and may be privileged or 
otherwise protected from disclosure. If you are not the intended recipient, 
you must not copy this message or attachment or disclose the contents to 
any other person. If you have received this transmission in error, please 
notify the sender immediately and delete the message and any attachment 
from your system. Merck KGaA, Darmstadt, Germany and any of its 
subsidiaries do not accept liability for any omissions or errors in this 
message which may arise as a result of E-Mail-transmission or for damages 
resulting from any unauthorized changes of the content of this message and 
any attachment thereto. Merck KGaA, Darmstadt, Germany and any of its 
subsidiaries do not guarantee that this message is free of viruses and does 
not accept liability for any damages caused by any virus transmitted 
therewith.

Click http://www.emdgroup.com/disclaimer to access the German, French, 
Spanish and Portuguese versions of this disclaimer.

Re: Subqueries

2017-01-06 Thread Peter Matthew Eichman

Hi Mikhail,

I've turned on DEBUG level logging, but I still only see the main request
logged, and no requests for the subqueries.

Could it be a version issue? We are running Solr 4.10.

Thanks,
-Peter

On Fri, Jan 6, 2017 at 1:56 AM, Mikhail Khludnev  wrote:

> Peter,
> Subquery should also log its' request. Can't you find it in log?
>
> On Fri, Jan 6, 2017 at 1:19 AM, Peter Matthew Eichman 
> wrote:
>
> > Hello Mikhail,
> >
> > I put pcdm_members into the fl, and it is definitely stored. I tried
> adding
> > the logParamsList, but all I see in the log is
> > 183866104 [qtp1778535015-14] INFO  org.apache.solr.core.SolrCore  –
> > [fedora4] webapp=/solr path=/select params={q=id:"https://
> > fcrepolocal/fcrepo/rest/pcdm/19/31/3c/1a/19313c1a-6ab4-
> > 4305-93ec-12dfdf01ba74"&members.logParamsList=q,fl,
> > rows,row.pcdm_members&indent=true&fl=members:[subquery]&
> > members.fl=id,title&members.q={!terms+f%3Did+v%3D$row.pcdm_
> > members}&wt=json&_=1483654385162} hits=1 status=0 QTime=0
> >
> > Still getting no members key in the output:
> >
> > {
> >   "responseHeader": {
> > "status": 0,
> > "QTime": 1,
> > "params": {
> >   "q": "id:\"https://fcrepolocal/fcrepo/rest/pcdm/19/31/3c/1a/
> > 19313c1a-6ab4-4305-93ec-12dfdf01ba74\"",
> >   "members.logParamsList": "q,fl,rows,row.pcdm_members",
> >   "indent": "true",
> >   "fl": "pcdm_members,members:[subquery]",
> >   "members.fl": "id,title",
> >   "members.q": "{!terms f=id v=$row.pcdm_members}",
> >   "wt": "json",
> >   "_": "1483654538166"
> > }
> >   },
> >   "response": {
> > "numFound": 1,
> > "start": 0,
> > "docs": [
> >   {
> > "pcdm_members": [
> >   "https://fcrepolocal/fcrepo/rest/pcdm/28/2e/5b/f5/
> > 282e5bf5-74c8-4148-9c1a-4ebead6435cb",
> >   "https://fcrepolocal/fcrepo/rest/pcdm/6e/7c/36/2f/
> > 6e7c362f-d239-4534-abd7-28caa24a134c",
> >   "https://fcrepolocal/fcrepo/rest/pcdm/6e/e3/a6/33/
> > 6ee3a633-998e-4f36-b80f-d76bcbe0d352",
> >   "https://fcrepolocal/fcrepo/rest/pcdm/8a/d9/c7/62/
> > 8ad9c762-4391-428d-b1ad-be5ac3e06c42"
> > ]
> >   }
> > ]
> >   }
> > }
> >
> > Is $row.pcdm_members the right way to refer to the pcdm_members field
> > of the current document in the subquery? Is the multivalued nature of
> > the field a problem? I have tried adding separator=' ' to both the
> > [subquery] and {!terms}, but to no avail.
> >
> > Thanks,
> > -Peter
> >
> > On Thu, Jan 5, 2017 at 4:38 PM, Mikhail Khludnev 
> wrote:
> >
> > > Hello,
> > >
> > > Can you add pcdm_members into fl to make sure it's stored?
> > > Also please add the following param
> > > members.logParamsList=q,fl,rows,row.pcdm_members,
> > > and check logs then.
> > >
> > > On Thu, Jan 5, 2017 at 9:46 PM, Peter Matthew Eichman <
> peich...@umd.edu>
> > > wrote:
> > >
> > > > Hello all,
> > > >
> > > > I am attempting to use a subquery to enrich a query with the titles
> of
> > > > related objects. Each document in my index may have 1 or more
> > > pcdm_members
> > > > and pcdm_related_objects fields, whose values are ids of other
> > documents
> > > in
> > > > the index. Those documents in turn have reciprocal pcdm_member_of and
> > > > pcdm_related_object_of fields.
> > > >
> > > > In the Blacklight app I am working on, we want to enrich the display
> > of a
> > > > document with the titles of its members and related objects using a
> > > > subquery. However, this is out first foray into subqueries and things
> > > > aren't working as expected.
> > > >
> > > > I expected the following query to return a "members" key with a
> > document
> > > > list of documents with "id" and "title" keys, but I am getting
> nothing:
> > > >
> > > > {
> > > >   "responseHeader": {
> > > > "status": 0,
> > > > "QTime": 1,
> > > > "params": {
> > > >   "q": "id:\"https://fcrepolocal/fcrepo/rest/pcdm/19/31/3c/1a/
> > > > 19313c1a-6ab4-4305-93ec-12dfdf01ba74\"",
> > > >   "indent": "true",
> > > >   "fl": "members:[subquery]",
> > > >   "members.fl": "id,title",
> > > >   "members.q": "{!terms f=id v=$row.pcdm_members}",
> > > >   "wt": "json",
> > > >   "_": "1483641932207"
> > > > }
> > > >   },
> > > >   "response": {
> > > > "numFound": 1,
> > > > "start": 0,
> > > > "docs": [
> > > >   {}
> > > > ]
> > > >   }
> > > > }
> > > >
> > > > Any pointers on what I am missing? Are there any configuration
> settings
> > > in
> > > > solrconfig.xml that I need to be aware of for subqueries to work?
> > > >
> > > > Thanks,
> > > > -Peter
> > > >
> > > > --
> > > > Peter Eichman
> > > > Senior Software Developer
> > > > University of Maryland Libraries
> > > > peich...@umd.edu
> > > >
> > >
> > >
> > >
> > > --
> > > Sincerely yours
> > > Mikhail Khludnev
> > >
> >
> >
> >
> > --
> > Peter Eichman
> > Senior Software Developer
> > University of Maryland Libraries
> > peich...@umd.edu
> >
>
>
>
> --
> Sincerely yours
> Mi

Re: CDCR logging is Needlessly verbose, fills up the file system fast

2017-01-06 Thread Shawn Heisey

On 1/6/2017 8:21 AM, Webster Homer wrote:
> I figured our problem with the filesystem, by default the root logger
> is configured with the CONSOLE logger, which is NOT rotated and
> eventually filled up the file system. That doesn't exonerate the CDCR
> logging problem though. The thing writes a huge amount of junk to the
> logs, information that looks more like Debug and even TRACE level
> statements. I hope that this is addressed, and soon! There doesn't
> seem to be a way to turn it off either. We want INFO level logging

The logging that goes to the console and solr.log by default is INFO. 
Even without CDCR, INFO logging is fairly verbose, but normally doesn't
chew up a whole bunch of gigabytes over a short timeframe.

CDCR is a pretty new feature, and the authors likely needed a lot of
information in the logs while they were developing it so they could
chase down bugs.  I have never used the feature and haven't looked at
the logging, but it sounds like much of it could be changed to DEBUG or
lower priority now that primary development is done.  If you haven't
already opened an issue in Jira to reduce what CDCR logs at INFO, that
would be a good idea.

To get rid of the console logging entirely, which is a good idea, edit
server/resources/log4j.properties.  Change the "log4j.rootLogger" line
to remove CONSOLE and the comma.  Optionally, you can also remove all
the other lines that contain CONSOLE.

Starting in version 6.3, you can put "-Dsolr.log.muteconsole" on the
java commandline (normally you'd do this in the solr.in.* script) to
accomplish this without changing the logging config.  I personally think
that Solr shouldn't log to the console at all unless it is running in
the foreground and the console isn't being redirected to a file.

https://issues.apache.org/jira/browse/SOLR-8186

Thanks,
Shawn

Re: SolrCloud different score for same document on different replicas.

2017-01-06 Thread Webster Homer

I was seeing something like this, and it turned out to be a problem with
our autoCommit and autoSoftCommit settings. We had overly aggressive
settings that eventually started failing with errors around too many
warming searchers etc...

You can test this by doing a commit and seeing if the replicas start
returning consistent results

On Thu, Jan 5, 2017 at 10:31 AM, Charlie Hull  wrote:

> On 05/01/2017 13:30, Morten Bøgeskov wrote:
>
>>
>>
>> Hi.
>>
>> We've got a SolrCloud which is sharded and has a replication factor of
>> 2.
>>
>> The 2 replicas of a shard may look like this:
>>
>> Num Docs:5401023
>> Max Doc:6388614
>> Deleted Docs:987591
>>
>>
>> Num Docs:5401023
>> Max Doc:5948122
>> Deleted Docs:547099
>>
>> We've seen >10% difference in Max Doc at times with same Num Docs.
>> Our use case is few documents that are search and many small that
>> are filtered against (often updated multiple times a day), so the
>> difference in deleted docs aren't surprising.
>>
>> This results in a different score for a document depending on which
>> replica it comes from. As I see it: it has to do with the different
>> maxDoc value when calculating idf.
>>
>> This in turn alters a specific document's position in the search
>> result over reloads. This is quite confusing (duplicates in pagination).
>>
>> What is the trick to get homogeneous score from different replicas.
>> We've tried using ExactStatsCache & ExactSharedStatsCache, but that
>> didn't seem to make any difference.
>>
>> Any hints to this will be greatly appreciated.
>>
>>
> This was one of things we looked at during our recent Lucene London
> Hackday (see item 3) https://github.com/flaxsearch/london-hackday-2016
>
> I'm not sure there is a way to get a homogenous score - this patch tries
> to keep you connected to the same replica during a session so you don't see
> results jumping over pagination.
>
> Cheers
>
> Charlie
>
>
> --
> Charlie Hull
> Flax - Open Source Enterprise Search
>
> tel/fax: +44 (0)8700 118334
> mobile:  +44 (0)7767 825828
> web: www.flax.co.uk
>

-- 

This message and any attachment are confidential and may be privileged or 
otherwise protected from disclosure. If you are not the intended recipient, 
you must not copy this message or attachment or disclose the contents to 
any other person. If you have received this transmission in error, please 
notify the sender immediately and delete the message and any attachment 
from your system. Merck KGaA, Darmstadt, Germany and any of its 
subsidiaries do not accept liability for any omissions or errors in this 
message which may arise as a result of E-Mail-transmission or for damages 
resulting from any unauthorized changes of the content of this message and 
any attachment thereto. Merck KGaA, Darmstadt, Germany and any of its 
subsidiaries do not guarantee that this message is free of viruses and does 
not accept liability for any damages caused by any virus transmitted 
therewith.

Click http://www.emdgroup.com/disclaimer to access the German, French, 
Spanish and Portuguese versions of this disclaimer.

Re: error during running my code java.lang.VerifyError: Bad type on operand stack

2017-01-06 Thread Susheel Kumar

Which solrj version are you using and can you point which line exactly
throws the error?

Thnx

On Fri, Jan 6, 2017 at 2:04 AM, gayathri...@tcs.com 
wrote:

> Hi
>
> Im using solr 5.4.0 while running my code i get below eroor please suggest
> what has to be done
>
> public static void main(String[] args) throws SolrServerException,
> IOException {
>
>
> String urlString = "http://localhost:8983/solr/";;
> SolrClient client = new HttpSolrClient(urlString);
> }
>
> Error :
>
> java.lang.VerifyError: Bad type on operand stack
> Exception Details:
>   Location:
>
> org/apache/http/impl/client/DefaultHttpClient.setDefaultHttpParams(Lorg/
> apache/http/params/HttpParams;)V
> @4: invokestatic
>   Reason:
> Type 'org/apache/http/HttpVersion' (current frame, stack[1]) is not
> assignable to 'org/apache/http/ProtocolVersion'
>
> please suggest what has to be done
>
>
>
> --
> View this message in context: http://lucene.472066.n3.
> nabble.com/error-during-running-my-code-java-lang-VerifyError-Bad-type-on-
> operand-stack-tp4312690.html
> Sent from the Solr - User mailing list archive at Nabble.com.
>

Re: Need help for this scenario

2017-01-06 Thread Susheel Kumar

Hello Shahi,  would you clarify your requirement or issue from Solr
perspective.  From the above its not clear what you are asking.

You use Solr for indexing some data which later you can search upon.
Keeping this in mind, can you elaborate what kind of data you are indexing
and what are you trying to search and the issue.

Thanks,
Susheel

On Thu, Jan 5, 2017 at 11:43 PM,  wrote:

> Hello Team,
>
> I am looking for your valuable suggestions/solutions for the below
> scenario:
>
> >  Scenario :
> When any user gives a request by giving the name of the filename.zip, then
> he wants the "filename.zip" zip file.
>
> >  Description:
> *The data is a collection.zip where it consists of many inner
> zipfiles(that is the filename.zip).
> *Here we have hive table and in the hive table first column
> consists of harlocation which in turn points to the collection.zip file
> path and in the next column it will have the filename.zip.
> hive> select * from hvtb_xdlogfileanalysis_logfilewithworkshopinfo_ext
> limit 1;
> OK
> har://hdfs-DBDPInnovationLab/org/itpgm/XDLogFileAnalysis/archived/data/
> AfterSalesAndService_AS/APP-16151/XentryInDia-XentryDiagnostics/Central_Y/
> XDlogfiles/2016_05_30/2016_05_30.har/10_52_33/INDIA_archive_LTA1_1_20160330-153020236.zip
> H_8658E990C2F3_20160330_110145.zip  WDC1660041A050241   M/GLE
> (166) 8658E990C2F32   2016-03-30 09:01:45 2016-03-30
> 12:50:34  MPC212  12/15   104172.0km  5131265
> CENTRETOILE SA  4500avenue de l'Industrie 24Huy 513
>  126531  2016-04-19
> Time taken: 0.609 seconds, Fetched: 1 row(s)
>
> >  Tasks carried out:
> *We need to use restapi to send a request for getting a
> filename.zip
> *Then we need to query the hive table and get the har location and
> from there we get the collection zip.
> *Unzip the collection.zip and compare the list of inner zip files
> with the filename.zip
> *After comparison, we need to put the filename.zip to particular
> location .
> *Provide the webhdfs path of the filename.zip via rest api.
>
> Now problem I am facing is :
> This is the first time, i am using the solr and i am not pretty sure, how
> to point the inner zip file (which is in the collection.zip).
> Does solr supports this, if yes can you explain me in brief.
> How to point the inner zip files.
> Or any workaround is there, please let me know.
>
> All suggestion/solutions are welcomed.
> Thank you in advance.
>
> Emailed : kshashi...@gmail.com
>
> Thanks,
> ShashiKumar
>
>
> If you are not the addressee, please inform us immediately that you have
> received this e-mail by mistake, and delete it. We thank you for your
> support.
>
>

Re: error during running my code java.lang.VerifyError: Bad type on operand stack

2017-01-06 Thread Atita Arora

Hi,

I found the same thing listed here at :

http://googleweblight.com/?lite_url=http://stackoverflow.com/questions/32105513/solr-bad-return-type-error&ei=xQ0ZJDXt&lc=en-IN&s=1&m=940&host=www.google.co.in&ts=1483722084&sig=AF9NedkxZ3LIU1o5BOd8inhSmW5Q5azbHA

HttpSolrClient has a constructor that accepts an HttpClient. When not
passed, it creates an internalClient that is a CloseableHttpClient.

So you can create a Default client and pass it as follows:

SystemDefaultHttpClient httpClient
=newSystemDefaultHttpClient();HttpSolrClient client
=newHttpSolrClient(url, httpClient);


I think the problem is the incorrect usage.
Can you try this?

Thanks,
Atita

On Jan 6, 2017 10:19 PM, "Susheel Kumar"  wrote:

Which solrj version are you using and can you point which line exactly
throws the error?

Thnx

On Fri, Jan 6, 2017 at 2:04 AM, gayathri...@tcs.com 
wrote:

> Hi
>
> Im using solr 5.4.0 while running my code i get below eroor please suggest
> what has to be done
>
> public static void main(String[] args) throws SolrServerException,
> IOException {
>
>
> String urlString = "http://localhost:8983/solr/";;
> SolrClient client = new HttpSolrClient(urlString);
> }
>
> Error :
>
> java.lang.VerifyError: Bad type on operand stack
> Exception Details:
>   Location:
>
> org/apache/http/impl/client/DefaultHttpClient.setDefaultHttpParams(Lorg/
> apache/http/params/HttpParams;)V
> @4: invokestatic
>   Reason:
> Type 'org/apache/http/HttpVersion' (current frame, stack[1]) is not
> assignable to 'org/apache/http/ProtocolVersion'
>
> please suggest what has to be done
>
>
>
> --
> View this message in context: http://lucene.472066.n3.
> nabble.com/error-during-running-my-code-java-lang-VerifyError-Bad-type-on-
> operand-stack-tp4312690.html
> Sent from the Solr - User mailing list archive at Nabble.com.
>

CDCR How to recover from Corrupted transaction log

2017-01-06 Thread Webster Homer

We recently had a problem where the solr transaction log became corrupted
during a data load. The collection was a CDCR source and we started seeing
problems on both source and target.

This happened while testing and was not in a production system. So we just
deleted both collections and recreated them after fixing the root cause.

If this had been a production system that would not have been acceptable.

What is the best way to recover from a problem like this? Stop cdcr and
delete the tlog files?

-- 


This message and any attachment are confidential and may be privileged or 
otherwise protected from disclosure. If you are not the intended recipient, 
you must not copy this message or attachment or disclose the contents to 
any other person. If you have received this transmission in error, please 
notify the sender immediately and delete the message and any attachment 
from your system. Merck KGaA, Darmstadt, Germany and any of its 
subsidiaries do not accept liability for any omissions or errors in this 
message which may arise as a result of E-Mail-transmission or for damages 
resulting from any unauthorized changes of the content of this message and 
any attachment thereto. Merck KGaA, Darmstadt, Germany and any of its 
subsidiaries do not guarantee that this message is free of viruses and does 
not accept liability for any damages caused by any virus transmitted 
therewith.

Click http://www.emdgroup.com/disclaimer to access the German, French, 
Spanish and Portuguese versions of this disclaimer.

Re: CDCR How to recover from Corrupted transaction log

2017-01-06 Thread Erick Erickson

If you just deleted the tlog files on the source, you'd likely miss
updates, right?
I can think of two ways to get source and target back in sync:

1> go ahead and delete the tlogs, then re-index from some point guaranteed
to have been propagated to the target _before_ the tlog went wonky.

2> rebuild the target collection as though you were just starting CDCR on an
existing collection like the "initial startup" here:
https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=62687462#CrossDataCenterReplication(CDCR)-InitialStartup


Best,
Erick

On Fri, Jan 6, 2017 at 9:09 AM, Webster Homer  wrote:
> We recently had a problem where the solr transaction log became corrupted
> during a data load. The collection was a CDCR source and we started seeing
> problems on both source and target.
>
> This happened while testing and was not in a production system. So we just
> deleted both collections and recreated them after fixing the root cause.
>
> If this had been a production system that would not have been acceptable.
>
> What is the best way to recover from a problem like this? Stop cdcr and
> delete the tlog files?
>
> --
>
>
> This message and any attachment are confidential and may be privileged or
> otherwise protected from disclosure. If you are not the intended recipient,
> you must not copy this message or attachment or disclose the contents to
> any other person. If you have received this transmission in error, please
> notify the sender immediately and delete the message and any attachment
> from your system. Merck KGaA, Darmstadt, Germany and any of its
> subsidiaries do not accept liability for any omissions or errors in this
> message which may arise as a result of E-Mail-transmission or for damages
> resulting from any unauthorized changes of the content of this message and
> any attachment thereto. Merck KGaA, Darmstadt, Germany and any of its
> subsidiaries do not guarantee that this message is free of viruses and does
> not accept liability for any damages caused by any virus transmitted
> therewith.
>
> Click http://www.emdgroup.com/disclaimer to access the German, French,
> Spanish and Portuguese versions of this disclaimer.

Re: CDCR How to recover from Corrupted transaction log

2017-01-06 Thread Shawn Heisey

On 1/6/2017 10:09 AM, Webster Homer wrote:
> This happened while testing and was not in a production system. So we
> just deleted both collections and recreated them after fixing the root
> cause. If this had been a production system that would not have been
> acceptable. What is the best way to recover from a problem like this?
> Stop cdcr and delete the tlog files?

What was the root cause?  Need to know that before anyone can tell you
whether or not you've run into a bug.

If it was the problem you've separately described where CDCR logging
filled up your disk ... handling that gracefully in a program is very
difficult.  It's possible, but there's very little incentive for anyone
to attempt it.  Lucene and Solr have a general requirement of plenty of
free disk space (enough for the index to triple in size temporarily)
just for normal operation, so coding for disk space exhaustion isn't
likely to happen.  Server monitoring should send an alarm when disk
space gets low so you can fix it before it causes real problems.

Thanks,
Shawn

Re: How to train the model using user clicks when use ltr(learning to rank) module?

2017-01-06 Thread Diego Ceccarelli (BLOOMBERG/ LONDON)

Hi Jeffery, 
I submitted a patch to the README of the learning to rank example folder, 
trying to explain better how to produce a training set given a log with 
interaction data. 

Patch is available here: https://issues.apache.org/jira/browse/SOLR-9929
And you can see the new version of the README here:  
https://github.com/bloomberg/lucene-solr/blob/master-ltr/solr/contrib/ltr/example/README.md

Please let me know if you have comments or more questions.
Cheers
Diego


From: solr-user@lucene.apache.org At: 01/06/17 03:57:29
To: solr-user@lucene.apache.org
Subject: Re: How to train the model using user clicks when use ltr(learning to 
rank) module?

In the Assemble training data part: the third column indicates the relative
importance or relevance of that doc
Could you please give more info about how to give a score based on what user
clicks?

Hi Jeffery,

Give your questions more detail and there may be more feedback; just a 
suggestion.
About above,

some examples of assigning "relative" weighting to training data
user click info gathered (all assumed but similar to omniture monitoring)
- position in the result list
- above/below the fold
- result page number
As a information engineer, you might see 2 attributes here: a) user 
perseverance b) effort to find the result

From there, the attributes have a correlation relationship that is not 
linear and directly proportional I think:
easy to find outweighs user perseverance every time because it 
reduces the need for such
 extensive perseverance, page #3 for example, doesn't mitigate 
effort, it drives effort  towards lower user perseverance need value pairs.
Ok. That is damn confusing. But its what I would want to do, use the pair 
in a manner that reranks a document as if the perseverance and effort were 
balanced and positioned ... "relative" to the other training data. What that 
equation is, will take some more effort

i'm not sure this response is helpful at all, but i'm going to go with it 
because I recognize all of it from AOL, Microsoft and Comcast work. Before the 
days of ML in Search.

On 1/5/2017 3:33 PM, Jeffery Yuan wrote:

Thanks , Will Martin.

I checked the pdf it's great. but seems not very useful for my question: How
to train the model using user clicks when use ltr(learning to rank) module.

I know the concept after reading these papers. But still not sure how to
code them.


--
View this message in context: 
http://lucene.472066.n3.nabble.com/How-to-train-the-model-using-user-clicks-when-use-ltr-learning-to-rank-module-tp4312462p4312592.html
Sent from the Solr - User mailing list archive at Nabble.com.

Re: Empty facets on TextField

2017-01-06 Thread John Davis

We've hit this issue again since solr defaults new fields to string type
which has docvalues. Changing those to be lowercase text does not remove
the docvalues and breaks faceting. Is there a way to remove docvalues for a
field w/o starting fresh?

On Tue, Oct 18, 2016 at 8:19 PM, Yonik Seeley  wrote:

> Actually, a delete-by-query of *:* may also be hit-or-miss on replicas
> in a solr cloud setup because of reorders.
> If it does work, you should see something in the logs at the INFO
> level like "REMOVING ALL DOCUMENTS FROM INDEX"
>
> -Yonik
>
> On Tue, Oct 18, 2016 at 11:02 PM, Yonik Seeley  wrote:
> > A delete-by-query of *:* may do it (because it special cases to
> > removing the index).
> > The underlying issue is when lucene merges a segment without docvalues
> > with a segment that has them.
> > -Yonik
> >
> >
> > On Tue, Oct 18, 2016 at 10:09 PM, John Davis 
> wrote:
> >> Thanks. Is there a way around to not starting fresh and forcing the
> reindex
> >> to remove docValues?
> >>
> >> On Tue, Oct 18, 2016 at 6:56 PM, Yonik Seeley 
> wrote:
> >>>
> >>> This sounds like you didn't actually start fresh, but just reindexed
> your
> >>> data.
> >>> This would mean that docValues would still exist in the index for this
> >>> field (just with no values), and that normal faceting would use those.
> >>> Forcing facet.method=enum forces the use of the index instead of
> >>> docvalues (or the fieldcache if the field is configured w/o
> >>> docvalues).
> >>>
> >>> -Yonik
> >>>
> >>> On Tue, Oct 18, 2016 at 9:43 PM, John Davis  >
> >>> wrote:
> >>> > Hi,
> >>> >
> >>> > I have converted one of my fields from StrField to TextField and am
> not
> >>> > getting back any facets for that field. Here's the exact
> configuration
> >>> > of
> >>> > the TextField. I have tested it with 6.2.0 on a fresh instance and it
> >>> > repros consistently. From reading through past archives and
> >>> > documentation,
> >>> > it feels like this should just work. I would appreciate any input.
> >>> >
> >>> >  >>> > omitTermFreqAndPositions="true" indexed="true" stored="true"
> >>> > positionIncrementGap="100" sortMissingLast="true" multiValued="true">
> >>> > 
> >>> >   
> >>> >   
> >>> > 
> >>> >   
> >>> >
> >>> >
> >>> > Search
> >>> > query:
> >>> > /select/?facet.field=FACET_FIELD_NAME&facet=on&indent=on&
> q=QUERY_STRING&wt=json
> >>> >
> >>> > Interestingly facets are returned if I change facet.method to enum
> >>> > instead
> >>> > of default fc.
> >>> >
> >>> > John
> >>
> >>
>

Re: Strange Index not mutable error

2017-01-06 Thread Erick Erickson

I'm assuming you're using classic schema factory?

Do this error show up in the log correlated to your use of the
config API to configure CDCR? If so, please do three things:
1> raise a JIRA
2> paste the command you use to configure CDCR
3> paste in any more info from the ERROR in the log, especially
 if there's a stack trace.


It is certainly possible that CDCR is piggy-backing on some of the
configuration code. I don't know of any CDCR tests that specifically
mix classic schema with CDCR config commands. If you'd like to
take a stab at a test case illustrating this that would be way cool.
I can easily imagine these two things not being tested together
though.

Erick

On Fri, Jan 6, 2017 at 7:15 AM, Webster Homer  wrote:
> I occasionally see this error in our logs:
> 2017-01-05 20:48:46.664 ERROR (SolrConfigHandler-refreshconf)
> [c:sial-catalog-material s:shard2 r:core_node2
> x:sial-catalog-material_shard2_replica1] o.a.s.s.IndexSchema This
> IndexSchema is not mutable.
>
> None of our indexes are mutable, nor are we trying to use the managed
> schema features.
>
> It doesn't seem to cause a problem, but I want to understand where this
> ERROR is coming from.
>
> It may be related to the use of the Config API which we use to dynamically
> set properties and to configure the CDCR Request handlers. I don't see
> anything in the Config API docs that limits its use to collections with
> managed schemas, moreover the API seems to work fine.
>
> I don't see SolrConfigHandler defined in the solrconfig.xml file
>
> We have been using the same basic schema file since Solr 3.* upgrading the
> schema and config as needed. I've only seen this error since migrating to
> Solr 6.2
>
> --
>
>
> This message and any attachment are confidential and may be privileged or
> otherwise protected from disclosure. If you are not the intended recipient,
> you must not copy this message or attachment or disclose the contents to
> any other person. If you have received this transmission in error, please
> notify the sender immediately and delete the message and any attachment
> from your system. Merck KGaA, Darmstadt, Germany and any of its
> subsidiaries do not accept liability for any omissions or errors in this
> message which may arise as a result of E-Mail-transmission or for damages
> resulting from any unauthorized changes of the content of this message and
> any attachment thereto. Merck KGaA, Darmstadt, Germany and any of its
> subsidiaries do not guarantee that this message is free of viruses and does
> not accept liability for any damages caused by any virus transmitted
> therewith.
>
> Click http://www.emdgroup.com/disclaimer to access the German, French,
> Spanish and Portuguese versions of this disclaimer.

Re: Subqueries

2017-01-06 Thread Mikhail Khludnev

https://issues.apache.org/jira/browse/SOLR-8208 is resolved for 6.1.
I don't know why 4.10 didn't throw exception on referring to [subquery],
which is absent there.

On Fri, Jan 6, 2017 at 6:23 PM, Peter Matthew Eichman 
wrote:

> Hi Mikhail,
>
> I've turned on DEBUG level logging, but I still only see the main request
> logged, and no requests for the subqueries.
>
> Could it be a version issue? We are running Solr 4.10.
>
> Thanks,
> -Peter
>
> On Fri, Jan 6, 2017 at 1:56 AM, Mikhail Khludnev  wrote:
>
> > Peter,
> > Subquery should also log its' request. Can't you find it in log?
> >
> > On Fri, Jan 6, 2017 at 1:19 AM, Peter Matthew Eichman 
> > wrote:
> >
> > > Hello Mikhail,
> > >
> > > I put pcdm_members into the fl, and it is definitely stored. I tried
> > adding
> > > the logParamsList, but all I see in the log is
> > > 183866104 [qtp1778535015-14] INFO  org.apache.solr.core.SolrCore  –
> > > [fedora4] webapp=/solr path=/select params={q=id:"https://
> > > fcrepolocal/fcrepo/rest/pcdm/19/31/3c/1a/19313c1a-6ab4-
> > > 4305-93ec-12dfdf01ba74"&members.logParamsList=q,fl,
> > > rows,row.pcdm_members&indent=true&fl=members:[subquery]&
> > > members.fl=id,title&members.q={!terms+f%3Did+v%3D$row.pcdm_
> > > members}&wt=json&_=1483654385162} hits=1 status=0 QTime=0
> > >
> > > Still getting no members key in the output:
> > >
> > > {
> > >   "responseHeader": {
> > > "status": 0,
> > > "QTime": 1,
> > > "params": {
> > >   "q": "id:\"https://fcrepolocal/fcrepo/rest/pcdm/19/31/3c/1a/
> > > 19313c1a-6ab4-4305-93ec-12dfdf01ba74\"",
> > >   "members.logParamsList": "q,fl,rows,row.pcdm_members",
> > >   "indent": "true",
> > >   "fl": "pcdm_members,members:[subquery]",
> > >   "members.fl": "id,title",
> > >   "members.q": "{!terms f=id v=$row.pcdm_members}",
> > >   "wt": "json",
> > >   "_": "1483654538166"
> > > }
> > >   },
> > >   "response": {
> > > "numFound": 1,
> > > "start": 0,
> > > "docs": [
> > >   {
> > > "pcdm_members": [
> > >   "https://fcrepolocal/fcrepo/rest/pcdm/28/2e/5b/f5/
> > > 282e5bf5-74c8-4148-9c1a-4ebead6435cb",
> > >   "https://fcrepolocal/fcrepo/rest/pcdm/6e/7c/36/2f/
> > > 6e7c362f-d239-4534-abd7-28caa24a134c",
> > >   "https://fcrepolocal/fcrepo/rest/pcdm/6e/e3/a6/33/
> > > 6ee3a633-998e-4f36-b80f-d76bcbe0d352",
> > >   "https://fcrepolocal/fcrepo/rest/pcdm/8a/d9/c7/62/
> > > 8ad9c762-4391-428d-b1ad-be5ac3e06c42"
> > > ]
> > >   }
> > > ]
> > >   }
> > > }
> > >
> > > Is $row.pcdm_members the right way to refer to the pcdm_members field
> > > of the current document in the subquery? Is the multivalued nature of
> > > the field a problem? I have tried adding separator=' ' to both the
> > > [subquery] and {!terms}, but to no avail.
> > >
> > > Thanks,
> > > -Peter
> > >
> > > On Thu, Jan 5, 2017 at 4:38 PM, Mikhail Khludnev 
> > wrote:
> > >
> > > > Hello,
> > > >
> > > > Can you add pcdm_members into fl to make sure it's stored?
> > > > Also please add the following param
> > > > members.logParamsList=q,fl,rows,row.pcdm_members,
> > > > and check logs then.
> > > >
> > > > On Thu, Jan 5, 2017 at 9:46 PM, Peter Matthew Eichman <
> > peich...@umd.edu>
> > > > wrote:
> > > >
> > > > > Hello all,
> > > > >
> > > > > I am attempting to use a subquery to enrich a query with the titles
> > of
> > > > > related objects. Each document in my index may have 1 or more
> > > > pcdm_members
> > > > > and pcdm_related_objects fields, whose values are ids of other
> > > documents
> > > > in
> > > > > the index. Those documents in turn have reciprocal pcdm_member_of
> and
> > > > > pcdm_related_object_of fields.
> > > > >
> > > > > In the Blacklight app I am working on, we want to enrich the
> display
> > > of a
> > > > > document with the titles of its members and related objects using a
> > > > > subquery. However, this is out first foray into subqueries and
> things
> > > > > aren't working as expected.
> > > > >
> > > > > I expected the following query to return a "members" key with a
> > > document
> > > > > list of documents with "id" and "title" keys, but I am getting
> > nothing:
> > > > >
> > > > > {
> > > > >   "responseHeader": {
> > > > > "status": 0,
> > > > > "QTime": 1,
> > > > > "params": {
> > > > >   "q": "id:\"https://fcrepolocal/fcrepo/rest/pcdm/19/31/3c/1a/
> > > > > 19313c1a-6ab4-4305-93ec-12dfdf01ba74\"",
> > > > >   "indent": "true",
> > > > >   "fl": "members:[subquery]",
> > > > >   "members.fl": "id,title",
> > > > >   "members.q": "{!terms f=id v=$row.pcdm_members}",
> > > > >   "wt": "json",
> > > > >   "_": "1483641932207"
> > > > > }
> > > > >   },
> > > > >   "response": {
> > > > > "numFound": 1,
> > > > > "start": 0,
> > > > > "docs": [
> > > > >   {}
> > > > > ]
> > > > >   }
> > > > > }
> > > > >
> > > > > Any pointers on what I am missing? Are there any configuration
> > settings
>

Re: Subqueries

2017-01-06 Thread Peter Matthew Eichman

Thanks, we will look into the feasibility of a Solr upgrade. If not, is
there anything in 4.10 that would allow us to do something similar, or
would we be stuck with denormalizing our data at index time?

-Peter

On Fri, Jan 6, 2017 at 4:03 PM, Mikhail Khludnev  wrote:

> https://issues.apache.org/jira/browse/SOLR-8208 is resolved for 6.1.
> I don't know why 4.10 didn't throw exception on referring to [subquery],
> which is absent there.
>
> On Fri, Jan 6, 2017 at 6:23 PM, Peter Matthew Eichman 
> wrote:
>
> > Hi Mikhail,
> >
> > I've turned on DEBUG level logging, but I still only see the main request
> > logged, and no requests for the subqueries.
> >
> > Could it be a version issue? We are running Solr 4.10.
> >
> > Thanks,
> > -Peter
> >
> > On Fri, Jan 6, 2017 at 1:56 AM, Mikhail Khludnev 
> wrote:
> >
> > > Peter,
> > > Subquery should also log its' request. Can't you find it in log?
> > >
> > > On Fri, Jan 6, 2017 at 1:19 AM, Peter Matthew Eichman <
> peich...@umd.edu>
> > > wrote:
> > >
> > > > Hello Mikhail,
> > > >
> > > > I put pcdm_members into the fl, and it is definitely stored. I tried
> > > adding
> > > > the logParamsList, but all I see in the log is
> > > > 183866104 [qtp1778535015-14] INFO  org.apache.solr.core.SolrCore  –
> > > > [fedora4] webapp=/solr path=/select params={q=id:"https://
> > > > fcrepolocal/fcrepo/rest/pcdm/19/31/3c/1a/19313c1a-6ab4-
> > > > 4305-93ec-12dfdf01ba74"&members.logParamsList=q,fl,
> > > > rows,row.pcdm_members&indent=true&fl=members:[subquery]&
> > > > members.fl=id,title&members.q={!terms+f%3Did+v%3D$row.pcdm_
> > > > members}&wt=json&_=1483654385162} hits=1 status=0 QTime=0
> > > >
> > > > Still getting no members key in the output:
> > > >
> > > > {
> > > >   "responseHeader": {
> > > > "status": 0,
> > > > "QTime": 1,
> > > > "params": {
> > > >   "q": "id:\"https://fcrepolocal/fcrepo/rest/pcdm/19/31/3c/1a/
> > > > 19313c1a-6ab4-4305-93ec-12dfdf01ba74\"",
> > > >   "members.logParamsList": "q,fl,rows,row.pcdm_members",
> > > >   "indent": "true",
> > > >   "fl": "pcdm_members,members:[subquery]",
> > > >   "members.fl": "id,title",
> > > >   "members.q": "{!terms f=id v=$row.pcdm_members}",
> > > >   "wt": "json",
> > > >   "_": "1483654538166"
> > > > }
> > > >   },
> > > >   "response": {
> > > > "numFound": 1,
> > > > "start": 0,
> > > > "docs": [
> > > >   {
> > > > "pcdm_members": [
> > > >   "https://fcrepolocal/fcrepo/rest/pcdm/28/2e/5b/f5/
> > > > 282e5bf5-74c8-4148-9c1a-4ebead6435cb",
> > > >   "https://fcrepolocal/fcrepo/rest/pcdm/6e/7c/36/2f/
> > > > 6e7c362f-d239-4534-abd7-28caa24a134c",
> > > >   "https://fcrepolocal/fcrepo/rest/pcdm/6e/e3/a6/33/
> > > > 6ee3a633-998e-4f36-b80f-d76bcbe0d352",
> > > >   "https://fcrepolocal/fcrepo/rest/pcdm/8a/d9/c7/62/
> > > > 8ad9c762-4391-428d-b1ad-be5ac3e06c42"
> > > > ]
> > > >   }
> > > > ]
> > > >   }
> > > > }
> > > >
> > > > Is $row.pcdm_members the right way to refer to the pcdm_members field
> > > > of the current document in the subquery? Is the multivalued nature of
> > > > the field a problem? I have tried adding separator=' ' to both the
> > > > [subquery] and {!terms}, but to no avail.
> > > >
> > > > Thanks,
> > > > -Peter
> > > >
> > > > On Thu, Jan 5, 2017 at 4:38 PM, Mikhail Khludnev 
> > > wrote:
> > > >
> > > > > Hello,
> > > > >
> > > > > Can you add pcdm_members into fl to make sure it's stored?
> > > > > Also please add the following param
> > > > > members.logParamsList=q,fl,rows,row.pcdm_members,
> > > > > and check logs then.
> > > > >
> > > > > On Thu, Jan 5, 2017 at 9:46 PM, Peter Matthew Eichman <
> > > peich...@umd.edu>
> > > > > wrote:
> > > > >
> > > > > > Hello all,
> > > > > >
> > > > > > I am attempting to use a subquery to enrich a query with the
> titles
> > > of
> > > > > > related objects. Each document in my index may have 1 or more
> > > > > pcdm_members
> > > > > > and pcdm_related_objects fields, whose values are ids of other
> > > > documents
> > > > > in
> > > > > > the index. Those documents in turn have reciprocal pcdm_member_of
> > and
> > > > > > pcdm_related_object_of fields.
> > > > > >
> > > > > > In the Blacklight app I am working on, we want to enrich the
> > display
> > > > of a
> > > > > > document with the titles of its members and related objects
> using a
> > > > > > subquery. However, this is out first foray into subqueries and
> > things
> > > > > > aren't working as expected.
> > > > > >
> > > > > > I expected the following query to return a "members" key with a
> > > > document
> > > > > > list of documents with "id" and "title" keys, but I am getting
> > > nothing:
> > > > > >
> > > > > > {
> > > > > >   "responseHeader": {
> > > > > > "status": 0,
> > > > > > "QTime": 1,
> > > > > > "params": {
> > > > > >   "q": "id:\"https://fcrepolocal/
> fcrepo/rest/pcdm/19/31/3c/1a/
> > > > > > 19313c1a-6ab4-4305-93ec-12d

Re: MLT Performance Degraded Between 4.6.1 and 5.5.2 Solr

2017-01-06 Thread Ivan Provalov

After some more digging, I narrowed it down to filtering.  Without any filters, 
the MLT is back to it's normal performance (8ms average response time for our 
case).  The issue goes away with 6.0 upgrade. 
The hot method is Lucene's DisiPriorityQueue downHeap(), which takes 5X more 
calls in 5.5.2 compared to 6.0.  I am guessing that some of the Solr filters 
refactoring fixed it for 6.0 release.  I am not sure which.  
As a work-around, for now I just refactored the custom MLT handler to convert 
the filters into boolean clauses, which takes care of the issue.  
Any insights into why this is happening in Solr 5.5.2?
 Our configuration:
1. mlt.maxqt=1002. There is an additional filter passed as a parameter3. 4. text_en is a pretty standard text 
fieldType.
Thanks,
Ivan
 

On Monday, October 31, 2016 5:10 PM, Ivan Provalov  
wrote:
 

 I noticed a 3X performance degradation for MoreLikeThis between 4.6.1 and 
5.5.2.  Our configuration: 
   
where text_en is a pretty standard text fieldType.
Any pointers?
Thanks,
Ivan Provalov

Re: Subqueries

2017-01-06 Thread Mikhail Khludnev

Denormalising works on small numbers, but it hits ceil quite soon, because
it scales hard.
People did such snippet enrichment in apps for ages. There is nothing
special in it.
Probably someone can port it to 4.10 as a plugin.

On Sat, Jan 7, 2017 at 12:08 AM, Peter Matthew Eichman 
wrote:

> Thanks, we will look into the feasibility of a Solr upgrade. If not, is
> there anything in 4.10 that would allow us to do something similar, or
> would we be stuck with denormalizing our data at index time?
>
> -Peter
>
> On Fri, Jan 6, 2017 at 4:03 PM, Mikhail Khludnev  wrote:
>
> > https://issues.apache.org/jira/browse/SOLR-8208 is resolved for 6.1.
> > I don't know why 4.10 didn't throw exception on referring to [subquery],
> > which is absent there.
> >
> > On Fri, Jan 6, 2017 at 6:23 PM, Peter Matthew Eichman 
> > wrote:
> >
> > > Hi Mikhail,
> > >
> > > I've turned on DEBUG level logging, but I still only see the main
> request
> > > logged, and no requests for the subqueries.
> > >
> > > Could it be a version issue? We are running Solr 4.10.
> > >
> > > Thanks,
> > > -Peter
> > >
> > > On Fri, Jan 6, 2017 at 1:56 AM, Mikhail Khludnev 
> > wrote:
> > >
> > > > Peter,
> > > > Subquery should also log its' request. Can't you find it in log?
> > > >
> > > > On Fri, Jan 6, 2017 at 1:19 AM, Peter Matthew Eichman <
> > peich...@umd.edu>
> > > > wrote:
> > > >
> > > > > Hello Mikhail,
> > > > >
> > > > > I put pcdm_members into the fl, and it is definitely stored. I
> tried
> > > > adding
> > > > > the logParamsList, but all I see in the log is
> > > > > 183866104 [qtp1778535015-14] INFO  org.apache.solr.core.SolrCore  –
> > > > > [fedora4] webapp=/solr path=/select params={q=id:"https://
> > > > > fcrepolocal/fcrepo/rest/pcdm/19/31/3c/1a/19313c1a-6ab4-
> > > > > 4305-93ec-12dfdf01ba74"&members.logParamsList=q,fl,
> > > > > rows,row.pcdm_members&indent=true&fl=members:[subquery]&
> > > > > members.fl=id,title&members.q={!terms+f%3Did+v%3D$row.pcdm_
> > > > > members}&wt=json&_=1483654385162} hits=1 status=0 QTime=0
> > > > >
> > > > > Still getting no members key in the output:
> > > > >
> > > > > {
> > > > >   "responseHeader": {
> > > > > "status": 0,
> > > > > "QTime": 1,
> > > > > "params": {
> > > > >   "q": "id:\"https://fcrepolocal/fcrepo/rest/pcdm/19/31/3c/1a/
> > > > > 19313c1a-6ab4-4305-93ec-12dfdf01ba74\"",
> > > > >   "members.logParamsList": "q,fl,rows,row.pcdm_members",
> > > > >   "indent": "true",
> > > > >   "fl": "pcdm_members,members:[subquery]",
> > > > >   "members.fl": "id,title",
> > > > >   "members.q": "{!terms f=id v=$row.pcdm_members}",
> > > > >   "wt": "json",
> > > > >   "_": "1483654538166"
> > > > > }
> > > > >   },
> > > > >   "response": {
> > > > > "numFound": 1,
> > > > > "start": 0,
> > > > > "docs": [
> > > > >   {
> > > > > "pcdm_members": [
> > > > >   "https://fcrepolocal/fcrepo/rest/pcdm/28/2e/5b/f5/
> > > > > 282e5bf5-74c8-4148-9c1a-4ebead6435cb",
> > > > >   "https://fcrepolocal/fcrepo/rest/pcdm/6e/7c/36/2f/
> > > > > 6e7c362f-d239-4534-abd7-28caa24a134c",
> > > > >   "https://fcrepolocal/fcrepo/rest/pcdm/6e/e3/a6/33/
> > > > > 6ee3a633-998e-4f36-b80f-d76bcbe0d352",
> > > > >   "https://fcrepolocal/fcrepo/rest/pcdm/8a/d9/c7/62/
> > > > > 8ad9c762-4391-428d-b1ad-be5ac3e06c42"
> > > > > ]
> > > > >   }
> > > > > ]
> > > > >   }
> > > > > }
> > > > >
> > > > > Is $row.pcdm_members the right way to refer to the pcdm_members
> field
> > > > > of the current document in the subquery? Is the multivalued nature
> of
> > > > > the field a problem? I have tried adding separator=' ' to both the
> > > > > [subquery] and {!terms}, but to no avail.
> > > > >
> > > > > Thanks,
> > > > > -Peter
> > > > >
> > > > > On Thu, Jan 5, 2017 at 4:38 PM, Mikhail Khludnev 
> > > > wrote:
> > > > >
> > > > > > Hello,
> > > > > >
> > > > > > Can you add pcdm_members into fl to make sure it's stored?
> > > > > > Also please add the following param
> > > > > > members.logParamsList=q,fl,rows,row.pcdm_members,
> > > > > > and check logs then.
> > > > > >
> > > > > > On Thu, Jan 5, 2017 at 9:46 PM, Peter Matthew Eichman <
> > > > peich...@umd.edu>
> > > > > > wrote:
> > > > > >
> > > > > > > Hello all,
> > > > > > >
> > > > > > > I am attempting to use a subquery to enrich a query with the
> > titles
> > > > of
> > > > > > > related objects. Each document in my index may have 1 or more
> > > > > > pcdm_members
> > > > > > > and pcdm_related_objects fields, whose values are ids of other
> > > > > documents
> > > > > > in
> > > > > > > the index. Those documents in turn have reciprocal
> pcdm_member_of
> > > and
> > > > > > > pcdm_related_object_of fields.
> > > > > > >
> > > > > > > In the Blacklight app I am working on, we want to enrich the
> > > display
> > > > > of a
> > > > > > > document with the titles of its members and related objects
> > using a
> > > > > > > subquery. However, th

Shards not working with basic authentication

2017-01-06 Thread Reagan Philip

Hello!We are on Solr 5.4.1 and basic authentication is enabled.We are trying
to use shards to combine 2 Solr cores but getting a "401 Unauthorized" error
while querying from the admin UI. Basically internal connection errors out
with shards. Also tried using "shardcredentials" as mentioned here
https://issues.apache.org/jira/browse/SOLR-1861, but no luck.Below is the
security config in webdefault.xml;  Solr authenticated application 
/*  admin  Please help. Thanks in advance!



--
View this message in context: 
http://lucene.472066.n3.nabble.com/Shards-not-working-with-basic-authentication-tp4312824.html
Sent from the Solr - User mailing list archive at Nabble.com.

use psuedo-field in json.facet api

2017-01-06 Thread radha krishnan

Hi,

can we use a psuedo-field in json.facet api ? is there any example like the
one below ?

json.facet={
my_histogram: {
type: terms,
field: new_field:my_function(my_solr_field, "return_as_integer")
}
}



Thanks,
Radhakrishnan D

Re: How to train the model using user clicks when use ltr(learning to rank) module?

2017-01-06 Thread Will Martin

ah. very nice Diego. Thanks.

On 1/6/2017 1:52 PM, Diego Ceccarelli (BLOOMBERG/ LONDON) wrote:

Hi Jeffery,
I submitted a patch to the README of the learning to rank example folder, 
trying to explain better how to produce a training set given a log with 
interaction data.

Patch is available here: https://issues.apache.org/jira/browse/SOLR-9929
And you can see the new version of the README here:  
https://github.com/bloomberg/lucene-solr/blob/master-ltr/solr/contrib/ltr/example/README.md

Please let me know if you have comments or more questions.
Cheers
Diego


From: solr-user@lucene.apache.org At: 
01/06/17 03:57:29
To: solr-user@lucene.apache.org
Subject: Re: How to train the model using user clicks when use ltr(learning to 
rank) module?

In the Assemble training data part: the third column indicates the relative
importance or relevance of that doc
Could you please give more info about how to give a score based on what user
clicks?

Hi Jeffery,

Give your questions more detail and there may be more feedback; just a 
suggestion.
About above,

some examples of assigning "relative" weighting to training data
user click info gathered (all assumed but similar to omniture monitoring)
- position in the result list
- above/below the fold
- result page number
As a information engineer, you might see 2 attributes here: a) user 
perseverance b) effort to find the result

From there, the attributes have a correlation relationship that is not 
linear and directly proportional I think:
easy to find outweighs user perseverance every time because it 
reduces the need for such
 extensive perseverance, page #3 for example, doesn't mitigate 
effort, it drives effort  towards lower user perseverance need value pairs.
Ok. That is damn confusing. But its what I would want to do, use the pair 
in a manner that reranks a document as if the perseverance and effort were 
balanced and positioned ... "relative" to the other training data. What that 
equation is, will take some more effort

i'm not sure this response is helpful at all, but i'm going to go with it 
because I recognize all of it from AOL, Microsoft and Comcast work. Before the 
days of ML in Search.

On 1/5/2017 3:33 PM, Jeffery Yuan wrote:

Thanks , Will Martin.

I checked the pdf it's great. but seems not very useful for my question: How
to train the model using user clicks when use ltr(learning to rank) module.

I know the concept after reading these papers. But still not sure how to
code them.


--
View this message in context: 
http://lucene.472066.n3.nabble.com/How-to-train-the-model-using-user-clicks-when-use-ltr-learning-to-rank-module-tp4312462p4312592.html
Sent from the Solr - User mailing list archive at Nabble.com.

Solr 5.2+ using SSL and non-SSL ports

2017-01-06 Thread dhelm

Previously I have configured Solr 4.x deployments with both SSL (https) and
non-SSL (http) via Jetty configurations.  I know the way to configure SSL in
Solr 5.2+ has changed.  I followed these instructions and was able to
successfully configure a standalone Solr instance for SSL on port 8984: 

https://cwiki.apache.org/confluence/display/solr/Enabling+SSL

But I am curious if there is a way to run Solr 5.2+ with both SSL (on port
8984 for example) and non-SSL (on standard point 8983).  I have not come
across instructions saying how this can be done.

Thanks




--
View this message in context: 
http://lucene.472066.n3.nabble.com/Solr-5-2-using-SSL-and-non-SSL-ports-tp4312859.html
Sent from the Solr - User mailing list archive at Nabble.com.

Re: Commit required after delete ?

Re: CDCR logging is Needlessly verbose, fills up the file system fast

Strange Index not mutable error

Re: Subqueries

Re: CDCR logging is Needlessly verbose, fills up the file system fast

Re: SolrCloud different score for same document on different replicas.

Re: error during running my code java.lang.VerifyError: Bad type on operand stack

Re: Need help for this scenario

Re: error during running my code java.lang.VerifyError: Bad type on operand stack

CDCR How to recover from Corrupted transaction log

Re: CDCR How to recover from Corrupted transaction log

Re: CDCR How to recover from Corrupted transaction log

Re: How to train the model using user clicks when use ltr(learning to rank) module?

Re: Empty facets on TextField

Re: Strange Index not mutable error

Re: Subqueries

Re: Subqueries

Re: MLT Performance Degraded Between 4.6.1 and 5.5.2 Solr

Re: Subqueries

Shards not working with basic authentication

use psuedo-field in json.facet api

Re: How to train the model using user clicks when use ltr(learning to rank) module?

Solr 5.2+ using SSL and non-SSL ports

23 matches

Site Navigation

Mail list logo

Footer information