Re: Backup not working

2017-04-21 Thread vrindavda
I realized that Segments_1 is getting created in Shard2 and Segments_2 in
Shard1.

Backup API is looking for Segments_1 in Shard1. Please correct if I have
configured something wrongly. I have created collection using collection API
and am using data_driven_schema_configs configs.



--
View this message in context: 
http://lucene.472066.n3.nabble.com/Backup-not-working-tp4331094p4331172.html
Sent from the Solr - User mailing list archive at Nabble.com.


Graph traversel

2017-04-21 Thread Ganesh M
Hi

I am trying graph traversal based on the documentation available over here

http://solr.pl/en/2016/04/18/solr-6-0-and-graph-traversal-support/

But the it's not working as expected.

For this query

http://localhost:8983/solr/graph/query?q=*:*&fq={!graph%20from=parent_id%20to=id}id:1

( which is to get all node getting traversed via node 1 )

I get the result as
"docs":[
  {
"id":"1"},
  {
"id":"11"},
  {
"id":"12"},
  {
"id":"13"},
  {
"id":"122"}]

Where as I expect result as 1,11,12,13,121, 122, 131.

What's going wrong ?

Can any body help us on this ?

Is the graph traversal stable enough in SOLR 6.5 ?

Regards,
Ganesh















RE: DistributedUpdateProcessorFactory was explicitly disabled from this updateRequestProcessorChain

2017-04-21 Thread alessandro.benedetti
Let's make a quick differentiation between PRE and POST processors in a Solr
Cloud atchitecture :

 "In a single node, stand-alone Solr, each update is run through all the
update processors in a chain exactly once. But the behavior of update
request processors in SolrCloud deserves special consideration. " cit. wiki

*PRE PROCESSORS*
All the processors defined BEFORE the distributedUpdateProcessor happen ONLY
on the first node that receive the update ( regardless if it is a leader or
a replica ).

*POST PROCESSORS*
The distributedUpdateProcessor will forward the update request to the the
correct leader ( or multiple leaders if the request involves more shards),
the leader will then forward to the replicas.
The leaders and replicas at this point will execute all the update request
processors defined AFTER the distributedUpdateProcessor.

" Pre-processors and Atomic Updates
Because DistributedUpdateProcessor is responsible for processing Atomic
Updates into full documents on the leader node, this means that
pre-processors which are executed only on the forwarding nodes can only
operate on the partial document. If you have a processor which must process
a full document then the only choice is to specify it as a post-processor."
wiki

In your example, your chain is definitely messed up, the order is important
and you want your heavy processing to happen only on the first node.

For better info and clarification:
https://cwiki.apache.org/confluence/display/solr/Schemaless+Mode ( you can
find here a working alternative to your chain)
https://cwiki.apache.org/confluence/display/solr/Update+Request+Processors



-
---
Alessandro Benedetti
Search Consultant, R&D Software Engineer, Director
Sease Ltd. - www.sease.io
--
View this message in context: 
http://lucene.472066.n3.nabble.com/DistributedUpdateProcessorFactory-was-explicitly-disabled-from-this-updateRequestProcessorChain-tp4319154p4331215.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: extract multi-features for one solr feature extractor in solr learning to rank

2017-04-21 Thread alessandro.benedetti
Hi Jianxiong, this is definitely interesting.
Briefly reviewing the paper you linked the use case seems clear :
You want similar "family" of features, to be calculated on each field.
Let's take as example the TF feature, you may want to define in the
features.json only one feature including all the fields involved :

{ 
"store" : "MyFeatureStore", 
"name" : "query_term_frequency", 
"class" : "com.apache.solr.ltr.feature.TermCountFeature", 
"params" : { 
   "fields" : ["field1","field2","field3"], 
   "terms" : "${user_terms}"
} 

And then under the hood you would like this feature to be translated to N
features in the feature vector .

You have few solutions here :

1) out of the box, when you create the features.json, you do it
programmatically, your client app takes in input a simplified features.json
and it extends it automatically based on your custom config ( i was using
this approach to encode categorical features in N binary features)

2) you dive deep into the code and you add this flexibility to the plugin,
this will involve a modification in how currently the feature vector is
generated.

Cheers



-
---
Alessandro Benedetti
Search Consultant, R&D Software Engineer, Director
Sease Ltd. - www.sease.io
--
View this message in context: 
http://lucene.472066.n3.nabble.com/extract-multi-features-for-one-solr-feature-extractor-in-solr-learning-to-rank-tp4330058p4331217.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: prefix facet performance

2017-04-21 Thread alessandro.benedetti
Hi Maria,
If you have 100-500.000 unique values for the field you are interested in,
and the cardinality of your search results is actually quite small in
comparison, I am not that sure term enum will help you that much ...

To simplify, with the term enum approach, you iterate over each unique
value, if it matches the prefix and then you count the intersection of the
result set with the posting list for that term.
In your case, your result set is likely to be much smaller than the number
of unique values.
I would assume you are using the fc approach, which in my opinion was not a
bad idea.
Let's start from the algorithm you are using and the schema config for your
field,

Cheers



-
---
Alessandro Benedetti
Search Consultant, R&D Software Engineer, Director
Sease Ltd. - www.sease.io
--
View this message in context: 
http://lucene.472066.n3.nabble.com/prefix-facet-performance-tp4330684p4331221.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: Graph traversel

2017-04-21 Thread Ganesh M
I also tried with the sample data mentioned in this link.

https://cwiki.apache.org/confluence/display/solr/Other+Parsers#OtherParsers-GraphQueryParser

even for that, after loading the data and for the query

http://localhost:8983/solr/graph/query?q={!graph%20from=in_edge%20to=out_edge}id:A&fl=id

I got the response as


{
  "responseHeader":{
"zkConnected":true,
"status":0,
"QTime":7,
"params":{
  "q":"{!graph from=in_edge to=out_edge}id:A",
  "fl":"id"}},
  "response":{"numFound":1,"start":0,"maxScore":1.0,"docs":[
  {
"id":"A"}]
  }}

instead of

"response":{"numFound":6,"start":0,"docs":[
   { "id":"A" },
   { "id":"B" },
   { "id":"C" },
   { "id":"D" },
   { "id":"E" },
   { "id":"F" } ]
}




Is any settings to enable graph traversal has to be done ?

Kindly let me know

Regards,

On Fri, Apr 21, 2017 at 1:20 PM Ganesh M 
mailto:mgane...@live.in>> wrote:
Hi

I am trying graph traversal based on the documentation available over here

http://solr.pl/en/2016/04/18/solr-6-0-and-graph-traversal-support/

But the it's not working as expected.

For this query

http://localhost:8983/solr/graph/query?q=*:*&fq={!graph%20from=parent_id%20to=id}id:1

( which is to get all node getting traversed via node 1 )

I get the result as
"docs":[
  {
"id":"1"},
  {
"id":"11"},
  {
"id":"12"},
  {
"id":"13"},
  {
"id":"122"}]

Where as I expect result as 1,11,12,13,121, 122, 131.

What's going wrong ?

Can any body help us on this ?

Is the graph traversal stable enough in SOLR 6.5 ?

Regards,
Ganesh















Re: Update schema.xml without restarting Solr?

2017-04-21 Thread Lingeshm
Hello Team ,

I can’t change the schema name of an existing index.

I want to change the schema name from “schemaV1" to “schemaV2 for one of the
existing index

curl -XPUT http://localhost:8098/search/index/my_idx-H "Content-Type:
application/json" -d '{"schema":"schemaV2"}'

the funny part is curl is not returning anything such either error or
success. I expect the next GET to return {"schema":" schemaV2"} but it still
returns {"schema":" schemaV1"}

I am using solr 4.10.4 version.

Have you faced any such issue ??

Regards
Lingesh M



--
View this message in context: 
http://lucene.472066.n3.nabble.com/Update-schema-xml-without-restarting-Solr-tp484263p4331225.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: Advice on how to work with pure JSON data.

2017-04-21 Thread Mikhail Khludnev
Hello,
See below.

On Fri, Apr 21, 2017 at 8:21 AM,  wrote:

> One thing I forgot to mention in my original post is that I wish to do
> this using the SolrJ client.
> I have my own rest server that presents a common API to our users, but the
> back-end can be
> anything I wish. I have been using "that other Lucene based product" :),
> but I wish to stick to
> a product that is more open and that perhaps I can contribute to.
>
> I've searched for SolrJ examples for child documents and unfortunately
> there are far too
> many references to implementations based off of older versions of Solr.
> Specifically, I would
> like to insert beans with multiple child collections in them, but the
> latest I've read says this
> is not currently possible. Is that still true?
>
Right. That how it was done at SOLR-1945
Now it throws cannot have more than one Field with child=true

>
> In short, It isn't so important that REST based requests / responses from
> Solr are pure JSON
> so long as I can do what I want from the java client.
>
> Do you know if there have been recent additions / enhancements up through
> 6.5 that make
> this more straight-forward?
>
Nothing new there.


>
> Thanks
>
>
> - Original Message -
>
> From: "Mikhail Khludnev" 
> To: "solr-user" 
> Sent: Thursday, April 20, 2017 3:38:11 PM
> Subject: Re: Advice on how to work with pure JSON data.
>
> This is one of the features of the epic
> https://issues.apache.org/jira/browse/SOLR-10144.
> Until it's done the only way to achieve this is to properly set many params
> for
> https://cwiki.apache.org/confluence/display/solr/
> Transforming+Result+Documents#TransformingResultDocuments-[subquery]
>
> Note, here I assume that children mapping is static ie there is a limited
> list of optional scopes.
> Indexing and searching arbitrary JSON is esoteric (XML DB like) problem.
> Also, beware of https://issues.apache.org/jira/browse/SOLR-10500. I hope
> to
> fix it soon.
>
> On Thu, Apr 20, 2017 at 10:15 PM,  wrote:
>
> >
> > I have looked at many examples on how to do what I want, but they tend to
> > only show fragments or they
> > are based on older versions of Solr. I'm hoping there are new features
> > that make what I'm doing easier.
> >
> > I am running version 6.5 and am testing by running in cloud mode but only
> > on a single machine.
> >
> > Basically, I have a large number of documents stored as JSON in
> individual
> > files. I want to take that JSON
> > document and index it without having to do any pre-processing, etc. I
> also
> > need to be able to write newly indexed
> > JSON data back to individual files in the same format.
> >
> > For example, let's say I have a json document that looks like the
> > following:
> >
> > {
> > "id" : "bb903493-55b0-421f-a83e-2199ea11e136",
> > "productName_s" : "UsefulWidget",
> > "productCategory_s" : "tool",
> > "suppliers" : [
> > {
> > "id" : " bb903493-55b0-421f-a83e-2199ea11e221",
> > "name_s" : "Acme Tools",
> > "productNumber_s" : "10342UW"
> > }, {
> > "id" : " bb903493-55b0-421a-a83e-2199ea11e445",
> > "name_s" : "Snappy Tools",
> > "productNumber_s" : "ST-X100023"
> > }
> > ],
> > "resellers" : [
> > {
> > "id" : "cc 903493-55b0-421f-a83e-2199ea11e221",
> > "name_s" : "Target",
> > "productSKU_s" : "TA092310342UW"
> > }, {
> > "id" : "bc903493-55b0-421a-a83e-2199ea11e445",
> > "name_s" : "Wal-Mart",
> > "productSKU_s" : "029342ABLSWM"
> > }
> > ]
> > }
> >
> > I know I can use the /update/json/docs handler to insert the above but
> > from what I understand, I'd have to set up parameters
> > telling it how to split the children, etc. Though that is a bit of a
> pain,
> > I can make that happen.
> >
> > The problem is that, when I then try to query for the data, it comes back
> > with _childDocuments_ instead of the names of the
> > child document lists. So, how can I have Solr return the document as it
> > was originally indexed (I know it would be embedded
> > in the results structure, but I can deal with that)?
> >
> > I am running version 6.5 and I am hoping there is a method I haven't seen
> > documented that can do this. If not, can someone
> > point me to some examples of how to do this another way.
> >
> > If there is no easy way to do this with the current version, can someone
> > point me to a good resource for writing my own
> > handlers?
> >
> > Thank you.
> >
> >
> >
> >
> >
> >
> >
> >
> >
>
>
> --
> Sincerely yours
> Mikhail Khludnev
>
>


-- 
Sincerely yours
Mikhail Khludnev


Re: Update schema.xml without restarting Solr?

2017-04-21 Thread Mikhail Khludnev
> the funny part is curl is not returning anything such either error or
success

can you add -v or so to curl, to see http status code at least?

On Fri, Apr 21, 2017 at 11:20 AM, Lingeshm  wrote:

> Hello Team ,
>
> I can’t change the schema name of an existing index.
>
> I want to change the schema name from “schemaV1" to “schemaV2 for one of
> the
> existing index
>
> curl -XPUT http://localhost:8098/search/index/my_idx-H "Content-Type:
> application/json" -d '{"schema":"schemaV2"}'
>
> the funny part is curl is not returning anything such either error or
> success. I expect the next GET to return {"schema":" schemaV2"} but it
> still
> returns {"schema":" schemaV1"}
>
> I am using solr 4.10.4 version.
>
> Have you faced any such issue ??
>
> Regards
> Lingesh M
>
>
>
> --
> View this message in context: http://lucene.472066.n3.
> nabble.com/Update-schema-xml-without-restarting-Solr-tp484263p4331225.html
> Sent from the Solr - User mailing list archive at Nabble.com.
>



-- 
Sincerely yours
Mikhail Khludnev


Re: Running Solr6 on Tomcat7

2017-04-21 Thread ankur.168
As Shawn said, it is not recommended, still if you want to do this you can
follow these steps(picked from following post
http://lucene.472066.n3.nabble.com/Running-Solr-6-3-on-Tomcat-Help-Please-td4320874.html)

The following instructions work with Solr 6.2 + Tomcat 8.5: 
1. Copy solr-6.2.0/server/solr-webapp/webapp directory to tomcat/webapps 
and rename it to 'solr'. 
2. Copy solr-6.2.0/server/lib/ext/*.jar and 
solr-6.2.0/dist/solr-dataimporthandler-*.jar to solr/WEB-INF/lib 
3. Uncomment env-entry for solr/home in web.xml and set the value to 
solr-6.2.0/server/solr 
4. Copy solr-6.2.0/server/WEB-INF/resources/log4j.properties to 
solr/WEB-INF/classes 

I have tried this and was able to use basic admin ui functions, havn't tried
any 6.2+  versions with this.



--
View this message in context: 
http://lucene.472066.n3.nabble.com/Running-Solr6-on-Tomcat7-tp4330500p4331256.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: Update schema.xml without restarting Solr?

2017-04-21 Thread Shawn Heisey
 On 4/21/2017 2:20 AM, Lingeshm wrote:
> I can’t change the schema name of an existing index.
>
> I want to change the schema name from “schemaV1" to “schemaV2 for one of the
> existing index
>
> curl -XPUT http://localhost:8098/search/index/my_idx-H "Content-Type:
> application/json" -d '{"schema":"schemaV2"}'
>
> the funny part is curl is not returning anything such either error or
> success. I expect the next GET to return {"schema":" schemaV2"} but it still
> returns {"schema":" schemaV1"}
>
> I am using solr 4.10.4 version.

I have never seen a URL path like "/search/index/my_idx-H" for Solr. 
Typically any URL path for Solr will start with "/solr".  With a 4.x
version, you might have changed the context path in the container, in
newer versions the admin UI will break if that is changed.  If you are
attempting to use the Schema API, then your URL path would contain
"/schema" somewhere, which is missing.

Can you point to a location in the Solr documentation that describes
what you are trying to do?

I have never seen anything like the JSON that you are sending.  The
Schema API documentation for version 6.6 (not yet released) doesn't
mention any ability to change the name.  Such a change would cause no
difference in how the schema works even if it were possible -- it's
cosmetic, and something somebody who uses Solr will never see.

Thanks,
Shawn



Overseer session expires on multiple collection creation

2017-04-21 Thread apoorvqwerty
Hi, 
I am trying to create multiple collections with 2 shards and 2 replications
each.
After 5-6 successful 
overseer status response for 5 creations shows 40k requests for
collection_operations=>am_i_leader which is a bit odd. 
and I get 
Am I not supposed to create 8-10 collections one after the other or is there
some configuration that I'm missing.
On creation of 8th collection I get following overseer session expired
exception 

nExpiredException: KeeperErrorCode = Session expired for
/overseer/collection-queue-work/qnr-24
at org.apache.zookeeper.KeeperException.create(KeeperException.java:127)
at org.apache.zookeeper.KeeperException.create(KeeperException.java:51)
at org.apache.zookeeper.ZooKeeper.exists(ZooKeeper.java:1045)
at
org.apache.solr.common.cloud.SolrZkClient$5.execute(SolrZkClient.java:322)
at
org.apache.solr.common.cloud.SolrZkClient$5.execute(SolrZkClient.java:319)
at
org.apache.solr.common.cloud.ZkCmdExecutor.retryOperation(ZkCmdExecutor.java:60)
at 
org.apache.solr.common.cloud.SolrZkClient.exists(SolrZkClient.java:319)
at
org.apache.solr.cloud.OverseerTaskQueue.remove(OverseerTaskQueue.java:93)
at
org.apache.solr.cloud.OverseerTaskProcessor$Runner.markTaskComplete(OverseerTaskProcessor.java:525)
at
org.apache.solr.cloud.OverseerTaskProcessor$Runner.run(OverseerTaskProcessor.java:483)
at
org.apache.solr.common.util.ExecutorUtil$MDCAwareThreadPoolExecutor.lambda$execute$0(ExecutorUtil.java:229)
at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)






--
View this message in context: 
http://lucene.472066.n3.nabble.com/Overseer-session-expires-on-multiple-collection-creation-tp4331265.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: Update schema.xml without restarting Solr?

2017-04-21 Thread Alexandre Rafalovitch
I would say that all this points at existence of middle-ware in front
of Solr. Therefore, the next action would be to identify the
middle-ware and ask this question on _their_ mailing list.

Regards,
   Alex.

http://www.solr-start.com/ - Resources for Solr users, new and experienced


On 21 April 2017 at 08:38, Shawn Heisey  wrote:
>  On 4/21/2017 2:20 AM, Lingeshm wrote:
>> I can’t change the schema name of an existing index.
>>
>> I want to change the schema name from “schemaV1" to “schemaV2 for one of the
>> existing index
>>
>> curl -XPUT http://localhost:8098/search/index/my_idx-H "Content-Type:
>> application/json" -d '{"schema":"schemaV2"}'
>>
>> the funny part is curl is not returning anything such either error or
>> success. I expect the next GET to return {"schema":" schemaV2"} but it still
>> returns {"schema":" schemaV1"}
>>
>> I am using solr 4.10.4 version.
>
> I have never seen a URL path like "/search/index/my_idx-H" for Solr.
> Typically any URL path for Solr will start with "/solr".  With a 4.x
> version, you might have changed the context path in the container, in
> newer versions the admin UI will break if that is changed.  If you are
> attempting to use the Schema API, then your URL path would contain
> "/schema" somewhere, which is missing.
>
> Can you point to a location in the Solr documentation that describes
> what you are trying to do?
>
> I have never seen anything like the JSON that you are sending.  The
> Schema API documentation for version 6.6 (not yet released) doesn't
> mention any ability to change the name.  Such a change would cause no
> difference in how the schema works even if it were possible -- it's
> cosmetic, and something somebody who uses Solr will never see.
>
> Thanks,
> Shawn
>


Re: Running Solr6 on Tomcat7

2017-04-21 Thread rgummadi
Thanks. I will try your steps.



--
View this message in context: 
http://lucene.472066.n3.nabble.com/Running-Solr6-on-Tomcat7-tp4330500p4331258.html
Sent from the Solr - User mailing list archive at Nabble.com.

Re: prefix facet performance

2017-04-21 Thread Maria Muslea
Actually using facet.method=enum made a HUGE difference even in my case
where I have many unique values. I am happy with the query response time
now.

Is there a way in SOLR to count the unique values for a field? If not, I
could run the reindexing and count the unique values while I add them to
give you a more accurate count of how many I have (there is a good chance
that I have more than 500K).

Thanks,
Maria

On Fri, Apr 21, 2017 at 1:16 AM, alessandro.benedetti 
wrote:

> Hi Maria,
> If you have 100-500.000 unique values for the field you are interested in,
> and the cardinality of your search results is actually quite small in
> comparison, I am not that sure term enum will help you that much ...
>
> To simplify, with the term enum approach, you iterate over each unique
> value, if it matches the prefix and then you count the intersection of the
> result set with the posting list for that term.
> In your case, your result set is likely to be much smaller than the number
> of unique values.
> I would assume you are using the fc approach, which in my opinion was not a
> bad idea.
> Let's start from the algorithm you are using and the schema config for your
> field,
>
> Cheers
>
>
>
> -
> ---
> Alessandro Benedetti
> Search Consultant, R&D Software Engineer, Director
> Sease Ltd. - www.sease.io
> --
> View this message in context: http://lucene.472066.n3.
> nabble.com/prefix-facet-performance-tp4330684p4331221.html
> Sent from the Solr - User mailing list archive at Nabble.com.
>


Re: HttpSolrServer commit is taking more time

2017-04-21 Thread Venkateswarlu Bommineni
Thanks for the reply.

I can see same configuration as given in mail in Solr configuration file
But I can see same performance issues while querying also through solrJ.

Thanks,
Venkat.

On 21 Apr 2017 9:30 am, "Shawn Heisey"  wrote:

> On 4/20/2017 9:23 PM, Venkateswarlu Bommineni wrote:
> > I am new to Solr so need your help in solving below issue.
> >
> > I am using SolrJ to add and commit the files into Solr.
> >
> > But Solr commit is taking a long time.
> >
> > for example: for 14000 records it is taking 4 min.
>
> Usually, extreme commit times like this have one of two causes:
>
> 1) The caches are very large and have a large autowarmCount.
>
> 2) The Solr heap is way too small, and the JVM is doing constant garbage
> collections.
>
> If queries are not having slowness issues, I would bet on option 1,
> although you may in fact be running into BOTH problems.
>
> This is what a typical example config looks like for one of the Solr
> caches:
>
>   size="512"
>  initialSize="512"
>  autowarmCount="0"/>
>
>
> This cache has a size of 512, but autowarmCount is zero.  This means
> that when a new searcher is created by a commit, none of the entries in
> the filterCache on the old searcher will make it to the cache on the new
> searcher.  If you change the autowarmCount value to say 4, then the top
> 4 filter queries in the cache will be re-executed on the new searcher,
> prepopulating the new cache with four entries.  If each of those four
> filters takes ten seconds to run, then warming that cache will take 40
> seconds.  I'm betting that somebody changed the autowarmCount values on
> the Solr caches to a high number in your configuration.If that's the
> case, lower the number and reload/restart.
>
> Thanks,
> Shawn
>
>


Re: Enable https for Solr

2017-04-21 Thread Steve Rowe
Hi Edwin,

See .

--
Steve
www.lucidworks.com

> On Apr 21, 2017, at 12:03 AM, Zheng Lin Edwin Yeo  
> wrote:
> 
> Hi,
> 
> I would like to find out, how can we allow Solr to accept secure
> connections via https?
> 
> I am using SolrCloud on Solr 6.4.2
> 
> Regards,
> Edwin



How to use Wordnet in solr?

2017-04-21 Thread Pablo Anzorena
Hey,

I'm planning to use Wordnet and I want to know how.

There's a class called *WordnetSynonymParser *, does anybody use it? It
says it is experimental...

I'm using solr 5.2.1

Briefly speaking about my needs:
I have different collections in different languages (fr, pr, sp, en).
When the user search for example in the english collection the word
"furnitures" I want to look for "table", "chair", "furniture"(without the
plural) and all the synonyms of "furnitures". Wordnet already provides me
all this and in different languages, that's why it would be great to have
solr using it.

Regards,
Pablo.


Re: How to use Wordnet in solr?

2017-04-21 Thread Alexandre Rafalovitch
 I am not sure WordnetSynonymParser is accessible from Solr. At least
I never heard anybody mention it.

I am also aware of https://github.com/nicholasding/solr-lemmatizer but
that's lematizer, not a synonym builder. But perhaps there are some
lessons/code in there that could be useful.

Regards,
   Alex.

http://www.solr-start.com/ - Resources for Solr users, new and experienced


On 21 April 2017 at 10:08, Pablo Anzorena  wrote:
> Hey,
>
> I'm planning to use Wordnet and I want to know how.
>
> There's a class called *WordnetSynonymParser *, does anybody use it? It
> says it is experimental...
>
> I'm using solr 5.2.1
>
> Briefly speaking about my needs:
> I have different collections in different languages (fr, pr, sp, en).
> When the user search for example in the english collection the word
> "furnitures" I want to look for "table", "chair", "furniture"(without the
> plural) and all the synonyms of "furnitures". Wordnet already provides me
> all this and in different languages, that's why it would be great to have
> solr using it.
>
> Regards,
> Pablo.


Re: How to use Wordnet in solr?

2017-04-21 Thread Steve Rowe
From 

 (also applies to SynonymFilter):

-
format: (optional; default: solr) Controls how the synonyms will be parsed. The 
short names solr (for SolrSynonymParser) and wordnet (for WordnetSynonymParser 
) are supported, or you may alternatively supply the name of your own  
SynonymMap.Builder  subclass.
-

--
Steve
www.lucidworks.com

> On Apr 21, 2017, at 10:28 AM, Alexandre Rafalovitch  
> wrote:
> 
> I am not sure WordnetSynonymParser is accessible from Solr. At least
> I never heard anybody mention it.
> 
> I am also aware of https://github.com/nicholasding/solr-lemmatizer but
> that's lematizer, not a synonym builder. But perhaps there are some
> lessons/code in there that could be useful.
> 
> Regards,
>   Alex.
> 
> http://www.solr-start.com/ - Resources for Solr users, new and experienced
> 
> 
> On 21 April 2017 at 10:08, Pablo Anzorena  wrote:
>> Hey,
>> 
>> I'm planning to use Wordnet and I want to know how.
>> 
>> There's a class called *WordnetSynonymParser *, does anybody use it? It
>> says it is experimental...
>> 
>> I'm using solr 5.2.1
>> 
>> Briefly speaking about my needs:
>> I have different collections in different languages (fr, pr, sp, en).
>> When the user search for example in the english collection the word
>> "furnitures" I want to look for "table", "chair", "furniture"(without the
>> plural) and all the synonyms of "furnitures". Wordnet already provides me
>> all this and in different languages, that's why it would be great to have
>> solr using it.
>> 
>> Regards,
>> Pablo.



Modify solr score

2017-04-21 Thread tstusr
Hi.

We are making an application that searches for certain specific topics, as
many captured words on a document the higher the score.

We have 2 scenarios of testing. The first one with documents that users tag
as relevant and other ones that contains documents out of our domain.

In first scenario, we report ratios of 1-2% on the amount of captured terms
against all document words. For the second scenario, we report ratios of
less than 0.005%.

Nevertheless, scores remain almost equal, ~0.85 for the first stage and ~0.8
for the latter one.


So what we want is to decrease the score we report for this latter scenario
according to the percentage of words captured in some way.


Is there any way to store those values in a field in order to use them as
query boost. Or any way to override the score default calculation to change
relevancy?


Thanks in advance...



--
View this message in context: 
http://lucene.472066.n3.nabble.com/Modify-solr-score-tp4331300.html
Sent from the Solr - User mailing list archive at Nabble.com.


DateRangeField and Faceting

2017-04-21 Thread Stephen Weiss
Hi everyone,

Just trying to do a sense check on this.  I'm trying to do a facet based off a 
DateRangeField and I'm hitting this error:

Error from server at 
http://172.20.141.150:8983/solr/instock_au_shard1_replica0: Unable to range 
facet on 
field:sku_history.date_range{type=daterange,properties=indexed,stored,omitTermFreqAndPositions}


Now I read through FacetRange.java and it seems like only TrieFields are 
accepted, while DateRangeField is a spatial type, so I suppose that makes 
sense. However, elsewhere in the codebase under DateCalc (which is essentially 
the same set of restrictions) it says:

if (! (field.getType() instanceof TrieDateField) ) { throw new 
IllegalArgumentException("SchemaField must use field type extending 
TrieDateField or DateRangeField"); }

Is this for some reason assuming that DateRangeField is a subclass of 
TrieDateField (it isn't)? I also see a mention here of someone doing a facet on 
a DateRangeField and having it work:

https://wiki.apache.org/solr/DateRangeField

In his case, his daterangefield is multivalued and it seemed to work for him - 
mine is simpler than that yet it doesn't work. I don't really understand what 
we're doing differently that matters, and reading the codebase, it really 
doesn't seem like this was ever possible - but that comment under DateCalc 
makes me wonder.

If we could facet on the daterangefield, it would be very helpful, so any 
pointers on how to do that would be welcome.

--
Steve

WGSN (www.wgsn.com) is the world’s leading trend authority for creative 
thinkers in over 94 countries. Our services cover consumer insights, fashion 
and lifestyle forecasting, data analytics, crowd-sourced design validation and 
expert consulting. We help drive our customers to greater success. Together, we 
Create Tomorrow.

WGSN is part of WGSN Limited, comprising of market-leading products including 
WGSN Insight, WGSN Fashion, WGSN Instock, WGSN Lifestyle & Interiors, WGSN 
Styletrial and WGSN Mindset, our bespoke consultancy services. WGSN is owned by 
Ascential plc, a leading international media company that informs and connects 
business professionals in 150 countries through market-leading Exhibitions and 
Festivals, and Information Services.

The information in or attached to this email is confidential and may be legally 
privileged. If you are not the intended recipient of this message, any use, 
disclosure, copying, distribution or any action taken in reliance on it is 
prohibited and may be unlawful. If you have received this message in error, 
please notify the sender immediately by return email and delete this message 
and any copies from your computer and network.

WGSN does not warrant that this email and any attachments are free from viruses 
and accepts no liability for any loss resulting from infected email 
transmissions.

WGSN reserves the right to monitor all email through its networks. Any views 
expressed may be those of the originator and not necessarily of WGSN. WGSN is 
powered by Ascential plc, which transforms knowledge businesses to deliver 
exceptional performance.

Please be advised all phone calls may be recorded for training and quality 
purposes and by accepting and/or making calls from and/or to us you acknowledge 
and agree to calls being recorded.

WGSN Limited, Company number 4858491

Registered Address:

Ascential plc, The Prow, 1 Wilder Walk, London W1B 5AP
WGSN Inc., tax ID 04-3851246, registered office c/o National Registered Agents, 
Inc., 160 Greentree Drive, Suite 101, Dover DE 19904, United States 4C Serviços 
de Informação Ltda., CNPJ/MF (Taxpayer's Register): 15.536.968/0001-04, 
Address: Avenida Cidade Jardim, 377, 7˚ andar CEP 01453-000, Itaim Bibi, São 
Paulo 4C Business Information Consulting (Shanghai) Co., Ltd, 富新商务信息咨询(上海)有限公司, 
registered address Unit 4810/4811, 48/F Tower 1, Grand Gateway, 1 Hong Qiao 
Road, Xuhui District, Shanghai


Re: How to use Wordnet in solr?

2017-04-21 Thread alessandro.benedetti
Hi Pablo,
with wordnet format , Solr will just parse synonyms from a different file
format [1] .
The rest will work exactly the same.
You will use a managed resource to load the file and then potentially update
it.
If you were thinking to use directly the online resource, you may need to
customize it a bit.

Cheers

[1] http://wordnet.princeton.edu/man/prologdb.5WN.html



-
---
Alessandro Benedetti
Search Consultant, R&D Software Engineer, Director
Sease Ltd. - www.sease.io
--
View this message in context: 
http://lucene.472066.n3.nabble.com/How-to-use-Wordnet-in-solr-tp4331273p4331306.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: prefix facet performance

2017-04-21 Thread alessandro.benedetti
That is quite interesting !
You can use the stats module ( in association with the Json facets if you
need it) to calculate an accurate approximation of the unique values [1] [2]
.

Good to know it improved your scenario, I may need to update my knowledge of
term enum internals!
Can you describe your schema configuration for the field and the way you
were faceting before in comparison to the way you facet now ( with the
related benefit)

[1] https://cwiki.apache.org/confluence/display/solr/The+Stats+Component
[2] http://yonik.com/solr-count-distinct/



-
---
Alessandro Benedetti
Search Consultant, R&D Software Engineer, Director
Sease Ltd. - www.sease.io
--
View this message in context: 
http://lucene.472066.n3.nabble.com/prefix-facet-performance-tp4330684p4331309.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: Modify solr score

2017-04-21 Thread alessandro.benedetti
It has been discussed countless times, never rely on score values.
Rely on the ranking of your results.
It seems you model a  as a least of keywords and then you just run a
query for each topic.
Essentially for you, a  is a query.

The ranking of your results will already be affected by how many times (
Term Frequency) such keywords appear in the results.
You can even play with different query parsers ( such as dismax/edismax) and
play with the mm percentage to estabilish how strict you want your results
to be, in relation with input query [1] .
Can you elaborate better the way you would like to customize the score ?
Which factor would you like to modify ?

Cheers

[1]
https://cwiki.apache.org/confluence/display/solr/The+DisMax+Query+Parser#TheDisMaxQueryParser-Themm(MinimumShouldMatch)Parameter



-
---
Alessandro Benedetti
Search Consultant, R&D Software Engineer, Director
Sease Ltd. - www.sease.io
--
View this message in context: 
http://lucene.472066.n3.nabble.com/Modify-solr-score-tp4331300p4331310.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: Modify solr score

2017-04-21 Thread tstusr
Since we report the score, we think there will be some relation between them.
As far as we know scoring (and then ranking) are calculated based on tf-idf.

What we want to do is to make a qualitative ranking, it means, according to
one topic we will tag documents as "very related", "fairly related" or "poor
related". So, we select some documents completely unrelated to a topic.

On a very related document we found a ratio of ~2% of words that reports
~0.85 of score (what we think is related to ranking). On a test document we
found a ratio of less than 0.01% and the score is heigher than the first
one. What we expect is that documents not related (those ones with less
ratio) report lower scores so we can then use them as minimum and create the
scale.

We came with multiply (of affect in some way) the default rank solr provide
us with the ratio of documents so unrelated documents will be penalized
while those with higher ratio values will be overrated.

Greetings, and thanks for your help.




--
View this message in context: 
http://lucene.472066.n3.nabble.com/Modify-solr-score-tp4331300p4331315.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: Modify solr score

2017-04-21 Thread Walter Underwood
It isn’t going to work. The score is not an absolute relevance measurement. It 
only says that the first document is more relevant than the second, and so on.

Scores are not comparable between different queries. The score cannot be used 
to say that the first hit for query A is a better match than the first hit for 
query B.

wunder
Walter Underwood
wun...@wunderwood.org
http://observer.wunderwood.org/  (my blog)


> On Apr 21, 2017, at 9:35 AM, tstusr  wrote:
> 
> Since we report the score, we think there will be some relation between them.
> As far as we know scoring (and then ranking) are calculated based on tf-idf.
> 
> What we want to do is to make a qualitative ranking, it means, according to
> one topic we will tag documents as "very related", "fairly related" or "poor
> related". So, we select some documents completely unrelated to a topic.
> 
> On a very related document we found a ratio of ~2% of words that reports
> ~0.85 of score (what we think is related to ranking). On a test document we
> found a ratio of less than 0.01% and the score is heigher than the first
> one. What we expect is that documents not related (those ones with less
> ratio) report lower scores so we can then use them as minimum and create the
> scale.
> 
> We came with multiply (of affect in some way) the default rank solr provide
> us with the ratio of documents so unrelated documents will be penalized
> while those with higher ratio values will be overrated.
> 
> Greetings, and thanks for your help.
> 
> 
> 
> 
> --
> View this message in context: 
> http://lucene.472066.n3.nabble.com/Modify-solr-score-tp4331300p4331315.html
> Sent from the Solr - User mailing list archive at Nabble.com.



Re: Modify solr score

2017-04-21 Thread tstusr
Well, maybe I explain it wrong.

We have entry points, each of them are related to a topic. It mens that when
we select the first topic all information has to be related in some way to
this vocabulary. So, it can work since we select documents not related to
each vocabulary of every entry point. To establish a threshold of minimums,
so that, we are trying to use hit ratio to modify score.

After we rank on that topics, all work after that is about faceting, word
selection and so on.

Greeting



--
View this message in context: 
http://lucene.472066.n3.nabble.com/Modify-solr-score-tp4331300p4331331.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: Modify solr score

2017-04-21 Thread Walter Underwood
Using a minimum score cut off does not work. The score is not an absolute 
estimate of relevance.

The idf component of the score is a whole-corpus metric. When you add or delete 
documents, the scores for the exact same query can change.

wunder
Walter Underwood
wun...@wunderwood.org
http://observer.wunderwood.org/  (my blog)


> On Apr 21, 2017, at 10:18 AM, tstusr  wrote:
> 
> Well, maybe I explain it wrong.
> 
> We have entry points, each of them are related to a topic. It mens that when
> we select the first topic all information has to be related in some way to
> this vocabulary. So, it can work since we select documents not related to
> each vocabulary of every entry point. To establish a threshold of minimums,
> so that, we are trying to use hit ratio to modify score.
> 
> After we rank on that topics, all work after that is about faceting, word
> selection and so on.
> 
> Greeting
> 
> 
> 
> --
> View this message in context: 
> http://lucene.472066.n3.nabble.com/Modify-solr-score-tp4331300p4331331.html
> Sent from the Solr - User mailing list archive at Nabble.com.



Re: Enable https for Solr

2017-04-21 Thread Zheng Lin Edwin Yeo
Thank you Steve.

I have managed to set up the SSL, and the query via https is working now.

However, I am getting this error when I tried to do indexing using SolrJ. I
have already changed the URL to pass using https.

What could be the reason that causes this?

javax.net.ssl.SSLHandshakeException: sun.security.validator.
ValidatorException:
PKIX path building failed: sun.security.provider.certpath.
SunCertPathBuilderExce
ption: unable to find valid certification path to requested target
at sun.security.ssl.Alerts.getSSLException(Unknown Source)
at sun.security.ssl.SSLSocketImpl.fatal(Unknown Source)
at sun.security.ssl.Handshaker.fatalSE(Unknown Source)
at sun.security.ssl.Handshaker.fatalSE(Unknown Source)
at sun.security.ssl.ClientHandshaker.serverCertificate(Unknown
Source)
at sun.security.ssl.ClientHandshaker.processMessage(Unknown Source)
at sun.security.ssl.Handshaker.processLoop(Unknown Source)
at sun.security.ssl.Handshaker.process_record(Unknown Source)
at sun.security.ssl.SSLSocketImpl.readRecord(Unknown Source)
at sun.security.ssl.SSLSocketImpl.performInitialHandshake(Unknown
Source
)
at sun.security.ssl.SSLSocketImpl.startHandshake(Unknown Source)
at sun.security.ssl.SSLSocketImpl.startHandshake(Unknown Source)
at sun.net.www.protocol.https.HttpsClient.afterConnect(Unknown
Source)
at sun.net.www.protocol.https.AbstractDelegateHttpsURLConnec
tion.connect
(Unknown Source)
at sun.net.www.protocol.http.HttpURLConnection.getInputStream0(Unknown
S
ource)
at sun.net.www.protocol.http.HttpURLConnection.getInputStream(Unknown
So
urce)
at sun.net.www.protocol.https.HttpsURLConnectionImpl.
getInputStream(Unkn
own Source)
at org.apache.solr.util.SimplePostTool.doHttpGet(
SimplePostTool.java:1702)
at org.apache.solr.util.SimplePostTool.main(SimplePostTool.j
ava:256)
Caused by: sun.security.validator.ValidatorException: PKIX path building
failed:
 sun.security.provider.certpath.SunCertPathBuilderException: unable to find
vali
d certification path to requested target
at sun.security.validator.PKIXValidator.doBuild(Unknown Source)
at sun.security.validator.PKIXValidator.engineValidate(Unknown
Source)
at sun.security.validator.Validator.validate(Unknown Source)
at sun.security.ssl.X509TrustManagerImpl.validate(Unknown Source)
at sun.security.ssl.X509TrustManagerImpl.checkTrusted(Unknown
Source)
at sun.security.ssl.X509TrustManagerImpl.checkServerTrusted(Unknown
Sour
ce)
... 15 more
Caused by: sun.security.provider.certpath.SunCertPathBuilderException:
unable to
 find valid certification path to requested target
at sun.security.provider.certpath.SunCertPathBuilder.build(Unknown
Sourc
e)
at sun.security.provider.certpath.SunCertPathBuilder.
engineBuild(Unknown
 Source)
at java.security.cert.CertPathBuilder.build(Unknown Source)
... 21 more
javax.net.ssl.SSLHandshakeException: sun.security.validator.
ValidatorException:
PKIX path building failed: sun.security.provider.certpath.
SunCertPathBuilderExce
ption: unable to find valid certification path to requested target
at sun.security.ssl.Alerts.getSSLException(Unknown Source)
at sun.security.ssl.SSLSocketImpl.fatal(Unknown Source)
at sun.security.ssl.Handshaker.fatalSE(Unknown Source)
at sun.security.ssl.Handshaker.fatalSE(Unknown Source)
at sun.security.ssl.ClientHandshaker.serverCertificate(Unknown
Source)
at sun.security.ssl.ClientHandshaker.processMessage(Unknown Source)
at sun.security.ssl.Handshaker.processLoop(Unknown Source)
at sun.security.ssl.Handshaker.process_record(Unknown Source)
at sun.security.ssl.SSLSocketImpl.readRecord(Unknown Source)
at sun.security.ssl.SSLSocketImpl.performInitialHandshake(Unknown
Source
)
at sun.security.ssl.SSLSocketImpl.startHandshake(Unknown Source)
at sun.security.ssl.SSLSocketImpl.startHandshake(Unknown Source)
at sun.net.www.protocol.https.HttpsClient.afterConnect(Unknown
Source)
at sun.net.www.protocol.https.AbstractDelegateHttpsURLConnec
tion.connect
(Unknown Source)
at sun.net.www.protocol.http.HttpURLConnection.getInputStream0(Unknown
S
ource)
at sun.net.www.protocol.http.HttpURLConnection.getInputStream(Unknown
So
urce)
at sun.net.www.protocol.https.HttpsURLConnectionImpl.
getInputStream(Unkn
own Source)
at org.apache.solr.util.SimplePostTool.doHttpGet(
SimplePostTool.java:1702)
at org.apache.solr.util.SimplePostTool.main(SimplePostTool.j
ava:256)
Caused by: sun.security.validator.ValidatorException: PKIX path building
failed:
 sun.security.provider.certpath.SunCertPathBuilderException: unable to find
vali
d certification path to requested target
at sun.security.validator.PKIXValidator.doBuild(Unknown Source)
at su

Re: Enable https for Solr

2017-04-21 Thread Steve Rowe
Edwin,

Did you set the required keystore/truststore/password system properties?  See 
the example at 


--
Steve
www.lucidworks.com

> On Apr 21, 2017, at 1:44 PM, Zheng Lin Edwin Yeo  wrote:
> 
> Thank you Steve.
> 
> I have managed to set up the SSL, and the query via https is working now.
> 
> However, I am getting this error when I tried to do indexing using SolrJ. I
> have already changed the URL to pass using https.
> 
> What could be the reason that causes this?
> 
> javax.net.ssl.SSLHandshakeException: sun.security.validator.
> ValidatorException:
> PKIX path building failed: sun.security.provider.certpath.
> SunCertPathBuilderExce
> ption: unable to find valid certification path to requested target
>at sun.security.ssl.Alerts.getSSLException(Unknown Source)
>at sun.security.ssl.SSLSocketImpl.fatal(Unknown Source)
>at sun.security.ssl.Handshaker.fatalSE(Unknown Source)
>at sun.security.ssl.Handshaker.fatalSE(Unknown Source)
>at sun.security.ssl.ClientHandshaker.serverCertificate(Unknown
> Source)
>at sun.security.ssl.ClientHandshaker.processMessage(Unknown Source)
>at sun.security.ssl.Handshaker.processLoop(Unknown Source)
>at sun.security.ssl.Handshaker.process_record(Unknown Source)
>at sun.security.ssl.SSLSocketImpl.readRecord(Unknown Source)
>at sun.security.ssl.SSLSocketImpl.performInitialHandshake(Unknown
> Source
> )
>at sun.security.ssl.SSLSocketImpl.startHandshake(Unknown Source)
>at sun.security.ssl.SSLSocketImpl.startHandshake(Unknown Source)
>at sun.net.www.protocol.https.HttpsClient.afterConnect(Unknown
> Source)
>at sun.net.www.protocol.https.AbstractDelegateHttpsURLConnec
> tion.connect
> (Unknown Source)
>at sun.net.www.protocol.http.HttpURLConnection.getInputStream0(Unknown
> S
> ource)
>at sun.net.www.protocol.http.HttpURLConnection.getInputStream(Unknown
> So
> urce)
>at sun.net.www.protocol.https.HttpsURLConnectionImpl.
> getInputStream(Unkn
> own Source)
>at org.apache.solr.util.SimplePostTool.doHttpGet(
> SimplePostTool.java:1702)
>at org.apache.solr.util.SimplePostTool.main(SimplePostTool.j
> ava:256)
> Caused by: sun.security.validator.ValidatorException: PKIX path building
> failed:
> sun.security.provider.certpath.SunCertPathBuilderException: unable to find
> vali
> d certification path to requested target
>at sun.security.validator.PKIXValidator.doBuild(Unknown Source)
>at sun.security.validator.PKIXValidator.engineValidate(Unknown
> Source)
>at sun.security.validator.Validator.validate(Unknown Source)
>at sun.security.ssl.X509TrustManagerImpl.validate(Unknown Source)
>at sun.security.ssl.X509TrustManagerImpl.checkTrusted(Unknown
> Source)
>at sun.security.ssl.X509TrustManagerImpl.checkServerTrusted(Unknown
> Sour
> ce)
>... 15 more
> Caused by: sun.security.provider.certpath.SunCertPathBuilderException:
> unable to
> find valid certification path to requested target
>at sun.security.provider.certpath.SunCertPathBuilder.build(Unknown
> Sourc
> e)
>at sun.security.provider.certpath.SunCertPathBuilder.
> engineBuild(Unknown
> Source)
>at java.security.cert.CertPathBuilder.build(Unknown Source)
>... 21 more
> javax.net.ssl.SSLHandshakeException: sun.security.validator.
> ValidatorException:
> PKIX path building failed: sun.security.provider.certpath.
> SunCertPathBuilderExce
> ption: unable to find valid certification path to requested target
>at sun.security.ssl.Alerts.getSSLException(Unknown Source)
>at sun.security.ssl.SSLSocketImpl.fatal(Unknown Source)
>at sun.security.ssl.Handshaker.fatalSE(Unknown Source)
>at sun.security.ssl.Handshaker.fatalSE(Unknown Source)
>at sun.security.ssl.ClientHandshaker.serverCertificate(Unknown
> Source)
>at sun.security.ssl.ClientHandshaker.processMessage(Unknown Source)
>at sun.security.ssl.Handshaker.processLoop(Unknown Source)
>at sun.security.ssl.Handshaker.process_record(Unknown Source)
>at sun.security.ssl.SSLSocketImpl.readRecord(Unknown Source)
>at sun.security.ssl.SSLSocketImpl.performInitialHandshake(Unknown
> Source
> )
>at sun.security.ssl.SSLSocketImpl.startHandshake(Unknown Source)
>at sun.security.ssl.SSLSocketImpl.startHandshake(Unknown Source)
>at sun.net.www.protocol.https.HttpsClient.afterConnect(Unknown
> Source)
>at sun.net.www.protocol.https.AbstractDelegateHttpsURLConnec
> tion.connect
> (Unknown Source)
>at sun.net.www.protocol.http.HttpURLConnection.getInputStream0(Unknown
> S
> ource)
>at sun.net.www.protocol.http.HttpURLConnection.getInputStream(Unknown
> So
> urce)
>at sun.net.www.protocol.https.HttpsURLConnectionImpl.
> getInputStre

php apache solr client - Solr HTTP Error 58: 'Problem with the local SSL certificate'

2017-04-21 Thread bay chae
Hi,

Apologies if this is an inappropriate place to contact please redirect if this 
is the case.

I have successfully setup solr(6.5.0) with ssl in my dev environment and can 
get a proper response using the following curl request:

curl -E ./etc/solr-ssl.keystore.p12:secret --cacert ./etc/solr-ssl.cacert.pem 
"https://localhost:8984/solr/mycollection/select?q=*:*&wt=json&indent=on 
”

...as advised in the official solr docs: 

https://cwiki.apache.org/confluence/display/solr/Enabling+SSL#EnablingSSL-ExampleClientActions
 


I have also installed into my php environment the official php client for 
apache solr (2.4.0) using pecl to install.

I have tested the client works in non-ssl mode when solr server does not force 
ssl on the client.

The problem I am having is:

[21-Apr-2017 17:00:36 UTC] Solr HTTP Error 58: 'Problem with the local SSL 
certificate' 
#0 Controller.php(37): SolrClient->ping()

With options:

$options = array
(
'hostname' => "localhost",
'port' => 8984,
'timeout'  => 10,
'secure'   => true,
'path' => 'solr/mycollection',
'ssl_cert' => SITE_ROOT . 'apps/config/solr-ssl.crt',   
'ssl_key'  => SITE_ROOT . 'apps/config/solr-ssl.keystore.pem', 
'ssl_keypassword' => 'secret',
 'ssl_cainfo' => SITE_ROOT . 'apps/config/solr-ssl.cacert.pem'   
);

These options while advised in the docs appear to be incompatible with example 
usage of curl as advised in the case of using curl with OS X Mavericks+.

I was hoping you might able to shed some light onto the problem I am having and 
how I might be able to remedy it.

As far as I am aware I have added all certs into OS X keychain with access to 
all applications.

Any help would be gratefully received.

Baychae

Re: How to use Wordnet in solr?

2017-04-21 Thread Pablo Anzorena
Thanks to everybody.

I will try first Alessandro and Steve recommendation.

If i don't misunderstood, you are telling me that I have to customize the
prolog files to "solr txt synonyms syntax"? If that is correct, what is the
point of format:wordnet ?

2017-04-21 12:52 GMT-03:00 alessandro.benedetti :

> Hi Pablo,
> with wordnet format , Solr will just parse synonyms from a different file
> format [1] .
> The rest will work exactly the same.
> You will use a managed resource to load the file and then potentially
> update
> it.
> If you were thinking to use directly the online resource, you may need to
> customize it a bit.
>
> Cheers
>
> [1] http://wordnet.princeton.edu/man/prologdb.5WN.html
>
>
>
> -
> ---
> Alessandro Benedetti
> Search Consultant, R&D Software Engineer, Director
> Sease Ltd. - www.sease.io
> --
> View this message in context: http://lucene.472066.n3.
> nabble.com/How-to-use-Wordnet-in-solr-tp4331273p4331306.html
> Sent from the Solr - User mailing list archive at Nabble.com.
>


Re: Modify solr score

2017-04-21 Thread tstusr
Well, I know they can change.

I think, the main problem here it that (in this point) documents completely
unrelated to a topic are being ranked as high as documents related. So, in
order to penalize them we are trying to use the ratio or term frequency/word
length.

Nevertheless we aren't able to find a practical way to make it.

Greetings.



--
View this message in context: 
http://lucene.472066.n3.nabble.com/Modify-solr-score-tp4331300p4331342.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: DateRangeField and Faceting

2017-04-21 Thread Stephen Weiss
One small detail - I just realized I've been doing JSON faceting and the wiki 
refers to old-school faceting.  Old-school faceting indeed does work but the 
problem is the facet is ultimately one of a whole tree of stats I'm collecting, 
so JSON facet is far more convenient for my use case (I don't think I can even 
do what I'm doing with old facets).  Why would daterangefield work with the old 
faceting system and not the new?

--
Steve

On Fri, Apr 21, 2017 at 11:50 AM, Stephen Weiss 
mailto:steve.we...@wgsn.com>> wrote:
Hi everyone,

Just trying to do a sense check on this.  I'm trying to do a facet based off a 
DateRangeField and I'm hitting this error:

Error from server at 
http://172.20.141.150:8983/solr/instock_au_shard1_replica0: Unable to range 
facet on 
field:sku_history.date_range{type=daterange,properties=indexed,stored,omitTermFreqAndPositions}


Now I read through FacetRange.java and it seems like only TrieFields are 
accepted, while DateRangeField is a spatial type, so I suppose that makes 
sense. However, elsewhere in the codebase under DateCalc (which is essentially 
the same set of restrictions) it says:

if (! (field.getType() instanceof TrieDateField) ) { throw new 
IllegalArgumentException("SchemaField must use field type extending 
TrieDateField or DateRangeField"); }

Is this for some reason assuming that DateRangeField is a subclass of 
TrieDateField (it isn't)? I also see a mention here of someone doing a facet on 
a DateRangeField and having it work:

https://wiki.apache.org/solr/DateRangeField

In his case, his daterangefield is multivalued and it seemed to work for him - 
mine is simpler than that yet it doesn't work. I don't really understand what 
we're doing differently that matters, and reading the codebase, it really 
doesn't seem like this was ever possible - but that comment under DateCalc 
makes me wonder.

If we could facet on the daterangefield, it would be very helpful, so any 
pointers on how to do that would be welcome.

--
Steve


WGSN (www.wgsn.com) is the world’s leading trend authority for creative 
thinkers in over 94 countries. Our services cover consumer insights, fashion 
and lifestyle forecasting, data analytics, crowd-sourced design validation and 
expert consulting. We help drive our customers to greater success. Together, we 
Create Tomorrow.

WGSN is part of WGSN Limited, comprising of market-leading products including 
WGSN Insight, WGSN Fashion, WGSN Instock, WGSN Lifestyle & Interiors, WGSN 
Styletrial and WGSN Mindset, our bespoke consultancy services. WGSN is owned by 
Ascential plc, a leading international media company that informs and connects 
business professionals in 150 countries through market-leading Exhibitions and 
Festivals, and Information Services.

The information in or attached to this email is confidential and may be legally 
privileged. If you are not the intended recipient of this message, any use, 
disclosure, copying, distribution or any action taken in reliance on it is 
prohibited and may be unlawful. If you have received this message in error, 
please notify the sender immediately by return email and delete this message 
and any copies from your computer and network.

WGSN does not warrant that this email and any attachments are free from viruses 
and accepts no liability for any loss resulting from infected email 
transmissions.

WGSN reserves the right to monitor all email through its networks. Any views 
expressed may be those of the originator and not necessarily of WGSN. WGSN is 
powered by Ascential plc, which transforms knowledge businesses to deliver 
exceptional performance.

Please be advised all phone calls may be recorded for training and quality 
purposes and by accepting and/or making calls from and/or to us you acknowledge 
and agree to calls being recorded.

WGSN Limited, Company number 4858491

Registered Address:

Ascential plc, The Prow, 1 Wilder Walk, London W1B 5AP
WGSN Inc., tax ID 04-3851246, registered office c/o National Registered Agents, 
Inc., 160 Greentree Drive, Suite 101, Dover DE 19904, United States 4C Serviços 
de Informação Ltda., CNPJ/MF (Taxpayer's Register): 15.536.968/0001-04, 
Address: Avenida Cidade Jardim, 377, 7˚ andar CEP 01453-000, Itaim Bibi, São 
Paulo 4C Business Information Consulting (Shanghai) Co., Ltd, 富新商务信息咨询(上海)有限公司, 
registered address Unit 4810/4811, 48/F Tower 1, Grand Gateway, 1 Hong Qiao 
Road, Xuhui District, Shanghai


Re: Modify solr score

2017-04-21 Thread Rick Leir
Ulf: Maybe there is a way you could filter out the unrelated documents. Qf?
Rick

On April 21, 2017 2:18:59 PM EDT, tstusr  wrote:
>Well, I know they can change.
>
>I think, the main problem here it that (in this point) documents
>completely
>unrelated to a topic are being ranked as high as documents related. So,
>in
>order to penalize them we are trying to use the ratio or term
>frequency/word
>length.
>
>Nevertheless we aren't able to find a practical way to make it.
>
>Greetings.
>
>
>
>--
>View this message in context:
>http://lucene.472066.n3.nabble.com/Modify-solr-score-tp4331300p4331342.html
>Sent from the Solr - User mailing list archive at Nabble.com.

-- 
Sorry for being brief. Alternate email is rickleir at yahoo dot com 

Re: prefix facet performance

2017-04-21 Thread Maria Muslea
The field is:



and using unique() I found that it has 700K+ unique values.

The query before (that takes ~10s):

wt=json&indent=true&q=*:*&rows=0&facet=true&facet.field=concept&facet.prefix=A/

the query after (that is almost instant):

wt=json&indent=true&q=*:*&rows=0&facet=true&facet.field=concept&facet.prefix=A/&facet.method=enum'

Maria

On Fri, Apr 21, 2017 at 8:59 AM, alessandro.benedetti 
wrote:

> That is quite interesting !
> You can use the stats module ( in association with the Json facets if you
> need it) to calculate an accurate approximation of the unique values [1]
> [2]
> .
>
> Good to know it improved your scenario, I may need to update my knowledge
> of
> term enum internals!
> Can you describe your schema configuration for the field and the way you
> were faceting before in comparison to the way you facet now ( with the
> related benefit)
>
> [1] https://cwiki.apache.org/confluence/display/solr/The+Stats+Component
> [2] http://yonik.com/solr-count-distinct/
>
>
>
> -
> ---
> Alessandro Benedetti
> Search Consultant, R&D Software Engineer, Director
> Sease Ltd. - www.sease.io
> --
> View this message in context: http://lucene.472066.n3.
> nabble.com/prefix-facet-performance-tp4330684p4331309.html
> Sent from the Solr - User mailing list archive at Nabble.com.
>


Re: prefix facet performance

2017-04-21 Thread Yonik Seeley
On Fri, Apr 21, 2017 at 4:25 PM, Maria Muslea  wrote:
> The field is:
>
> 
>
> and using unique() I found that it has 700K+ unique values.
>
> The query before (that takes ~10s):
>
> wt=json&indent=true&q=*:*&rows=0&facet=true&facet.field=concept&facet.prefix=A/
>
> the query after (that is almost instant):
>
> wt=json&indent=true&q=*:*&rows=0&facet=true&facet.field=concept&facet.prefix=A/&facet.method=enum'

Ah, the fact that you specify a facet.prefix makes this perfectly
aligned for the "enum" method, which can skip directly to the first
term on-or-after "A/"
facet.method=enum goes term-by-term, calculating the intersection with
the facet domain.
In this case, it's the number of terms that start with "A/" that
matters, not the number of terms in the entire field (hence the
speedup).

-Yonik


Re: prefix facet performance

2017-04-21 Thread Maria Muslea
I see. Once I specify a prefix the number of terms is MUCH smaller.

Thank you again for all your help.

Maria

On Fri, Apr 21, 2017 at 1:46 PM, Yonik Seeley  wrote:

> On Fri, Apr 21, 2017 at 4:25 PM, Maria Muslea 
> wrote:
> > The field is:
> >
> > 
> >
> > and using unique() I found that it has 700K+ unique values.
> >
> > The query before (that takes ~10s):
> >
> > wt=json&indent=true&q=*:*&rows=0&facet=true&facet.field=
> concept&facet.prefix=A/
> >
> > the query after (that is almost instant):
> >
> > wt=json&indent=true&q=*:*&rows=0&facet=true&facet.field=
> concept&facet.prefix=A/&facet.method=enum'
>
> Ah, the fact that you specify a facet.prefix makes this perfectly
> aligned for the "enum" method, which can skip directly to the first
> term on-or-after "A/"
> facet.method=enum goes term-by-term, calculating the intersection with
> the facet domain.
> In this case, it's the number of terms that start with "A/" that
> matters, not the number of terms in the entire field (hence the
> speedup).
>
> -Yonik
>