Re: help with DIH transformer to add a suffix to column names

2016-08-23 Thread Emir Arnautovic

Hi Wendy,

Why don't you simply specify column names in your query? Do you have 
that much columns that "SELECT *" is THE way to go?


For the transformer - you changed the row, but fields in context are 
still using old names - maybe try setting field names in context (if 
possible - did not look at code)


Emir

On 22.08.2016 21:47, Wendy wrote:

Hi Emlr,

I use the example of "A General TrimTransformer" in the following link:

https://wiki.apache.org/solr/DIHCustomTransformer#transformer

But instead of trim the field value, I wanted to change the table column
name to columnName_stem.
So I can use *_stem to copy all fields.

Here is my code, but just not working. I don't what is the problem with the
code? Any ideas? Thanks!

public class RowTransformer extends Transformer  {
 public Map transformRow(Map row, Context
context) {
 List> fields = ((Context)
context).getAllEntityFields();
  
 System.out.println("fields = " + fields.size());
 
 for (Map field : fields) {

 String columnName = field.get(DataImporter.COLUMN);
 System.out.println("columnName = "+ columnName);
 // Get this field's value from the current row
 Object value = row.get(columnName);
 if (value != null && !value.toString().trim().equals("")) {
row.put(columnName + "_stem", value.toString().trim());
System.out.println("positive columnName = "+ columnName);
System.out.println("positive columnValue = "+
value.toString());
 }
 }
 return row;
 }
}



--
View this message in context: 
http://lucene.472066.n3.nabble.com/help-with-DIH-transformer-to-add-a-suffix-to-column-names-tp4292448p4292796.html
Sent from the Solr - User mailing list archive at Nabble.com.


--
Monitoring * Alerting * Anomaly Detection * Centralized Log Management
Solr & Elasticsearch Support * http://sematext.com/



SolrCore is loading in the middle of indexing.

2016-08-23 Thread Pranaya Behera

Hi,
In the middle of indexing solrcore gets reloaded and causing 503 
error. Here is the stack trace of the issue.


[main] ERROR org.apache.solr.client.solrj.impl.CloudSolrClient - Request 
to collection product failed due to (503) 
org.apache.solr.client.solrj.impl.HttpSolrClient$RemoteSolrException: 
Error from server at http://x.x.x.x:8983/solr/product_shard3_replica1: 
Expected mime type application/octet-stream but got text/html. 



Error 503 
{metadata={error-class=org.apache.solr.common.SolrException,root-error-class=org.apache.solr.common.SolrException},msg=SolrCore 
is loading,code=503}


HTTP ERROR 503
Problem accessing /solr/product_shard3_replica1/update. Reason:
 
{metadata={error-class=org.apache.solr.common.SolrException,root-error-class=org.apache.solr.common.SolrException},msg=SolrCore 
is loading,code=503}



, retry? 0
[main] ERROR com.igp.solrindex.ProductIndex - Exception is
org.apache.solr.client.solrj.impl.CloudSolrClient$RouteException: Error 
from server at http://10.0.2.6:8983/solr/product_shard3_replica1: 
Expected mime type application/octet-stream but got text/html. 



Error 503 
{metadata={error-class=org.apache.solr.common.SolrException,root-error-class=org.apache.solr.common.SolrException},msg=SolrCore 
is loading,code=503}


HTTP ERROR 503
Problem accessing /solr/product_shard3_replica1/update. Reason:
 
{metadata={error-class=org.apache.solr.common.SolrException,root-error-class=org.apache.solr.common.SolrException},msg=SolrCore 
is loading,code=503}




at 
org.apache.solr.client.solrj.impl.CloudSolrClient.directUpdate(CloudSolrClient.java:697) 
~[solrindex-1.0-SNAPSHOT.jar:?]
at 
org.apache.solr.client.solrj.impl.CloudSolrClient.sendRequest(CloudSolrClient.java:1109) 
~[solrindex-1.0-SNAPSHOT.jar:?]
at 
org.apache.solr.client.solrj.impl.CloudSolrClient.requestWithRetryOnStaleState(CloudSolrClient.java:998) 
~[solrindex-1.0-SNAPSHOT.jar:?]
at 
org.apache.solr.client.solrj.impl.CloudSolrClient.request(CloudSolrClient.java:934) 
~[solrindex-1.0-SNAPSHOT.jar:?]
at 
org.apache.solr.client.solrj.SolrRequest.process(SolrRequest.java:149) 
~[solrindex-1.0-SNAPSHOT.jar:?]
at org.apache.solr.client.solrj.SolrClient.add(SolrClient.java:106) 
~[solrindex-1.0-SNAPSHOT.jar:?]
at org.apache.solr.client.solrj.SolrClient.add(SolrClient.java:71) 
~[solrindex-1.0-SNAPSHOT.jar:?]
at org.apache.solr.client.solrj.SolrClient.add(SolrClient.java:85) 
~[solrindex-1.0-SNAPSHOT.jar:?]
at com.igp.solrindex.ProductIndex.index(ProductIndex.java:225) 
[solrindex-1.0-SNAPSHOT.jar:?]
at com.igp.solrindex.App.main(App.java:17) 
[solrindex-1.0-SNAPSHOT.jar:?]
Caused by: 
org.apache.solr.client.solrj.impl.HttpSolrClient$RemoteSolrException: 
Error from server at http://x.x.x.x:8983/solr/product_shard3_replica1: 
Expected mime type application/octet-stream but got text/html. 



Error 503 
{metadata={error-class=org.apache.solr.common.SolrException,root-error-class=org.apache.solr.common.SolrException},msg=SolrCore 
is loading,code=503}


HTTP ERROR 503
Problem accessing /solr/product_shard3_replica1/update. Reason:
 
{metadata={error-class=org.apache.solr.common.SolrException,root-error-class=org.apache.solr.common.SolrException},msg=SolrCore 
is loading,code=503}




at 
org.apache.solr.client.solrj.impl.HttpSolrClient.executeMethod(HttpSolrClient.java:558) 
~[solrindex-1.0-SNAPSHOT.jar:?]
at 
org.apache.solr.client.solrj.impl.HttpSolrClient.request(HttpSolrClient.java:259) 
~[solrindex-1.0-SNAPSHOT.jar:?]
at 
org.apache.solr.client.solrj.impl.HttpSolrClient.request(HttpSolrClient.java:248) 
~[solrindex-1.0-SNAPSHOT.jar:?]
at 
org.apache.solr.client.solrj.impl.LBHttpSolrClient.doRequest(LBHttpSolrClient.java:404) 
~[solrindex-1.0-SNAPSHOT.jar:?]
at 
org.apache.solr.client.solrj.impl.LBHttpSolrClient.request(LBHttpSolrClient.java:357) 
~[solrindex-1.0-SNAPSHOT.jar:?]
at 
org.apache.solr.client.solrj.impl.CloudSolrClient.lambda$directUpdate$14(CloudSolrClient.java:674) 
~[solrindex-1.0-SNAPSHOT.jar:?]
at java.util.concurrent.FutureTask.run(FutureTask.java:266) 
~[?:1.8.0_91]
at 
org.apache.solr.common.util.ExecutorUtil$MDCAwareThreadPoolExecutor.lambda$execute$22(ExecutorUtil.java:229) 
~[solrindex-1.0-SNAPSHOT.jar:?]
at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) 
~[?:1.8.0_91]
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) 
~[?:1.8.0_91]

at java.lang.Thread.run(Thread.java:745) ~[?:1.8.0_91]


By raising this issue the indexing never completes.
What could be the issue here for the mime type and the core loading.
I am using 6.1.0 with sorlcloud in 3 instances with 3 zookeeper in each 
instance.




Range Filter for Multi-Valued Date Fields

2016-08-23 Thread Iana Bondarska
Hi All,

could you help me please with multiple range filters on multi valued fields:
I have following dataset:
{
"p_happyDates":[
"1986-05-16T20:00:00Z",
"1875-04-29T21:57:56Z",
"1906-07-04T21:57:56Z"]
},
{
"p_happyDates":[
"1986-05-16T20:00:00Z",
"1975-10-31T21:57:56Z",
"1966-12-28T21:00:00Z"]
}
I apply filters:
&fq=(p_happyDates:[1975-10-31T00:00:00.000Z+TO+*]+AND+p_happyDates:[*+TO+1975-10-31T23:59:59.999Z])
I expect to see only second record.
Actually I see both records. Even if I add parameter q.op=AND - result is
the same.
Is this expected behavior or known issue for multivalued fields?

Best Regards,
Iana Bondarska


Re: Range Filter for Multi-Valued Date Fields

2016-08-23 Thread Mikhail Khludnev
Hello Iana,

I consider is as expected behavior, perhaps usually it's done as
&fq=p_happyDates:[1975-10-31T00:00:00.000Z+TO+1975-10-31T23:59:59.999Z],
which is not equivalent to combining half closed ranges with boolean query.
I wonder why did you do like that?

On Tue, Aug 23, 2016 at 2:33 PM, Iana Bondarska  wrote:

> Hi All,
>
> could you help me please with multiple range filters on multi valued
> fields:
> I have following dataset:
> {
> "p_happyDates":[
> "1986-05-16T20:00:00Z",
> "1875-04-29T21:57:56Z",
> "1906-07-04T21:57:56Z"]
> },
> {
> "p_happyDates":[
> "1986-05-16T20:00:00Z",
> "1975-10-31T21:57:56Z",
> "1966-12-28T21:00:00Z"]
> }
> I apply filters:
> &fq=(p_happyDates:[1975-10-31T00:00:00.000Z+TO+*]+AND+p_
> happyDates:[*+TO+1975-10-31T23:59:59.999Z])
> I expect to see only second record.
> Actually I see both records. Even if I add parameter q.op=AND - result is
> the same.
> Is this expected behavior or known issue for multivalued fields?
>
> Best Regards,
> Iana Bondarska
>



-- 
Sincerely yours
Mikhail Khludnev


Re: Error upgrading from 6.0 to 6.1

2016-08-23 Thread Shawn Heisey
On 8/22/2016 9:18 PM, Stephen Lewis wrote:
> Oops, apologies for my confusing grammar and for missing the
> attachment. The intro sentence should have read "I have a question
> about upgrading a solr cloud cluster in place." I've actually attached
> the log below this time.

The mailing list eats most attachments.  Sometimes they make it through,
usually they don't, and I never can see what causes the difference. 
Your attachment did not make it through.

For us to see it, you will need to to place the log somewhere on the
Internet and provide a URL to access it.

When you get a message saying that application/octet-stream was expected
but text/html is received instead, it usually means what was received
from the remote server was an error page, instead of the javabin
response that was expected.  To see what that error is, you'll need to
check the log on the remote server -- in this case, the server with IP
address 172.18.6.68.

Further down in the thread you did mention a NoSuchMethodError.  If
that's the error message from 172.18.6.68, then I agree with Erick's
assessment.  You've probably got multiple versions of Solr jars on your
classpath.

Best guess is that your bootstrapping step copies the install directory
without deleting anything from the target, which would *add* jars to
server/solr-webapp/webapp/WEB-INF/lib.  The jars in the two versions of
Solr do not have the same names -- the full version number is part of
the filename.

I can anticipate a possible next question:  Why did it work when
upgrading from 6.0.0 to 6.0.1?  The answer:  Mixing jars would most
likely work with those two versions, because that's a bugfix release,
and bugfix releases are usually 100 percent API/ABI compatible with the
previous version.  Breaks in that compatibility are expected in minor
version upgrades on the server side, especially in code relating to
SolrCloud.  That code is evolving *VERY* rapidly.

If you do not see evidence supporting the multiple-jar-version idea,
then you may need to provide access to the logfile.

We do *try* to ensure API/ABI compatibility on the the *client* side so
user programs can update SolrJ to a new minor version without
recompiling ... but even that is not guaranteed.

Thanks,
Shawn



AW: Re: Tagging and excluding Filters with BlockJoin Queries and BlockJoin Faceting

2016-08-23 Thread Tobias Lorenz
Hi Mikhail,

Thanks for replying so quickly with a suggestion.

I'm a colleague of Stefan and working with him on our project.

We tried composing our solr query with exclusion instructions, and the result 
was that the facet excluded by tag did not show up anymore in the result, 
instead of showing all values.

Your example from the last comment, completed by our exlusion instruction:

json.facet={
  filter_by_children: {
type: query,
q: "isparent:false",
domain: {
  blockChildren: "isparent:true"
},
facet: {
  colors: {
type: terms,
field: color,
domain:{
  excludeTags:myTag
},
facet: {
  productsCount: "unique(_root_)"
}
  }
}
  }
}


and the corresponding filter query:

fq={!parent which='isparent:true'}{!tag=myTag}color:blue


Either this feature is not working yet, or we are making a mistake using it.
Of course we know it's still in development right now.

Might you please have a look if we are doing something obviously wrong?

Thanks,
Tobias



>The last comment at https://issues.apache.org/jira/browse/SOLR-8998 shows
>the current verbose json.facet syntax which provides aggregated facet
>counts already. It's a little bit slower that child.facet.field.
>Nevertheless, you can take this sample and add exclusion instructions into.
>It should work. Let me know how does it, please.
>
>On Wed, Aug 17, 2016 at 5:35 PM, Stefan Moises  wrote:
>
>> Hi Mikhail,
>>
>> thanks for the info ... what is the advantage of using the JSON FACET API
>> compared to the standard BlockJoinQuery features?
>>
>> Is there already anybody working on the tagging/exclusion feature or is
>> there any timeframe for it? There wasn't any discussion yet in SOLR-8998
>> about exclusions, was there?
>>
>> Thank you very much,
>>
>> best,
>>
>> Stefan
>>
>>
>> Am 17.08.16 um 15:26 schrieb Mikhail Khludnev:
>>
>> Stefan,
>>> child.facet.field never intend to support exclusions. My preference is to
>>> implement it under json.facet that's discussed under
>>> https://issues.apache.org/jira/browse/SOLR-8998.
>>>
>>> On Wed, Aug 17, 2016 at 3:52 PM, Stefan Moises 
>>> wrote:
>>>
>>> Hey girls and guys,

 for a long time we have been using our own BlockJoin Implementation,
 because for our Shop Systems a lot of requirements that we had were not
 implemented in solr.

 As we now had a deeper look into how far the standard has come, we saw
 that BlockJoin and faceting on children is now part of the standard,
 which
 is pretty cool.
 When I tried to refactor our external code to use that now, I stumbled
 upon one non-working feature with BlockJoins that still keeps us from
 using
 it:

 It seems that tagging and excluding Filters with BlockJoin Faceting
 simply
 does not work yet.

 Simple query:

 &qt=products
 &q={!parent which='isparent:true'}shirt AND isparent:false
 &facet=true
 &fq={!parent which='isparent:true'}{!tag=myTag}color:grey
 &child.facet.field={!ex=myTag}color


 Gives us:
 o.a.s.h.RequestHandlerBase org.apache.solr.common.SolrException:
 undefined field: "{!ex=myTag}color"
  at org.apache.solr.schema.IndexSchema.getField(IndexSchema.
 java:1231)


 Does somebody have an idea?


 Best,
 Stefan

 --
 --
 
 Stefan Moises
 Manager Research & Development
 shoptimax GmbH
 Ulmenstraße 52 H
 90443 Nürnberg
 Tel.: 0911/25566-0
 Fax: 0911/25566-29
 moi...@shoptimax.de
 http://www.shoptimax.de

 Geschäftsführung: Friedrich Schreieck
 Ust.-IdNr.: DE 814340642
 Amtsgericht Nürnberg HRB 21703




>>>
>> --
>> --
>> 
>> Stefan Moises
>> Manager Research & Development
>> shoptimax GmbH
>> Ulmenstraße 52 H
>> 90443 Nürnberg
>> Tel.: 0911/25566-0
>> Fax: 0911/25566-29
>> moi...@shoptimax.de
>> http://www.shoptimax.de
>>
>> Geschäftsführung: Friedrich Schreieck
>> Ust.-IdNr.: DE 814340642
>> Amtsgericht Nürnberg HRB 21703
>>   
>>
>>
>
>
>-- 
>Sincerely yours
>Mikhail Khludnev
>
>


Re: help with DIH transformer to add a suffix to column names

2016-08-23 Thread Wendy
Hi Emir,I have many tables and columns to index. One of the requirements is
to dynamically index columns without knowing column names. In this way, if a
new column is added later on, we don't need to change the configurations,
just need to do a delta-imput.I did use Solr with mongodb and
mongo-connector for a couple of applications. We were happy with the results
and performance. Now try to use Solr with mysql in a much large scale. I am
kind of stuck at this step for several days :-(Will try again today.Thanks
again for your response!



--
View this message in context: 
http://lucene.472066.n3.nabble.com/help-with-DIH-transformer-to-add-a-suffix-to-column-names-tp4292448p4292925.html
Sent from the Solr - User mailing list archive at Nabble.com.

Re: SolrCore is loading in the middle of indexing.

2016-08-23 Thread Shawn Heisey
On 8/23/2016 4:49 AM, Pranaya Behera wrote:
> In the middle of indexing solrcore gets reloaded and causing 503
> error. Here is the stack trace of the issue.

> Error 503
> {metadata={error-class=org.apache.solr.common.SolrException,root-error-class=org.apache.solr.common.SolrException},msg=SolrCore
> is loading,code=503}

> By raising this issue the indexing never completes.
> What could be the issue here for the mime type and the core loading.
> I am using 6.1.0 with sorlcloud in 3 instances with 3 zookeeper in
> each instance.

Core reloads don't just happen.  Something is requesting the reload.

If you are using options like the Managed Schema API, that can cause a
core reload, because a reload is necessary in order for a modified
config/schema to become active.

I've checked the update processor used by the data-driven example config
to modify the schema during indexing when unknown fields are
encountered.  I did not see anything in that code that would cause a
reload, which makes sense, because a core reload in the middle of
indexing is a bad thing.  Somebody would have noticed that, and we would
have fixed it.

There is some code or perhaps a person making a request that results in
a core or collection reload.  A bug in Solr is *possible*, but I don't
think that's the problem here.

Another possibility is that one of your SolrCloud instances is getting
completely restarted ... but I think if that were happening, you'd
probably know about it.

Thanks,
Shawn



Indexing (x,y) points representing characteristics

2016-08-23 Thread marotosg
Hi.
I have a use case I am trying to solve and stuck with some ideas.
I would need to index one field in my collection with x,y values which
represents how a person is located on an axis based on some characteristics
of him. x and y go from 0 to 1 in 0.1 gaps.

For instance a person can have (0.5,0.1) or (0.7,0.7) etc.  Never negative
values.

I need to achieve to things.
1) When i have a search result. Facet on this field to group them and get
the counts with the same x,y.
I would need to plot them on SVG file with the plot and count.
2) Draw a box or circle to filter out the results.

Which field should I use to do it?
I see PointType is an option but I can't get results faceting on it.
I am testing LatLonType  or BBbox field but looks like when you try to
search they work with km so they are more geo orientated.

I appreciate any tips.

sergio




--
View this message in context: 
http://lucene.472066.n3.nabble.com/Indexing-x-y-points-representing-characteristics-tp4292928.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: help with DIH transformer to add a suffix to column names

2016-08-23 Thread Alexandre Rafalovitch
I am still not sure it is the right approach. As opposed to managed schema,
etc. But...

If you add dynamic field * it will accept any field. And DIH test for
skipping fields  unknown in schema during automatic name matching should
probably accept it.

Do that as step one, see that you get fields with content into the schema
at all.

Then, try your renaming magic to match to the more specific dynamic field
with extension. If it still fails, try doing that renaming in
UpdateRequestProcessor chain.

If this works, you can then set the * dynamicField to ignored (store,
index, docvalues all false) and it should still work for your renamed
fields and ignore others.

If this does work all together, please report back to the list (I am
curious) AND document this really well for the poor next maintainer of your
configuration.

Good luck,
Alex

On 23 Aug 2016 8:12 PM, "Wendy"  wrote:

Hi Emir,I have many tables and columns to index. One of the requirements is
to dynamically index columns without knowing column names. In this way, if a
new column is added later on, we don't need to change the configurations,
just need to do a delta-imput.I did use Solr with mongodb and
mongo-connector for a couple of applications. We were happy with the results
and performance. Now try to use Solr with mysql in a much large scale. I am
kind of stuck at this step for several days :-(Will try again today.Thanks
again for your response!



--
View this message in context: http://lucene.472066.n3.
nabble.com/help-with-DIH-transformer-to-add-a-suffix-to-column-names-
tp4292448p4292925.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: Indexing (x,y) points representing characteristics

2016-08-23 Thread Alexandre Rafalovitch
You could copyField for faceting if that's the only limiting factor. Or
even for each separate use case. Just don't store the copied values.

Regards,
Alex

On 23 Aug 2016 8:36 PM, "marotosg"  wrote:

> Hi.
> I have a use case I am trying to solve and stuck with some ideas.
> I would need to index one field in my collection with x,y values which
> represents how a person is located on an axis based on some characteristics
> of him. x and y go from 0 to 1 in 0.1 gaps.
>
> For instance a person can have (0.5,0.1) or (0.7,0.7) etc.  Never negative
> values.
>
> I need to achieve to things.
> 1) When i have a search result. Facet on this field to group them and get
> the counts with the same x,y.
> I would need to plot them on SVG file with the plot and count.
> 2) Draw a box or circle to filter out the results.
>
> Which field should I use to do it?
> I see PointType is an option but I can't get results faceting on it.
> I am testing LatLonType  or BBbox field but looks like when you try to
> search they work with km so they are more geo orientated.
>
> I appreciate any tips.
>
> sergio
>
>
>
>
> --
> View this message in context: http://lucene.472066.n3.
> nabble.com/Indexing-x-y-points-representing-characteristics-tp4292928.html
> Sent from the Solr - User mailing list archive at Nabble.com.
>


Re: Range Filter for Multi-Valued Date Fields

2016-08-23 Thread Iana Bondarska
Hello Mikhail,
I convert filters that come from other part of application and in general
cannot combine many filters into one , since conditions can be quite
complex.
Could you please provide more details why is this expected behavior -
(p_happyDates:[1975-10-31T00:00:00.000Z+TO+*]+AND+p_
happyDates:[*+TO+1975-10-31T23:59:59.999Z]) is  AND filter with 2
conditions date>="1975-10-31T00:00:00.000Z" and  date<="1975-10-
31T23:59:59.999Z" , seems that it should return same results that
&fq=p_happyDates:[1975-10-31T00:00:00.000Z+TO+1975-10-31T23:59:59.999Z]



2016-08-23 15:00 GMT+03:00 Mikhail Khludnev :

> Hello Iana,
>
> I consider is as expected behavior, perhaps usually it's done as
> &fq=p_happyDates:[1975-10-31T00:00:00.000Z+TO+1975-10-31T23:59:59.999Z],
> which is not equivalent to combining half closed ranges with boolean query.
> I wonder why did you do like that?
>
> On Tue, Aug 23, 2016 at 2:33 PM, Iana Bondarska 
> wrote:
>
> > Hi All,
> >
> > could you help me please with multiple range filters on multi valued
> > fields:
> > I have following dataset:
> > {
> > "p_happyDates":[
> > "1986-05-16T20:00:00Z",
> > "1875-04-29T21:57:56Z",
> > "1906-07-04T21:57:56Z"]
> > },
> > {
> > "p_happyDates":[
> > "1986-05-16T20:00:00Z",
> > "1975-10-31T21:57:56Z",
> > "1966-12-28T21:00:00Z"]
> > }
> > I apply filters:
> > &fq=(p_happyDates:[1975-10-31T00:00:00.000Z+TO+*]+AND+p_
> > happyDates:[*+TO+1975-10-31T23:59:59.999Z])
> > I expect to see only second record.
> > Actually I see both records. Even if I add parameter q.op=AND - result is
> > the same.
> > Is this expected behavior or known issue for multivalued fields?
> >
> > Best Regards,
> > Iana Bondarska
> >
>
>
>
> --
> Sincerely yours
> Mikhail Khludnev
>


Language support

2016-08-23 Thread Bradley Belyeu
Hi, I’m trying to find a synonym list for any of the following languages:
Catalan, Farsi, Hindi, Korean, Latvian, Dutch, Romanian, Thai, and Turkish
Does anyone know of resources where I can get a synonym list for these 
languages?


Re: Language support

2016-08-23 Thread Walter Underwood
Synonyms are also domain specific. A synonym set for one area may be completely 
wrong in another.

In cooking, arugula and rocket are the same thing. In military or aerospace, 
missile and rocket are very similar.

I would start with librarians. They maintain controlled vocabularies (called 
“thesauri”). Usually, a thesaurus has the official classification terms but 
also has “entry terms”. The entry terms are alternate terms that are used to 
get to the primary term.

For example, the category might be “electric vehicle”, but an entry term could 
be “zero emission vehicle”.

Good luck. I had a hard time finding thesauri on line a few years ago.

wunder
Walter Underwood
wun...@wunderwood.org
http://observer.wunderwood.org/  (my blog)


> On Aug 23, 2016, at 7:38 AM, Bradley Belyeu  
> wrote:
> 
> Hi, I’m trying to find a synonym list for any of the following languages:
> Catalan, Farsi, Hindi, Korean, Latvian, Dutch, Romanian, Thai, and Turkish
> Does anyone know of resources where I can get a synonym list for these 
> languages?



Settings for DocValues

2016-08-23 Thread Zheng Lin Edwin Yeo
Hi,

Would like to find out, if we set docValues="true" in my configuration of
the fieldType in schema,xml, must we set the corresponding indexed="false"
and stored="false"

Will there be any implication if I set my indexed="true and stored="true"?

I'm using Solr 6.1.0

Regards,
Edwin


Re: help with DIH transformer to add a suffix to column names

2016-08-23 Thread Wendy
 

Hi Alex,

It worked out kindly. I have to specify table column names. Using customer
transformer allowed me to change column name to _stem. In this way, it
simplifies field ranking in solrconfig.xml file and simplifies field
specification in  managed-schema file. I listed the steps below. 


Steps:
---
1. sample of db-data-config.xml file



 




   
  
 
  

  
  
 
  


2. Modification of solrconfig.xml file: 
Add the following lines:


 
 


 




db-data-config.xml




 
  
  true  
  explicit
  edismax
   pdb_id^20.0
   author_list_stem^20.0
header^10.0
reflns.resolution^5.0
keywords_stem^10.0
rest_field_stem^0.3 
7
1000
text 
  
 

3. Modification of managed-schema file:



   

 
 




 

  
 
  
  
  
  
  
  
  
  
  
  
   
  
   
  
  
  
  
 


4. Java class of FieldTransformer:
package my.solr.transformer;

import java.util.List;
import java.util.Map;

import org.apache.solr.handler.dataimport.Context;
import org.apache.solr.handler.dataimport.DataImporter;
import org.apache.solr.handler.dataimport.Transformer;

public class FieldTransformer extends Transformer  {
public Map transformRow(Map row, Context
context) {
List> fields = ((Context)
context).getAllEntityFields();

for (Map field : fields) {
String columnName = field.get(DataImporter.COLUMN);
// Get this field's value from the current row
Object value = row.get(columnName);
if (value != null && !value.toString().trim().equals("")) {
   row.put(columnName + "_stem", value.toString().trim());
}
}
return row;
}


}

5. NOTES:

1. When write customer transformer, need to copy the following files:

cp /opt/solr-6.1.0/dist/solr-dataimporthandler-extras-6.1.0.jar

 /opt/solr-6.1.0/server/solr-webapp/webapp/WEB-INF/lib/

cp /opt/solr-6.1.0/dist/solr-dataimporthandler-6.1.0.jar

 /opt/solr-6.1.0/server/solr-webapp/webapp/WEB-INF/lib/

2. put the customer transformer jar file to the following directory and
specify it solrconfig.xml file (see step 2 above) 

/opt/solr-6.1.0/dist/solr-rcsb-plugin.jar 





--
View this message in context: 
http://lucene.472066.n3.nabble.com/help-with-DIH-transformer-to-add-a-suffix-to-column-names-tp4292448p4292972.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: Settings for DocValues

2016-08-23 Thread Erick Erickson
These are orthogonal. The indexed and docvalues
structures are very different. One is useful for
searching and the other is useful for faceting and the like.

If you set indexed="false" and docValues="true", then try
to search on the field you roughly do a "table scan" which
is very slow.

Rule of thumb:
If you're searching set indexed="true".
If you're faceting/grouping/sorting set docValues="true".

Best,
Erick

On Tue, Aug 23, 2016 at 8:14 AM, Zheng Lin Edwin Yeo
 wrote:
> Hi,
>
> Would like to find out, if we set docValues="true" in my configuration of
> the fieldType in schema,xml, must we set the corresponding indexed="false"
> and stored="false"
>
> Will there be any implication if I set my indexed="true and stored="true"?
>
> I'm using Solr 6.1.0
>
> Regards,
> Edwin


Re: Error upgrading from 6.0 to 6.1

2016-08-23 Thread Stephen Lewis
Erick, Shawn, you're right on both counts. Mixed jar versions are happening
in both cases, and only lead to a fatal error on on the upgrade to 6.1.0.
So there was a big gap in my upgrading methodology. I've confirmed that
fixing the bootstrapping script allows the upgrade and that the correct jar
files are being loaded after the fix.

Sorry for the confusion with the attachment. I've included the log below
just in case it is helpful to anyone listening.

Thanks again!

Best,
Stephen

{msg=org.apache.solr.common.cloud.ZkStateReader.getClusterProps()Ljava/util/
Map;,trace=java.lang.NoSuchMethodError: org.apache.solr.common.cloud.
ZkStateReader.getClusterProps()Ljava/util/Map;
at org.apache.solr.core.CoreContainer.create(CoreContainer.java:743)
at org.apache.solr.handler.admin.CoreAdminOperation$1.call(
CoreAdminOperation.java:134)
at org.apache.solr.handler.admin.CoreAdminHandler$CallInfo.
call(CoreAdminHandler.java:351)
at org.apache.solr.handler.admin.CoreAdminHandler.handleRequestBody(
CoreAdminHandler.java:153)
at org.apache.solr.handler.RequestHandlerBase.handleRequest(
RequestHandlerBase.java:155)
at org.apache.solr.servlet.HttpSolrCall.handleAdminRequest(
HttpSolrCall.java:658)
at org.apache.solr.servlet.HttpSolrCall.call(HttpSolrCall.java:441)
at org.apache.solr.servlet.SolrDispatchFilter.doFilter(
SolrDispatchFilter.java:229)
at org.apache.solr.servlet.SolrDispatchFilter.doFilter(
SolrDispatchFilter.java:184)
at org.eclipse.jetty.servlet.ServletHandler$CachedChain.
doFilter(ServletHandler.java:1668)
at org.eclipse.jetty.servlet.ServletHandler.doHandle(
ServletHandler.java:581)
at org.eclipse.jetty.server.handler.ScopedHandler.handle(
ScopedHandler.java:143)
at org.eclipse.jetty.security.SecurityHandler.handle(
SecurityHandler.java:548)
at org.eclipse.jetty.server.session.SessionHandler.
doHandle(SessionHandler.java:226)
at org.eclipse.jetty.server.handler.ContextHandler.
doHandle(ContextHandler.java:1160)
at org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHandler.java:511)
at org.eclipse.jetty.server.session.SessionHandler.
doScope(SessionHandler.java:185)
at org.eclipse.jetty.server.handler.ContextHandler.
doScope(ContextHandler.java:1092)
at org.eclipse.jetty.server.handler.ScopedHandler.handle(
ScopedHandler.java:141)
at org.eclipse.jetty.server.handler.ContextHandlerCollection.handle(
ContextHandlerCollection.java:213)
at org.eclipse.jetty.server.handler.HandlerCollection.
handle(HandlerCollection.java:119)
at org.eclipse.jetty.server.handler.HandlerWrapper.handle(
HandlerWrapper.java:134)
at org.eclipse.jetty.server.Server.handle(Server.java:518)
at org.eclipse.jetty.server.HttpChannel.handle(HttpChannel.java:308)
at org.eclipse.jetty.server.HttpConnection.onFillable(
HttpConnection.java:244)
at org.eclipse.jetty.io.AbstractConnection$ReadCallback.succeeded(
AbstractConnection.java:273)
at org.eclipse.jetty.io.FillInterest.fillable(FillInterest.java:95)
at org.eclipse.jetty.io.SelectChannelEndPoint$2.run(
SelectChannelEndPoint.java:93)
at org.eclipse.jetty.util.thread.strategy.ExecuteProduceConsume.
produceAndRun(ExecuteProduceConsume.java:246)
at org.eclipse.jetty.util.thread.strategy.ExecuteProduceConsume.run(
ExecuteProduceConsume.java:156)
at org.eclipse.jetty.util.thread.QueuedThreadPool.runJob(
QueuedThreadPool.java:654)
at org.eclipse.jetty.util.thread.QueuedThreadPool$3.run(
QueuedThreadPool.java:572)
at java.lang.Thread.run(Thread.java:745)
,code=500}

On Tue, Aug 23, 2016 at 6:00 AM, Shawn Heisey  wrote:

> On 8/22/2016 9:18 PM, Stephen Lewis wrote:
> > Oops, apologies for my confusing grammar and for missing the
> > attachment. The intro sentence should have read "I have a question
> > about upgrading a solr cloud cluster in place." I've actually attached
> > the log below this time.
>
> The mailing list eats most attachments.  Sometimes they make it through,
> usually they don't, and I never can see what causes the difference.
> Your attachment did not make it through.
>
> For us to see it, you will need to to place the log somewhere on the
> Internet and provide a URL to access it.
>
> When you get a message saying that application/octet-stream was expected
> but text/html is received instead, it usually means what was received
> from the remote server was an error page, instead of the javabin
> response that was expected.  To see what that error is, you'll need to
> check the log on the remote server -- in this case, the server with IP
> address 172.18.6.68.
>
> Further down in the thread you did mention a NoSuchMethodError.  If
> that's the error message from 172.18.6.68, then I agree with Erick's
> assessment.  You've probably got multiple versions of Solr jars on your
> classpath.
>
> Best guess is that your bootstrapping step copies the install directory
> without deleting anything from the target, which would *add* jars to
> server/solr-webapp/webapp/WEB-INF/lib.  The jars in the two versions of
> Solr do not have the same names -- the full version number is part of
>

Is it safe to upgrade an existing field to docvalues?

2016-08-23 Thread Ronald Wood
We are planning to migrate from Solr 4.10.4 to 5.5.2 in the next couple of 
months. We do not use SolrCloud.

When doing initial testing in our dev and qa environments we ran into cases 
where we got errors for fields that had docvalues newly enabled, but not 
re-indexed. Mixed docvalues/non-docvalues was possible due to ongoing indexing.

Specifically, when we tried to sort or facet we sometimes got errors like:

“IllegalStateException: unexpected docvalues type NONE for field 'id' 
(expected=SORTED). Use UninvertingReader or index with docvalues.” Id is a 
string field with docValues=true.

This did not always consistently happen, but any occurrence of this is 
troublesome.

My reading of tickets like https://issues.apache.org/jira/browse/SOLR-7190 is 
that when docvalues is not fully available, Solr will fall back to the 
UninvertingReader. But the error message seems to indicate this is not done 
automatically for a sort.

In general, is there a way to migrate existing indexes (we have petabytes of 
data) by enabling docvalues and incrementally re-indexing? We expect the latter 
would take a month using an atomic update process.

Could this be an artifact of having old 4.x indexes, and it would be wiser to 
first migrate the indexes to 5.x format before enabling docvalues? (We expect 
that would also take us a month using incremental optimize.)

Can someone clarify what migration paths to docvalues are likely to succeed?

Thanks!

-Ronald Wood.





Custom handler/content stream loader

2016-08-23 Thread Jamie Johnson
I have a need to build custom field types that store additional metadata at
the field level in a payload.  I was thinking that I could satisfy this by
building a custom UpdateRequest that captured this additional information
in XML, but I am not really sure how to get at this additional information
on the server side.  Would I need to implement a custom RequestHandler to
handle the update, could I add a custom ContentStreamLoader to parse the
XML, how do I customize the creation of the lucene document once I have the
XML?  Any help/direction would really be appreciated.

-Jamie


Graph Query Parser

2016-08-23 Thread Jigar Shah
Hello,

I am trying to get "path" from root node to leaves using Graph Query
Parser. Graph Query Parser gives me all child nodes from root but not
specific paths. Can someone suggest how to get that?

e.g: If i have parent to child relation as follows.

A -> B,C -> D

D is child of B and C and they are child of A.

Expected result set:

A/B/D
A/C/D

Resultset should have ordered list from root to leaf, like we get in
hierarchical facet.

Thanks,
Jigar Shah.


Re: Graph Query Parser

2016-08-23 Thread Joel Bernstein
If you're using Solr Cloud, the shortestPath streaming expression will do
that for you:

https://cwiki.apache.org/confluence/display/solr/Streaming+Expressions#StreamingExpressions-shortestPath

Joel Bernstein
http://joelsolr.blogspot.com/

On Tue, Aug 23, 2016 at 4:34 PM, Jigar Shah  wrote:

> Hello,
>
> I am trying to get "path" from root node to leaves using Graph Query
> Parser. Graph Query Parser gives me all child nodes from root but not
> specific paths. Can someone suggest how to get that?
>
> e.g: If i have parent to child relation as follows.
>
> A -> B,C -> D
>
> D is child of B and C and they are child of A.
>
> Expected result set:
>
> A/B/D
> A/C/D
>
> Resultset should have ordered list from root to leaf, like we get in
> hierarchical facet.
>
> Thanks,
> Jigar Shah.
>


Re: Custom handler/content stream loader

2016-08-23 Thread Jamie Johnson
Ok, did a bit more digging.  It looks like if I build a custom
ContentStreamLoader I can create a custom AddOrUpdateCommand that is
ultimately responsible for building the lucene document.  So looks like if
I build a custom UpdateRequestHandler I can register my custom
ContentStreamLoader and I'll be set.  Is this the appropriate course of
action?


Lastly, I always want to use my custom UpdateRequest when adding data to
Solr from SolrJ but I don't see an easy way of doing this.  Really what I
need is to control the XML generated and being sent to the server and it
looks like this is the best way, but I wonder given the inability to plugin
a custom request writer (or something similar).  Am I barking up the wrong
tree?

On Aug 23, 2016 5:22 PM, "Jamie Johnson"  wrote:

> I have a need to build custom field types that store additional metadata
> at the field level in a payload.  I was thinking that I could satisfy this
> by building a custom UpdateRequest that captured this additional
> information in XML, but I am not really sure how to get at this additional
> information on the server side.  Would I need to implement a custom
> RequestHandler to handle the update, could I add a custom
> ContentStreamLoader to parse the XML, how do I customize the creation of
> the lucene document once I have the XML?  Any help/direction would really
> be appreciated.
>
> -Jamie
>


Re: Custom handler/content stream loader

2016-08-23 Thread Alexandre Rafalovitch
Have you tried starting with the DelimitedPayloadTokenFilterFactory?
There is a sample configuration in the shipped examples:
https://github.com/apache/lucene-solr/blob/releases/lucene-solr/6.1.0/solr/example/example-DIH/solr/db/conf/managed-schema#L625

Regards,
Alex.

Newsletter and resources for Solr beginners and intermediates:
http://www.solr-start.com/


On 24 August 2016 at 04:22, Jamie Johnson  wrote:
> I have a need to build custom field types that store additional metadata at
> the field level in a payload.  I was thinking that I could satisfy this by
> building a custom UpdateRequest that captured this additional information
> in XML, but I am not really sure how to get at this additional information
> on the server side.  Would I need to implement a custom RequestHandler to
> handle the update, could I add a custom ContentStreamLoader to parse the
> XML, how do I customize the creation of the lucene document once I have the
> XML?  Any help/direction would really be appreciated.
>
> -Jamie


Re: Settings for DocValues

2016-08-23 Thread Zheng Lin Edwin Yeo
Hi Erick,

Thanks for the explanation.

Regards,
Edwin


On 24 August 2016 at 02:27, Erick Erickson  wrote:

> These are orthogonal. The indexed and docvalues
> structures are very different. One is useful for
> searching and the other is useful for faceting and the like.
>
> If you set indexed="false" and docValues="true", then try
> to search on the field you roughly do a "table scan" which
> is very slow.
>
> Rule of thumb:
> If you're searching set indexed="true".
> If you're faceting/grouping/sorting set docValues="true".
>
> Best,
> Erick
>
> On Tue, Aug 23, 2016 at 8:14 AM, Zheng Lin Edwin Yeo
>  wrote:
> > Hi,
> >
> > Would like to find out, if we set docValues="true" in my configuration of
> > the fieldType in schema,xml, must we set the corresponding
> indexed="false"
> > and stored="false"
> >
> > Will there be any implication if I set my indexed="true and
> stored="true"?
> >
> > I'm using Solr 6.1.0
> >
> > Regards,
> > Edwin
>


Re: Range Filter for Multi-Valued Date Fields

2016-08-23 Thread Mikhail Khludnev
It executes both half closed ranges first, here the undesired first doc
comes in. Then it intersect these document sets, and here again, the
undesired first doc comes through.

On Tue, Aug 23, 2016 at 5:15 PM, Iana Bondarska  wrote:

> Hello Mikhail,
> I convert filters that come from other part of application and in general
> cannot combine many filters into one , since conditions can be quite
> complex.
> Could you please provide more details why is this expected behavior -
> (p_happyDates:[1975-10-31T00:00:00.000Z+TO+*]+AND+p_
> happyDates:[*+TO+1975-10-31T23:59:59.999Z]) is  AND filter with 2
> conditions date>="1975-10-31T00:00:00.000Z" and  date<="1975-10-
> 31T23:59:59.999Z" , seems that it should return same results that
> &fq=p_happyDates:[1975-10-31T00:00:00.000Z+TO+1975-10-31T23:59:59.999Z]
>
>
>
> 2016-08-23 15:00 GMT+03:00 Mikhail Khludnev :
>
> > Hello Iana,
> >
> > I consider is as expected behavior, perhaps usually it's done as
> > &fq=p_happyDates:[1975-10-31T00:00:00.000Z+TO+1975-10-31T23:59:59.999Z],
> > which is not equivalent to combining half closed ranges with boolean
> query.
> > I wonder why did you do like that?
> >
> > On Tue, Aug 23, 2016 at 2:33 PM, Iana Bondarska 
> > wrote:
> >
> > > Hi All,
> > >
> > > could you help me please with multiple range filters on multi valued
> > > fields:
> > > I have following dataset:
> > > {
> > > "p_happyDates":[
> > > "1986-05-16T20:00:00Z",
> > > "1875-04-29T21:57:56Z",
> > > "1906-07-04T21:57:56Z"]
> > > },
> > > {
> > > "p_happyDates":[
> > > "1986-05-16T20:00:00Z",
> > > "1975-10-31T21:57:56Z",
> > > "1966-12-28T21:00:00Z"]
> > > }
> > > I apply filters:
> > > &fq=(p_happyDates:[1975-10-31T00:00:00.000Z+TO+*]+AND+p_
> > > happyDates:[*+TO+1975-10-31T23:59:59.999Z])
> > > I expect to see only second record.
> > > Actually I see both records. Even if I add parameter q.op=AND - result
> is
> > > the same.
> > > Is this expected behavior or known issue for multivalued fields?
> > >
> > > Best Regards,
> > > Iana Bondarska
> > >
> >
> >
> >
> > --
> > Sincerely yours
> > Mikhail Khludnev
> >
>



-- 
Sincerely yours
Mikhail Khludnev