Re: KeepwordFilter issue

2015-06-12 Thread Upayavira
Facets use indexed terms, search results return stored terms before
analysis. If you want to *store* the results of analysis, then you'll
need to do it in an analysis chain. I do recall Chris Hosstetter
(Hossman) giving a presentation about how to do this, but I don't recall
where that was :-( Hopefully he is listening.

Upayavira

On Fri, Jun 12, 2015, at 07:25 AM, vineet yadav wrote:
> Hi,
> 
> I am using keepword filter to identify key phrases. I have made following
> schema changes in schema.xml
> 
> 
>  indexed="true" multiValued="true"/>
> 
> 
>  
> 
> 
> 
>  
> 
>  ignoreCase="true"/>
> 
> 
> 
>  ignoreCase="true"/>
> 
> 
> 
> When I am using facet query on keyphrase field(
> http://localhost:8983/solr/core1/select?q=*%3A*&wt=json&indent=true&facet=true&facet.field=keyphrase_words)
> , I am getting only filtered words. But When I use solr general query
> 
> http://localhost:8983/solr/core1/select?q=*%3A*&wt=json&indent=true,
> 
> Both content field and keyphrase field has same content.
> I want to get only filter words in solr general query
> http://localhost:8983/solr/core1/select?q=*%3A*&wt=json&indent=true. Can
> you please tell me how can I achieve this requirement.


Re: KeepwordFilter issue

2015-06-12 Thread Ahmet Arslan
Hi,

Then, you need to perform filtering in an update processor for example.
https://cwiki.apache.org/confluence/display/solr/Update+Request+Processors

Ahmet



On Friday, June 12, 2015 9:26 AM, vineet yadav  
wrote:
Hi,

I am using keepword filter to identify key phrases. I have made following
schema changes in schema.xml









 









When I am using facet query on keyphrase field(
http://localhost:8983/solr/core1/select?q=*%3A*&wt=json&indent=true&facet=true&facet.field=keyphrase_words)
, I am getting only filtered words. But When I use solr general query

http://localhost:8983/solr/core1/select?q=*%3A*&wt=json&indent=true,

Both content field and keyphrase field has same content.
I want to get only filter words in solr general query
http://localhost:8983/solr/core1/select?q=*%3A*&wt=json&indent=true. Can
you please tell me how can I achieve this requirement.


Issues with using Paoding to index Chinese characters

2015-06-12 Thread Zheng Lin Edwin Yeo
I'm trying to use Paoding to index Chinese characters in Solr.

I'm using Solr 5.1, have downloaded the dictionary to shard1\dic and
shard2\dic, and have configured the following in schema,xml





I've also included -DPAODING_DIC_HOME=/dic during my startup of Solr

However, when I tried to start Solr, I get the following error:

java.lang.VerifyError: class
net.paoding.analysis.analyzer.PaodingAnalyzerBean overrides final
method 
tokenStream.(Ljava/lang/String;Ljava/io/Reader;)Lorg/apache/lucene/analysis/TokenStream;
at java.lang.ClassLoader.defineClass1(Native Method)
at java.lang.ClassLoader.defineClass(Unknown Source)
at java.security.SecureClassLoader.defineClass(Unknown Source)
at java.net.URLClassLoader.defineClass(Unknown Source)
at java.net.URLClassLoader.access$100(Unknown Source)
at java.net.URLClassLoader$1.run(Unknown Source)
at java.net.URLClassLoader$1.run(Unknown Source)
at java.security.AccessController.doPrivileged(Native Method)
at java.net.URLClassLoader.findClass(Unknown Source)
at 
org.eclipse.jetty.webapp.WebAppClassLoader.loadClass(WebAppClassLoader.java:421)
at 
org.eclipse.jetty.webapp.WebAppClassLoader.loadClass(WebAppClassLoader.java:383)
at java.lang.ClassLoader.defineClass1(Native Method)
at java.lang.ClassLoader.defineClass(Unknown Source)
at java.security.SecureClassLoader.defineClass(Unknown Source)
at java.net.URLClassLoader.defineClass(Unknown Source)
at java.net.URLClassLoader.access$100(Unknown Source)
at java.net.URLClassLoader$1.run(Unknown Source)
at java.net.URLClassLoader$1.run(Unknown Source)
at java.security.AccessController.doPrivileged(Native Method)
at java.net.URLClassLoader.findClass(Unknown Source)
at 
org.eclipse.jetty.webapp.WebAppClassLoader.loadClass(WebAppClassLoader.java:421)
at java.lang.ClassLoader.loadClass(Unknown Source)
at java.net.FactoryURLClassLoader.loadClass(Unknown Source)
at java.lang.ClassLoader.loadClass(Unknown Source)
at java.net.FactoryURLClassLoader.loadClass(Unknown Source)
at java.lang.ClassLoader.loadClass(Unknown Source)
at java.lang.Class.forName0(Native Method)
at java.lang.Class.forName(Unknown Source)
at 
org.apache.solr.core.SolrResourceLoader.findClass(SolrResourceLoader.java:476)
at 
org.apache.solr.core.SolrResourceLoader.findClass(SolrResourceLoader.java:423)
at 
org.apache.solr.schema.FieldTypePluginLoader.readAnalyzer(FieldTypePluginLoader.java:262)
at 
org.apache.solr.schema.FieldTypePluginLoader.create(FieldTypePluginLoader.java:94)
at 
org.apache.solr.schema.FieldTypePluginLoader.create(FieldTypePluginLoader.java:42)
at 
org.apache.solr.util.plugin.AbstractPluginLoader.load(AbstractPluginLoader.java:151)
at org.apache.solr.schema.IndexSchema.readSchema(IndexSchema.java:489)
at org.apache.solr.schema.IndexSchema.(IndexSchema.java:175)
at 
org.apache.solr.schema.IndexSchemaFactory.create(IndexSchemaFactory.java:55)
at 
org.apache.solr.schema.IndexSchemaFactory.buildIndexSchema(IndexSchemaFactory.java:69)
at 
org.apache.solr.core.ConfigSetService.createIndexSchema(ConfigSetService.java:102)
at 
org.apache.solr.core.ConfigSetService.getConfig(ConfigSetService.java:74)
at org.apache.solr.core.CoreContainer.create(CoreContainer.java:516)
at org.apache.solr.core.CoreContainer$1.call(CoreContainer.java:283)
at org.apache.solr.core.CoreContainer$1.call(CoreContainer.java:277)
at java.util.concurrent.FutureTask.run(Unknown Source)
at java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Source)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source)
at java.lang.Thread.run(Unknown Source)


Is there anything which I've done wrong or missed out?


Regards,
Edwin


Solr result with Intersects QUery is unexpected

2015-06-12 Thread Novin

Hello,

I am doing Intersect query and I am getting unexpected results. Details 
is below:


Below is the solr schema definition I am using for indexing polygons.

class="solr.SpatialRecursivePrefixTreeFieldType"

spatialContextFactory="com.spatial4j.core.context.jts.JtsSpatialContextFactory"
geo="true" distErrPct="0.010" maxDistErr="0.05" 
distanceUnits="degrees" />


stored="false" required="false" multiValued="true"/>


Format of Polygon is indexed

POLYGON((-0.154839 51.508315,-0.15265 51.506365,-0.151019 
51.503267,-0.161984 51.502732,-0.173357 51.502385,-0.184085 
51.502291,-0.190759 51.501984,-0.192325 51.504095,-0.192926 
51.504161,-0.193591 51.504509,-0.194299 51.505724,-0.194879 
51.508248,-0.195329 51.509223,-0.197046 51.508889,-0.197325 
51.511039,-0.197346 51.513203,-0.195029 51.515072,-0.196273 
51.518878,-0.198441 51.518571,-0.201037 51.521455,-0.201681 
51.524218,-0.194085 51.530827,-0.191917 51.53387,-0.189557 
51.535098,-0.188398 51.534244,-0.175095 51.523511,-0.167842 
51.518304,-0.160246 51.513336,-0.154839 51.508315))


POLYGON((-0.354266 51.429023,-0.34976 51.428059,-0.35182 
51.42426,-0.352077 51.419416,-0.352721 51.410771,-0.360103 
51.412778,-0.370145 51.410771,-0.379157 51.407826,-0.388041 
51.409299,-0.393233 51.409218,-0.400186 51.406354,-0.409455 
51.404346,-0.428295 51.386325,-0.432587 51.387825,-0.432458 
51.38978,-0.43469 51.391574,-0.436277 51.393556,-0.43323 
51.398777,-0.434217 51.402927,-0.43263 51.4044,-0.431514 
51.410021,-0.430355 51.411841,-0.430441 51.413876,-0.442929 
51.415749,-0.448594 51.422868,-0.451598 51.423457,-0.474343 
51.425919,-0.472798 51.426802,-0.481296 51.430441,-0.488591 
51.43127,-0.491509 51.433063,-0.451641 51.438948,-0.40 
51.448551,-0.396602 51.450236,-0.386002 51.451867,-0.381453 
51.452108,-0.377548 51.452028,-0.358236 51.44946,-0.347743 
51.448003,-0.342507 51.447762,-0.338044 51.448016,-0.338902 
51.443603,-0.339353 51.439082,-0.340383 51.436527,-0.343173 
51.435337,-0.351927 51.43309,-0.354266 51.429023))


Below query I am using for Intersect to get results
geopolygon:"Intersects(ENVELOPE(7.14111, -7.89917, 54.36776, 49.47526))" 
AND id:xyz


Above query does not give any results but below query gives the results. 
I am not sure why. But above query bbox contains and below query's bbox.


geopolygon:"Intersects(ENVELOPE(0.42572, -1.45432, 51.97304, 51.3572))" 
AND id:xyz


Am I missing something  or doing wrong query?

I've also tried Contains too, but does not fit requirement. And problem 
is same as intersect.


Any kind of help would be appreciated. and let  me know if you guys need 
more details.


Thanks for your time,
Novin


Solr Exact match boost Reduce the results

2015-06-12 Thread JACK
I have two fields, one is copy field. I have to get Exact match results first
along with entire result of fuzzy search.





Its filed definition is given below



  


















 


  



 
 






   


The dummy filed is to get the exact match results.
1. To get exact results first just use quotes around the search words. So i
am getting the exact results first. But the result is too less.Around
8000.Query is given below
q="laptop+bag"&df=product_name&defType=edismax&qf=product_name^0.01+dummy_name^200

2. But for the query without quotes gives huge amount of results around
2, but won't give exact one first. Its query is below
q=laptop+bag&df=product_name&defType=edismax&qf=product_name^0.01+dummy_name^200

I have to get huge results like my second option with exact results first.
Is this the way to do it or any problem in my query?



--
View this message in context: 
http://lucene.472066.n3.nabble.com/Solr-Exact-match-boost-Reduce-the-results-tp4211352.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: Merging Sets of Data from Two Different Sources

2015-06-12 Thread Alessandro Benedetti
Hi Paden,
probably you don't even need to customise the DIH.
I would give it a try.

To answer your question , yes, federated search is definitely possible with
Solr in simple ways

Cheers

2015-06-11 22:36 GMT+01:00 Reitzel, Charles :

> Yes.  Typically, the content file is used to populate a single field in
> each document, e.g. "content".  Typically, this field is the primary target
> for searches.Sometimes, additional metadata (title, author, etc.) can
> be extracted from the source files.   But the idea remains the same: the
> two sources (database record + file) are merged into single searchable
> document in solr.
>
> If you write your own indexer using SolrJ, you have more control the
> loading process and, imo, the approach is clearer.  All the pieces come
> together in one place.
>
> But Alessandro says the same result is achievable using
> DataImportHandler.   Probably worth a try before writing code...
>
> -Original Message-
> From: Paden [mailto:rumsey...@gmail.com]
> Sent: Thursday, June 11, 2015 4:14 PM
> To: solr-user@lucene.apache.org
> Subject: Re: Merging Sets of Data from Two Different Sources
>
> So you're saying I could merge both the metadata in the database and their
> files in the file system into one  query-able item in solr by just
> customizing the DIH correctly and getting the right schema?
>
> (I'm sorry this sounds like a redundant question but I've been trying to
> find an answer for the past couple of days and it seems like people
> sometimes misunderstand what I'm asking)
>
>
>
> --
> View this message in context:
> http://lucene.472066.n3.nabble.com/Merging-Sets-of-Data-from-Two-Different-Sources-tp4211166p4211248.html
> Sent from the Solr - User mailing list archive at Nabble.com.
>
> *
> This e-mail may contain confidential or privileged information.
> If you are not the intended recipient, please notify the sender
> immediately and then delete it.
>
> TIAA-CREF
> *
>
>


-- 
--

Benedetti Alessandro
Visiting card : http://about.me/alessandro_benedetti

"Tyger, tyger burning bright
In the forests of the night,
What immortal hand or eye
Could frame thy fearful symmetry?"

William Blake - Songs of Experience -1794 England


Re: Issues with using Paoding to index Chinese characters

2015-06-12 Thread Upayavira
Not knowing anything about paoding, it seems that this library isn't
compatible with the current version of Solr/Lucene. Have a look at the
version that it was compiled for. Having looked at the date of the
latest download (2008) Lucene has changed a LOT since then, so some
conversion work will definitely be needed to make it work.

Upayavira

On Fri, Jun 12, 2015, at 08:28 AM, Zheng Lin Edwin Yeo wrote:
> I'm trying to use Paoding to index Chinese characters in Solr.
> 
> I'm using Solr 5.1, have downloaded the dictionary to shard1\dic and
> shard2\dic, and have configured the following in schema,xml
> 
> 
> 
> 
> 
> I've also included -DPAODING_DIC_HOME=/dic during my startup of Solr
> 
> However, when I tried to start Solr, I get the following error:
> 
> java.lang.VerifyError: class
> net.paoding.analysis.analyzer.PaodingAnalyzerBean overrides final
> method
> tokenStream.(Ljava/lang/String;Ljava/io/Reader;)Lorg/apache/lucene/analysis/TokenStream;
>   at java.lang.ClassLoader.defineClass1(Native Method)
>   at java.lang.ClassLoader.defineClass(Unknown Source)
>   at java.security.SecureClassLoader.defineClass(Unknown Source)
>   at java.net.URLClassLoader.defineClass(Unknown Source)
>   at java.net.URLClassLoader.access$100(Unknown Source)
>   at java.net.URLClassLoader$1.run(Unknown Source)
>   at java.net.URLClassLoader$1.run(Unknown Source)
>   at java.security.AccessController.doPrivileged(Native Method)
>   at java.net.URLClassLoader.findClass(Unknown Source)
>   at 
> org.eclipse.jetty.webapp.WebAppClassLoader.loadClass(WebAppClassLoader.java:421)
>   at 
> org.eclipse.jetty.webapp.WebAppClassLoader.loadClass(WebAppClassLoader.java:383)
>   at java.lang.ClassLoader.defineClass1(Native Method)
>   at java.lang.ClassLoader.defineClass(Unknown Source)
>   at java.security.SecureClassLoader.defineClass(Unknown Source)
>   at java.net.URLClassLoader.defineClass(Unknown Source)
>   at java.net.URLClassLoader.access$100(Unknown Source)
>   at java.net.URLClassLoader$1.run(Unknown Source)
>   at java.net.URLClassLoader$1.run(Unknown Source)
>   at java.security.AccessController.doPrivileged(Native Method)
>   at java.net.URLClassLoader.findClass(Unknown Source)
>   at 
> org.eclipse.jetty.webapp.WebAppClassLoader.loadClass(WebAppClassLoader.java:421)
>   at java.lang.ClassLoader.loadClass(Unknown Source)
>   at java.net.FactoryURLClassLoader.loadClass(Unknown Source)
>   at java.lang.ClassLoader.loadClass(Unknown Source)
>   at java.net.FactoryURLClassLoader.loadClass(Unknown Source)
>   at java.lang.ClassLoader.loadClass(Unknown Source)
>   at java.lang.Class.forName0(Native Method)
>   at java.lang.Class.forName(Unknown Source)
>   at 
> org.apache.solr.core.SolrResourceLoader.findClass(SolrResourceLoader.java:476)
>   at 
> org.apache.solr.core.SolrResourceLoader.findClass(SolrResourceLoader.java:423)
>   at 
> org.apache.solr.schema.FieldTypePluginLoader.readAnalyzer(FieldTypePluginLoader.java:262)
>   at 
> org.apache.solr.schema.FieldTypePluginLoader.create(FieldTypePluginLoader.java:94)
>   at 
> org.apache.solr.schema.FieldTypePluginLoader.create(FieldTypePluginLoader.java:42)
>   at 
> org.apache.solr.util.plugin.AbstractPluginLoader.load(AbstractPluginLoader.java:151)
>   at org.apache.solr.schema.IndexSchema.readSchema(IndexSchema.java:489)
>   at org.apache.solr.schema.IndexSchema.(IndexSchema.java:175)
>   at 
> org.apache.solr.schema.IndexSchemaFactory.create(IndexSchemaFactory.java:55)
>   at 
> org.apache.solr.schema.IndexSchemaFactory.buildIndexSchema(IndexSchemaFactory.java:69)
>   at 
> org.apache.solr.core.ConfigSetService.createIndexSchema(ConfigSetService.java:102)
>   at 
> org.apache.solr.core.ConfigSetService.getConfig(ConfigSetService.java:74)
>   at org.apache.solr.core.CoreContainer.create(CoreContainer.java:516)
>   at org.apache.solr.core.CoreContainer$1.call(CoreContainer.java:283)
>   at org.apache.solr.core.CoreContainer$1.call(CoreContainer.java:277)
>   at java.util.concurrent.FutureTask.run(Unknown Source)
>   at java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Source)
>   at java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source)
>   at java.lang.Thread.run(Unknown Source)
> 
> 
> Is there anything which I've done wrong or missed out?
> 
> 
> Regards,
> Edwin


Re: KeepwordFilter issue

2015-06-12 Thread Erik Hatcher
I detailed using a JavaScript update processor to run code through (the 
KeepWordFilter too) analysis and setting the stored field values to the 
extracted tokens here:

http://www.slideshare.net/erikhatcher/solr-indexing-and-analysis-tricks 



—
Erik Hatcher, Senior Solutions Architect
http://www.lucidworks.com 




> On Jun 12, 2015, at 3:04 AM, Upayavira  wrote:
> 
> Facets use indexed terms, search results return stored terms before
> analysis. If you want to *store* the results of analysis, then you'll
> need to do it in an analysis chain. I do recall Chris Hosstetter
> (Hossman) giving a presentation about how to do this, but I don't recall
> where that was :-( Hopefully he is listening.
> 
> Upayavira
> 
> On Fri, Jun 12, 2015, at 07:25 AM, vineet yadav wrote:
>> Hi,
>> 
>> I am using keepword filter to identify key phrases. I have made following
>> schema changes in schema.xml
>> 
>> 
>> > indexed="true" multiValued="true"/>
>> 
>> 
>> 
>> 
>> 
>> 
>> 
>>
>>> ignoreCase="true"/>
>>
>>
>>
>>> ignoreCase="true"/>
>>
>>
>> 
>> When I am using facet query on keyphrase field(
>> http://localhost:8983/solr/core1/select?q=*%3A*&wt=json&indent=true&facet=true&facet.field=keyphrase_words)
>> , I am getting only filtered words. But When I use solr general query
>> 
>> http://localhost:8983/solr/core1/select?q=*%3A*&wt=json&indent=true,
>> 
>> Both content field and keyphrase field has same content.
>> I want to get only filter words in solr general query
>> http://localhost:8983/solr/core1/select?q=*%3A*&wt=json&indent=true. Can
>> you please tell me how can I achieve this requirement.



Re: Solr Exact match boost Reduce the results

2015-06-12 Thread Alessandro Benedetti
Hi jack, do you mean exact match over the synonyms ?
In that case, with your Analyzer you are not going to be able to see that.
You apply index analysis synonym expansion.
So for your Index there is no difference between the synonyms, there is no
"exact match" .

In this case I would suggest you to use a custom Synonym query parser that
actually solve this problem boosting original terms and providing nice
query time synonyms :

http://nolanlawson.com/2012/10/31/better-synonym-handling-in-solr/

If it's not related the synonyms, you should take a look to the edismax
query parser.
I think he enhance exact phrase matches out of the box.

Cheers



2015-06-12 10:38 GMT+01:00 JACK :

> I have two fields, one is copy field. I have to get Exact match results
> first
> along with entire result of fuzzy search.
>
>  stored="true"
> required="true" multiValued="false"/>
>  required="true" />
> 
>
> Its filed definition is given below
>
>  positionIncrementGap="100">
> 
>class="solr.WhitespaceTokenizerFactory"/>
>  synonyms="synonyms.txt"
> ignoreCase="true" expand="true"/>
>  class="solr.WordDelimiterFilterFactory"
>  generateWordParts="1"
>  generateNumberParts="1"
>  catenateWords="1"
>  catenateNumbers="1"
>  catenateAll="1"
>  preserveOriginal="1"
>  />
>  class="solr.LowerCaseFilterFactory"/>
>  class="solr.SnowballPorterFilterFactory" language="English" />
>  class="solr.PorterStemFilterFactory"/>
> 
>  class="solr.EnglishMinimalStemFilterFactory"/>
> 
> 
>  class="solr.WhitespaceTokenizerFactory"/>
>  synonyms="synonyms.txt"
> ignoreCase="true" expand="true"/>
>  class="solr.WordDelimiterFilterFactory"
>  generateWordParts="1"
>  generateNumberParts="1"
>  catenateWords="1"
>  catenateNumbers="1"
>  catenateAll="1"
>  preserveOriginal="1"
>  />
>  class="solr.LowerCaseFilterFactory"/>
>  class="solr.SnowballPorterFilterFactory" language="English" />
>  class="solr.PorterStemFilterFactory"/>
> 
>  class="solr.EnglishMinimalStemFilterFactory"/>
> 
>  
>
>  omitNorms="true">
>   
>  class="solr.WhitespaceTokenizerFactory"/>
>  maxTokenCount="20"/>
> 
>  
>  
>  class="solr.WhitespaceTokenizerFactory"/>
>  maxTokenCount="20"/>
> 
>  language="English" />
> 
>  class="solr.EnglishMinimalStemFilterFactory"/>
>
> 
>
> The dummy filed is to get the exact match results.
> 1. To get exact results first just use quotes around the search words. So i
> am getting the exact results first. But the result is too less.Around
> 8000.Query is given below
>
> q="laptop+bag"&df=product_name&defType=edismax&qf=product_name^0.01+dummy_name^200
>
> 2. But for the query without quotes gives huge amount of results around
> 2, but won't give exact one first. Its query is below
>
> q=laptop+bag&df=product_name&defType=edismax&qf=product_name^0.01+dummy_name^200
>
> I have to get huge results like my second option with exact results first.
> Is this the way to do it or any problem in my query?
>
>
>
> --
> View this message in context:
> http://lucene.472066.n3.nabble.com/Solr-Exact-match-boost-Reduce-the-results-tp4211352.html
> Sent from the Solr - User mailing list archive at Nabble.com.
>



-- 
--

Benedetti Alessandro
Visiting card : http://about.me/alessandro_benedetti

"Tyger, tyger burning bright
In the forests of the night,
What immortal hand or eye
Could frame thy fearful symmetry?"

William Blake - Songs of Experience -1794 England


Query regarding scorejoin query parser

2015-06-12 Thread Neeraj Lajpal
Hi,
I am using scorejoin query parser: 
https://issues.apache.org/jira/browse/SOLR-6234
I have a doubt regarding query time join.Example:I have the document structure 
like this:parent1: id, important_categorychild1: id, 
_root_,brandidchild2:id,_root_,brandid
_root_ field of child doc contains value of parent's id.
I want to use query time join. I want all the parent documents after applying 
boosts on children's fields and avg their scores, and then apply boost on 
important_category field on parent.My query is like:{!scorejoin from=_root_ 
to=id score=avg multiVals=false}(brandid:1398^4 OR brandid:237^4.5) 
But, I want to apply boost to important_category field of parent also.If I will 
use fq it will not impact score, so I have to use q only.If I make q like :  
important_category:322^4{!scorejoin from=_root_ to=id score=avg 
multiVals=false}(brandid:1398^4 OR brandid:237^4.5) 
it is not showing any results.How can I do this?

Thanks,Neeraj 

Re: How to index/search without whitespace but hightlight with whitespace?

2015-06-12 Thread Alessandro Benedetti
Can you show  a practical example of what do you get ?
Highlight is calculated on stored fields, and the term offset and position
if calculated , should be ok !
Let me know

Cheers

2015-06-11 18:50 GMT+01:00 Travis :

> Hey everyone!
>
> I'm trying to setup a Solr instance on some free text clinical data.
> This data has a lot of white space formatting, for example, I might have a
> document that contains unstructured bulleted lists or section titles.
>
> For example,
>
> blah blah blah...
> MEDICATIONS:
> * Xanax
> * Phenobritrol
>
> DIAGNOSIS:
> blah blah blah...
>
> When indexing (and thus querying) this document, I use a text field with
> tokenization, stemming, etc, lets call it "text".
>
> Unfortunately, when I try to print highlighted results, the newlines and
> whitespace are obviously not preserved. In an attempt to get around this, I
> created a second field in the index that stores the full content of each
> document as a string, thus preserving the whitespace, called "raw_text".
>
> If I setup the search page to search on the text field, but highlight on
> the text_raw field, then the highlighted matches don't always line up. Is
> there a way to some how project the stemmed matches from the text field
> onto the text_raw field when displaying hightlighting?
>
> Thank you for your time,
> Travis
>



-- 
--

Benedetti Alessandro
Visiting card : http://about.me/alessandro_benedetti

"Tyger, tyger burning bright
In the forests of the night,
What immortal hand or eye
Could frame thy fearful symmetry?"

William Blake - Songs of Experience -1794 England


Re: Solr Exact match boost Reduce the results

2015-06-12 Thread JACK
Hi Alessandro Benedetti ,

What i meant is that suppose if i have items like this

dell laptop with bag
dell laptop
dell laptop without bag
dell inspiron laptop with bag
if i query for "dell laptop", the result should be like this
dell laptop
dell laptop with bag
dell laptop without bag
dell inspiron laptop with bag
Exact match should come first, rest of the things will be in the any order,
but should get the same number of results




--
View this message in context: 
http://lucene.472066.n3.nabble.com/Solr-Exact-match-boost-Reduce-the-results-tp4211352p4211377.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: Solr Exact match boost Reduce the results

2015-06-12 Thread Alessandro Benedetti
Are we talking about a title field ?
Querying only that title field ?
And the texts you put are the titles ?

If this is the case and you are not omitting norms ( because norms boost
short field containing the query terms).
The second results should be the ones expected.

Can you describe a little bit more ? DO the field you are querying contain
actually more text and not only the one you quoted ?

Cheers

2015-06-12 14:25 GMT+01:00 JACK :

> Hi Alessandro Benedetti ,
>
> What i meant is that suppose if i have items like this
>
> dell laptop with bag
> dell laptop
> dell laptop without bag
> dell inspiron laptop with bag
> if i query for "dell laptop", the result should be like this
> dell laptop
> dell laptop with bag
> dell laptop without bag
> dell inspiron laptop with bag
> Exact match should come first, rest of the things will be in the any order,
> but should get the same number of results
>
>
>
>
> --
> View this message in context:
> http://lucene.472066.n3.nabble.com/Solr-Exact-match-boost-Reduce-the-results-tp4211352p4211377.html
> Sent from the Solr - User mailing list archive at Nabble.com.
>



-- 
--

Benedetti Alessandro
Visiting card : http://about.me/alessandro_benedetti

"Tyger, tyger burning bright
In the forests of the night,
What immortal hand or eye
Could frame thy fearful symmetry?"

William Blake - Songs of Experience -1794 England


Re: Have anyone used Automatic Phrase Tokenization (AutoPhrasingTokenFilterFactory) ?

2015-06-12 Thread larry
Is anyone successfully using AutoPhrasingTokenFilterFactory on Solr 5?






Sent from Windows Mail

Division with Stats Component when Grouping in Solr

2015-06-12 Thread kingofhypocrites
I am migrating a database from SQL Server to Cassandra. Currently I have a
setup as follows:

- Log data in Cassandra
- Summarize data in Spark and put into Cassandra summary tables
- Query data in Solr

Everything fits beautifully until I need to do stats on groups. I am hoping
to get this to work with Solr so I can stick to one database, but I am not
sure it's possible.

If I had it in SQL Server, I could do it like so:
SELECT
site_id, 
keyword, 
SUM(visits) as visits, 
CONVERT(DECIMAL(13, 3), SUM(bounces)) / SUM(visits) as bounce_rate, 
SUM(pageviews) as pageviews,
CONVERT(DECIMAL(13, 3), SUM(pageviews)) / SUM(visits) as
avg_pages_per_visit 
FROM 
report_all_keywords_daily 
WHERE 
site_id = 55 AND date_key >= '20150606' AND date_key <= '20150608'
GROUP BY 
site_id, keyword 
ORDER BY visits DESC

Now I need to replicate this in Solr. The closest I could get to this is by
using the Stats component and then using field collapsing.
group=true&group.field=keyword&stats=true&stats.field=visits&stats.facet=keyword

And here are some results I get back: http://pastebin.com/raw.php?i=Fxhe2RA0

However, I need to do able to divide certain metrics. I tried including
functions in the stats.field such as div(sum(bounce_rate), (sum(visits)) but
it doesn't recognize the functions. Also it seems to ignoring the paging for
the stats results and returns all groups regardless.

Ultimately I'd like something like this which is what I would get in SQL: 
 

Is this possible or do I have to give up on the prospect of using Solr? I
have to query this data dynamically so I can't pre-summarize all of it.

To clarify I having the following two problems:
- Paging is ignored for stats data
- I can't figure out how to divide two stats together to get a third stat.
Note: In some cases I would need to be able to sort on this combined stat



--
View this message in context: 
http://lucene.472066.n3.nabble.com/Division-with-Stats-Component-when-Grouping-in-Solr-tp4211402.html
Sent from the Solr - User mailing list archive at Nabble.com.


Custom Function for date reformatting

2015-06-12 Thread simon
Has anyone written a Solr function which will reformat Solr's ISO8601 Date
fields and could be used to generate pseudo-fields in search results ?

I  am converting existing appplications that have baked-in assumptions that
dates are in the format -mm-dd to use Solr, and tracking down every
place where a date format conversion is needed is proving painful indeed ;=(

My thought is to write a custom function of the form
datereformatter(, )  but I thought I'd
check if it's already been done or if someone can suggest a better approach.

regards

-Simon


Re: Solr Exact match boost Reduce the results

2015-06-12 Thread JACK
Hi, I have to search on the field product_name.It is found that in order to
get exact matches first, I made one copy field named as dummy_name with the
above field definition.And while query, just boost the copy field. I done
this. So as to get exact matches I need to put quotes around the search
words. When I do this my results is too less compared to search without
quotes. But I need the same results without quotes along with exact matches
should come first



--
View this message in context: 
http://lucene.472066.n3.nabble.com/Solr-Exact-match-boost-Reduce-the-results-tp4211352p4211409.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: Solr Exact match boost Reduce the results

2015-06-12 Thread JACK
The quoted search words will be different and it will be any word or more
than one word. In the query it's just example 



--
View this message in context: 
http://lucene.472066.n3.nabble.com/Solr-Exact-match-boost-Reduce-the-results-tp4211352p4211410.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: Custom Function for date reformatting

2015-06-12 Thread Jack Krupansky
Sounds like a good addition to Solr.

A document transformer would be the better choice. See:
https://cwiki.apache.org/confluence/display/solr/Transforming+Result+Documents

The primary purpose of a function query is to affect relevance, although,
yes, they can sort of be (mis)used as document transformers as well.

Maybe you could generalize it to "format" and handle non-date fields as
well.


-- Jack Krupansky

On Fri, Jun 12, 2015 at 11:20 AM, simon  wrote:

> Has anyone written a Solr function which will reformat Solr's ISO8601 Date
> fields and could be used to generate pseudo-fields in search results ?
>
> I  am converting existing appplications that have baked-in assumptions that
> dates are in the format -mm-dd to use Solr, and tracking down every
> place where a date format conversion is needed is proving painful indeed
> ;=(
>
> My thought is to write a custom function of the form
> datereformatter(, )  but I thought I'd
> check if it's already been done or if someone can suggest a better
> approach.
>
> regards
>
> -Simon
>


Re: Division with Stats Component when Grouping in Solr

2015-06-12 Thread Joel Bernstein
https://issues.apache.org/jira/browse/SOLR-7560, will almost support this
in Solr 5.3. The compound function support won't be there yet though. But
it will be there in the near future.



Joel Bernstein
http://joelsolr.blogspot.com/

On Fri, Jun 12, 2015 at 9:30 AM, kingofhypocrites <
kingofhypocri...@gmail.com> wrote:

> I am migrating a database from SQL Server to Cassandra. Currently I have a
> setup as follows:
>
> - Log data in Cassandra
> - Summarize data in Spark and put into Cassandra summary tables
> - Query data in Solr
>
> Everything fits beautifully until I need to do stats on groups. I am hoping
> to get this to work with Solr so I can stick to one database, but I am not
> sure it's possible.
>
> If I had it in SQL Server, I could do it like so:
> SELECT
> site_id,
> keyword,
> SUM(visits) as visits,
> CONVERT(DECIMAL(13, 3), SUM(bounces)) / SUM(visits) as bounce_rate,
> SUM(pageviews) as pageviews,
> CONVERT(DECIMAL(13, 3), SUM(pageviews)) / SUM(visits) as
> avg_pages_per_visit
> FROM
> report_all_keywords_daily
> WHERE
> site_id = 55 AND date_key >= '20150606' AND date_key <= '20150608'
> GROUP BY
> site_id, keyword
> ORDER BY visits DESC
>
> Now I need to replicate this in Solr. The closest I could get to this is by
> using the Stats component and then using field collapsing.
>
> group=true&group.field=keyword&stats=true&stats.field=visits&stats.facet=keyword
>
> And here are some results I get back:
> http://pastebin.com/raw.php?i=Fxhe2RA0
>
> However, I need to do able to divide certain metrics. I tried including
> functions in the stats.field such as div(sum(bounce_rate), (sum(visits))
> but
> it doesn't recognize the functions. Also it seems to ignoring the paging
> for
> the stats results and returns all groups regardless.
>
> Ultimately I'd like something like this which is what I would get in SQL:
> 
>
> Is this possible or do I have to give up on the prospect of using Solr? I
> have to query this data dynamically so I can't pre-summarize all of it.
>
> To clarify I having the following two problems:
> - Paging is ignored for stats data
> - I can't figure out how to divide two stats together to get a third stat.
> Note: In some cases I would need to be able to sort on this combined stat
>
>
>
> --
> View this message in context:
> http://lucene.472066.n3.nabble.com/Division-with-Stats-Component-when-Grouping-in-Solr-tp4211402.html
> Sent from the Solr - User mailing list archive at Nabble.com.
>


Re: Division with Stats Component when Grouping in Solr

2015-06-12 Thread Joel Bernstein
If you are a java programmer you may want to look at plugging in your own
custom Streams into the Streaming API. The SQL stuff is built on top of the
Streaming API.

http://joelsolr.blogspot.com/2015/04/the-streaming-api-solrjio-basics.html

Joel Bernstein
http://joelsolr.blogspot.com/

On Fri, Jun 12, 2015 at 11:00 AM, Joel Bernstein  wrote:

> https://issues.apache.org/jira/browse/SOLR-7560, will almost support this
> in Solr 5.3. The compound function support won't be there yet though. But
> it will be there in the near future.
>
>
>
> Joel Bernstein
> http://joelsolr.blogspot.com/
>
> On Fri, Jun 12, 2015 at 9:30 AM, kingofhypocrites <
> kingofhypocri...@gmail.com> wrote:
>
>> I am migrating a database from SQL Server to Cassandra. Currently I have a
>> setup as follows:
>>
>> - Log data in Cassandra
>> - Summarize data in Spark and put into Cassandra summary tables
>> - Query data in Solr
>>
>> Everything fits beautifully until I need to do stats on groups. I am
>> hoping
>> to get this to work with Solr so I can stick to one database, but I am not
>> sure it's possible.
>>
>> If I had it in SQL Server, I could do it like so:
>> SELECT
>> site_id,
>> keyword,
>> SUM(visits) as visits,
>> CONVERT(DECIMAL(13, 3), SUM(bounces)) / SUM(visits) as bounce_rate,
>> SUM(pageviews) as pageviews,
>> CONVERT(DECIMAL(13, 3), SUM(pageviews)) / SUM(visits) as
>> avg_pages_per_visit
>> FROM
>> report_all_keywords_daily
>> WHERE
>> site_id = 55 AND date_key >= '20150606' AND date_key <= '20150608'
>> GROUP BY
>> site_id, keyword
>> ORDER BY visits DESC
>>
>> Now I need to replicate this in Solr. The closest I could get to this is
>> by
>> using the Stats component and then using field collapsing.
>>
>> group=true&group.field=keyword&stats=true&stats.field=visits&stats.facet=keyword
>>
>> And here are some results I get back:
>> http://pastebin.com/raw.php?i=Fxhe2RA0
>>
>> However, I need to do able to divide certain metrics. I tried including
>> functions in the stats.field such as div(sum(bounce_rate), (sum(visits))
>> but
>> it doesn't recognize the functions. Also it seems to ignoring the paging
>> for
>> the stats results and returns all groups regardless.
>>
>> Ultimately I'd like something like this which is what I would get in SQL:
>> 
>>
>> Is this possible or do I have to give up on the prospect of using Solr? I
>> have to query this data dynamically so I can't pre-summarize all of it.
>>
>> To clarify I having the following two problems:
>> - Paging is ignored for stats data
>> - I can't figure out how to divide two stats together to get a third stat.
>> Note: In some cases I would need to be able to sort on this combined stat
>>
>>
>>
>> --
>> View this message in context:
>> http://lucene.472066.n3.nabble.com/Division-with-Stats-Component-when-Grouping-in-Solr-tp4211402.html
>> Sent from the Solr - User mailing list archive at Nabble.com.
>>
>
>


Re: Solr Exact match boost Reduce the results

2015-06-12 Thread Alessandro Benedetti
I did a simple test using out of the Box Edismax ( not even configuring
specific params or the phrase field).
As expected the exact match comes first.
This is because of the norms and the natural way the Edismax boost exact
matches.

Are you sure you are using a proper query parser ?
I did nothing analysis side, I simply used a standard field.
Also I find a little bit suspicious your double field analysis approach and
the field boost.
Are you sure it's not related to the synonyms ?

Cheers


2015-06-12 16:27 GMT+01:00 JACK :

> The quoted search words will be different and it will be any word or more
> than one word. In the query it's just example
>
>
>
> --
> View this message in context:
> http://lucene.472066.n3.nabble.com/Solr-Exact-match-boost-Reduce-the-results-tp4211352p4211410.html
> Sent from the Solr - User mailing list archive at Nabble.com.
>



-- 
--

Benedetti Alessandro
Visiting card : http://about.me/alessandro_benedetti

"Tyger, tyger burning bright
In the forests of the night,
What immortal hand or eye
Could frame thy fearful symmetry?"

William Blake - Songs of Experience -1794 England


Pretty Print segments_N

2015-06-12 Thread Mike Drob
I'm doing some debugging work on a solr core, and would find it useful to
be able to pretty print the contents of the segments_N file in the index.
Is there already good functionality for this, or will I need to write up my
own utility using SegmentInfos?

Thanks,
Mike


Re: Solr Exact match boost Reduce the results

2015-06-12 Thread JACK
As explained above, actually I have around 10 lack data not 5 row. It's not
about synonyms . When I checked in the FAQ page of Solr wiki, it is found
that if we need to get exact match results first, use a copy field with
different configuration. That's why I followed this way. 



--
View this message in context: 
http://lucene.472066.n3.nabble.com/Solr-Exact-match-boost-Reduce-the-results-tp4211352p4211434.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: Division with Stats Component when Grouping in Solr

2015-06-12 Thread Chris Hostetter

: However, I need to do able to divide certain metrics. I tried including
: functions in the stats.field such as div(sum(bounce_rate), (sum(visits)) but
: it doesn't recognize the functions. Also it seems to ignoring the paging for
: the stats results and returns all groups regardless.

i'm lost on what your goal is regarding grouping and what you mean by 
"ignoring the paging" but FWIW stats.field does support functions (or 
query scores) -- you just need to use local params to make it clear that 
you are passing in a function name and not a field name...

https://cwiki.apache.org/confluence/display/solr/The+Stats+Component

Example...

http://localhost:8983/solr/techproducts/select?q=*:*&stats=true&stats.field={!func}termfreq('text','memory')&stats.field=price&stats.field=popularity&rows=0&indent=true

: Ultimately I'd like something like this which is what I would get in SQL: 
:  

at first glance, making some assumptions about your data, this looks like 
pivot faceting with some stats hanging 
off of it -- ie: 

facet.pivot={!stats=nest}site_id,keyword
stats.field={!tag=nest sum=true}visits
stats.field={!tag=nest sum=true}bounces
stats.field={!tag=nest sum=true}pageviews

https://cwiki.apache.org/confluence/display/solr/Faceting#Faceting-CombiningStatsComponentWithPivots

...that will give you the sum or each of the specified fields for each 
"top" keyword (by doc count) for each "top" site_id (by doc count).  
(Computing the bounce_rate and avg_pages_per_visit is simple client side 
division)

: - Paging is ignored for stats data

How/Why exactly do you want/expect paging to affect stats computation? 
stats are over entire result sets -- if you wnated stats just over a 
single page that's trivial to do in the client.

: - I can't figure out how to divide two stats together to get a third stat.
: Note: In some cases I would need to be able to sort on this combined stat

Yeah, unfortunately sorting pivots facet results currently only works by 
either hte doc count or the term, not an arbitrary stat on the docs in the 
pivot subset (that's a really hard problem to solve for arbitrary 
functions in a distributed setup) ... the new JSON faceting stuff might do 
what you want, but i don't really know enough about it to say...

https://cwiki.apache.org/confluence/display/solr/JSON+Request+API


-Hoss
http://www.lucidworks.com/


RE: Solr Exact match boost Reduce the results

2015-06-12 Thread Andrew Chillrud
If I understand you correctly you want to boost the score of documents where 
the contents of the product_name field match exactly (other than case) the 
query string.

I think what you need is for the dummy_name field to be non-tokenized (indexed 
as a single string rather than parsed into individual words). The name of the 
field type you have configured the dummy_name field  to use (string_ci) would 
seem to indicate this is your intent. However the definition of string_ci 
doesn't match the name. It is configured to use the WhitespaceTokenizerFactory 
tokenizer, which will break the contents of the field up into multiple tokens 
where ever white space occurs.

Try defining string_ci using the (somewhat cryptically named) 
KeywordTokenizerFactory, which will index the entire contents of the field as a 
single token. Something like:


  


  


- Andy -

-Original Message-
From: JACK [mailto:mfal...@gmail.com] 
Sent: Friday, June 12, 2015 12:54 PM
To: solr-user@lucene.apache.org
Subject: Re: Solr Exact match boost Reduce the results

As explained above, actually I have around 10 lack data not 5 row. It's not 
about synonyms . When I checked in the FAQ page of Solr wiki, it is found that 
if we need to get exact match results first, use a copy field with different 
configuration. That's why I followed this way. 



--
View this message in context: 
http://lucene.472066.n3.nabble.com/Solr-Exact-match-boost-Reduce-the-results-tp4211352p4211434.html
Sent from the Solr - User mailing list archive at Nabble.com.


facet name using {!key} doesn't work for interval facets

2015-06-12 Thread Phanindra R
Hi,

 According to
https://cwiki.apache.org/confluence/display/solr/Faceting#Faceting-IntervalFaceting,
a facet name could be assigned to interval facets, which then replaces the
field name as the facet name in the response.

The syntax I used: facet.interval={!key=myName}myField

But Solr 4.10.2 throws following exception:

"error": { "msg": "undefined field: \"{!key=myName}myField\"", "code": 400 }


What's not a valid attribute data in Solr's schema.xml and solrconfig.xml

2015-06-12 Thread Steven White
Hi,

I'm trying to sort out what's not valid in Solr's files.  For example, the
following request-handler will cause Solr to fail to load (notice the
missing "/" from "987" in the 'name'):

  
  

But having a name with a space, such as "/ 987" or "/ 1 2 3 " works.

This is one example, but my question is much brother and extends to other
attributes: where can I find what's not valid data in attributes used by
both schema.xml and solrconfig.xml file?

Thanks in advance.

Steve


Re: What's not a valid attribute data in Solr's schema.xml and solrconfig.xml

2015-06-12 Thread Shawn Heisey
On 6/12/2015 12:24 PM, Steven White wrote:
> I'm trying to sort out what's not valid in Solr's files.  For example, the
> following request-handler will cause Solr to fail to load (notice the
> missing "/" from "987" in the 'name'):
>
>   
>   
>
> But having a name with a space, such as "/ 987" or "/ 1 2 3 " works.
>
> This is one example, but my question is much brother and extends to other
> attributes: where can I find what's not valid data in attributes used by
> both schema.xml and solrconfig.xml file?

I added that exact text to solrconfig.xml in a core created by
solr-5.1.0, and everything worked just fine.  I can see the handler
named "987" in Plugins/Stats -> QUERYHANDLER.

What errors did you get in your log, and what version of Solr are you
running?

Thanks,
Shawn




Re: facet name using {!key} doesn't work for interval facets

2015-06-12 Thread Upayavira
This is mentioned in the release notes for 5.2, i.e. the instructions
mentioned on that page will only work on 5.2+.

When looking at the cwiki ref guide, remember that that is the
development location for the Solr Reference Guide, so invariably refers
to an as yet to be released version of Solr. You are advised to download
a PDF for your specific version if you want to be sure to get directions
relevant to your version.

Upayavira

On Fri, Jun 12, 2015, at 07:22 PM, Phanindra R wrote:
> Hi,
> 
>  According to
> https://cwiki.apache.org/confluence/display/solr/Faceting#Faceting-IntervalFaceting,
> a facet name could be assigned to interval facets, which then replaces
> the
> field name as the facet name in the response.
> 
> The syntax I used: facet.interval={!key=myName}myField
> 
> But Solr 4.10.2 throws following exception:
> 
> "error": { "msg": "undefined field: \"{!key=myName}myField\"", "code":
> 400 }


Re: What's not a valid attribute data in Solr's schema.xml and solrconfig.xml

2015-06-12 Thread Steven White
You are right.  If I use "solr.SearchHandler" it works, but if I
use "solr.admin.AdminHandlers" like so:

  
  

Solr reports this error:

HTTP ERROR 500

Problem accessing /solr/db/config/requestHandler. Reason:

{msg=SolrCore 'db' is not available due to init failure: The
AdminHandler needs to be registered to a path.  Typically this is
'/admin',trace=org.apache.solr.common.SolrException: SolrCore 'db' is not
available due to init failure: The AdminHandler needs to be registered to a
path.  Typically this is '/admin'
at org.apache.solr.core.CoreContainer.getCore(CoreContainer.java:763)
at
org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:307)
at
org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:220)
at
org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1419)
at
org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:455)

Does this mean each request handler has its own level of error checking?

Steve

On Fri, Jun 12, 2015 at 2:43 PM, Shawn Heisey  wrote:

> On 6/12/2015 12:24 PM, Steven White wrote:
> > I'm trying to sort out what's not valid in Solr's files.  For example,
> the
> > following request-handler will cause Solr to fail to load (notice the
> > missing "/" from "987" in the 'name'):
> >
> >   
> >   
> >
> > But having a name with a space, such as "/ 987" or "/ 1 2 3 " works.
> >
> > This is one example, but my question is much brother and extends to other
> > attributes: where can I find what's not valid data in attributes used by
> > both schema.xml and solrconfig.xml file?
>
> I added that exact text to solrconfig.xml in a core created by
> solr-5.1.0, and everything worked just fine.  I can see the handler
> named "987" in Plugins/Stats -> QUERYHANDLER.
>
> What errors did you get in your log, and what version of Solr are you
> running?
>
> Thanks,
> Shawn
>
>
>


Re: What's not a valid attribute data in Solr's schema.xml and solrconfig.xml

2015-06-12 Thread Shawn Heisey
On 6/12/2015 1:02 PM, Steven White wrote:
> You are right.  If I use "solr.SearchHandler" it works, but if I
> use "solr.admin.AdminHandlers" like so:
>
>   
>   
>
> Solr reports this error:
>
> HTTP ERROR 500
>
> Problem accessing /solr/db/config/requestHandler. Reason:
>
> {msg=SolrCore 'db' is not available due to init failure: The
> AdminHandler needs to be registered to a path.  Typically this is

With an admin handler, it won't be possible to access that handler if
it's not a path that starts with a slash, so it can be incorporated into
the request URL.  Solr is making sure the config will work, and throwing
an error when it won't work.

With search handlers, if you set up the config right, you *CAN* access
that handler through a request parameter on the /select handler, it does
not need to be part of the URL path.  The default config found in
examples for 4.x and later will prevent that from working, but you can
change the config to allow it ... so search handlers must work even if
the name is not a path.  The config validation is not as strict as it is
for an admin handler.

Thanks,
Shawn



Re: What's not a valid attribute data in Solr's schema.xml and solrconfig.xml

2015-06-12 Thread Erik Hatcher
Do note that AdminHandler*s* (plural) is "* A special Handler that registers 
all standard admin handlers”, so if you’re trying to do something tricky with 
admin handlers,  Note that AdminHandlers is also deprecated and these admin 
handlers are implicitly registered with ImplicitPlugins these days.  

What kind of handler are you adding?  Or are you trying to change the prefix of 
all the admin handlers instead of /admin?

—
Erik Hatcher, Senior Solutions Architect
http://www.lucidworks.com




> On Jun 12, 2015, at 3:56 PM, Shawn Heisey  wrote:
> 
> On 6/12/2015 1:02 PM, Steven White wrote:
>> You are right.  If I use "solr.SearchHandler" it works, but if I
>> use "solr.admin.AdminHandlers" like so:
>> 
>>  
>>  
>> 
>> Solr reports this error:
>> 
>> HTTP ERROR 500
>> 
>> Problem accessing /solr/db/config/requestHandler. Reason:
>> 
>>{msg=SolrCore 'db' is not available due to init failure: The
>> AdminHandler needs to be registered to a path.  Typically this is
> 
> With an admin handler, it won't be possible to access that handler if
> it's not a path that starts with a slash, so it can be incorporated into
> the request URL.  Solr is making sure the config will work, and throwing
> an error when it won't work.
> 
> With search handlers, if you set up the config right, you *CAN* access
> that handler through a request parameter on the /select handler, it does
> not need to be part of the URL path.  The default config found in
> examples for 4.x and later will prevent that from working, but you can
> change the config to allow it ... so search handlers must work even if
> the name is not a path.  The config validation is not as strict as it is
> for an admin handler.
> 
> Thanks,
> Shawn
> 



Re: facet name using {!key} doesn't work for interval facets

2015-06-12 Thread Phanindra R
Thanks. Looked at all version specific documentation from 4.10 and the
{!key} feature for interval facets is implemented in 5.1.

On Fri, Jun 12, 2015 at 12:03 PM, Upayavira  wrote:

> This is mentioned in the release notes for 5.2, i.e. the instructions
> mentioned on that page will only work on 5.2+.
>
> When looking at the cwiki ref guide, remember that that is the
> development location for the Solr Reference Guide, so invariably refers
> to an as yet to be released version of Solr. You are advised to download
> a PDF for your specific version if you want to be sure to get directions
> relevant to your version.
>
> Upayavira
>
> On Fri, Jun 12, 2015, at 07:22 PM, Phanindra R wrote:
> > Hi,
> >
> >  According to
> >
> https://cwiki.apache.org/confluence/display/solr/Faceting#Faceting-IntervalFaceting
> ,
> > a facet name could be assigned to interval facets, which then replaces
> > the
> > field name as the facet name in the response.
> >
> > The syntax I used: facet.interval={!key=myName}myField
> >
> > But Solr 4.10.2 throws following exception:
> >
> > "error": { "msg": "undefined field: \"{!key=myName}myField\"", "code":
> > 400 }
>


Re: What's not a valid attribute data in Solr's schema.xml and solrconfig.xml

2015-06-12 Thread Steven White
Thank you Erik and Shawn for your support.

I'm using Solr's Schema API and Config API to manage and administer a Solr
deployment based on customer specific setting that my application will need
to do to a Solr deployment.  A client application will be using my APIs and
as part of data validation, I'm trying to figure out what to allow and what
not too as invalid attributes data that I cannot send to Solr.

For example, I wasn't sure that a request-handler name can have spaces or
can be all numeric, etc.  What about fields and field types, is there a
restriction for the field names?

I know my question is broad, but if there is a starting point, I can use
that to help me write application so that it is defensive against clients
who will use it to manage Solr.  If they use invalid data, I don't want to
send it to Solr and cause Solr to break.

Steve

On Fri, Jun 12, 2015 at 4:41 PM, Erik Hatcher 
wrote:

> Do note that AdminHandler*s* (plural) is "* A special Handler that
> registers all standard admin handlers”, so if you’re trying to do something
> tricky with admin handlers,  Note that AdminHandlers is also deprecated and
> these admin handlers are implicitly registered with ImplicitPlugins these
> days.
>
> What kind of handler are you adding?  Or are you trying to change the
> prefix of all the admin handlers instead of /admin?
>
> —
> Erik Hatcher, Senior Solutions Architect
> http://www.lucidworks.com
>
>
>
>
> > On Jun 12, 2015, at 3:56 PM, Shawn Heisey  wrote:
> >
> > On 6/12/2015 1:02 PM, Steven White wrote:
> >> You are right.  If I use "solr.SearchHandler" it works, but if I
> >> use "solr.admin.AdminHandlers" like so:
> >>
> >>  
> >>  
> >>
> >> Solr reports this error:
> >>
> >> HTTP ERROR 500
> >>
> >> Problem accessing /solr/db/config/requestHandler. Reason:
> >>
> >>{msg=SolrCore 'db' is not available due to init failure: The
> >> AdminHandler needs to be registered to a path.  Typically this is
> >
> > With an admin handler, it won't be possible to access that handler if
> > it's not a path that starts with a slash, so it can be incorporated into
> > the request URL.  Solr is making sure the config will work, and throwing
> > an error when it won't work.
> >
> > With search handlers, if you set up the config right, you *CAN* access
> > that handler through a request parameter on the /select handler, it does
> > not need to be part of the URL path.  The default config found in
> > examples for 4.x and later will prevent that from working, but you can
> > change the config to allow it ... so search handlers must work even if
> > the name is not a path.  The config validation is not as strict as it is
> > for an admin handler.
> >
> > Thanks,
> > Shawn
> >
>
>


Re: What's not a valid attribute data in Solr's schema.xml and solrconfig.xml

2015-06-12 Thread Shawn Heisey
On 6/12/2015 3:30 PM, Steven White wrote:
> Thank you Erik and Shawn for your support.
>
> I'm using Solr's Schema API and Config API to manage and administer a Solr
> deployment based on customer specific setting that my application will need
> to do to a Solr deployment.  A client application will be using my APIs and
> as part of data validation, I'm trying to figure out what to allow and what
> not too as invalid attributes data that I cannot send to Solr.
>
> For example, I wasn't sure that a request-handler name can have spaces or
> can be all numeric, etc.  What about fields and field types, is there a
> restriction for the field names?
>
> I know my question is broad, but if there is a starting point, I can use
> that to help me write application so that it is defensive against clients
> who will use it to manage Solr.  If they use invalid data, I don't want to
> send it to Solr and cause Solr to break.

Even if things like spaces and punctuation are accepted, I wouldn't use
them.  You can't be sure that all parts of Solr will support strange
characters, much less third-party software.

For handler names, they should always start with a forward slash and
stick to letters, numbers, and the underscore, and also make sure you
stick to ASCII characters numbered below 127, even if Solr would allow
you to use other characters.  If you stick to that, you can be
reasonably sure everything will work with any software.

Field and type names should stick to letters, numbers, and the
underscore character, also within the standard ASCII character set.

I like to use only lowercase letters, but that's not a requirement. 
Note that if you do use mixed case, almost everything in Solr is case
sensitive, so you must use the same case everywhere, and you should not
use two names that differ only in the case of the letters, just in case
something is NOT case sensitive.

I also prefer to start identifiers with a letter, not a number, but I'm
pretty sure that is also not a requirement.

For best results, similar rules should be followed for other identifiers
in a Solr config/schema.

Thanks,
Shawn



Re: What's not a valid attribute data in Solr's schema.xml and solrconfig.xml

2015-06-12 Thread Steven White
Thanks Shawn.

Steve

On Fri, Jun 12, 2015 at 6:00 PM, Shawn Heisey  wrote:

> On 6/12/2015 3:30 PM, Steven White wrote:
> > Thank you Erik and Shawn for your support.
> >
> > I'm using Solr's Schema API and Config API to manage and administer a
> Solr
> > deployment based on customer specific setting that my application will
> need
> > to do to a Solr deployment.  A client application will be using my APIs
> and
> > as part of data validation, I'm trying to figure out what to allow and
> what
> > not too as invalid attributes data that I cannot send to Solr.
> >
> > For example, I wasn't sure that a request-handler name can have spaces or
> > can be all numeric, etc.  What about fields and field types, is there a
> > restriction for the field names?
> >
> > I know my question is broad, but if there is a starting point, I can use
> > that to help me write application so that it is defensive against clients
> > who will use it to manage Solr.  If they use invalid data, I don't want
> to
> > send it to Solr and cause Solr to break.
>
> Even if things like spaces and punctuation are accepted, I wouldn't use
> them.  You can't be sure that all parts of Solr will support strange
> characters, much less third-party software.
>
> For handler names, they should always start with a forward slash and
> stick to letters, numbers, and the underscore, and also make sure you
> stick to ASCII characters numbered below 127, even if Solr would allow
> you to use other characters.  If you stick to that, you can be
> reasonably sure everything will work with any software.
>
> Field and type names should stick to letters, numbers, and the
> underscore character, also within the standard ASCII character set.
>
> I like to use only lowercase letters, but that's not a requirement.
> Note that if you do use mixed case, almost everything in Solr is case
> sensitive, so you must use the same case everywhere, and you should not
> use two names that differ only in the case of the letters, just in case
> something is NOT case sensitive.
>
> I also prefer to start identifiers with a letter, not a number, but I'm
> pretty sure that is also not a requirement.
>
> For best results, similar rules should be followed for other identifiers
> in a Solr config/schema.
>
> Thanks,
> Shawn
>
>


How to split using multiple parameters

2015-06-12 Thread Sandeep Mellacheruvu
Hi,

I have a json document which has multiple json arrays and inner json
objects. From the documentation it seems like there is only one split
parameter. Following is the sample JSON that I have.

Can anyone help me in splitting this json ? Also I do not need some of the
fields like websites, so can I also ignore such fields altogether ?

{
"groups": [
{
"name": "Airlines"
},
{
"name": "Industrial Design"
},
{},
{}
],
"family_name": "Volante",
"locality": "Chile",
"industry": "Civil Engineering",
"num_connections": "500+",
"websites": [
{
"description": "Personal Website"
}
],
"summary": "Ingeniero Civil Industrial UC y Magister en Innovación UAI.
Mis áreas de interés son la Innovación Empresarial, las Tecnologías de
Información, la Gestión de Operaciones y Logística",
"headline": "Value Senior Advisor en SAP",
"given_name": "Martin",
"full_name": "Martin Volante",
"skills": [
"Business Intelligence",
"Team Leadership",
"Business Strategy",
"Project Management",
"Management",
"Business Process",
"Business Analysis",
"Change Management",
"SOA",
"Strategic Planning",
"Software Project...",
"Oracle",
"Pre-sales",
"Solution Architecture",
"Management Consulting",
"Project Planning",
"IT Strategy"
],
"experience": [
{
"end": "Present",
"title": "Industry Value Engineering",
"start": "September 2014",
"location": "Santiago, Chile",
"duration": "6 months",
"organization": [
{
"name": "SAP",
"profile_url": "http://www.linkedin.com/company/1115";
}
]
}
],
"education": [
{
"start": "2010",
"end": "2011",
"name": "Universidad Adolfo Ibáñez",
"degrees": [
"Master en Innovación"
]
}
]
}

Thanks,
Sandeep


Re: AngularJS

2015-06-12 Thread William Bell
1. With the angular index.html, when selecting a CORE, the right side of
the screen does not refresh and show info for the core I selected.

2. It looks like it just needs whitespace

Fetched
: 11,310
251/s

On Wed, Jun 10, 2015 at 3:28 AM, Upayavira  wrote:

>
>
> On Wed, Jun 10, 2015, at 05:52 AM, William Bell wrote:
> > Finding DIH issue with the new AngularJS DIH section, while indexing...
> >
> > 1,22613/s ?
> >
> > Last Update: 22:50:50
> > *Indexing since 0:1:38.204*
> > Requests: 1, Fetched: 1,22613/s, Skipped: 0, Processed: 1,22613/s
> > Started: 3 minutes ago
>
> Ahh, great - real feedback! :-)
>
> What does the old UI say at that point? Could you use "inspect element"
> in your browser, and paste a few nodes around this for both the old and
> the new UI?
>
> We can, and probably should, do this in a JIRA ticket. You willing to
> file one?
>
> Many thanks!
>
> Upayavira
>
>


-- 
Bill Bell
billnb...@gmail.com
cell 720-256-8076


Re: Pretty Print segments_N

2015-06-12 Thread Michał B . .
Hi,

Please look at the SegmentsInfoHandler and visualisation in AdminUi added
in Solr 5.

Maybe you could use some of informations already exposed by this Api or
simply find a good place to put your changes.

I'm pretty much responsible for this code so if you'll have some questions
feel free to ask.

Regards
Michał
12 cze 2015 18:53 "Mike Drob"  napisał(a):

> I'm doing some debugging work on a solr core, and would find it useful to
> be able to pretty print the contents of the segments_N file in the index.
> Is there already good functionality for this, or will I need to write up my
> own utility using SegmentInfos?
>
> Thanks,
> Mike
>


How to use https://issues.apache.org/jira/browse/SOLR-7274

2015-06-12 Thread William Bell
How do you set this up?


-- 
Bill Bell
billnb...@gmail.com
cell 720-256-8076