Re: Bad fieldNorm when using morphologic synonyms

2013-12-05 Thread Isaac Hebsh
The field is our main textual field. In the standard case, the length-normalization makes a significant work with tf-idf, we don't want to avoid it. Removing duplicates won't help here, because the terms are not dup. One term is stemmed, and the other is not. On Fri, Dec 6, 2013 at 9:48 AM, Ahme

Re: Bad fieldNorm when using morphologic synonyms

2013-12-05 Thread Ahmet Arslan
Hi Isaac, Did you consider omitting norms completely for that field? omitNorms="true" Are you using solr.RemoveDuplicatesTokenFilterFactory? On Thursday, December 5, 2013 8:55 PM, Isaac Hebsh wrote: Hi, we implemented a morphologic analyzer, which stems words on index time. For some reasons

Re: Xml Query Parser

2013-12-05 Thread Puneet Pawaia
Hi Gora, Had seen that before but took a look again. Since it is not yet resolved, I assumed it is still a work in progress. Should I try an patch the current 4.6 code with the patches? How would you suggest I proceed? I am new to Solr and Java and so do not have much experience with this. Regard

Re: Indexing on plain text data and base64 encode data in a single HTTP POST request

2013-12-05 Thread neerajp
Thank you all for responding to me. Due to some other activity, I was moved out of it and now I am on it. I tried to use ExtractingUpdateProcessorFactory but it seems to me that its support is not in Solr 4.5(which I am using) even not in any of the Solr version available in market. Pls. find the

Re: Prioritize search returns by URL path?

2013-12-05 Thread Alexandre Rafalovitch
Something like URLClassifyProcessor could be useful to work with URLs: http://lucene.apache.org/solr/4_5_1/solr-core/org/apache/solr/update/processor/URLClassifyProcessor.html Look for presentations/writing by Jan Hoydahl on the background for this, similar work. Regards, Alex. Personal websi

Re: Xml Query Parser

2013-12-05 Thread Gora Mohanty
On 6 December 2013 11:35, Puneet Pawaia wrote: > Hi, > > I am testing using Solr 4.6 and would like to know if there is some > implementation like XmlQueryParser of Lucene in solr. [...] Please take a look at this JIRA issue: https://issues.apache.org/jira/browse/SOLR-839 Regards, Gora

Xml Query Parser

2013-12-05 Thread Puneet Pawaia
Hi, I am testing using Solr 4.6 and would like to know if there is some implementation like XmlQueryParser of Lucene in solr. I need to be able to use SpanQueries. How would one go about implementing this if it is not already implemented in solr. TIA Puneet

Re: Error while using ExtractingUpdateProcessorFactory

2013-12-05 Thread neerajp
Hi guys, I am using solr4.5.0 version and getting the errors which I mentioned in previous post. Any help is highly appreciated -- View this message in context: http://lucene.472066.n3.nabble.com/Error-while-using-ExtractingUpdateProcessorFactory-tp4105155p4105269.html Sent from the Solr -

an "array" liked string is treated as multivalued when adding doc to solr

2013-12-05 Thread Liu Bo
Dear solr users: I've met this kind of error several times, when add a "array" liked string such as:[Get 20% Off Official Barça Kits, coupon] to a multiValued="false" field, solr will complain: org.apache.solr.common.SolrException: ERROR: [doc=7781396456243918692] multiple values encountered fo

Re: Difference between textfield and strfield

2013-12-05 Thread Erick Erickson
Use the field type Ahmet recommended. It'll look almost exactly like String with the exception it'll be case insensitive. Take a look at the admin/analysis page with this field type and you'll understand why we're recommending this. Best, Erick On Thu, Dec 5, 2013 at 8:33 PM, manju16832003 wrote

Re: Prioritize search returns by URL path?

2013-12-05 Thread manju16832003
Hi Jim Glynn, KAMACI is correct. How do you discriminate your documents?. Jim and Kamaci, I do have the same situation where I will be boosting document regular basis and expect documents with higher score appears on top and lower one at the bottom. Here is my requirement. My entity name is LI

Re: Difference between textfield and strfield

2013-12-05 Thread manju16832003
Hi Iori, Thank you replying, really appreciate that. My concern not to use *TextField* and I want to make use of *string* field. Reason is that I have 7 fields that I want to apply case-insensitiveness and all these fields are *facetable* fields. It would not be feasible if I change data type fr

Re: Solr Performance Issue

2013-12-05 Thread Hien Luu
Thanks Furkan. Looking forward to seeing your test results. Sent from Yahoo Mail on Android

No /clusterstate.json updates on Solrcloud 4.3.1 Cores API UNLOAD/CREATE

2013-12-05 Thread Tim Vaillancourt
Hey guys, I've been having an issue with 1 of my 4 replicas having an inconsistent replica, and have been trying to fix it. At the core of this issue, I've noticed /clusterstate.json doesn't seem to be receiving updates when cores get unhealthy, or even added/removed. Today I decided I would remo

Re: Inconsistent numFound in SC when querying core directly

2013-12-05 Thread Tim Vaillancourt
I spoke too soon, my plan for fixing this didn't quite work. I've moved this issue into a new thread/topic: "No /clusterstate.json updates on Solrcloud 4.3.1 Cores API UNLOAD/CREATE". Thanks all for the help on this one! Tim On 5 December 2013 11:37, Tim Vaillancourt wrote: > Very good point

Re: Solr Performance Issue

2013-12-05 Thread Furkan KAMACI
Hi Hien; Actually high index rate is a relative concept. I could index such kind of data within a few hours. I aim to index much much more data within same time soon. I can share my test results when I do. Thanks; Furkan KAMACI 6 Aralık 2013 Cuma tarihinde Hien Luu adlı kullanıcı şöyle yazdı: >

Re: Solr Performance Issue

2013-12-05 Thread Shawn Heisey
On 12/5/2013 4:08 PM, Hien Luu wrote: Just curious what was the index rate that you were able to achieve? What I've usually seen based on my experience and what people have said here and on IRC is that the data source is usually the bottleneck - Solr typically indexes VERY fast, as long as yo

Re: Prioritize search returns by URL path?

2013-12-05 Thread Furkan KAMACI
Hi; How do you discriminate your content type, by their paths? If you don't want to do any regex operation and complex things you can send content type as a filed of your document too. Does every wiki content has more priority than every blogs? If yes you can use facet at that new field (content t

Re: Solr Performance Issue

2013-12-05 Thread Hien Luu
Hi Furkan, Just curious what was the index rate that you were able to achieve?   Regards, Hien On Thursday, December 5, 2013 3:06 PM, Furkan KAMACI wrote: Hi; Erick and Shawn have explained that we need more information about your infrastructure. I should add that: I had test data at my

Re: Solr Performance Issue

2013-12-05 Thread Furkan KAMACI
Hi; Erick and Shawn have explained that we need more information about your infrastructure. I should add that: I had test data at my SolrCloud nearly as much as yours and I did not have any problems except for when indexing at a huge index rate and it can be solved with turning. You should optimiz

Re: Solr cuts highlighted sentences

2013-12-05 Thread Furkan KAMACI
Hi; You can use Boundry Scanner in order to achieve this. You can set up your boundry scanner for word, line, sentence and character. You can follow that: https://cwiki.apache.org/confluence/display/solr/FastVector+Highlighter Thanks; Furkan KAMACI 4 Aralık 2013 Çarşamba tarihinde katoo adlı ku

Question about external file fields

2013-12-05 Thread yriveiro
Hi, I read this post http://1opensourcelover.wordpress.com/ about EEF's and I found very interesting. Can someone give me more use cases about the utility of EEF's? /Yago - Best regards -- View this message in context: http://lucene.472066.n3.nabble.com/Question-about-external-file-field

Re: starting up solr automatically

2013-12-05 Thread Eric Palmer
thanks Greg and Timothy. Very helpful On Thu, Dec 5, 2013 at 3:42 PM, Tim Potter wrote: > Apologies for chiming in late on this one ... just wanted to mention what > I've used with good success in the past is supervisord ( > http://supervisord.org/). It's easy to install and configure and has th

RE: starting up solr automatically

2013-12-05 Thread Tim Potter
Apologies for chiming in late on this one ... just wanted to mention what I've used with good success in the past is supervisord (http://supervisord.org/). It's easy to install and configure and has the benefit of restarting nodes if they crash (such as due to an OOM). I'll also mention that you

Re: starting up solr automatically

2013-12-05 Thread Greg Walters
Eric, Sorry about that, the entire OPTIONS= part can be dropped. That's there to support a war file that we deploy next to solr. Greg On Dec 5, 2013, at 1:51 PM, Eric Palmer wrote: > some progress but getting this error now > sudo service jetty start > Starting Jetty: -bash: line 1: cd: /var/

Re: starting up solr automatically

2013-12-05 Thread Eric Palmer
Okay I changed the cd /var/lib/answers/atlascloud/solr45 to cd $JETTY_HOME and am getting the same error run the runcmd from the comand line I get the same error if I take the ,jsp off it runs and solr returns search results so I modified the script and removed the ,jsp option and it works. I'm

Re: starting up solr automatically

2013-12-05 Thread Eric Palmer
some progress but getting this error now sudo service jetty start Starting Jetty: -bash: line 1: cd: /var/lib/answers/atlascloud/solr45: No such file or directory STARTED Jetty Thu Dec 5 19:50:09 UTC 2013 [ec2-user@ip-10-50-203-92 ~]$ java.lang.IllegalArgumentException: No such OPTIONS: jsp at org

Re: Inconsistent numFound in SC when querying core directly

2013-12-05 Thread Tim Vaillancourt
Very good point. I've seen this issue occur once before when I was playing with 4.3.1 and don't remember it happening since 4.5.0+, so that is good news - we are just behind. For anyone that is curious, on my earlier mention that Zookeeper/clusterstate.json was not taking updates: this was NOT co

Re: starting up solr automatically

2013-12-05 Thread Greg Walters
Eric, If you're using the script from the gist I posted make sure you're sourcing the jetty file at line 140. Thanks, Greg On Dec 5, 2013, at 1:21 PM, Eric Palmer wrote: > Greg or anyone that can help, when I try to start jetty as a service > sudo service jetty start > > I get this error > *

Re: starting up solr automatically

2013-12-05 Thread Eric Palmer
Greg or anyone that can help, when I try to start jetty as a service sudo service jetty start I get this error ** ERROR: JETTY_HOME not set, you need to set it or install in a standard location same for sudo service jetty stop sudo service jetty check etc I have a file here and the permissions l

Global query parameters to facet query

2013-12-05 Thread Isaac Hebsh
Hi, It seems that a facet query does not use the global query parameters (for example, field aliasing for edismax parser). We have an intensive use of facet queries (in some cases, we have a lot of facet.query for a single q), and the using of LocalParams for each facet.query is not convenient. D

Bad fieldNorm when using morphologic synonyms

2013-12-05 Thread Isaac Hebsh
Hi, we implemented a morphologic analyzer, which stems words on index time. For some reasons, we index both the original word and the stem (on the same position, of course). The stemming is done on a specific language, so other languages are not stemmed at all. Because of that, two documents with

Re: Sorting on solr results

2013-12-05 Thread Erick Erickson
Isn't this just sort=Price asc, Position asc ? But I agree with Shawn, I have no clue what :&@QueryTerm=*&OnlineFlag=1&@Sort.Price=0,position=0 is all about. Possibly a front end to Solr that has its own query language? In which case you'd have to talk to whoever maintains that Best, Erick

Error while using ExtractingUpdateProcessorFactory

2013-12-05 Thread neerajp
Hi, I am using ExtractingUpdateProcessorFactory in my application to extract binary data and make indexing for that using tika. I did following configuration in solrconfig.xml: attachmentchain binary_content

Re: starting up solr automatically

2013-12-05 Thread Greg Walters
Alan, Yes, that's intentional. There's two reasons for this: 1: We make schema changes frequently (more frequently than I like) 2: So far as I've noticed, it doesn't hurt anything and covers my butt when I've got to clear out all the solr related data from ZK while testing Thanks, Greg On Dec

Re: Importing/Indexing the DB2 XML FieldType in SOLR

2013-12-05 Thread Shawn Heisey
On 12/5/2013 8:20 AM, Shawn Heisey wrote: > one with a proper toString() method. If not, you'll need to write your > own indexing application or modify the dataimport handler source code to > handle the XML object and recompile it. I just noticed something on the IBM URL. http://publib.boulder.i

Re: Importing/Indexing the DB2 XML FieldType in SOLR

2013-12-05 Thread Shawn Heisey
On 12/5/2013 2:30 AM, ravi1984 wrote: > I'm using DB2 9.x and I have a column named DTL_XML of type XML. Following > are the snippets of data-config.xml & schema.xml within my SOLR instance. > > data-config.xml: > url="jdbc:db2://myIP:myPort/DBName" user="testUsr" password="testPwd" /> >

Re: SOLR 4 not utilizing multi CPU cores

2013-12-05 Thread Salman Akram
After debugging it seems that Query Parser code in Surround Parser is giving an issue in queries with Common Words. Has anyone tried Surround and Common Grams with SOLR 4? On Thu, Dec 5, 2013 at 7:00 PM, Daniel Collins wrote: > Fair enough, I'm not famiilar with Surround parser, but it does look

Re: SolrCloud FunctionQuery inconsistency

2013-12-05 Thread Shawn Heisey
On 12/5/2013 2:27 AM, sling wrote: > By the way, the "shards" param is running ok with the value > "localhost:7574/solr,localhost:8983/solr" or "shard2", > but it get an exception with only one replica "localhost:7574/solr"; > > right: > shards=204.lead.index.com:9090/solr/doc/,66.index.com:8080

Re: Sorting on solr results

2013-12-05 Thread Shawn Heisey
On 12/4/2013 11:59 PM, anuragwalia wrote: > I required to sort product on webshop price with position. > > e.g. If we have three product (A, B ,C) needs to sort Price asc and position > asc. > > IDPrice Position > A 10 3 > B 10 2 > C 20 5 > > Result should be sor

Re: post filtering for boolean filter queries

2013-12-05 Thread Yonik Seeley
On Thu, Dec 5, 2013 at 7:39 AM, Dmitry Kan wrote: > Thanks Erick! > To be sure we are using cost 101 and no cache. It seems to affect on > searches as we expected. > > Basically with cache on we see more "fat" spikes around commit points, as > cache is getting flushed (we don't rerun too many entr

Re: facet.method=fcs vs facet.method=fc on solr slaves

2013-12-05 Thread Patrick O'Lone
So does it make the most sense then to force, by default, facet.method=fcs on slave nodes that receive updates every 5 minutes but with large segments that don't change every update? Right now, everything I have configured uses facet.method=fc since we don't declare it at all. Randomly, after repl

Re: SOLR 4 not utilizing multi CPU cores

2013-12-05 Thread Daniel Collins
Fair enough, I'm not famiilar with Surround parser, but it does look like some logic has changed there. On 5 December 2013 12:38, Salman Akram wrote: > Here is the response to your 2 questions: > > 1- Started from fresh Solr 4 config and modified custom stuff. > > 2- Index is same and optimized.

Re: post filtering for boolean filter queries

2013-12-05 Thread Erick Erickson
bq: To be sure we are using cost 101 and no cache The guy who wrote the code is really good, but I'm paranoid too so I use 101. Based on the number of off-by-one errors I've coded :)... How slow is "around commit points really slow"? You could at least lessen the pain here by committing less ofte

Re: post filtering for boolean filter queries

2013-12-05 Thread Dmitry Kan
Thanks Erick! To be sure we are using cost 101 and no cache. It seems to affect on searches as we expected. Basically with cache on we see more "fat" spikes around commit points, as cache is getting flushed (we don't rerun too many entries from old cache). But when the post-filtering is involved,

Re: SOLR 4 not utilizing multi CPU cores

2013-12-05 Thread Salman Akram
Here is the response to your 2 questions: 1- Started from fresh Solr 4 config and modified custom stuff. 2- Index is same and optimized. However, as I said in a previous mail the issue seems to be Surround Query Parser which is parsing the query in a different format. On Thu, Dec 5, 2013 at 2:

Re: starting up solr automatically

2013-12-05 Thread Alan Woodward
Hi Greg, It looks as though your script below will bootstrap a collection configuration every time Solr is restarted, which probably isn't what you want to do? You only need to upload the config once. Alan Woodward www.flax.co.uk On 4 Dec 2013, at 21:26, Greg Walters wrote: > I almost forgo

Re: Faceting Query in Solr

2013-12-05 Thread Ahmet Arslan
how about fq={!lucene q.op=OR}categoryId:(800 900) On Thursday, December 5, 2013 1:26 PM, kumar wrote: I used following url... http://localhost:8080/solr/Testing/select?q=*:*&wt=xml&indent=true&facet=true&facet.field=categoryId&fq=categoryId:(900 800) but it is not showing any results -

Re: Faceting Query in Solr

2013-12-05 Thread kumar
It is working fine for fq=categoryId:(800 900) -- View this message in context: http://lucene.472066.n3.nabble.com/Faceting-Query-in-Solr-tp4104881p4105110.html Sent from the Solr - User mailing list archive at Nabble.com.

Re: Difference between textfield and strfield

2013-12-05 Thread Ahmet Arslan
Hi Manju, You can use the following type ( taken from example schema.xml) . You may add TrimFilterFactory too.                                       On Thursday, December 5, 2013 12:09 PM, manju16832003 wrote: Hi, If we can not analyse string field, is there any other way to apply

Re: Faceting Query in Solr

2013-12-05 Thread kumar
I used following url... http://localhost:8080/solr/Testing/select?q=*:*&wt=xml&indent=true&facet=true&facet.field=categoryId&fq=categoryId:(900 800) but it is not showing any results -- View this message in context: http://lucene.472066.n3.nabble.com/Faceting-Query-in-Solr-tp4104881p4105106.h

Re: Programmatically upload configuration into ZooKeeper

2013-12-05 Thread Artem Karpenko
Thank you Shawn, it works! You were right about CoreAdmin API, I've implemented individual core management by getting list of live nodes from ZK and using HttpSolrServer's for different nodes. Regards, Artem. 04.12.2013 18:47, Shawn Heisey пишет: On 12/4/2013 9:23 AM, Artem Karpenko wrote:

Re: Difference between textfield and strfield

2013-12-05 Thread manju16832003
Hi, If we can not analyse string field, is there any other way to apply case insensitiveness. Ex: I have a field called *make* its values are Toyota, Honda, Chery etc. In Solr make data type is string and values are stored as they appear (Toyota, Honda, Chery). However, when user try to search f

Re: Faceting Query in Solr

2013-12-05 Thread Mikhail Khludnev
On Thu, Dec 5, 2013 at 1:34 PM, kumar wrote: > fq=categoryId:800&fq=categoryId:900 > what about fq=categoryId:(800 900) ? -- Sincerely yours Mikhail Khludnev Principal Engineer, Grid Dynamics

highlight feature is not working on "string" field type- Apache Solr

2013-12-05 Thread pyramesh
Hi ALL, I have recently build small search application using Apache solr. now I am facing an issue. Highlighting text feature is not working on "string" field type, But it working on "text" field type. when I search the content on string field type, the results are getting displaying, but not ge

Re: Faceting Query in Solr

2013-12-05 Thread kumar
Hi i am using following two queries but i am not getting the out put as expected http://localhost:8080/solr/Testing/select?q=*%3A*&wt=xml&indent=true&facet=true&facet.field=categoryId for the above query it is showing all the matched results as category1(Id-900)--10002 category2(Id-800)--5202 c

Re: SOLR 4 not utilizing multi CPU cores

2013-12-05 Thread Salman Akram
I am not using Shards. I gave more info in a previous mail but I know its a single index and what you are saying makes sense but from what I could see in 1.4.1 that it was better 'utilizing' the hardware resources available. I mean if the CPU is free then why not do multi threading (if possible of

Importing/Indexing the DB2 XML FieldType in SOLR

2013-12-05 Thread ravi1984
I'm using DB2 9.x and I have a column named DTL_XML of type XML. Following are the snippets of data-config.xml & schema.xml within my SOLR instance. data-config.xml: Snippets from schema.xml: cust_data When I query the SOLR, my requirement is to retrieve t

Re: SolrCloud FunctionQuery inconsistency

2013-12-05 Thread sling
By the way, the "shards" param is running ok with the value "localhost:7574/solr,localhost:8983/solr" or "shard2", but it get an exception with only one replica "localhost:7574/solr"; right: shards=204.lead.index.com:9090/solr/doc/,66.index.com:8080/solr/doc/ wrong: shards=204.lead.index.c

Re: SOLR 4 not utilizing multi CPU cores

2013-12-05 Thread Salman Akram
So I think I found one issue that somewhat explains the time difference but not sure why this is happening. We are using Surround Query Parser. Below is a two words query, both of them are in Common Grams list. Query = "only be" Here is what debug shows. I have highlighted the red part which is d

Re: SOLR 4 not utilizing multi CPU cores

2013-12-05 Thread Daniel Collins
Not sure if you are really stating the problem here. If you don't use Solr sharding, (I also assume you aren't using SolrCloud), and I'm guessing you are a single core (but can you confirm). As I understand Solr's logic, for a single query on a single core, that will only use 1 thread (ignoring u

Re: SOLR 4 not utilizing multi CPU cores

2013-12-05 Thread Salman Akram
More info on Cpu consumption: We have a server with 32 physical cores. Same search when executed on SOLR 4.6 takes quite long and throughout only uses 3% cpu (1 core). Same search when executed on SOLR 1.4.1 takes much less time and on average uses around 40-50% cpu. On Thu, Dec 5, 2013 at 2:05

Re: SOLR 4 not utilizing multi CPU cores

2013-12-05 Thread Salman Akram
I missed one imp piece of info. Due to large size we have indexed the date with Common Grams. All of the words in slow search are in common grams and when I debug it, they query is made properly with common grams. In debug all of the time is shown in process query time. Let me know what other inf

Re: Using Payloads as a Coefficient For Score At a Custom QParser That extends ExtendedDismaxQParser

2013-12-05 Thread Furkan KAMACI
Hi Joel; Of course I will share my codes with Solr as like before. I will explain my use case in detail and my solution. I will fire a Jira (Erick Hatcher had created a similar Jira 4 years ago, I will link it too). I will write to mail list when I applied my patch. Thanks; Furkan KAMACI 2013/