Tweaking boosts for more search results variety

2013-09-04 Thread Sai Gadde
Our index is aggregated content from various sites on the web. We want good user experience by showing multiple sites in the search results. In our setup we are seeing most of the results from same site on the top. Here is some information regarding queries and schema site - String

Re: unknown _stream_source_info while indexing rich doc in solr

2013-09-04 Thread Nutan
yes sir i did restart the tomcat. On Wed, Sep 4, 2013 at 6:27 PM, Jack Krupansky-2 [via Lucene] < ml-node+s472066n4088181...@n3.nabble.com> wrote: > Did you restart Solr after editing config and schema? > > -- Jack Krupansky > > -Original Message- > From: Nutan > Sent: Wednesday, Septemb

Re: Invalid Version when slave node pull replication from master node

2013-09-04 Thread YouPeng Yang
Hi all I solve the problem by add the coreName explicitly according to http://wiki.apache.org/solr/SolrReplication#Replicating_solrconfig.xml. But I want to make sure about that is it necessary to set the coreName explicitly. Is there any SolrJ API to pull the replication on the slave node f

Re: Invalid Version when slave node pull replication from master node

2013-09-04 Thread YouPeng Yang
Hi again I'm using Solr4.4. 2013/9/5 YouPeng Yang > HI solrusers > >I'm testing the replication within SolrCloud . >I just uncomment the replication section separately on the master and > slave node. >The replication section setting on the master node: > > commit

Invalid Version when slave node pull replication from master node

2013-09-04 Thread YouPeng Yang
HI solrusers I'm testing the replication within SolrCloud . I just uncomment the replication section separately on the master and slave node. The replication section setting on the master node: commit startup schema.xml,stopwords.txt and on the sl

Re: SolrCloud 4.x hangs under high update volume

2013-09-04 Thread Tim Vaillancourt
Thanks so much for the explanation Mark, I owe you one (many)! We have this on our high TPS cluster and will run it through it's paces tomorrow. I'll provide any feedback I can, more soon! :D Cheers, Tim

Re: Little XsltResponseWriter documentation bug (Attn: Wiki Admin)

2013-09-04 Thread Dmitri Popov
Upayavira, I could edit that page myself, but need to be confirmed human according to http://wiki.apache.org/solr/FrontPage#How_to_edit_this_Wiki My wiki account name is 'pin' just in case. On Wed, Sep 4, 2013 at 5:27 PM, Upayavira wrote: > It's a wiki. Can't you correct it? > > Upayavira > >

Re: Little XsltResponseWriter documentation bug (Attn: Wiki Admin)

2013-09-04 Thread Upayavira
It's a wiki. Can't you correct it? Upayavira On Wed, Sep 4, 2013, at 08:25 PM, Dmitri Popov wrote: > Hi, > > http://wiki.apache.org/solr/XsltResponseWriter (and reference manual PDF > too) become out of date: > > In configuration section > >name="xslt" > class="org.apache.solr.request.XS

RE: Solr highlighting fragment issue

2013-09-04 Thread Bryan Loofbourrow
>> I’m having some issues with Solr search results (using Solr 1.4 ) . I have enabled highlighting of searched text (hl=true) and set the fragment size as 500 (hl.fragsize=500) in the search query. Below is the (screen shot) results shown when I searched for the term ‘grandfather’ (2 results are

Little XsltResponseWriter documentation bug (Attn: Wiki Admin)

2013-09-04 Thread Dmitri Popov
Hi, http://wiki.apache.org/solr/XsltResponseWriter (and reference manual PDF too) become out of date: In configuration section 5 class name org.apache.solr.request.XSLTResponseWriter should be replaced by org.apache.solr.response.XSLTResponseWriter Otherwise ClassNotFoundException happe

Re: Numeric fields and payload

2013-09-04 Thread PETER LENAHAN
Chris Hostetter fucit.org> writes: > > > : is it possible to store (text) payload to numeric fields (class > : solr.TrieDoubleField)? My goal is to store measure units to numeric > : features - e.g. '1.5 cm' - and to use faceted search with these fields. > : But the field type doesn't allow

RE: SolrCloud 4.x hangs under high update volume

2013-09-04 Thread Markus Jelsma
Hi Mark, Got an issue to watch? Thanks, Markus -Original message- > From:Mark Miller > Sent: Wednesday 4th September 2013 16:55 > To: solr-user@lucene.apache.org > Subject: Re: SolrCloud 4.x hangs under high update volume > > I'm going to try and fix the root cause for 4.5 - I've susp

subindex

2013-09-04 Thread Peyman Faratin
Hi Is there a way to build a new (smaller) index from an existing (larger) index where the smaller index contains a subset of the fields of the larger index? thank you

cleanup after OutOfMemoryError

2013-09-04 Thread Ryan McKinley
I have an application where I am calling DirectUpdateHandler2 directly with: update.addDoc(cmd); This will sometimes hit: java.lang.OutOfMemoryError: Java heap space at org.apache.lucene.util.UnicodeUtil.UTF16toUTF8(UnicodeUtil.java:248) at org.apache.lucene.store.DataOutput.writeString(DataOu

Re: SolrCloud 4.x hangs under high update volume

2013-09-04 Thread Mark Miller
The 'lock' or semaphore was added to cap the number of threads that would be used. Previously, the number of threads in use could spike to many, many thousands on heavy updates. A limit on the number of outstanding requests was put in place to keep this from happening. Something like 16 * the nu

Re: cleanup after OutOfMemoryError

2013-09-04 Thread Mark Miller
I don't know that there is any 'safe' thing you can do other than restart - but if I were to try anything, I would use true for rollback. - Mark On Wed, Sep 4, 2013 at 9:44 AM, Ryan McKinley wrote: > I have an application where I am calling DirectUpdateHandler2 directly > with: > > update.ad

Re: SolrCloud 4.x hangs under high update volume

2013-09-04 Thread Tim Vaillancourt
Thanks guys! :) Mark: this patch is much appreciated, I will try to test this shortly, hopefully today. For my curiosity/understanding, could someone explain to me quickly what locks SolrCloud takes on updates? Was I on to something that more shards decrease the chance for locking? Secondly, I w

Questions about Replication Factor on solrcloud

2013-09-04 Thread Lisandro Montaño
Hi all, I’m currently working on deploying a solrcloud distribution in centos machines and wanted to have more guidance about Replication Factor configuration. I have configured two servers with solrcloud over tomcat and a third server as zookeeper. I have configured successfully and have o

Re: Solr Cloud hangs when replicating updates

2013-09-04 Thread Mark Miller
It would be great if you could give this patch a try: http://pastebin.com/raw.php?i=aaRWwSGP - Mark On Wed, Sep 4, 2013 at 8:31 AM, Kevin Osborn wrote: > Thanks. If there is anything I can do to help you resolve this issue, let > me know. > > -Kevin > > > On Wed, Sep 4, 2013 at 7:51 AM, Mark M

Re: SolrCloud 4.x hangs under high update volume

2013-09-04 Thread Mark Miller
There is an issue if I remember right, but I can't find it right now. If anyone that has the problem could try this patch, that would be very helpful: http://pastebin.com/raw.php?i=aaRWwSGP - Mark On Wed, Sep 4, 2013 at 8:04 AM, Markus Jelsma wrote: > Hi Mark, > > Got an issue to watch? > > Th

Solr highlighting fragment issue

2013-09-04 Thread Sreehareesh Kaipravan Meethaleveetil
Hi, I'm having some issues with Solr search results (using Solr 1.4 ) . I have enabled highlighting of searched text (hl=true) and set the fragment size as 500 (hl.fragsize=500) in the search query. Below is the (screen shot) results shown when I searched for the term 'grandfather' (2 results a

How to config SOLR server for spell check functionality

2013-09-04 Thread sebastian.manolescu
I want to implement spell check functionality offerd by solr using MySql database, but I dont understand how. Here the basic flow of what I want to do. I have a simple inputText (in jsf) and if I type the word shwo the response to OutputLabel should be show. First of all I'm using the following t

Re: solr performance against oracle

2013-09-04 Thread Toke Eskildsen
On Wed, 2013-09-04 at 14:06 +0200, Sergio Stateri wrote: > I´m trying to change the data access in the company where I work from > Oracle to Solr. They work on different principles and fulfill different needs. Comparing them by a performance oriented test are not likely to be usable point for sele

Need help on Joining and sorting syntax and limitations between multiple documents in solr-4.4.0

2013-09-04 Thread Sukanta Dey
Hi Team, In my project I am going to use Apache solr-4.4.0 version for searching. While doing that I need to join between multiple solr documents within the same core on one of the common field across the documents. Though I successfully join the documents using solr-4.4.0 join syntax, it is re

Re: Solr Cloud hangs when replicating updates

2013-09-04 Thread Mark Miller
Ill look at fixing the root issue for 4.5. I've been putting it off for way to long. Mark Sent from my iPhone On Sep 3, 2013, at 2:15 PM, Kevin Osborn wrote: > I was having problems updating SolrCloud with a large batch of records. The > records are coming in bursts with lulls between updat

Re: Solr Cloud hangs when replicating updates

2013-09-04 Thread Kevin Osborn
Thanks. If there is anything I can do to help you resolve this issue, let me know. -Kevin On Wed, Sep 4, 2013 at 7:51 AM, Mark Miller wrote: > Ill look at fixing the root issue for 4.5. I've been putting it off for > way to long. > > Mark > > Sent from my iPhone > > On Sep 3, 2013, at 2:15 PM,

Re: SolrCloud 4.x hangs under high update volume

2013-09-04 Thread Kevin Osborn
I am having this issue as well. I did apply this patch. Unfortunately, it did not resolve the issue in my case. On Wed, Sep 4, 2013 at 7:01 AM, Greg Walters wrote: > Tim, > > Take a look at > http://lucene.472066.n3.nabble.com/updating-docs-in-solr-cloud-hangs-td4067388.htmland > https://issues.

Re: Boost by numFounds

2013-09-04 Thread Flavio Pompermaier
I found that what can do the trick for page-rank like indexing is externalFileField! Is there an help to upload the external files to all solr servers (in solr 3 and solrCloud)? Or should I copy it to all solr instances data folder and then reload their cache? On Sat, Aug 24, 2013 at 12:36 AM, Fla

Re: SolrCloud 4.x hangs under high update volume

2013-09-04 Thread Mark Miller
I'm going to try and fix the root cause for 4.5 - I've suspected what it is since early this year, but it's never personally been an issue, so it's rolled along for a long time. Mark Sent from my iPhone On Sep 3, 2013, at 4:30 PM, Tim Vaillancourt wrote: > Hey guys, > > I am looking into a

Re: dataimporter tika doesn't extract certain div

2013-09-04 Thread Andreas Owen
or could i use a filter in schema.xml where i define a fieldtype and use some filter that understands xpath? On 4. Sep 2013, at 11:52 AM, Shalin Shekhar Mangar wrote: > No that wouldn't work. It seems that you probably need a custom > Transformer to extract the right div content. I do not know i

Re: Strange behaviour with single word and phrase

2013-09-04 Thread Alistair Young
Yep ignoring stop words. Thanks for the pointer. Alistair - mov eax,1 mov ebx,0 int 80 On 04/09/2013 13:43, "Jack Krupansky" wrote: >Do you have stop word filtering enabled? What does your field type look >like? > >If stop words are ignored, you will get exactly the behavior

RE: SolrCloud 4.x hangs under high update volume

2013-09-04 Thread Greg Walters
Tim, Take a look at http://lucene.472066.n3.nabble.com/updating-docs-in-solr-cloud-hangs-td4067388.html and https://issues.apache.org/jira/browse/SOLR-4816. I had the same issue that you're reporting for a while then I applied the patch from SOLR-4816 to my clients and the problems went away.

RE: Solr Cloud hangs when replicating updates

2013-09-04 Thread Greg Walters
Kevin, Take a look at http://lucene.472066.n3.nabble.com/updating-docs-in-solr-cloud-hangs-td4067388.html and https://issues.apache.org/jira/browse/SOLR-4816. I had the same issue that you're reporting for a while then I applied the patch from SOLR-4816 to my clients and the problems went away

Re: unknown _stream_source_info while indexing rich doc in solr

2013-09-04 Thread Jack Krupansky
Did you restart Solr after editing config and schema? -- Jack Krupansky -Original Message- From: Nutan Sent: Wednesday, September 04, 2013 3:07 AM To: solr-user@lucene.apache.org Subject: unknown _stream_source_info while indexing rich doc in solr i am using solr4.2 on windows7 my sch

Re: Strange behaviour with single word and phrase

2013-09-04 Thread Jack Krupansky
Do you have stop word filtering enabled? What does your field type look like? If stop words are ignored, you will get exactly the behavior you described. -- Jack Krupansky -Original Message- From: Alistair Young Sent: Wednesday, September 04, 2013 6:57 AM To: solr-user@lucene.apache.

Re: solr performance against oracle

2013-09-04 Thread Andrea Gazzarini
You said nothing about your enviroments (e.g. operating systems, what kind of Oracle installation you have, whar kind of SOLR installation, how many data in database, how many documents in index, RAM for SOLR, for Oracle, for OS, and in general hardware...and so on)... Anyway...a migration fro

solr performance against oracle

2013-09-04 Thread Sergio Stateri
Hi, I´m trying to change the data access in the company where I work from Oracle to Solr. Then I make some test, like this: In Oracle: private void go() throws Exception { Class.forName("oracle.jdbc.driver.OracleDriver"); Connection conn = DriverManager.getConnection("XXX

Re: Indexing pdf files - question.

2013-09-04 Thread Nutan Shinde
My solrconfig.xml is: desc true attr_ true Schema.xml: doc_id I have created extract directory and copied all required .jar and solr-cell jar files into this extract directory and given its path in lib tag in solrconfig.xml When I try

Re: Starting Solr in Tomcat with specifying ZK host(s)

2013-09-04 Thread maephisto
Thanks Shawn! Indeed, setting the JAVA_OPTS and restarting Tomcat did the trick. Currently I'm exploring and experimenting with SolrCloud, thus I only used only one ZK. For a production environment you suggestion would, of course, be mandatory. -- View this message in context: http://lucene.47

Strange behaviour with single word and phrase

2013-09-04 Thread Alistair Young
I wonder if anyone could point me in the right direction please? If I search on the phrase "the toolkit" I get hits containing that phrase but also hits that have the word 'the' before the word 'toolkit', no matter how far apart they are. Also, if I search on the word 'the' there are no hits at

Re: Measuring SOLR performance

2013-09-04 Thread Dmitry Kan
Hi Roman, Ok, I will. Thanks! Cheers, Dmitry On Tue, Sep 3, 2013 at 4:46 PM, Roman Chyla wrote: > Hi Dmitry, > > Thanks for the feedback. Yes, it is indeed jmeter issue (or rather, the > issue of the plugin we use to generate charts). You may want to use the > github for whatever comes next >

Re: dataimporter tika doesn't extract certain div

2013-09-04 Thread Shalin Shekhar Mangar
No that wouldn't work. It seems that you probably need a custom Transformer to extract the right div content. I do not know if TikaEntityProcessor supports such a thing. On Wed, Sep 4, 2013 at 12:38 PM, Andreas Owen wrote: > so could i just nest it in a XPathEntityProcessor to filter the html or

Re: Change the score of a document based on the *value* of a multifield using dismax

2013-09-04 Thread danielitos85
Thanks a lot David. I will try it ;) -- View this message in context: http://lucene.472066.n3.nabble.com/Change-the-score-of-a-document-based-on-the-value-of-a-multifield-tp4087503p4088145.html Sent from the Solr - User mailing list archive at Nabble.com.

Re: dataimporter tika doesn't extract certain div

2013-09-04 Thread Andreas Owen
so could i just nest it in a XPathEntityProcessor to filter the html or is there something like xpath for tika? but now i dont know how to pass the text to tika, what do i put in url and datasou

Re: DIH + Solr Cloud

2013-09-04 Thread Tim Vaillancourt
Hey Alejandro, I guess it means what you call "more than one instance". The request handlers are at the core-level, and not the Solr instance/global level, and within each of those cores you could have one or more data import handlers. Most setups have 1 DIH per core at the handler location

unknown _stream_source_info while indexing rich doc in solr

2013-09-04 Thread Nutan
i am using solr4.2 on windows7 my schema is: solrconfig.xml : contents true ignored_ true when i execute: curl "http://localhost:8080/solr/update/extract?literal.id=1&commit=true"; -F "myfile=@abc.txt" i get error:unknown field ignored_stream_ source_info. i referred solr cookbook3