Changing existing index to use block-join

2014-01-18 Thread dev
Hello, I read about the possibility to have nested documents with solr block-join since version 4.5. I’m wondering if I can change an existing index to use this new opportunity. Right now I’m having an index which stores informations about a journal and each of its articles. For example

Re: Changing existing index to use block-join

2014-01-20 Thread dev
Zitat von Mikhail Khludnev : On Sat, Jan 18, 2014 at 11:25 PM, wrote: So, my question now: can I change my existing index in just adding a is_parent and a _root_ field and saving the journal id there like I did with j-id or do I have to reindex all my documents? Absolutely, to use block-j

Searching and scoring with block join

2014-01-22 Thread dev
Hello again, I'm using the solr block-join feature to index a journal and all of it's articles. Here a short example: 527fcbf8-c140-4ae6-8f51-68cd2efc1343 Sozialmagazin 8 2008 0340-8469 .

Re: Searching and scoring with block join

2014-01-22 Thread dev
Zitat von Mikhail Khludnev : On Wed, Jan 22, 2014 at 10:17 PM, wrote: I know that I can't just make a query like this: {!parent which=is_parent:true}+Term, most likely I'll get this error: child query must only match non-parent docs, but parent docID= matched childScorer=class org.apache

Re: Searching and scoring with block join

2014-01-24 Thread dev
Zitat von Mikhail Khludnev : nesting query parsers is shown at http://blog.griddynamics.com/2013/12/grandchildren-and-siblings-with-block.html try to start from the following: title:Test _query_:"{!parent which=is_parent:true}{!dismax qf=content_de}Test" mind about local params referencing eg

Indexing and searching documents in different languages

2013-04-09 Thread dev
Hello, I'm trying to index a large number of documents in different languages. I don't know the language of the document, so I'm using TikaLanguageIdentifierUpdateProcessorFactory to identify it. So, this is my configuration in solrconfig.xml class="org.apache.solr.update.processor.Tik

Re: Indexing and searching documents in different languages

2013-04-10 Thread dev
Thx, I'll try this approach. Zitat von Alexandre Rafalovitch : Have you looked at edismax and the 'qf' fields parameter? It allows you to define the fields to search. Also, you can define those parameters in solrconfig.xml and not have to send them down the wire. Finally, you can define severa

Very bad search performance with group=true

2013-06-11 Thread dev
Hi, I'm indexing pdf documents to use full text search with solr. To get the number of the page where the result was found, I save every page separately and group the results with a field called doc_id. (See this topic: http://mail-archives.apache.org/mod_mbox/lucene-solr-user/201303.mbox/%3c

Get page number of searchresult of a pdf in solr

2013-02-28 Thread dev
Hello, I'm building a web application where users can search for pdf documents and view them with pdf.js. I would like to display the search results with a short snippet of the paragraph where the search term where found and a link to open the document at the right page. So what I need is

Re: Get page number of searchresult of a pdf in solr

2013-03-01 Thread dev
Is it possible to write a plugin that is converting each page separately with Tika and saving all pages in one document (maybe in a dynamic field like "page_*")? I would like to have only one document stored in SOLR for each pdf (it fit's better to the way my web application is managing the

Help me understand these newrelic graphs

2014-03-13 Thread Software Dev
Here are some screen shots of our Solr Cloud cluster via Newrelic http://postimg.org/gallery/2hyzyeyc/ We currently have a 5 node cluster and all indexing is done on separate machines and shipped over. Our machines are running on SSD's with 18G of ram (Index size is 8G). We only have 1 shard at t

Re: Help me understand these newrelic graphs

2014-03-13 Thread Software Dev
g the average response for an add > > operation, which generally returns very quickly and due to sheer number > are > > averaging out the response time of your queries. New Relic should break > > out requests based on which handler they're hitting but they don't s

Re: Help me understand these newrelic graphs

2014-03-14 Thread Software Dev
high number of > concurrent queries than sharding may not be of any help at all. > > Otis > -- > Performance Monitoring * Log Analytics * Search Analytics > Solr & Elasticsearch Support * http://sematext.com/ > > > On Thu, Mar 13, 2014 at 7:42 PM, Software Dev >wrote:

Re: Help me understand these newrelic graphs

2014-03-14 Thread Software Dev
Here is a screenshot of the host information: http://postimg.org/image/vub5ihxix/ As you can see we have 24 core CPU's and the load is only at 5-7.5. On Fri, Mar 14, 2014 at 10:02 AM, Software Dev wrote: > If that is the case, what would help? > > > On Thu, Mar 13, 2014

Re: Help me understand these newrelic graphs

2014-03-17 Thread Software Dev
ring * Log Analytics * Search Analytics > Solr & Elasticsearch Support * http://sematext.com/ > > > On Fri, Mar 14, 2014 at 1:07 PM, Software Dev >wrote: > > > Here is a screenshot of the host information: > > http://postimg.org/image/vub5ihxix/ > > &

Solr Cloud collection keep going down?

2014-03-22 Thread Software Dev
We have 2 collections with 1 shard each replicated over 5 servers in the cluster. We see a lot of flapping (down or recovering) on one of the collections. When this happens the other collection hosted on the same machine is still marked as active. When this happens it takes a fairly long time (~30

Re: Solr Cloud collection keep going down?

2014-03-22 Thread Software Dev
a:182) at org.eclipse.jetty.http.HttpGenerator.flushBuffer(HttpGenerator.java:838) ... 51 more ,code=500} On Sat, Mar 22, 2014 at 12:23 PM, Software Dev wrote: > We have 2 collections with 1 shard each replicated over 5 servers in the > cluster. We see a lot of flapping (down or recoverin

Re: Solr Cloud collection keep going down?

2014-03-24 Thread Software Dev
reporting as well as some relevant portions of our SolrConfig.xml. Any thoughts/comments would be greatly appreciated. http://postimg.org/gallery/4t73sdks/1fc10f9c/ Thanks On Sat, Mar 22, 2014 at 2:26 PM, Shawn Heisey wrote: > On 3/22/2014 1:23 PM, Software Dev wrote: >> We have 2 collect

Question on highlighting edgegrams

2014-03-24 Thread Software Dev
In 3.5.0 we have the following. If we searched for "c" with highlighting enabled we would get back results such as: cdat crocdile cool beans But in the latest Solr (4.7) we get the full words highlighted back. Di

Re: Question on highlighting edgegrams

2014-03-25 Thread Software Dev
Bump On Mon, Mar 24, 2014 at 3:00 PM, Software Dev wrote: > In 3.5.0 we have the following. > > positionIncrementGap="100"> > > > > maxGramSize="30"/> > > > > &

Replication (Solr Cloud)

2014-03-25 Thread Software Dev
I see that by default in SolrCloud that my collections are replicating. Should this be disabled in SolrCloud as this is already handled by it? >From the documentation: "The Replication screen shows you the current replication state for the named core you have specified. In Solr, replication is fo

Re: Replication (Solr Cloud)

2014-03-25 Thread Software Dev
Thanks for the reply. Ill make sure NOT to disable it.

Re: Solr Cloud collection keep going down?

2014-03-25 Thread Software Dev
Can anyone else chime in? Thanks On Mon, Mar 24, 2014 at 10:10 AM, Software Dev wrote: > Shawn, > > Thanks for pointing me in the right direction. After consulting the > above document I *think* that the problem may be too large of a heap > and which may be affecting GC colle

Re: Replication (Solr Cloud)

2014-03-25 Thread Software Dev
One other question. If I optimize a collection on one node, does this get replicated to all others when finished? On Tue, Mar 25, 2014 at 10:13 AM, Software Dev wrote: > Thanks for the reply. Ill make sure NOT to disable it.

Re: Replication (Solr Cloud)

2014-03-25 Thread Software Dev
Ehh.. found out the hard way. I optimized the collection on 1 machine and when it was completed it replicated to the others and took my cluster down. Shitty On Tue, Mar 25, 2014 at 10:46 AM, Software Dev wrote: > One other question. If I optimize a collection on one node, does this &g

Re: Replication (Solr Cloud)

2014-03-25 Thread Software Dev
So its generally a bad idea to optimize I gather? - In older versions it might have done them all at once, but I believe that newer versions only do one core at a time. On Tue, Mar 25, 2014 at 11:16 AM, Shawn Heisey wrote: > On 3/25/2014 11:59 AM, Software Dev wrote: >> >> Ehh.

Re: Replication (Solr Cloud)

2014-03-25 Thread Software Dev
"In older versions it might have done them all at once, but I believe that newer versions only do one core at a time." It looks like it did it all at once and I'm on the latest (4.7) On Tue, Mar 25, 2014 at 11:27 AM, Software Dev wrote: > So its generally a bad idea to

Re: Question on highlighting edgegrams

2014-03-25 Thread Software Dev
Same problem here: http://lucene.472066.n3.nabble.com/Solr-4-x-EdgeNGramFilterFactory-and-highlighting-td4114748.html On Tue, Mar 25, 2014 at 9:39 AM, Software Dev wrote: > Bump > > On Mon, Mar 24, 2014 at 3:00 PM, Software Dev > wrote: >> In 3.5.0 we

What contributes to disk IO?

2014-03-25 Thread Software Dev
What are the main contributing factors for Solr Cloud generating a lot of disk IO? A lot of reads? Writes? Insufficient RAM? I would think if there was enough disk cache available for the whole index there would be little to no disk IO.

Re: Question on highlighting edgegrams

2014-03-26 Thread Software Dev
Is this a known bug? On Tue, Mar 25, 2014 at 1:12 PM, Software Dev wrote: > Same problem here: > http://lucene.472066.n3.nabble.com/Solr-4-x-EdgeNGramFilterFactory-and-highlighting-td4114748.html > > On Tue, Mar 25, 2014 at 9:39 AM, Software Dev > wrote: >> Bump >> &

What are my options?

2014-03-27 Thread Software Dev
We have a collection named "items". These are simply products that we sell. A large part of our scoring involves boosting on certain metrics for each product (amount sold, total GMS, ratings, etc). Some of these metrics are actually split across multiple tables. We are currently re-indexing the co

Re: Question on highlighting edgegrams

2014-03-27 Thread Software Dev
Certainly I am not the only user experiencing this? On Wed, Mar 26, 2014 at 1:11 PM, Software Dev wrote: > Is this a known bug? > > On Tue, Mar 25, 2014 at 1:12 PM, Software Dev > wrote: >> Same problem here: >> http://lucene.472066.n3.nabble.com/Solr-4-x-Ed

Re: Question on highlighting edgegrams

2014-03-28 Thread Software Dev
, 2014 at 10:17 AM, Software Dev > wrote: >> Certainly I am not the only user experiencing this? >> >> On Wed, Mar 26, 2014 at 1:11 PM, Software Dev >> wrote: >>> Is this a known bug? >>> >>> On Tue, Mar 25, 2014 at 1:12 PM, Software Dev &g

Highlighting bug with edgegrams

2014-04-09 Thread Software Dev
In 3.5.0 we have the following. If we searched for "c" with highlighting enabled we would get back results such as: cdat crocdile cool beans But in the latest Solr (4.7.1) we get the full words highlighted back.

Re: Sharding and replicas (Solr Cloud)

2013-11-07 Thread Software Dev
Sorry about the confusion. I meant I created my config via the ZkCLI and then I wanted to create my core via the CollectionsAPI. I *think* I have it working but was wondering why there are a crazy amount of core names under the admin "Core Selector"? When I create X amount of shards via the bootst

Re: Sharding and replicas (Solr Cloud)

2013-11-07 Thread Software Dev
n Thu, Nov 7, 2013 at 3:15 PM, Shawn Heisey wrote: > On 11/7/2013 2:52 PM, Software Dev wrote: > >> Sorry about the confusion. I meant I created my config via the ZkCLI and >> then I wanted to create my core via the CollectionsAPI. I *think* I have >> it >> workin

Solr Cloud Bulk Indexing Questions

2014-01-20 Thread Software Dev
We are testing our shiny new Solr Cloud architecture but we are experiencing some issues when doing bulk indexing. We have 5 solr cloud machines running and 3 indexing machines (separate from the cloud servers). The indexing machines pull off ids from a queue then they index and ship over a docume

Re: Solr Cloud Bulk Indexing Questions

2014-01-20 Thread Software Dev
e culprit. > > Best, > Erick > > On Mon, Jan 20, 2014 at 4:00 PM, Software Dev > wrote: > > We are testing our shiny new Solr Cloud architecture but we are > > experiencing some issues when doing bulk indexing. > > > > We have 5 solr cloud machines running

Re: Solr Cloud Bulk Indexing Questions

2014-01-20 Thread Software Dev
We also noticed that disk IO shoots up to 100% on 1 of the nodes. Do all updates get sent to one machine or something? On Mon, Jan 20, 2014 at 2:42 PM, Software Dev wrote: > We commit have a soft commit every 5 seconds and hard commit every 30. As > far as docs/second it would guess arou

Re: Solr Cloud Bulk Indexing Questions

2014-01-20 Thread Software Dev
4.6.0 On Mon, Jan 20, 2014 at 2:47 PM, Mark Miller wrote: > What version are you running? > > - Mark > > On Jan 20, 2014, at 5:43 PM, Software Dev > wrote: > > > We also noticed that disk IO shoots up to 100% on 1 of the nodes. Do all > > updates get

Removing a node from Solr Cloud

2014-01-21 Thread Software Dev
What is the process for completely removing a node from Solr Cloud? We recently removed one but t its still showing up as "Gone" in the Cloud admin. Thanks

Setting leaderVoteWait for auto discovered cores

2014-01-21 Thread Software Dev
How is this accomplished? We currently have an empty solr.xml (auto-discovery) so I'm not sure where to put this value?

Re: Removing a node from Solr Cloud

2014-01-21 Thread Software Dev
solr/CoreAdmin#UNLOAD. > > > On Tue, Jan 21, 2014 at 10:22 AM, Software Dev >wrote: > > > What is the process for completely removing a node from Solr Cloud? We > > recently removed one but t its still showing up as "Gone" in the Cloud > > adm

Re: Solr Cloud Bulk Indexing Questions

2014-01-21 Thread Software Dev
Any other suggestions? On Mon, Jan 20, 2014 at 2:49 PM, Software Dev wrote: > 4.6.0 > > > On Mon, Jan 20, 2014 at 2:47 PM, Mark Miller wrote: > >> What version are you running? >> >> - Mark >> >> On Jan 20, 2014, at 5:43 PM, Software Dev >> wr

Re: Solr Cloud Bulk Indexing Questions

2014-01-22 Thread Software Dev
is a change. > How much system RAM ? JVM Heap ? Enough space in RAM for system disk cache > ? > What is the size of your documents ? A few KB, MB, ... ? > Ah, and what about network IO ? Could that be a limiting factor ? > > > André > > > On 2014-01-21 23:40, Software De

Re: Solr Cloud Bulk Indexing Questions

2014-01-23 Thread Software Dev
t consequences: > > > http://searchhub.org/2013/08/23/understanding-transaction-logs-softcommit-and-commit-in-sorlcloud/ > > I suspect (but, of course, can't prove) that you're over-committing > and hitting segment > merges without meaning to... > > FWIW, > E

Re: Solr Cloud Bulk Indexing Questions

2014-01-23 Thread Software Dev
Also, any suggestions on debugging? What should I look for and how? Thanks On Thu, Jan 23, 2014 at 10:01 AM, Software Dev wrote: > Thanks for suggestions. After reading that document I feel even more > confused though because I always thought that hard commits should be less > freq

Re: Solr Cloud Bulk Indexing Questions

2014-01-23 Thread Software Dev
/search-lucene.com/?q=maxWriteMBPerSec&fc_project=Solr > > Otis > -- > Performance Monitoring * Log Analytics * Search Analytics > Solr & Elasticsearch Support * http://sematext.com/ > > > On Mon, Jan 20, 2014 at 4:00 PM, Software Dev >wrote: > > >

SolrCloudServer questions

2014-01-31 Thread Software Dev
Can someone clarify what the following options are: - updatesToLeaders - shutdownLBHttpSolrServer - parallelUpdates Also, I remember in older version of Solr there was an efficient format that was used between SolrJ and Solr that is more compact. Does this sill exist in the latest version of Solr

Disabling Commit/Auto-Commit (SolrCloud)

2014-01-31 Thread Software Dev
Is there a way to disable commit/hard-commit at runtime? For example, we usually have our hard commit and soft-commit set really low but when we do bulk indexing we would like to disable this to increase performance. If there isn't a an easy way of doing this would simply pushing a new solrconfig t

Re: SolrCloudServer questions

2014-01-31 Thread Software Dev
shards in > parallel rather than with a single thread. Can really increase update > speed. Still not as powerful as using CloudSolrServer from multiple > threads, but a nice improvement non the less. > > > - Mark > > http://about.me/markrmiller > > > > > I&#

Re: SolrCloudServer questions

2014-02-01 Thread Software Dev
ds, or if > you need more fine grained responses, use the single add from multiple > threads (though bulk add can also be done via multiple threads if you > really want to try and push the max). > > - Mark > > http://about.me/markrmiller > > On Jan 31, 2014, at 3:50 PM, S

Re: SolrCloudServer questions

2014-02-01 Thread Software Dev
Also, if we are seeing a huge cpu spike on the leader when doing a bulk index, would changing any of the options help? On Sat, Feb 1, 2014 at 2:59 PM, Software Dev wrote: > Out use case is we have 3 indexing machines pulling off a kafka queue and > they are all sending individual u

How does Solr parse schema.xml?

2014-02-26 Thread Software Dev
Can anyone point me in the right direction. I'm trying to duplicate the functionality of the analysis request handler so we can wrap a service around it to return the terms given a string of text. We would like to read the same schema.xml file to configure the analyzer,tokenizer, etc but I can't se

Re: Does Solr flush to disk even before ramBufferSizeMB is hit?

2011-08-30 Thread roz dev
Thanks Shawn. If Solr writes this info to Disk as soon as possible (which is what I am seeing) then ramBuffer setting seems to be misleading. Anyone else has any thoughts on this? -Saroj On Mon, Aug 29, 2011 at 6:14 AM, Shawn Heisey wrote: > On 8/28/2011 11:18 PM, roz dev wrote: >

Re: DataImportHandler using new connection on each query

2011-09-02 Thread eks dev
I am not sure if current version has this, but DIH used to reload connections after some idle time if (currTime - connLastUsed > CONN_TIME_OUT) { synchronized (this) { Connection tmpConn = factory.call(); clos

Re: DataImportHandler using new connection on each query

2011-09-02 Thread eks dev
take care, "running 10 hours" != "idling 10 seconds" and trying again. Those are different cases. It is not dropping *used* connections (good to know it works that good, thanks for reporting!), just not reusing connections more than 10 seconds idle On Fri, Sep 2, 2011 at 10:26 PM, Gora Mohanty

Re: DataImportHandler using new connection on each query

2011-09-02 Thread eks dev
watch out, "running 10 hours" != "idling 10 seconds" and trying again. Those are different cases. It is not dropping *used* connections (good to know it works that good, thanks for reporting!), just not reusing connections more than 10 seconds idle On Fri, Sep 2, 2011 at 10:26 PM, Gora Mohanty

Which Solr / Lucene direcotory for ramfs?

2011-09-16 Thread eks dev
probably stupid question, Which Directory implementation should be the best suited for index mounted on ramfs/tmpfs? I guess plain old FSDirectory, (or mmap/nio?)

solr-user@lucene.apache.org

2011-09-16 Thread eks dev
probably stupid question, Which Directory implementation should be the best suited for index mounted on ramfs/tmpfs? I guess plain old FSDirectory, (or mmap/nio?)

what is the default value of omitNorms and termVectors in solr schema

2011-09-18 Thread roz dev
Hi As per this document, http://wiki.apache.org/solr/FieldOptionsByUseCase, omitNorms and termVectors have to be "explicitly" specified in some cases. I am wondering what is the default value of these settings if solr schema definition does not state them. *Example:* In above case, will Solr

cache invalidation in slaves

2011-09-20 Thread roz dev
Hi All Solr has different types of caches such as filterCache, queryResultCache and document Cache . I know that if a commit is done then a new searcher is opened and new caches are built. And, this makes sense. What happens when commits are happening on master and slaves are pulling all the delt

q and fq in solr 1.4.1

2011-09-20 Thread roz dev
Hi All I am sure that q vs fq question has been answered several times. But, I still have a question which I would like to know the answers for: if we have a solr query like this q=*&fq=field_1:XYZ&fq=field_2:ABC&sortBy=field_3+asc How does SolrIndexSearcher fire query in 1.4.1 Will it fire q

Production Issue: SolrJ client throwing this error even though field type is not defined in schema

2011-09-21 Thread roz dev
Hi All We are getting this error in our Production Solr Setup. Message: Element type "t_sort" must be followed by either attribute specifications, ">" or "/>". Solr version is 1.4.1 Stack trace indicates that solr is returning malformed document. Caused by: org.apache.solr.client.solrj.SolrSer

Re: Production Issue: SolrJ client throwing - Element type must be followed by either attribute specifications, ">" or "/>".

2011-09-22 Thread roz dev
dev wrote: > Hi All > > We are getting this error in our Production Solr Setup. > > Message: Element type "t_sort" must be followed by either attribute > specifications, ">" or "/>". > Solr version is 1.4.1 > > Stack trace indic

Update ingest rate drops suddenly

2011-09-24 Thread eks dev
just looking for hints where to look for... We were testing single threaded ingest rate on solr, trunk version on atypical collection (a lot of small documents), and we noticed something we are not able to explain. Setup: We use defaults for index settings, windows 64 bit, jdk 7 U2. on SSD, machi

Re: Update ingest rate drops suddenly

2011-09-25 Thread eks dev
g locally > > Out of curiosity, how big is your ramBufferSizeMB and your -Xmx? > And on that 8-core box you have ~8 indexing threads going? > > Otis > > Sematext is Hiring -- http://sematext.com/about/jobs.html > > > > >> &g

Re: Update ingest rate drops suddenly

2011-09-26 Thread eks dev
Just to bring closure on this one, we were slurping data from the wrong DB (hardly desktop class machine)... Solr did not cough on 41Mio records @34k updates / sec., single threaded. Great! On Sat, Sep 24, 2011 at 9:18 PM, eks dev wrote: > just looking for hints where to look for... >

Re: Production Issue: SolrJ client throwing this error even though field type is not defined in schema

2011-09-30 Thread roz dev
t; > > > http://wiki.apache.org/solr/UsingMailingLists > > > > There's really not much to go on here. > > > > Best > > Erick > > > > On Wed, Sep 21, 2011 at 12:13 PM, roz dev wrote: > >> Hi All > >> > >> We are getting

Re: capacity planning

2011-10-11 Thread eks dev
Re. "I have little experience with VM servers for search." We had huge performance penalty on VMs, CPU was bottleneck. We couldn't freely run measurements to figure out what the problem really was (hosting was contracted by customer...), but it was something pretty scary, kind of 8-10 times slowe

Index format difference between 4.0 and 3.4

2011-11-14 Thread roz dev
Hi All, We are using Solr 1.4.1 in production and are considering an upgrade to newer version. It seems that Solr 3.x requires a complete rebuild of index as the format seems to have changed. Is Solr 4.0 index file format compatible with Solr 3.x format? Please advise. Thanks Saroj

codec="Pulsing" per field broken?

2011-12-11 Thread eks dev
on the latest trunk, my schema.xml with field type declaration containing //codec="Pulsing"// does not work any more (throws exception from FieldType). It used to work wit approx. a month old trunk version. I didn't dig deeper, can be that the old schema.xml was broken and worked by accident. --

Re: codec="Pulsing" per field broken?

2011-12-11 Thread eks dev
Thanks Robert, I've missed LUCENE-3490... Awesome! On Sun, Dec 11, 2011 at 6:37 PM, Robert Muir wrote: > On Sun, Dec 11, 2011 at 11:34 AM, eks dev wrote: >> on the latest trunk, my schema.xml with field type declaration >> containing //codec="Pulsing"//

hot deploy of newer version of solr schema in production

2012-01-23 Thread roz dev
Hi All, I need community's feedback about deploying newer versions of solr schema into production while existing (older) schema is in use by applications. How do people perform these things? What has been the learning of people about this. Any thoughts are welcome. Thanks Saroj

Re: filter query from external list of Solr unique IDs

2010-10-16 Thread eks dev
if your index is read-only in production, can you add mapping unique_id-Lucene docId in your kv store and and build filters externally? That would make unique Key obsolete in your production index, as you would work at lucene doc id level. That way, you offline the problem to update/optimize phase

Re: can we configure spellcheck to be invoked after request processing?

2013-03-04 Thread roz dev
gram Content Group > (615) 213-4311 > > > -Original Message- > From: roz dev [mailto:rozde...@gmail.com] > Sent: Thursday, February 28, 2013 6:33 PM > To: solr-user@lucene.apache.org > Subject: can we configure spellcheck to be invoked after request > processing? >

Can we manipulate termfreq to count as 1 for multiple matches?

2013-03-13 Thread roz dev
Hi All I am wondering if there is a way to alter term frequency of a certain field as 1, even if there are multiple matches in that document? Use Case is: Let's say that I have a document with 2 fields - Name and - Description And, there is a document with data like this Document_1 Name = Blu

Re: hot deploy of newer version of solr schema in production

2012-01-31 Thread roz dev
s huge and the reason for Solr upgrade or schema > change is to fix a bug, not to use new functionality. > > -- > Jan Høydahl, search solution architect > Cominvent AS - www.cominvent.com > Solr Training - www.solrtraining.com > > On 24. jan. 2012, at 01:51, roz dev wrote:

reader/searcher refresh after replication (commit)

2012-02-21 Thread eks dev
Hi all, I am a bit confused with IndexSearcher refresh lifecycles... In a master slave setup, I override postCommit listener on slave (solr trunk version) to read some user information stored in userCommitData on master -- @Override public final void postCommit() { // This returnes "stale"

Re: reader/searcher refresh after replication (commit)

2012-02-21 Thread eks dev
licates can appear) are there any "IndexWriter" listeners around? Thanks again, eks. On Tue, Feb 21, 2012 at 8:03 PM, Mark Miller wrote: > Post commit calls are made before a new searcher is opened. > > Might be easier to try to hook in with a new searcher listener? >

Re: reader/searcher refresh after replication (commit)

2012-02-21 Thread eks dev
And drinks on me to those who decoupled implicit commit from close... this was tricky trap On Tue, Feb 21, 2012 at 9:10 PM, eks dev wrote: > Thanks Mark, > Hmm, I would like to have this information asap, not to wait until the > first search gets executed (depends on user) . Is solr

Re: reader/searcher refresh after replication (commit)

2012-02-22 Thread eks dev
with the master. > > What are you expecting a BeforeCommitListener could do for you, if one > would exist? > > Kind regards, > Em > > Am 21.02.2012 21:10, schrieb eks dev: >> Thanks Mark, >> Hmm, I would like to have this information asap, not to wait until the

SnapPull failed :org.apache.solr.common.SolrException: Error opening new searcher

2012-02-22 Thread eks dev
We started observing strange failures from ReplicationHandler when we commit on master trunk version 4-5 days old. It works sometimes, and sometimes not didn't dig deeper yet. Looks like the real culprit hides behind: org.apache.lucene.store.AlreadyClosedException: this IndexWriter is clos

Re: Unusually long data import time?

2012-02-22 Thread eks dev
Davon, you ought to try to update from many threads, (I do not know if DIH can do it, check it), but lucene does great job if fed from many update threads... depends where your time gets lost, but it is usually a) analysis chain or b) database if it os a) and your server has spare cpu-cores, you

dih and solr cloud

2012-02-22 Thread eks dev
out of curiosity, trying to see if new cloud features can replace what I use now... how is this (batch) update forwarding solved at cloud level? imagine simple one shard and one replica case, if I fire up DIH update, is this going to be replicated to replica shard? If yes, - is it going to be sen

Re: SnapPull failed :org.apache.solr.common.SolrException: Error opening new searcher

2012-02-22 Thread eks dev
te our commit point to the right dir >       solrCore.getUpdateHandler().commit(new CommitUpdateCommand(req, false)); > > That should allow the searcher that the following commit command prompts to > see the *new* IndexWriter. > > On Feb 22, 2012, at 10:56 AM, eks dev wrote: > >> W

Re: SnapPull failed :org.apache.solr.common.SolrException: Error opening new searcher

2012-02-23 Thread eks dev
thin. On Thu, Feb 23, 2012 at 8:47 AM, eks dev wrote: > thanks Mark, I will give it a go and report back... > > On Thu, Feb 23, 2012 at 1:31 AM, Mark Miller wrote: >> Looks like an issue around replication IndexWriter reboot, soft commits and >> hard commits. >> >>

Solr Cloud, Commits and Master/Slave configuration

2012-02-27 Thread roz dev
Hi All, I am trying to understand features of Solr Cloud, regarding commits and scaling. - If I am using Solr Cloud then do I need to explicitly call commit (hard-commit)? Or, a soft commit is okay and Solr Cloud will do the job of writing to disk? - Do We still need to use Master

Re: Solr Cloud, Commits and Master/Slave configuration

2012-02-28 Thread eks dev
used to. > > There aren't really masters/slaves in the old sense any more, so > you have to get out of that thought-mode (it's hard, I know). > > The code is under pretty active development, so any feedback is > valuable > > Best > Erick > > On M

Re: Solr Cloud, Commits and Master/Slave configuration

2012-03-01 Thread eks dev
that gets > flushed depending on the requests coming through and the buffer size. > > - Mark Miller > lucidimagination.com > > On Feb 28, 2012, at 3:38 AM, eks dev wrote: > >> SolrCluod is going to be great, NRT feature is really huge step >> forward, as well as centra

Re: Solr Design question on spatial search

2012-03-02 Thread Venu Dev
ation you have that restricts how far to expand the > search? > > Best > Erick > > On Thu, Mar 1, 2012 at 4:57 PM, Venu Gmail Dev > wrote: >> I don't think Spatial search will fully fit into this. I have 2 approaches >> in mind but I am not satisfied with ei

Re: [SoldCloud] Slow indexing

2012-03-04 Thread eks dev
hmm, loks like you are facing exactly the phenomena I asked about. See my question here: http://comments.gmane.org/gmane.comp.jakarta.lucene.solr.user/61326 On Sun, Mar 4, 2012 at 9:24 PM, Markus Jelsma wrote: > Hi, > > With auto-committing disabled we can now index many millions of documents in

Re: Solr 4.0 and production environments

2012-03-07 Thread eks dev
I am here on lucene as a user since the project started, even before solr came to life, many many years. And I was always using trunk version for pretty big customers, and *never* experienced some serious problems. The worst thing that can happen is to notice bug somewhere, and if you have some rea

Is there any performance cost of using lots of OR in the solr query

2012-04-04 Thread roz dev
Hi All, I am working on an application which makes few solr calls to get the data. On the high level, We have a requirement like this - Make first call to Solr, to get the list of products which are children of a given category - Make 2nd solr call to get product documents based on a l

Re: How to do custom sorting in Solr?

2012-06-10 Thread roz dev
Hi All > > I have an index which contains a Catalog of Products and Categories, with > Solr 4.0 from trunk > > Data is organized like this: > > Category: Books > > Sub Category: Programming > > Products: > > Product # 1, Price: Regular Sort Order:1 > Product # 2, Price: Markdown, Sort Order:2 >

Re: How to do custom sorting in Solr?

2012-06-10 Thread roz dev
uot;products which are on markdown, are at > the bottom of the documents list" > > But in your examples, products on "markdown" are intermingled > > Best > Erick > > On Sun, Jun 10, 2012 at 3:36 AM, roz dev wrote: > > Hi All > > > >> &

Re: How to do custom sorting in Solr?

2012-06-10 Thread roz dev
ubt very much given your > problem description. > > So with a corpus that size, I'd "just try it'. > > Best > Erick > > On Sun, Jun 10, 2012 at 7:12 PM, roz dev wrote: > > Thanks Erik for your quick feedback > > > > When Products are assigned to

Re: Issue with field collapsing in solr 4 while performing distributed search

2012-06-11 Thread roz dev
I think that there is no way around doing custom logic in this case. If indexing process knows that documents have to be grouped then they better be together. -Saroj On Mon, Jun 11, 2012 at 6:37 AM, Nitesh Nandy wrote: > Martijn, > > How do we add a custom algorithm for distributing documents

SolrJ Question about Bad Request Root cause error

2011-01-11 Thread roz dev
Hi All We are using SolrJ client (v 1.4.1) to integrate with our solr search server. We notice that whenever SolrJ request does not match with Solr schema, we get Bad Request exception which makes sense. org.apache.solr.common.SolrException: Bad Request But, SolrJ Client does not provide any clu

Question about http://wiki.apache.org/solr/Deduplication

2011-03-24 Thread eks dev
Hi, Use case I am trying to figure out is about preserving IDs without re-indexing on duplicate, rather adding this new ID under list of document id "aliases". Example: Input collection: "id":1, "text":"dummy text 1", "signature":"A" "id":2, "text":"dummy text 1", "signature":"A" I add the first

  1   2   3   4   >