And what does proximity search exactly mean?
A proximity search means searching terms with a distance in between them.
E.g. Search for a document which has java near 3 words of network.
field:"java network"~3
So the above query will match any document having a distance of 3 by its
position between
Hi,
Kindly provide your inputs on the issue.
Thanks,
Modassar
On Mon, Feb 1, 2016 at 12:40 PM, Modassar Ather
wrote:
> Hi,
>
> Got following error during optimize of index on 2 nodes of 12 node
> cluster. Please let me know if the index can be recovered and how and what
> could be the reason?
Hello Ted.
We have a similar requirement to deploy Solr across 2 DCs.
In our case, the DCs are connected via fibre optic.
We managed to deploy a single SolrCloud cluster across multiple DCs without
any major issue (see links below).
The whole set-up is described in the following articles:
-
htt
Hello Harry ,
sorry for delayed reply , I have taken other approach by giving user a
different usability as I did not have solution for this. But your option
looks great , I will try this out.
--
View this message in context:
http://lucene.472066.n3.nabble.com/How-to-use-DocValues-with-TextFie
I agree. If the system updates synchronously, then you are in two-phase commit
land. If you have a persistent store that each index can track, then things are
good.
wunder
Walter Underwood
wun...@wunderwood.org
http://observer.wunderwood.org/ (my blog)
> On Feb 9, 2016, at 7:37 PM, Shawn Heis
On 2/9/2016 5:48 PM, Walter Underwood wrote:
> Updating two systems in parallel gets into two-phase commit, instantly. So
> you need a persistent pool of updates that both clusters pull from.
My indexing system does exactly what I have suggested for tedsolr -- it
updates multiple copies of my ind
This issue might be similar to what Apple presented at the closing
keynote at Solr Revolution 2014. I believe they used a queue on each
of the site feeding into Solr. The presentation should be online.
Regards,
Alex.
Newsletter and resources for Solr beginners and intermediates:
http://www
My impulse would be to _not_ run Tika in its own JVM, just catch any
exceptions in my code and "do the right thing". I'm not sure I see any
real benefit in yet another JVM.
FWIW,
Erick
On Tue, Feb 9, 2016 at 6:22 PM, Allison, Timothy B. wrote:
> I have one answer here [0], but I'd be interested
I have one answer here [0], but I'd be interested to hear what Solr
users/devs/integrators have experienced on this topic.
[0]
http://mail-archives.apache.org/mod_mbox/tika-user/201602.mbox/%3CCY1PR09MB0795EAED947B53965BC86874C7D70%40CY1PR09MB0795.namprd09.prod.outlook.com%3E
-Original Me
Thanks for your replies and suggestions!
Why I store all events related to a session under one doc?
Each session can have about 500 total entries (events) corresponding to it.
So when I try to retrieve a session's info it can back with around 500
records. If it is this compounded one doc per sessi
1 million document isn't considered big for Solr. How much RAM does your
machine have?
Regards,
Edwin
On 8 February 2016 at 23:45, Susheel Kumar wrote:
> 1 million document shouldn't have any issues at all. Something else is
> wrong with your hw/system configuration.
>
> Thanks,
> Susheel
>
>
Updating two systems in parallel gets into two-phase commit, instantly. So you
need a persistent pool of updates that both clusters pull from.
wunder
Walter Underwood
wun...@wunderwood.org
http://observer.wunderwood.org/ (my blog)
> On Feb 9, 2016, at 4:15 PM, Shawn Heisey wrote:
>
> On 2/9/2
On 2/9/2016 1:43 PM, tedsolr wrote:
> I expect that rsync can be used initially to copy the collection data
> folders and the zookeeper data and transaction log folders. So after
> verifying Solr/ZK is functional after the install, shut it down and perform
> the copy. This may sound slow but my pro
Thank you Erick and Alex.
My main question is with a long running process using Tika in the same JVM
as my application. I'm running my file-system-crawler in its own JVM (not
Solr's). On Tika mailing list, it is suggested to run Tika's code in it's
own JVM and invoke it from my file-system-crawl
Making two indexing calls, one to each, works until one system is not
available. Then they are out of sync.
You might want to put the updates into a persistent message queue, then have
both systems indexed from that queue.
wunder
Walter Underwood
wun...@wunderwood.org
http://observer.wunderwood
So as I understand your use case, its effectively logging actions within a
user session, why do you have to do the update in NRT? Why not just log
all the user session events (with some unique key, and ensuring the session
Id is in the document somewhere), then when you want to do the query, you
j
Here's a writeup that should help
https://lucidworks.com/blog/2012/02/14/indexing-with-solrj/
On Tue, Feb 9, 2016 at 2:49 PM, Alexandre Rafalovitch
wrote:
> Solr uses Tika directly. And not in the most efficient way. It is
> there mostly for convenience rather than performance.
>
> So, for p
Solr uses Tika directly. And not in the most efficient way. It is
there mostly for convenience rather than performance.
So, for performance, Solr recommendation is also to run Tika
separately and only send Solr the processed documents.
Regards,
Alex.
Newsletter and resources for Solr beg
Hi folks,
I'm writing a file-system-crawler that will index files. The file system
is going to be very busy an I anticipate on average 10 new updates per
min. My application checks for new or updated files once every 1 min. I
use Tika to extract the raw-text off those files and send them over t
There is a Cross Datacenter replication feature in the works - not sure
of its status.
In lieu of that, I'd simply have two copies of your indexing code -
index everything simultaneously into both clusters.
There is, of course risks that both get out of sync, so you might want
to find some ways t
I have a Solr Cloud cluster (v5.2.1) using a Zookeeper ensemble in my primary
data center. I am now trying to plan for disaster recovery with an available
warm site. I have read (many times) the disaster recovery section in the
Apache ref guide. I suppose I don't fully understand it.
What I'd like
Shahzad - As Shawn mentioned you can get lot of inputs from the folks who
are using joins in Solr cloud if you start a new thread and i would suggest
to take a look at Solr Streaming expressions and Parallel SQL Interface
which covers joining use cases as well.
Thanks,
Susheel
On Tue, Feb 9, 2016
Bear in mind that Lucene is optimised towards high read lower write.
That is, it puts in a lot of effort at write time to make reading
efficient. It sounds like you are going to be doing far more writing
than reading, and I wonder whether you are necessarily choosing the
right tool for the job.
Ho
This has been a long standing issue, Hoss is doing some current work on it see:
https://issues.apache.org/jira/browse/SOLR-445
But the short form is "no, not yet".
Best,
Erick
On Tue, Feb 9, 2016 at 8:19 AM, Debraj Manna wrote:
> Hi,
>
>
>
> I have a Document Centric Versioning Constraints adde
Hi,
I have a Document Centric Versioning Constraints added in solr schema:-
false
doc_version
I am adding multiple documents in solr in a single call using SolrJ 5.2.
The code fragment looks something like below :-
try {
UpdateResponse resp = solrClient.add(docs.getDocCollectio
On 2/9/2016 7:01 AM, Daniel Pool wrote:
> Did you ever get to the bottom of this issue? I'm encountering exactly the
> same problem with haproxy 1.6.2; health checks throwing occasional errors and
> the connection being closed by haproxy.
Your message did not include any quotes from the original
Hi,
Thanks for all your suggestions. I took some time to get the details to be
more accurate. Please find what I have gathered:-
My data being indexed is something like this.
I am basically capturing all data related to a user session.
Inside a session I have categorized my actions like actionA, a
On 2/8/2016 1:09 PM, Kelly, Frank wrote:
> We are running a small SolrCloud instance on AWS
>
> Solr : Version 5.3.1
> ZooKeeper: Version 3.4.6
>
> 3 x ZooKeeper nodes (with higher limits and timeouts due to being on AWS)
> 3 x Solr Nodes (8 GB of memory each – 2 collections with 3 shards for
> eac
On Tue, Feb 9, 2016 at 10:02 AM, Markus Jelsma
wrote:
> Nice! Are the aggregations also going to be pluggable? Reading the ticket, i
> would assume it is going to be pluggable.
Yep.
-Yonik
> Thanks,
> Markus
>
> -Original message-
>> From:Yonik Seeley
>> Sent: Tuesday 9th February 20
Nice! Are the aggregations also going to be pluggable? Reading the ticket, i
would assume it is going to be pluggable.
Thanks,
Markus
-Original message-
> From:Yonik Seeley
> Sent: Tuesday 9th February 2016 15:25
> To: solr-user@lucene.apache.org
> Subject: Re: Custom JSON facet funct
Nathan,
Did you ever get to the bottom of this issue? I'm encountering exactly the same
problem with haproxy 1.6.2; health checks throwing occasional errors and the
connection being closed by haproxy.
Daniel Pool
This electronic message contains information from CACI International Inc or
subs
On Tue, Feb 9, 2016 at 7:10 AM, Markus Jelsma
wrote:
> Hi - i must have missing something but is it possible to declare custom JSON
> facet functions in solrconfig.xml? Just like we would do with request
> handlers or search components?
Yes, but it will probably change:
https://issues.apache.o
On 2/8/2016 10:10 PM, Shahzad Masud wrote:
> Due to distributed search feature, I might not be able to run
> SolrCloud. I would appreciate, if you please share that way of setting
> solr home for a specific context in Jetty-Solr. Its good to seek more
> information for comparison purposes. Do you t
Susheel, thank you for asking. I am using joins of two cores (employee,
department, servicetickets), which isn't support by SolrCloud - last time I
check. Not sure if this (advanced distributed request option) was present
in 4.10. Do you think, I am missing something here?
Shahzad
On Tue, Feb 9,
Shahzad - I am curious what features of distributed search stops you to run
SolrCloud. Using DS, you would be able to search across cores or
collections.
https://cwiki.apache.org/confluence/display/solr/Advanced+Distributed+Request+Options
Thanks,
Susheel
On Tue, Feb 9, 2016 at 12:10 AM, Shahzad
Hi - i must have missing something but is it possible to declare custom JSON
facet functions in solrconfig.xml? Just like we would do with request handlers
or search components?
Thanks,
Markus
amazing, thanks!
--
*John Blythe*
Product Manager & Lead Developer
251.605.3071 | j...@curvolabs.com
www.curvolabs.com
58 Adams Ave
Evansville, IN 47713
On Tue, Feb 9, 2016 at 6:04 AM, Vincenzo D'Amore wrote:
> Hi,
>
> I did a chrome extension:
>
>
> https://chrome.google.com/webstore/detail
Hi,
I did a chrome extension:
https://chrome.google.com/webstore/detail/solr-query-debugger/gmpkeiamnmccifccnbfljffkcnacmmdl
Hope this helps,
Vincenzo
On Tue, Feb 9, 2016 at 11:39 AM, John Blythe wrote:
> that's it!
>
> and doug is the one from back in the day :)
>
> thanks guys
>
> --
> *J
that's it!
and doug is the one from back in the day :)
thanks guys
--
*John Blythe*
Product Manager & Lead Developer
251.605.3071 | j...@curvolabs.com
www.curvolabs.com
58 Adams Ave
Evansville, IN 47713
On Mon, Feb 8, 2016 at 3:08 PM, Toke Eskildsen
wrote:
> John Blythe wrote:
> > last ye
Hi Joel,
I saw your response this morning, and have created an issue, SOLR-8664,
and linked it to SOLR-8125. As context, I included my original question
and your answer, as a comment.
Cheers
Akiel
From: Joel Bernstein
To: solr-user@lucene.apache.org
Date: 29/01/2016 13:46
Subject:
40 matches
Mail list logo