Anyone?
On Mon, Mar 17, 2014 at 12:03 PM, Salman Akram <
salman.ak...@northbaysolutions.net> wrote:
> Below is one of the sample slow query that takes mins!
>
> ((stock or share*) w/10 (sale or sell* or sold or bought or buy* or
> purchase* or repurchase*)) w/10 (executive or director)
>
> If a
We do have couple of commodity SSDs already and they perform good. However,
our user queries are very complex and quite a few of them go above a minute
so we really had to do something about it.
Using this beast vs putting the whole index to RAM, the beast still seemed
a better option. Also we are
Thanks for the info. The articles were really useful but still seems I have
to do my own testing to find the right page size? I thought for large
indexes there would already be some tests done in SOLR community.
Side note: We are heavily using Microsoft technology (.NET etc) for
development so by
Sorry guys I spoke too fast. I looked at the code again. No it doesn't
correlate with commits at all. I was mistaken.
On Wed, Mar 19, 2014 at 10:06 AM, Chris W wrote:
> Thanks, Shawn and Shalin
>
> How does the frequency of commit affect zookeeper?
>
>
> Thanks
>
>
> On Tue, Mar 18, 2014 at 9:12
Shalin, "correlated with how frequently you call commit" is it soft commit
or hard commit? , I guess it should be later one.
just curious what data it update to zookeeper during commit
On Tue, Mar 18, 2014 at 9:12 PM, Shalin Shekhar Mangar <
shalinman...@gmail.com> wrote:
> SolrCloud will updat
Thanks, Shawn and Shalin
How does the frequency of commit affect zookeeper?
Thanks
On Tue, Mar 18, 2014 at 9:12 PM, Shalin Shekhar Mangar <
shalinman...@gmail.com> wrote:
> SolrCloud will update Zookeeper on state changes (node goes to
> recovery, comes back up etc) or for leader election and
SolrCloud will update Zookeeper on state changes (node goes to
recovery, comes back up etc) or for leader election and during
collection API commands. It doesn't correlate directly with indexing
but is correlated with how frequently you call commit.
On Wed, Mar 19, 2014 at 5:46 AM, Shawn Heisey w
Hi,
I think you probably want to split giant documents because you / your users
probably want to be able to find smaller sections of those big docs that
are best matches to their queries. Imagine querying War and Peace. Almost
any regular word your query for will produce a match. Yes, you may w
Hi Solr Users,
I'm looking for advice on best practices when indexing large documents
(100's of MB or even 1 to 2 GB text files). I've been hunting around on
google and the mailing list, and have found some suggestions of splitting
the logical document up into multiple solr documents. However, I h
Hi Lajos,
This can be due to the heavy query-time processing chain associated with
the TextField? You can also check out AnalyzingInfixLookupFactory, if the
suggestion entries are a bit long (this suggester will give matches, even
if the query matches a term in the middle of a suggestion entry.
I
Hi Lajos,
Can you elaborate on the "get the overflow when using a text field" part?
The new SuggestComponent should work just as well for DocumentDictionary.
Thanks
Areek
On Mon, Mar 17, 2014 at 6:05 PM, Lajos wrote:
> Hi Steve,
>
> I've posted previously about a nice Stackoverflow exception
On 3/18/2014 5:46 PM, Chris W wrote:
I am running a 3 node zookeeper 3.4.5 Quorum. I am running into issues
with Zookeeper transaction logs
[myid:2] - ERROR [main:QuorumPeer@453] - Unable to load database on disk
java.io.IOException: Unreasonable length = 1048587
at
org.apache.jute.BinaryInpu
I am running a 3 node zookeeper 3.4.5 Quorum. I am running into issues
with Zookeeper transaction logs
[myid:2] - ERROR [main:QuorumPeer@453] - Unable to load database on disk
java.io.IOException: Unreasonable length = 1048587
at
org.apache.jute.BinaryInputArchive.readBuffer(BinaryInputArchive.j
hi all
I have some questions re shards.tolerant=true and timeAllowed=xxx
I have seen situations where shards.tolerant=true works; if one of the
shards specified in a query is dead, shards.tolerant seems to work and I get
results from the non-dead shards
However, if one of the shards goes down du
Did you change the schema at all?
No
Did you upgrade Solr from a previous version with the same index?
No
This was fresh install from the website.
Ran "ant run-example"
Killed that instance
Copied Example to Node1
Copied Example to Node2
Switched into Node1
java -Dbootstrap_confdir=solr/collect
Hi Steve,
This feature make sens for us because we don't have write access in production.
Anyway, I'll do a script to push config files updates directly to zookeeper and
reload the collection. But, it's always simpler when it's already integrated
in a admin tool.
Thank you for your time.
On 3/18/2014 3:51 PM, AJ Lemke wrote:
I have a strange issue with my local SOLR install.
I have a search that sorts on a boolean field. This search is pulling the following
error: "java.lang.String cannot be cast to org.apache.lucene.util.BytesRef".
The search is over the dummy data that is inc
Joel,
I had a discussion with you earlier related ngroup inconsistent number
when you suggested to use the composite id to make sure that identical
(ADSKDedup) fields are available in the same shard.
Here's the thread -->
http://lucene.472066.n3.nabble.com/SolrCloud-Result-Grouping-vs-Collaps
Hello all!
I have a strange issue with my local SOLR install.
I have a search that sorts on a boolean field. This search is pulling the
following error: "java.lang.String cannot be cast to
org.apache.lucene.util.BytesRef".
The search is over the dummy data that is included in the exampledocs.
Hi Francois,
The config file editing functionality was pulled out of Solr before the 4.7
release; what remains is a read-only config directory browser/file viewer.
May I ask why you thought the config file editing functionality was in 4.7?
Steve
On Mar 18, 2014, at 4:39 PM, Francois Perron
w
Salman Akram [salman.ak...@northbaysolutions.net] wrote:
[Hundreds of GB index]
> http://www.storagereview.com/micron_p420m_enterprise_pcie_ssd_review
May I ask why you have chosen a drive with such a high speed and matching cost?
We have some years of experience with using SSDs for search at w
Hi,
I had install lastest version of solr (4.7.0) and I want to try new
functionality to edit config files in AdminUI. But when I click on file, no
edit box appear!
This is info on my version :
Versions
*
solr-spec
4.7.0
*
solr-impl
4.7.0 1570806 - simon - 2014-02-22 08:36:23
*
luce
Hi Shamik,
I see that you are using distributed search. With the
CollapsingQParserPlugin you need to have all the documents that are in the
same group on the same shard.
Is that the way you have the documents indexed?
Joel
Joel Bernstein
Search Engineer at Heliosearch
On Mon, Mar 17, 2014 at
Varun,
You could use a function query involving “min” with a comma-separated list
of geodist clauses.
See https://cwiki.apache.org/confluence/display/solr/Spatial+Search
“Boost Nearest Results”. You’d replace the geodist() in there with
min(geodist(45.15,-93.85),geodist(50.2,22.3),…) (etc.)
~
I wonder whether this is a known bug. In previous SOLR cloud versions, 4.4
or maybe 4.5, an explicit optimize(), without any parameters, it usually
took 2 minutes for a 32 core cluster.
However, in 4.6.1, the same call took about 1 hour. Checking the index
modification time for each core shows 2 m
Thanks Joel - I decided upon another route - I was almost always grouping so
I am trying another model where we will store the data with fewer rows and a
few multivalue fields.
--
View this message in context:
http://lucene.472066.n3.nabble.com/CollapsingQParserPlugin-facet-results-fq-collapse-
Thanks,
Indeed, the subject line was misleading.
Then I will file a new improvement request for "block atomic update
support".
On Tue, Mar 18, 2014 at 2:08 PM, Jack Krupansky wrote:
> That's a reasonable request and worth a Jira, but different from what you
> have specified in your subject lin
I disabled softCommit and tried to run another indexing proccess.
Now I see no jetty EofException and no latency peaks..
I also noticed that when I had softcommit every 10 minutes, I also saw
spikes in the major GC (i use CMS) to around 9-10k.
Any idea?
Shawn Heisey-4 wrote
> On 3/17/2014 7:07
Am 18.03.2014 15:26, schrieb Martin de Vries:
Martin, I’ve committed the SOLR-5875 fix, including to the
lucene_solr_4_7 branch.
Any chance you could test the fix?
Hi Steve,
I'm very happy you found the bug. We are running the version from SVN on
one server and it's already running fine for 5
On 3/18/2014 8:37 AM, Erick Erickson wrote:
> It sounds like you already understand mmap. Even so you might be
> interested in this excellent writeup of MMapDirectory and Lucene by
> Uwe: http://blog.thetaphi.de/2012/07/use-lucenes-mmapdirectory-on-64bit.html
There is some actual bad memory report
Avishai:
It sounds like you already understand mmap. Even so you might be
interested in this excellent writeup of MMapDirectory and Lucene by
Uwe: http://blog.thetaphi.de/2012/07/use-lucenes-mmapdirectory-on-64bit.html
Best,
Erick
On Tue, Mar 18, 2014 at 7:23 AM, Avishai Ish-Shalom
wrote:
> aha
Martin, I’ve committed the SOLR-5875 fix, including to the
lucene_solr_4_7 branch.
Any chance you could test the fix?
Hi Steve,
I'm very happy you found the bug. We are running the version from SVN
on one server and it's already running fine for 5 hours. If it's still
stable tomorrow than w
Done, thanks!
On Tue, Mar 18, 2014 at 3:54 AM, Anders Gustafsson
wrote:
> Yes, please. My Wiki ID is Anders Gustafsson
>
>> But yes, please, add the howto to Wiki. You will need to get your
>> account whitelisted first (due to spammers), so send a separate email
>> with your Apache wiki id and so
Hello,
What is the difference between setting parameters via SolrQuery vs
ModifiableSolrParams? If there is a difference, is there a preferred
choice? I'm using Solr 4.6.1.
SolrQuery query = new SolrQuery();
query.setParam("wt", "json");
ModifiableSolrParams params = new ModifiableSolrParams()
aha! mmap explains it. thank you.
On Tue, Mar 18, 2014 at 3:11 PM, Shawn Heisey wrote:
> On 3/18/2014 5:30 AM, Avishai Ish-Shalom wrote:
> > My solr instances are configured with 10GB heap (Xmx) but linux shows
> > resident size of 16-20GB. even with thread stack and permgen taken into
> > acco
On 3/18/2014 7:39 AM, Salman Akram wrote:
> This SSD default size seems to be 4K not 16K (as can be seen below).
>
> Bytes Per Sector : 512
> Bytes Per Physical Sector : 4096
> Bytes Per Cluster : 4096
> Bytes Per FileRecord Segment: 1024
The *sector* size o
Yes, I did.
I have also tried disabling security and removing the '#' from the masterURL.
... but still get, "Master at: http://:9081/solr/collection1 is not
available. Index fetch failed."
Thanks
-Original Message-
From: Doug Turnbull [mailto:dturnb...@opensourceconnections.com]
Sent
On 3/18/2014 7:48 AM, Cynthia Park wrote:
> What is the difference between setting parameters via SolrQuery vs
> ModifiableSolrParams? If there is a difference, is there a preferred
> choice? I'm using Solr 4.6.1.
>
> SolrQuery query = new SolrQuery();
> query.setParam("wt", "json");
>
> Modifi
On 3/18/2014 7:18 AM, david.dav...@correo.aeat.es wrote:
> yes, but if I use enableLazyFieldLoading=trueand my queries only request
> for very small fields like ID, DocumentCache shouldn't grow, although my
> stored fields are very big. Am I wrong?
Since Solr 4.1, stored fields are compressed.
Hi all,
I have a field that contains dates (it has date type) and I would like
to make a hierarchical (pivot) facet based on that field.
So I would like to have something like this:
date_of_creation:
|__2014
||__January
|| |_01
|| |_02
|| |_14
|
This SSD default size seems to be 4K not 16K (as can be seen below).
Bytes Per Sector : 512
Bytes Per Physical Sector : 4096
Bytes Per Cluster : 4096
Bytes Per FileRecord Segment: 1024
I will go through the articles you sent. Thanks
On Tue, Mar 18, 2014 at
On 3/18/2014 7:12 AM, Salman Akram wrote:
> Is there a rule of thumb for ideal block size for SSDs for large indexes
> (in hundreds of GBs)? Read performance is of top importance for us and we
> can sacrifice the space a little...
>
> This is the one we just got and wanted to see if there are any
Hi Miguel,
yes, but if I use enableLazyFieldLoading=trueand my queries only request
for very small fields like ID, DocumentCache shouldn't grow, although my
stored fields are very big. Am I wrong?
Best regards,
David Dávila Atienza
AEAT - Departamento de Informática Tributaria
Subdirección de
All,
Is there a rule of thumb for ideal block size for SSDs for large indexes
(in hundreds of GBs)? Read performance is of top importance for us and we
can sacrifice the space a little...
This is the one we just got and wanted to see if there are any test results
out there
http://www.storagerevie
Hi David
If you use lazy field loading (/enableLazyFieldLoading=true/)
/documentCache/ functionality is somehow limited. This means that the
document stored in the /documentCache/ will contain only those fields
that were passed to the /fl /parameter.
/documentCache/ requires memory, the
On 3/18/2014 5:30 AM, Avishai Ish-Shalom wrote:
> My solr instances are configured with 10GB heap (Xmx) but linux shows
> resident size of 16-20GB. even with thread stack and permgen taken into
> account i'm still far off from these numbers. Could it be that jvm IO
> buffers take so much space? doe
That's a reasonable request and worth a Jira, but different from what you
have specified in your subject line: "re-indexing a single document" - the
entire block needs to be re-indexed.
I suppose people might want a "block atomic update" - where multiple child
documents as well as the parent d
How large is your index on disk? Solr memory maps the index into
memory. Thus the virtual memory used will often be quite large. Your
numbers don't sound inconceivable.
A good reference point is Grant Ingersoll's blog post on searchhub:
http://searchhub.org/2011/09/14/estimating-memory-and-storage
Hi,
My solr instances are configured with 10GB heap (Xmx) but linux shows
resident size of 16-20GB. even with thread stack and permgen taken into
account i'm still far off from these numbers. Could it be that jvm IO
buffers take so much space? does lucene use JNI/JNA memory allocations?
Yes, please. My Wiki ID is Anders Gustafsson
> But yes, please, add the howto to Wiki. You will need to get your
> account whitelisted first (due to spammers), so send a separate email
> with your Apache wiki id and somebody will unlock you for editing.
--
Anders Gustafsson
Engineer, CNI, CNE6,
The metadata fields can be all sorts of strange, including spaces and
other strange characters. So, often, there is some issue on mapping.
But yes, please, add the howto to Wiki. You will need to get your
account whitelisted first (due to spammers), so send a separate email
with your Apache wiki i
Thanks again. I already had the Tika jars, but not the commandline one,
so I downloaded 1.5 and ran it against the docx and found:
So the name is prefixed, does that mean that I should add it prefixed
in the conf files as well? Ie:
Yes. Did that and now it works. Guess I should take the time
You can just download Tika from Apache site, it's a separate product
and has command line interface.
Or to use Solr extract handler: go through Solr tutorial, it explains
it. https://lucene.apache.org/solr/4_7_0/tutorial.html
Specifically, http://wiki.apache.org/solr/ExtractingRequestHandler and
Thanks for the quick reply. I am a bit of a newb when it comes to Solr, Lux and
Tika so I would appreciate if you could give me some quick pointers how to
use/call Tika directly and/or how to send one file directly and storing the
dynamic field?
--
Anders Gustafsson
Engineer, CNI, CNE6, ASE
Have you tried just using Tika directly and seeing what gets output?
Maybe it is all prefixed somehow. Or sending one file as a sample
directly to the extract handler and temporarily storing the ignored_*
dynamicField to see what actually happens?
Basically, check what is there before trying to fi
solr-spec 4.6.1
lucene-spec 4.6.0
lux-appserver 1.1.0
tika 1.4
poi 3.9
Hi!
I set it up, pretty much following the instructions at
http://www.codewrecks.com/blog/index.php/2013/05/25/import-folder-of-documents-with-apache-solr-4-0-and-tika/
Problem is that I cannot seem to import custom propertie
Thanks Jack,
I understand that updating a single document on a block is currently not
supported.
But, atomic update to a single document does not have to be in conflict
with block joins.
If I got it right from the documentation:
Currently, If a document is atomically updated, SOLR finds the stor
Hello,
we have a Solr Cloud 4.7, but this question is also related with other
versions, because we have tested this in several installations.
We have a very big index ( more than 400K docs) with big documents, but
in our queries we don't fetch the large fields in fl parameter. But, we
have s
Hi David,
Thanks for the quick reply.
As I haven't migrated to 4.7 (I am still using 4.6), I tested using OR
clause with multiple geofilt query based phrases and it seems to be working
great. But I have one more question: How do I boost the score of the
matching documents based on geodist? How wi
59 matches
Mail list logo