Thank you
--
View this message in context:
http://lucene.472066.n3.nabble.com/Solr-5-not-loading-shards-from-symlinked-directories-tp4255403p4255431.html
Sent from the Solr - User mailing list archive at Nabble.com.
I've tried to upgrade from Solr 4.10.3 to 5.4.1. Solr shards are placed on
different disks and symlinks (ln -s) are created to SOLR_HOME (SOLR_HOME
itself is set as an absolute path and works fine).
When Solr starts, it loads only shards placed in home directory, but not
symlinked ones.
If I copy s
I'm looking for a way to delete term vectors from existing index, schema is
changed to 'termVectors="false" ' and optimization was performed after that,
but index size remains the same (I'm totally sure, that optimization was
successful).
I've also tried to add some new documents to existing index
I'm sorry for being so unclear.
The problem is in speed - while node holds only several cats, it can answer
with "numFound=0", if these cats are missed in query.
It looks like:
node 1 - cats 1,2,3
node 2 - cats 3,4,5
node 3 - cats 50,70
...
Query "q=cat:(1 4)"
QTime per node now is like
node1 -
SOLR version - 4.10.3
We have SOLR Cloud cluster, each node has documents only for several
categories.
Queries look like "...fq=cat(1 3 89 ...)&..."
So, only some nodes need to process, others can answer with zero as soon as
they check "cat".
The problem is to keep separate cache for "cat" values
In case s1 will face the same problem, the thing was that SOLR caches were
turned off, and I underestimated the meaning of caches in desire to save as
much RAM as possible.
--
View this message in context:
http://lucene.472066.n3.nabble.com/Simple-search-low-speed-tp4202135p4202975.html
Sent fr
Thanks for your reply.
Yes, 100% CPU is used by SOLR (100% - I mean 1 core, not all cores), I'm
totally sure.
I have more than 80 GB RAM on test machine and about 50 is cached as disk
cache, SOLR uses about 8, Xmx=40G.
I use GC1, but it can't be the problem, cause memory usage is much lower
than
The number of documents in collection is about 100m.
--
View this message in context:
http://lucene.472066.n3.nabble.com/Simple-search-low-speed-tp4202135p4202152.html
Sent from the Solr - User mailing list archive at Nabble.com.
We have simple search over 50 GB index.
And it's slow.
I can't even wonder why, whole index is in RAM (and a lot of free space is
available) and CPU is a bottleneck (100% load).
The query is simple (except tvrh):
q=(text:(word1+word2)++title:(word1+word2))&tv=true&isShard=true&qt=/tvrh&fq=cat:(10
If u need only 200 results grouped, u can easily do it with some external
code, it will be much faster anyway.
Also, it's widely suggested to use docValues="true" for fields, by which
group is performed, it really helps (I can only say numbers in terms of RAM
usage, but speed increases as-well).
Thank you for the reply.
Out schema is:
1) Index real-time (on separate machine).
2) NRT index becomes large.
3) Copy NRT index on other machine.
3) Merge NRT-made indexes with large ("all-the-time") index.
4) Remove NRT index (until now it was available for searching).
At the end we have big, opt
Is there a ready-to-use tool to merge existing indexes in map-reduce?
We have real-time search and want to merge (and optimize) its indexes into
one, so we don't need to build index in Map-Reduce, but only merge it.
--
View this message in context:
http://lucene.472066.n3.nabble.com/Merge-index
U can do the same simply by something like that
http://localhost:8983/solr/admin/cores?action=CREATE&collection=wikingram&name=ANY_NAME_HERE&shard=shard1
The main part is "shard=shard1", when you create core with existing shard
(core name doesn't matter, we use "collection_shard1_replica2", but
Nice, thanks!
If u'd like to, I'll write our results with that amazing util.
--
View this message in context:
http://lucene.472066.n3.nabble.com/Field-collapsing-memory-usage-tp4181092p4181159.html
Sent from the Solr - User mailing list archive at Nabble.com.
Thank you for your answer.
We've found out that the problem was in our SOLR spec (heliosearch 0.08).
There are no crushes, after changing to 4.10.3 (although, there are lot of
OOMs while handling query, it's not really strange for 1.1 bil of documents
).
Now we are going to try latest Heliosearch.
We are trying to run SOLR with big index, using as little RAM as possible.
Simple search for our cases works nice, but field collapsing (group=true)
queries fall with OOM.
Our setup is several shards per SOLR entity, each shard on it's own HDD.
We've tried same queries, but to one specific shard,
Can u get raw SOLR response?
For me grouping works exactly the way u expect it to work.
Try direct query in browser to be sure the problem is not in your code.
http://192.168.0.1:8983/solr/collection1/select?q=*:*&group=true&group.field=tenant_pool
--
View this message in context:
http://luce
Please, tell a bit more about how you run SOLRs.
When we trying to run SOLR with 5 shards, 50GB per shard, we often get
OutOfMemory (especially for group queries). And while indexing SOLR often
falls (without exceptions - some JVM issue).
We are using Heliosearch.
--
View this message in contex
I'm trying to implement specifying queries - we have some results and need to
search over them.
But query, I constructed, returns some strange results.
q=(text:(specific) OR title:(specific)) AND (text:(first long query) OR
title:(first long query))
This query returns something, which contains "
The problem is, that hard commit is on, max uncommited docs = 500.000.
And tlog size is just about 200 MB per shard - doesn't seem too big for me.
The reason of my panic is the fact, that one shard in my old collection is
down forever, without any unusual entries in logs. I tried different magic
(
I'm using SOLR 4.10.1 in cloud mode with 3 instances, 5 shards per instance
without replication.
I restarted one SOLR and now all shards from that instance are down, but
there are no errors in logs.
All I see is
09.12.2014, 11:13:40WARNUpdateLog Starting log replay
tlog{file=/opt/dat
Thanks, I'll learn about facets.
Actually, we want to use Mahout, but it needs term vectors - so we faced the
problem of receiving term vector for author from set of documents.
Anyway the main reason of my question was the desire to learn, if I'm
missing some simple solution, or not.
So, thank u
I'm working with social media data.
We have blog posts in our index - text + authors_id.
Now we need to clusterize authors by their texts. We need to get term vector
not for documents, but one vector per one author (for all authors
documents).
We can't get all documents and then unite 'em cause It
Yes, it's late, but I've faced same problem and this question is the only one
relevant to the problem in Google results, so, hope it'll help s1.
For me, adding this two strings to solrconfig solved the problem
${solr.data.dir:hdfs://192.168.22.11:9001/solr}
true
In docs it's siad, that there is
Already tried with same result (the message changed properly )
--
View this message in context:
http://lucene.472066.n3.nabble.com/Solr-HDFS-settings-tp4165873p4166089.html
Sent from the Solr - User mailing list archive at Nabble.com.
Ok, new problem, while collection or shard creating:
Caused by: no segments* file found in
NRTCachingDirectory(HdfsDirectory@3a19dc74
lockFactory=org.apache.solr.store.hdfs.HdfsLockFactory@43507d1b;
maxCacheMB=192.0 maxMergeSizeMB=16.0): files: [HdfsDirectory@3a19dc74
lockFactory=org.apache.solr.s
I'm trying to run SOLR with HDFS
in solrconfig.xml I've written
hdfs:///solr
true
1
true
16384
true
true
true
16
192
But, when I'm trying to create collection, I get
"Caused by: com.google.pr
*totalprovidencevideo
It worked, thanks, u helped me to save nearly a week on reindexing and lot
of nerves.
--
View this message in context:
http://lucene.472066.n3.nabble.com/Solr-Index-to-Helio-Search-tp4163446p4164114.html
Sent from the Solr - User mailing list archive at Nabble.com.
When I try to simple copy index from native SOLR to Heliosearch, i get
exception:
Caused by: java.lang.IllegalArgumentException: A SPI class of type
org.apache.lu
cene.codecs.Codec with name 'Lucene410' does not exist. You need to add the
corr
esponding JAR file supporting this SPI to your classpa
Thanks again.
I'd answered before properly reading your post, my apologizes.
I can't say for sure, cause filter caches are out of the JVM (dat HS), but
top shows 5 GB cached and no free RAM.
The only question for me now is how to balance disk cache and filter cache?
Do I need to worry about that,
Thanks for your reply.
Collection contains about billion of documents.
I'm using most of all simple queries with date and other filters (5 filters
per query).
Yup, disks are cheapest and simplest.
At the end, I want to reach several seconds per search query (for not cached
query =) ), so, please,
I have CLOUD with 3 nodes and 16 MB RAM on each.
My index is about 1 TB and search speed is awfully bad.
I've read, that one needs at least 50% of index size in RAM, but I suerly
can't afford it.
Please, tell me, is there any way to improve perfomance with hardly limited
resources?
Yes, I can try t
I'm using SOLR-hs_0.06 based on SOLR 4.10
I have SolrCloud with external ZooKeepers.
I manually indexed with DIH from mySQL on each instance - we have lot of
dbs, so It's one db per solr instace.
All was just fine - I could search and so on.
Then I sended update queries (lot of, about 1 or 100
33 matches
Mail list logo