Re: Solr 5: not loading shards from symlinked directories

2016-02-05 Thread Norgorn
Thank you -- View this message in context: http://lucene.472066.n3.nabble.com/Solr-5-not-loading-shards-from-symlinked-directories-tp4255403p4255431.html Sent from the Solr - User mailing list archive at Nabble.com.

Solr 5: not loading shards from symlinked directories

2016-02-04 Thread Norgorn
I've tried to upgrade from Solr 4.10.3 to 5.4.1. Solr shards are placed on different disks and symlinks (ln -s) are created to SOLR_HOME (SOLR_HOME itself is set as an absolute path and works fine). When Solr starts, it loads only shards placed in home directory, but not symlinked ones. If I copy s

Delete term vectors from existing index

2015-10-04 Thread Norgorn
I'm looking for a way to delete term vectors from existing index, schema is changed to 'termVectors="false" ' and optimization was performed after that, but index size remains the same (I'm totally sure, that optimization was successful). I've also tried to add some new documents to existing index

Re: Solr cache for specific field

2015-08-18 Thread Norgorn
I'm sorry for being so unclear. The problem is in speed - while node holds only several cats, it can answer with "numFound=0", if these cats are missed in query. It looks like: node 1 - cats 1,2,3 node 2 - cats 3,4,5 node 3 - cats 50,70 ... Query "q=cat:(1 4)" QTime per node now is like node1 -

Solr cache for specific field

2015-08-18 Thread Norgorn
SOLR version - 4.10.3 We have SOLR Cloud cluster, each node has documents only for several categories. Queries look like "...fq=cat(1 3 89 ...)&..." So, only some nodes need to process, others can answer with zero as soon as they check "cat". The problem is to keep separate cache for "cat" values

Re: Simple search low speed

2015-04-29 Thread Norgorn
In case s1 will face the same problem, the thing was that SOLR caches were turned off, and I underestimated the meaning of caches in desire to save as much RAM as possible. -- View this message in context: http://lucene.472066.n3.nabble.com/Simple-search-low-speed-tp4202135p4202975.html Sent fr

Re: Simple search low speed

2015-04-24 Thread Norgorn
Thanks for your reply. Yes, 100% CPU is used by SOLR (100% - I mean 1 core, not all cores), I'm totally sure. I have more than 80 GB RAM on test machine and about 50 is cached as disk cache, SOLR uses about 8, Xmx=40G. I use GC1, but it can't be the problem, cause memory usage is much lower than

Re: Simple search low speed

2015-04-24 Thread Norgorn
The number of documents in collection is about 100m. -- View this message in context: http://lucene.472066.n3.nabble.com/Simple-search-low-speed-tp4202135p4202152.html Sent from the Solr - User mailing list archive at Nabble.com.

Simple search low speed

2015-04-24 Thread Norgorn
We have simple search over 50 GB index. And it's slow. I can't even wonder why, whole index is in RAM (and a lot of free space is available) and CPU is a bottleneck (100% load). The query is simple (except tvrh): q=(text:(word1+word2)++title:(word1+word2))&tv=true&isShard=true&qt=/tvrh&fq=cat:(10

Re: Grouping Performance Optimation

2015-04-24 Thread Norgorn
If u need only 200 results grouped, u can easily do it with some external code, it will be much faster anyway. Also, it's widely suggested to use docValues="true" for fields, by which group is performed, it really helps (I can only say numbers in terms of RAM usage, but speed increases as-well).

Re: Merge indexes in MapReduce

2015-04-17 Thread Norgorn
Thank you for the reply. Out schema is: 1) Index real-time (on separate machine). 2) NRT index becomes large. 3) Copy NRT index on other machine. 3) Merge NRT-made indexes with large ("all-the-time") index. 4) Remove NRT index (until now it was available for searching). At the end we have big, opt

Merge indexes in MapReduce

2015-04-16 Thread Norgorn
Is there a ready-to-use tool to merge existing indexes in map-reduce? We have real-time search and want to merge (and optimize) its indexes into one, so we don't need to build index in Map-Reduce, but only merge it. -- View this message in context: http://lucene.472066.n3.nabble.com/Merge-index

Re: Add replica on shards

2015-03-18 Thread Norgorn
U can do the same simply by something like that http://localhost:8983/solr/admin/cores?action=CREATE&collection=wikingram&name=ANY_NAME_HERE&shard=shard1 The main part is "shard=shard1", when you create core with existing shard (core name doesn't matter, we use "collection_shard1_replica2", but

RE: Field collapsing memory usage

2015-01-22 Thread Norgorn
Nice, thanks! If u'd like to, I'll write our results with that amazing util. -- View this message in context: http://lucene.472066.n3.nabble.com/Field-collapsing-memory-usage-tp4181092p4181159.html Sent from the Solr - User mailing list archive at Nabble.com.

RE: Field collapsing memory usage

2015-01-22 Thread Norgorn
Thank you for your answer. We've found out that the problem was in our SOLR spec (heliosearch 0.08). There are no crushes, after changing to 4.10.3 (although, there are lot of OOMs while handling query, it's not really strange for 1.1 bil of documents ). Now we are going to try latest Heliosearch.

Field collapsing memory usage

2015-01-21 Thread Norgorn
We are trying to run SOLR with big index, using as little RAM as possible. Simple search for our cases works nice, but field collapsing (group=true) queries fall with OOM. Our setup is several shards per SOLR entity, each shard on it's own HDD. We've tried same queries, but to one specific shard,

Re: Solr grouping problem - need help

2015-01-14 Thread Norgorn
Can u get raw SOLR response? For me grouping works exactly the way u expect it to work. Try direct query in browser to be sure the problem is not in your code. http://192.168.0.1:8983/solr/collection1/select?q=*:*&group=true&group.field=tenant_pool -- View this message in context: http://luce

Re: How large is your solr index?

2014-12-30 Thread Norgorn
Please, tell a bit more about how you run SOLRs. When we trying to run SOLR with 5 shards, 50GB per shard, we often get OutOfMemory (especially for group queries). And while indexing SOLR often falls (without exceptions - some JVM issue). We are using Heliosearch. -- View this message in contex

SOLR complex queries misunderstanding

2014-12-19 Thread Norgorn
I'm trying to implement specifying queries - we have some results and need to search over them. But query, I constructed, returns some strange results. q=(text:(specific) OR title:(specific)) AND (text:(first long query) OR title:(first long query)) This query returns something, which contains "

Re: SOLR shards stay down forever

2014-12-09 Thread Norgorn
The problem is, that hard commit is on, max uncommited docs = 500.000. And tlog size is just about 200 MB per shard - doesn't seem too big for me. The reason of my panic is the fact, that one shard in my old collection is down forever, without any unusual entries in logs. I tried different magic (

SOLR shards stay down forever

2014-12-08 Thread Norgorn
I'm using SOLR 4.10.1 in cloud mode with 3 instances, 5 shards per instance without replication. I restarted one SOLR and now all shards from that instance are down, but there are no errors in logs. All I see is 09.12.2014, 11:13:40WARNUpdateLog Starting log replay tlog{file=/opt/dat

Re: Terms vector for multiple documents

2014-11-27 Thread Norgorn
Thanks, I'll learn about facets. Actually, we want to use Mahout, but it needs term vectors - so we faced the problem of receiving term vector for author from set of documents. Anyway the main reason of my question was the desire to learn, if I'm missing some simple solution, or not. So, thank u

Terms vector for multiple documents

2014-11-27 Thread Norgorn
I'm working with social media data. We have blog posts in our index - text + authors_id. Now we need to clusterize authors by their texts. We need to get term vector not for documents, but one vector per one author (for all authors documents). We can't get all documents and then unite 'em cause It

Re: Solr: IndexNotFoundException: no segments* file HdfsDirectoryFactory

2014-11-13 Thread Norgorn
Yes, it's late, but I've faced same problem and this question is the only one relevant to the problem in Google results, so, hope it'll help s1. For me, adding this two strings to solrconfig solved the problem ${solr.data.dir:hdfs://192.168.22.11:9001/solr} true In docs it's siad, that there is

Re: Solr + HDFS settings

2014-10-27 Thread Norgorn
Already tried with same result (the message changed properly ) -- View this message in context: http://lucene.472066.n3.nabble.com/Solr-HDFS-settings-tp4165873p4166089.html Sent from the Solr - User mailing list archive at Nabble.com.

Re: Solr + HDFS settings

2014-10-25 Thread Norgorn
Ok, new problem, while collection or shard creating: Caused by: no segments* file found in NRTCachingDirectory(HdfsDirectory@3a19dc74 lockFactory=org.apache.solr.store.hdfs.HdfsLockFactory@43507d1b; maxCacheMB=192.0 maxMergeSizeMB=16.0): files: [HdfsDirectory@3a19dc74 lockFactory=org.apache.solr.s

Solr + HDFS settings

2014-10-25 Thread Norgorn
I'm trying to run SOLR with HDFS in solrconfig.xml I've written hdfs:///solr true 1 true 16384 true true true 16 192 But, when I'm trying to create collection, I get "Caused by: com.google.pr

Re: Solr Index to Helio Search

2014-10-13 Thread Norgorn
*totalprovidencevideo It worked, thanks, u helped me to save nearly a week on reindexing and lot of nerves. -- View this message in context: http://lucene.472066.n3.nabble.com/Solr-Index-to-Helio-Search-tp4163446p4164114.html Sent from the Solr - User mailing list archive at Nabble.com.

Solr Index to Helio Search

2014-10-08 Thread Norgorn
When I try to simple copy index from native SOLR to Heliosearch, i get exception: Caused by: java.lang.IllegalArgumentException: A SPI class of type org.apache.lu cene.codecs.Codec with name 'Lucene410' does not exist. You need to add the corr esponding JAR file supporting this SPI to your classpa

RE: SlrCloud RAM requirments

2014-09-24 Thread Norgorn
Thanks again. I'd answered before properly reading your post, my apologizes. I can't say for sure, cause filter caches are out of the JVM (dat HS), but top shows 5 GB cached and no free RAM. The only question for me now is how to balance disk cache and filter cache? Do I need to worry about that,

RE: SlrCloud RAM requirments

2014-09-24 Thread Norgorn
Thanks for your reply. Collection contains about billion of documents. I'm using most of all simple queries with date and other filters (5 filters per query). Yup, disks are cheapest and simplest. At the end, I want to reach several seconds per search query (for not cached query =) ), so, please,

SlrCloud RAM requirments

2014-09-23 Thread Norgorn
I have CLOUD with 3 nodes and 16 MB RAM on each. My index is about 1 TB and search speed is awfully bad. I've read, that one needs at least 50% of index size in RAM, but I suerly can't afford it. Please, tell me, is there any way to improve perfomance with hardly limited resources? Yes, I can try t

SolrCloud deleted all existing indexes after update query

2014-09-17 Thread Norgorn
I'm using SOLR-hs_0.06 based on SOLR 4.10 I have SolrCloud with external ZooKeepers. I manually indexed with DIH from mySQL on each instance - we have lot of dbs, so It's one db per solr instace. All was just fine - I could search and so on. Then I sended update queries (lot of, about 1 or 100