We store marketing message in Solr, one type is voucher message, this message stores a map: key is accountId, value is this user's voucher - we also search on some other fields.
*My current approach is to add a field: accountIds, index it and search against it.* Another field: details, not-indexed, json data, which stores the acocuntId to voucher map. Then we have a transformer in solr side, that will only keep this user's voucher - remove all other user's voucher. This works, with 2gb memory and 3 nodes in same machine, it can handle at least 1 million accountIds. - 35 mb file which contains 1 million accountId and voucher codes. But today I am thinking of another approach, *Create a dynamic field: voucher_*, the field name is voucher_acount123, it's value is acount123's voucher. * The query would be simple, and its performance would be better. - as no need to load the whole details field to memory and filter out other vouchers. This works for small data, but it failed when I try to index 200,000 voucher - (thus 200,000 dynamic fields) The document is created, but not all dynamic fields are created. Server side error: WARN - 2016-08-30 06:37:49.529; [message shard1 core_node1 message_shard1_replica3] org.eclipse.jetty.servlet.ServletHandler; Error for /solr/message_shard1_replica3/update java.lang.OutOfMemoryError: Java heap space at org.apache.lucene.util.packed.Direct16.<init>(Direct16.java:37) at org.apache.lucene.util.packed.PackedInts.getMutable(PackedInts.java:993) at org.apache.lucene.util.packed.PackedInts.getMutable(PackedInts.java:976) at org.apache.lucene.codecs.compressing.LZ4$HashTable.reset(LZ4.java:193) at org.apache.lucene.codecs.compressing.LZ4.compress(LZ4.java:217) at org.apache.lucene.codecs.compressing.CompressionMode$LZ4FastCompressor.compress(CompressionMode.java:164) at org.apache.lucene.codecs.compressing.CompressingStoredFieldsWriter.flush(CompressingStoredFieldsWriter.java:235) at org.apache.lucene.codecs.compressing.CompressingStoredFieldsWriter.finishDocument(CompressingStoredFieldsWriter.java:165) at org.apache.lucene.index.DefaultIndexingChain.finishStoredFields(DefaultIndexingChain.java:270) at org.apache.lucene.index.DefaultIndexingChain.processDocument(DefaultIndexingChain.java:311) at org.apache.lucene.index.DocumentsWriterPerThread.updateDocument(DocumentsWriterPerThread.java:232) at org.apache.lucene.index.DocumentsWriter.updateDocument(DocumentsWriter.java:458) at org.apache.lucene.index.IndexWriter.updateDocument(IndexWriter.java:1363) at org.apache.solr.update.DirectUpdateHandler2.addDoc0(DirectUpdateHandler2.java:239) at org.apache.solr.update.DirectUpdateHandler2.addDoc(DirectUpdateHandler2.java:163) at org.apache.solr.update.processor.RunUpdateProcessor.processAdd(RunUpdateProcessorFactory.java:69) at org.apache.solr.update.processor.UpdateRequestProcessor.processAdd(UpdateRequestProcessor.java:51) at org.apache.solr.update.processor.DistributedUpdateProcessor.doLocalAdd(DistributedUpdateProcessor.java:955) at org.apache.solr.update.processor.DistributedUpdateProcessor.versionAdd(DistributedUpdateProcessor.java:1110) at org.apache.solr.update.processor.DistributedUpdateProcessor.processAdd(DistributedUpdateProcessor.java:706) at org.apache.solr.update.processor.LogUpdateProcessor.processAdd(LogUpdateProcessorFactory.java:104) at org.apache.solr.handler.loader.JavabinLoader$1.update(JavabinLoader.java:101) at org.apache.solr.client.solrj.request.JavaBinUpdateRequestCodec$1.readOuterMostDocIterator(JavaBinUpdateRequestCodec.java:179) at org.apache.solr.client.solrj.request.JavaBinUpdateRequestCodec$1.readIterator(JavaBinUpdateRequestCodec.java:135) at org.apache.solr.common.util.JavaBinCodec.readVal(JavaBinCodec.java:241) at org.apache.solr.client.solrj.request.JavaBinUpdateRequestCodec$1.readNamedList(JavaBinUpdateRequestCodec.java:121) at org.apache.solr.common.util.JavaBinCodec.readVal(JavaBinCodec.java:206) at org.apache.solr.common.util.JavaBinCodec.unmarshal(JavaBinCodec.java:126) at org.apache.solr.client.solrj.request.JavaBinUpdateRequestCodec.unmarshal(JavaBinUpdateRequestCodec.java:186) at org.apache.solr.handler.loader.JavabinLoader.parseAndLoadDocs(JavabinLoader.java:111) at org.apache.solr.handler.loader.JavabinLoader.load(JavabinLoader.java:58) at org.apache.solr.handler.UpdateRequestHandler$1.load(UpdateRequestHandler.java:98) Caused by: org.apache.zookeeper.KeeperException$ConnectionLossException: KeeperErrorCode = ConnectionLoss for /configs/messagen/params.json solrj client: megaphone 2016-08-29 23:22:14 INFO 777756 [http-nio-0.0.0.0-8080-exec-6-SendThread(localhost:9983)] o.a.z.ClientCnxn - Client session timed out, have not heard from server in 6249ms for sessionid 0x156da0d9e0b0003, closing socket connection and attempting reconnect I also tried to change voucher_* to stored only - indexed=false, I am able to import message with 1 million accountId and voucher, - means 1 million stored only dynamic fields. But this takes too much memory, it needs 1.5 gb to serve this message - I change memory to 512mb, solr can't be started at all. when changed to 500m ERROR - 2016-08-30 07:04:19.866; [ message_shard1_replica3] org.apache.solr.core.CoreContainer; Error creating core [message_shard1_replica3]: Java heap space java.lang.OutOfMemoryError: Java heap space at java.util.HashMap.resize(HashMap.java:703) at java.util.HashMap.putVal(HashMap.java:662) at java.util.HashMap.put(HashMap.java:611) at org.apache.lucene.index.FieldInfos.<init>(FieldInfos.java:65) at org.apache.lucene.codecs.lucene50.Lucene50FieldInfosFormat.read(Lucene50FieldInfosFormat.java:166) at org.apache.lucene.index.IndexWriter.readFieldInfos(IndexWriter.java:912) at org.apache.lucene.index.IndexWriter.getFieldNumberMap(IndexWriter.java:924) at org.apache.lucene.index.IndexWriter.<init>(IndexWriter.java:854) at org.apache.solr.update.SolrIndexWriter.<init>(SolrIndexWriter.java:78) at org.apache.solr.update.SolrIndexWriter.create(SolrIndexWriter.java:65) at org.apache.solr.update.DefaultSolrCoreState.createMainIndexWriter(DefaultSolrCoreState.java:273) at org.apache.solr.update.DefaultSolrCoreState.getIndexWriter(DefaultSolrCoreState.java:116) at org.apache.solr.core.SolrCore.openNewSearcher(SolrCore.java:1626) at org.apache.solr.core.SolrCore.getSearcher(SolrCore.java:1769) at org.apache.solr.core.SolrCore.initSearcher(SolrCore.java:911) at org.apache.solr.core.SolrCore.<init>(SolrCore.java:788) at org.apache.solr.core.SolrCore.<init>(SolrCore.java:658) I know solr's data structure is much more complex than canssandra, but is it possible to support 1 million indexable dynamic fields in solr - at least for stored-only dynamic fields? This will make solr more useful in some cases. Thanks for any input. -- View this message in context: http://lucene.472066.n3.nabble.com/Solr-Wide-Column-not-work-when-with-200k-dynamic-fields-tp4293911.html Sent from the Solr - User mailing list archive at Nabble.com.