Perhaps it would make sense to explore a nested structure with your key/value being a child document. Then, you match on child document and load the parent document.
I don't think either first or second solution is a very resilient Solr architecture. Regards, Alex. ---- Newsletter and resources for Solr beginners and intermediates: http://www.solr-start.com/ On 30 August 2016 at 14:13, Jeffery Yuan <yuanyun...@gmail.com> wrote: > We store marketing message in Solr, one type is voucher message, this message > stores a map: key is accountId, value is this user's voucher - we also > search on some other fields. > > *My current approach is to add a field: accountIds, index it and search > against it.* > Another field: details, not-indexed, json data, which stores the acocuntId > to voucher map. > > Then we have a transformer in solr side, that will only keep this user's > voucher - remove all other user's voucher. > > This works, with 2gb memory and 3 nodes in same machine, it can handle at > least 1 million accountIds. - 35 mb file which contains 1 million accountId > and voucher codes. > > But today I am thinking of another approach, > *Create a dynamic field: voucher_*, the field name is voucher_acount123, > it's value is acount123's voucher. > * > > The query would be simple, and its performance would be better. - as no need > to load the whole details field to memory and filter out other vouchers. > > This works for small data, but it failed when I try to index 200,000 voucher > - (thus 200,000 dynamic fields) > > The document is created, but not all dynamic fields are created. > > Server side error: > WARN - 2016-08-30 06:37:49.529; [message shard1 core_node1 > message_shard1_replica3] org.eclipse.jetty.servlet.ServletHandler; Error for > /solr/message_shard1_replica3/update > java.lang.OutOfMemoryError: Java heap space > at org.apache.lucene.util.packed.Direct16.<init>(Direct16.java:37) > at > org.apache.lucene.util.packed.PackedInts.getMutable(PackedInts.java:993) > at > org.apache.lucene.util.packed.PackedInts.getMutable(PackedInts.java:976) > at > org.apache.lucene.codecs.compressing.LZ4$HashTable.reset(LZ4.java:193) > at org.apache.lucene.codecs.compressing.LZ4.compress(LZ4.java:217) > at > org.apache.lucene.codecs.compressing.CompressionMode$LZ4FastCompressor.compress(CompressionMode.java:164) > at > org.apache.lucene.codecs.compressing.CompressingStoredFieldsWriter.flush(CompressingStoredFieldsWriter.java:235) > at > org.apache.lucene.codecs.compressing.CompressingStoredFieldsWriter.finishDocument(CompressingStoredFieldsWriter.java:165) > at > org.apache.lucene.index.DefaultIndexingChain.finishStoredFields(DefaultIndexingChain.java:270) > at > org.apache.lucene.index.DefaultIndexingChain.processDocument(DefaultIndexingChain.java:311) > at > org.apache.lucene.index.DocumentsWriterPerThread.updateDocument(DocumentsWriterPerThread.java:232) > at > org.apache.lucene.index.DocumentsWriter.updateDocument(DocumentsWriter.java:458) > at > org.apache.lucene.index.IndexWriter.updateDocument(IndexWriter.java:1363) > at > org.apache.solr.update.DirectUpdateHandler2.addDoc0(DirectUpdateHandler2.java:239) > at > org.apache.solr.update.DirectUpdateHandler2.addDoc(DirectUpdateHandler2.java:163) > at > org.apache.solr.update.processor.RunUpdateProcessor.processAdd(RunUpdateProcessorFactory.java:69) > at > org.apache.solr.update.processor.UpdateRequestProcessor.processAdd(UpdateRequestProcessor.java:51) > at > org.apache.solr.update.processor.DistributedUpdateProcessor.doLocalAdd(DistributedUpdateProcessor.java:955) > at > org.apache.solr.update.processor.DistributedUpdateProcessor.versionAdd(DistributedUpdateProcessor.java:1110) > at > org.apache.solr.update.processor.DistributedUpdateProcessor.processAdd(DistributedUpdateProcessor.java:706) > at > org.apache.solr.update.processor.LogUpdateProcessor.processAdd(LogUpdateProcessorFactory.java:104) > at > org.apache.solr.handler.loader.JavabinLoader$1.update(JavabinLoader.java:101) > at > org.apache.solr.client.solrj.request.JavaBinUpdateRequestCodec$1.readOuterMostDocIterator(JavaBinUpdateRequestCodec.java:179) > at > org.apache.solr.client.solrj.request.JavaBinUpdateRequestCodec$1.readIterator(JavaBinUpdateRequestCodec.java:135) > at > org.apache.solr.common.util.JavaBinCodec.readVal(JavaBinCodec.java:241) > at > org.apache.solr.client.solrj.request.JavaBinUpdateRequestCodec$1.readNamedList(JavaBinUpdateRequestCodec.java:121) > at > org.apache.solr.common.util.JavaBinCodec.readVal(JavaBinCodec.java:206) > at > org.apache.solr.common.util.JavaBinCodec.unmarshal(JavaBinCodec.java:126) > at > org.apache.solr.client.solrj.request.JavaBinUpdateRequestCodec.unmarshal(JavaBinUpdateRequestCodec.java:186) > at > org.apache.solr.handler.loader.JavabinLoader.parseAndLoadDocs(JavabinLoader.java:111) > at > org.apache.solr.handler.loader.JavabinLoader.load(JavabinLoader.java:58) > at > org.apache.solr.handler.UpdateRequestHandler$1.load(UpdateRequestHandler.java:98) > > Caused by: org.apache.zookeeper.KeeperException$ConnectionLossException: > KeeperErrorCode = ConnectionLoss for /configs/messagen/params.json > > solrj client: > megaphone 2016-08-29 23:22:14 INFO 777756 > [http-nio-0.0.0.0-8080-exec-6-SendThread(localhost:9983)] o.a.z.ClientCnxn - > Client session timed out, have not heard from server in 6249ms for sessionid > 0x156da0d9e0b0003, closing socket connection and attempting reconnect > > I also tried to change voucher_* to stored only - indexed=false, > I am able to import message with 1 million accountId and voucher, - means 1 > million stored only dynamic fields. > But this takes too much memory, it needs 1.5 gb to serve this message - I > change memory to 512mb, solr can't be started at all. > > when changed to 500m > ERROR - 2016-08-30 07:04:19.866; [ message_shard1_replica3] > org.apache.solr.core.CoreContainer; Error creating core > [message_shard1_replica3]: Java heap space > java.lang.OutOfMemoryError: Java heap space > at java.util.HashMap.resize(HashMap.java:703) > at java.util.HashMap.putVal(HashMap.java:662) > at java.util.HashMap.put(HashMap.java:611) > at org.apache.lucene.index.FieldInfos.<init>(FieldInfos.java:65) > at > org.apache.lucene.codecs.lucene50.Lucene50FieldInfosFormat.read(Lucene50FieldInfosFormat.java:166) > at > org.apache.lucene.index.IndexWriter.readFieldInfos(IndexWriter.java:912) > at > org.apache.lucene.index.IndexWriter.getFieldNumberMap(IndexWriter.java:924) > at org.apache.lucene.index.IndexWriter.<init>(IndexWriter.java:854) > at > org.apache.solr.update.SolrIndexWriter.<init>(SolrIndexWriter.java:78) > at > org.apache.solr.update.SolrIndexWriter.create(SolrIndexWriter.java:65) > at > org.apache.solr.update.DefaultSolrCoreState.createMainIndexWriter(DefaultSolrCoreState.java:273) > at > org.apache.solr.update.DefaultSolrCoreState.getIndexWriter(DefaultSolrCoreState.java:116) > at org.apache.solr.core.SolrCore.openNewSearcher(SolrCore.java:1626) > at org.apache.solr.core.SolrCore.getSearcher(SolrCore.java:1769) > at org.apache.solr.core.SolrCore.initSearcher(SolrCore.java:911) > at org.apache.solr.core.SolrCore.<init>(SolrCore.java:788) > at org.apache.solr.core.SolrCore.<init>(SolrCore.java:658) > > I know solr's data structure is much more complex than canssandra, but is it > possible to support 1 million indexable dynamic fields in solr - at least > for stored-only dynamic fields? > > This will make solr more useful in some cases. > > Thanks for any input. > > > > -- > View this message in context: > http://lucene.472066.n3.nabble.com/Solr-Wide-Column-not-work-when-with-200k-dynamic-fields-tp4293911.html > Sent from the Solr - User mailing list archive at Nabble.com.