Lucene creates an array of one item per document for every field you sort on. If you sort on a thousand fields, Lucene will create 1000 different arrays of 500K ints. I assume there is some sort of cache of these arrays. In Solr, it is also possible to sort using a function as the relevance value. This is rather slow, and caches no data between queries.
You may want to do sorting in your front-end applications, or get database ids from Solr and do sorting in the database query. On Mon, Nov 30, 2009 at 7:14 AM, Alex Wang <aw...@crossview.com> wrote: > Thanks Otis for the reply. Yes this will be pretty memory intensive. > The size of the index is 5 cores with a maximum of 500K documents each > core. I did search the archives before but did not find any definite > answer. Thanks again! > > Alex > > > > On Nov 27, 2009, at 11:09 PM, Otis Gospodnetic wrote: > >> Hi Alex, >> >> There is no build-in limit. The limit is going to be dictated by >> your hardware resources. In particular, this sounds like a memory >> intensive app because of sorting on lots of different fields. You >> didn't mention the size of your index, but that's a factor, too. >> Once in a while people on the list mention cases with lots and lots >> of fields, so I'd check ML archives. >> >> Otis >> -- >> Sematext is hiring -- http://sematext.com/about/jobs.html?mls >> Lucene, Solr, Nutch, Katta, Hadoop, HBase, UIMA, NLP, NER, IR >> >> >> >> ----- Original Message ---- >>> From: Alex Wang <aw...@crossview.com> >>> To: "solr-user@lucene.apache.org" <solr-user@lucene.apache.org> >>> Sent: Thu, November 26, 2009 12:47:36 PM >>> Subject: Maximum number of fields allowed in a Solr document >>> >>> Hi, >>> >>> We are in the process of designing a Solr app where we might have >>> millions of documents and within each of the document, we might have >>> thousands of dynamic fields. These fields are small and only contain >>> an integer, which needs to be retrievable and sortable. >>> >>> My questions is: >>> >>> 1. Is there a limit on the number of fields allowed per document? >>> 2. What is the performance impact for such design? >>> 3. Has anyone done this before and is it a wise thing to do? >>> >>> Thanks, >>> >>> Alex >> > > -- Lance Norskog goks...@gmail.com