Fetch your 70,000 results in 70 chunks of 1000 results. Parse each chunk
and add it to your internal list.

If you are allowed to parse Python results, why can't you use a diffetent
XML parser?

What sort of "more work" are you doing? I've implemented lots of stuff
on top of a paged model, including customizing the relevance formula
and re-ranking.

wunder

On 12/12/07 12:31 PM, "Owens, Martin" <[EMAIL PROTECTED]> wrote:
>  
>>> I think your biggest problem is requesting 70,000 records from Solr.
>>> That is not going to be fast.
> 
> I know it, but the limits on the development don't lend themselves to putting
> all of the fields into lucene so a proper search can be conducted. We need to
> return them all because more work is done on the results webserver side (much
> to my chagrin) so paging is out of the question.
> 
>>> 2. Since you are running out of memory parsing XML, I'm guessing
>>> that you're using a DOM-style parser. Don't do that. You do not
>>> need to create elaborate structures, strip mine the data, then
>>> throw those structures away. Instead, us a streaming parser, like Stax.
> 
> Oh I know there are better ways of doing it, I just can't do any of them.
> constraints and all that.
> 
> I was looking at the PythonResponseWriter, I'm trying to find a howto since a
> response writer would be responsible for writing the response after a search
> right?
> 
> Best regards, Martin Owens

Reply via email to