The DispatchFilter could probably be modified to have the option of using the ServletOutputStream instead of the Writer. It would take some doing to maintain the proper compatibility, but it can be done, I think. Maybe we could have a /binary path or something along those lines and SolrJ could use that. QueryResponseWriter could be extended to have a write method that takes an OutputStream. Caveat: I haven't fully investigated this, but I do believe it makes sense for SolrJ to use a binary format by default. The other thing it should do is make sure, when sending/receiving XML is that the XML is as "tight" as possible, i.e. minimal whitespace, etc.

Just thinking out loud,
Grant

On Feb 22, 2008, at 8:29 AM, Noble Paul നോബിള്‍ नोब्ळ् wrote:

The API forbids use of any non-text format.

The QueryResponseWriter's write() method can take only a Writer. So we
cannot write any binary stream into that.

--Noble

On Fri, Feb 22, 2008 at 12:30 AM, Walter Underwood
<[EMAIL PROTECTED]> wrote:
Python marshal format is worth a try. It is binary and can represent
the same data as JSON. It should be a good fit to Solr.

We benchmarked that against XML several years ago and it was 2X faster.
Of course, XML parsers are a lot faster now.

wunder



On 2/21/08 10:50 AM, "Grant Ingersoll" <[EMAIL PROTECTED]> wrote:

XML can be a problem when it is really lengthy (lots of results, large
results) such that a binary format could be useful in certain cases
where we control both ends of the pipe (i.e. SolrJ.)  I've seen apps
that deal with really large files wrapped in XML where the XML parsing
takes a significant amount of time as compared to a more compact
binary format.

I think it at least warrants profiling/testing.

-Grant

On Feb 21, 2008, at 12:07 PM, Noble Paul നോബിള്‍
नोब्ळ् wrote:

hi,
The format over the wire is not of great significance because it gets
unmarshalled into the corresponding language object as soon as it
comes out
of the wire. I would say XML/JSON should meet 99% of the requirements because all the platforms come with an unmarshaller for both of these.

But,If it can offer good performance improvement it is worth trying.
--Noble

On Thu, Feb 21, 2008 at 3:41 AM, alexander lind <[EMAIL PROTECTED]>
wrote:

On Feb 20, 2008, at 9:31 AM, Doug Steigerwald wrote:

A few months back I wrote a YAML update request handler to see if we
could post documents faster than with XMl.  We did see some small
speed improvements (didn't write down the numbers), but the hacked
together code was probably making it slower as well.  Not sure if
there are faster YAML libraries out there either.

We're not actually using it, since it was just a small proof of
concept type of project, but is this anything people might be
interested in?


Out of simple preference I would love to see a YAML request handler
just because I like the YAML format. If its also faster than XML,
then
all the better.

Cheers
Alec




--
--Noble Paul

--------------------------
Grant Ingersoll
http://www.lucenebootcamp.com
Next Training: April 7, 2008 at ApacheCon Europe in Amsterdam

Lucene Helpful Hints:
http://wiki.apache.org/lucene-java/BasicsOfPerformance
http://wiki.apache.org/lucene-java/LuceneFAQ










--
--Noble Paul

--------------------------
Grant Ingersoll
http://www.lucenebootcamp.com
Next Training: April 7, 2008 at ApacheCon Europe in Amsterdam

Lucene Helpful Hints:
http://wiki.apache.org/lucene-java/BasicsOfPerformance
http://wiki.apache.org/lucene-java/LuceneFAQ





Reply via email to