Niels Thykier wrote (05 Jul 2016 20:21:03 GMT) :
> Re: the memory usage; it may make sense to do the report as multiple
> "documents" (e.g. one per source package or something).

> It would allow both generator and consumers to process it more
> efficiently by processing a single source at the time.

I'm open to discussing this option, and have just spent some time
thinking about it. I have a few worries about it, as far as this first
iteration is concerned.

First of all, for my use case retrieving all data in one single HTTP
request simpler, and I'm ready to take the performance hit since it
also makes my consumer code much simpler to write, review
and maintain.

Also, once published on https://lintian.d.o/, these per-package files
will look very much like endpoints for a web API, that consumers might
start using in the wild, and then:

1. The exact URIs matter a lot, as they become the API endoints; I'm
   not interested in designing that API at the moment personally,
   especially in a way that provides any kind of stability (and
   anyway, the API should be versioned so all URIs should start with
   a "/0.1" component or something).

2. I wonder if YAML is optimal for consumers that want to use
   a smaller subset of the data. E.g. for web tools it's easier to
   use JSON.

So right now, I'm leaning towards keeping the one-big-YAML-file design
since it matches my current needs very well, and leaving it to those
who want finer-grained machine-readable access to design how the data
could be made available (endpoints, format, etc.).

What do you think?

As a side note: I'd be totally fine with advertising the
one-big-YAML-file format as subject to change, and to adjust my
consumer code in the future if needed, e.g. if/when a better data
format and endpoints layout is designed.

Cheers,
-- 
intrigeri

Reply via email to