Package: qa.debian.org Severity: wishlist User: qa.debian....@packages.debian.org Usertags: udd
Hi. I already discussed this feature with Lucas Nussbaum at Debconf11. It would be nice to have a gatherer for patch-tracker.d.o data in UDD. I already wrote a patch for patch-tracker.d.o to export the data we could be interested into in a JSON file. The patch is still waiting for review and I'll ping Sean in the next days. I also wrote patches for UDD, and I'm attaching them here. They're still incomplete, need to be adjusted for the actual setup of UDD (which I do not know enough), still have some FIXME for potential SQL injection. Moreover, SQL schemas have to be fixed and the whole infrastructure for getting JSON data from patch-tracker.d.o must be set up. But, in the meantime, people can have a look at the code to understand what's going on. I'm also attaching a draft of the SQL schema (which probably still lacks things, most notably sane indexes). A gzipped test data file (based on a snapshot of a few days ago) can be found here: http://people.debian.org/~gio/patch_tracker.json.gz Thanks, Gio. -- Giovanni Mascellani <mascell...@poisson.phc.unipi.it> Pisa, Italy Web: http://poisson.phc.unipi.it/~mascellani Jabber: g.mascell...@jabber.org / giova...@elabor.homelinux.org
From ef97570af8a56e7018b971574152cdcbd29c1b8b Mon Sep 17 00:00:00 2001 From: Giovanni Mascellani <mascell...@poisson.phc.unipi.it> Date: Mon, 1 Aug 2011 11:51:39 +0200 Subject: [PATCH] Added sketch of patch_tracker gatherer. --- udd/config-org.yaml | 7 +++++ udd/doc/sources/patch-tracker | 24 +++++++++++++++++ udd/udd/patch_tracker_gatherer.py | 52 +++++++++++++++++++++++++++++++++++++ 3 files changed, 83 insertions(+), 0 deletions(-) create mode 100644 udd/doc/sources/patch-tracker create mode 100644 udd/udd/patch_tracker_gatherer.py diff --git a/udd/config-org.yaml b/udd/config-org.yaml index 0a0745b..368425c 100644 --- a/udd/config-org.yaml +++ b/udd/config-org.yaml @@ -13,6 +13,7 @@ general: orphaned-packages: module udd.orphaned_packages_gatherer bugs: exec DEBBUGS_CONFIG_FILE=/org/bugs.debian.org/etc/config perl /org/udd.debian.org/udd/udd/bugs_gatherer.pl carnivore: module udd.carnivore_gatherer + patch-tracker: module udd.patch_tracker_gatherer lintian: module udd.lintian_gatherer debtags: module udd.debtags_gatherer ubuntu-bugs: module udd.ubuntu_bugs_gatherer @@ -546,6 +547,12 @@ carnivore: login-table: carnivore_login schema: carnivore +patch-tracker: + type: patch-tracker + path: /home/giovanni/packages/qa/collab-qa/udd/mirrors/patch_tracker.json + patch-tracker-table: patch_tracker + patches-table: patches + lintian: type: lintian update-command: rm -f /org/udd.debian.org/mirrors/lintian.log && wget -q http://lintian.debian.org/lintian.log -O /org/udd.debian.org/mirrors/lintian.log diff --git a/udd/doc/sources/patch-tracker b/udd/doc/sources/patch-tracker new file mode 100644 index 0000000..20027f9 --- /dev/null +++ b/udd/doc/sources/patch-tracker @@ -0,0 +1,24 @@ +DESCRIPTION + patch-tracker.d.o is a web service that shows the patches that get + applied to the Debian packages at build time. A summary of the shown + data is stored and published as JSON file, so it can be included + in UDD. + + We import such information assigning a unique integer ID to each + pair (package, release). Two tables are then used to store the + data: + * patch-tracker-table: keeps general information about the + package (name, version, release) and about the patching method + it uses (the series type - i.e., whether it uses quilt, dpatch + or other systems, including the 'native' and 'no_series' values; + and the number of lines added and removed outside the debian/ + directory by the .diff.gz or .debian.tar.gz patch); + * patches-table: keeps information about the specific patches + (the package to which belongs, its name and the number of lines + added and removed); it will probably extended in the future to + store also DEP 3 metadata. + +CONFIGURATION: + path: The path of the JSON statistics from patch-tracker.d.o. + patch-tracker-table, patches-table: The names of the database + tables used to describe the packages and patches. diff --git a/udd/udd/patch_tracker_gatherer.py b/udd/udd/patch_tracker_gatherer.py new file mode 100644 index 0000000..ed56727 --- /dev/null +++ b/udd/udd/patch_tracker_gatherer.py @@ -0,0 +1,52 @@ +#!/usr/bin/env python + +import sys +from gatherer import gatherer +import json + +def get_gatherer(connection, config, source): + return patch_tracker_gatherer(connection, config, source) + +class patch_tracker_gatherer(gatherer): + + def __init__(self, connection, config, source): + gatherer.__init__(self, connection, config, source) + self.assert_my_config('path', 'patch-tracker-table', 'patches-table') + + def run(self): + my_config = self.my_config + + cur = self.cursor() + + cur.execute("DELETE FROM %s" % my_config["patches-table"]) + cur.execute("DELETE FROM %s" % my_config["patch-tracker-table"]) + + # FIXME SQL Injection exposures here + cur.execute("""PREPARE patch_tracker_insert + AS INSERT INTO %s (id, package, version, release, series_type, nondebian_added, nondebian_removed) + VALUES ($1, $2, $3, $4, $5, $6, $7)""" % (my_config['patch-tracker-table'])) + cur.execute("""PREPARE patches_insert + AS INSERT INTO %s (id, package_id, lines_added, lines_removed, name) + VALUES ($1, $2, $3, $4, $5)""" % (my_config['patches-table'])) + + # TODO We don't need to keep all the decoded JSON file in memory; there are + # libraries to access it in a stream (DOM-like) fashion, but Python + # bindings don't appear to be available in Debian (libyajl) + patch_tracker_data = json.load(open(my_config['path'])) + package_number = 0 + patch_number = 0 + for line in patch_tracker_data: + package, suite, version, series_type, nondebian, patches = \ + (line['package'], line['suite'], line['version'], + line['series_type'], line['nondebian'], line['patches']) + cur.execute("EXECUTE patch_tracker_insert (%d, '%s', '%s', '%s', '%s', %d, %d);" % + (package_number, package, version, suite, series_type, nondebian[0], nondebian[1])) + for patch_name, patch_data in patches: + cur.execute("EXECUTE patches_insert (%d, %d, %d, %d, '%s');" % + (patch_number, package_number, patch_data[0], patch_data[1], patch_name)) + patch_number += 1 + package_number += 1 + + cur.execute("ANALYZE %s" % my_config["patch-tracker-table"]) + cur.execute("ANALYZE %s" % my_config["patches-table"]) + -- 1.7.5.4
-- -- Name: patch_tracker; Type: TABLE; Schema: public; Owner: giovanni; Tablespace: -- CREATE TABLE patch_tracker ( id integer NOT NULL, package text NOT NULL, version debversion NOT NULL, release text NOT NULL, series_type text NOT NULL, nondebian_added integer NOT NULL, nondebian_removed integer NOT NULL ); -- -- Name: patch_tracker_pkey; Type: CONSTRAINT; Schema: public; Owner: giovanni; Tablespace: -- ALTER TABLE ONLY patch_tracker ADD CONSTRAINT patch_tracker_pkey PRIMARY KEY (id); -- -- Name: package_version_idx; Type: INDEX; Schema: public; Owner: giovanni; Tablespace: -- CREATE INDEX package_version_idx ON patch_tracker USING btree (package, version); -- -- Name: patches; Type: TABLE; Schema: public; Owner: giovanni; Tablespace: -- CREATE TABLE patches ( id integer NOT NULL, package_id integer NOT NULL, lines_added integer NOT NULL, lines_removed integer NOT NULL, name text NOT NULL ); -- -- Name: patches_pkey; Type: CONSTRAINT; Schema: public; Owner: giovanni; Tablespace: -- ALTER TABLE ONLY patches ADD CONSTRAINT patches_pkey PRIMARY KEY (id); -- -- Name: package_id_idx; Type: INDEX; Schema: public; Owner: giovanni; Tablespace: -- CREATE INDEX package_id_idx ON patches USING btree (package_id); -- Name: patches_package_id_fkey; Type: FK CONSTRAINT; Schema: public; Owner: giovanni -- ALTER TABLE ONLY patches ADD CONSTRAINT patches_package_id_fkey FOREIGN KEY (package_id) REFERENCES patch_tracker(id);
signature.asc
Description: OpenPGP digital signature