Bug#970428: ITP: golang-gopkg-cheggaaa-pb.v3 -- Console progress bar for Golang

2020-09-16 Thread Andreas Henriksson
Package: wnpp
Severity: wishlist
Owner: Andreas Henriksson 

* Package name: golang-gopkg-cheggaaa-pb.v3
  Version : 1.0.29-1
  Upstream Author : Sergey Cherepanov
* URL : https://github.com/cheggaaa/pb
* License : BSD-3-clause
  Programming Lang: Go
  Description : Console progress bar for Golang

 Terminal progress bar for Go. The v1 and v2 is already packaged
 in debian, but now projects are starting to port to v3 so it's
 needed as a dependency for other things I'm thus considering
 packaging it. If someone else wants to take it over, please
 go for it!



Re: How much data load is acceptable in debian/ dir and upstream (Was: edtsurf_0.2009-7_amd64.changes REJECTED)

2020-09-16 Thread Steven Robbins
On Tuesday, September 15, 2020 11:18:28 P.M. CDT Andreas Tille wrote:
> Hi Paul,
> 
> On Tue, Sep 15, 2020 at 10:00:45PM +0200, Paul Gevers wrote:
> > On 14-09-2020 21:04, Andreas Tille wrote:
> > > In the case of larger data sets it seems to be natural to provide the
> > > data in a separate binary architecture all package to not bloat the
> > > machines of users who do not want this and also save bandwidt of our
> > > mirroring network.  New binary packages require new processing and my
> > > question is here about a set of rejection mails we received ( .
> > 
> > I assume you realized, but just in case you didn't: the data doesn't
> > need to go into any binary package for autopkgtests to find it. While
> > running autopkgtests, the SOURCE is unpackaged and available. (You
> > mentioned other reasons why you want it, though.)
> 
> Yes, that fact is perfectly known.  However, in the current discussion
> this would only "help" us since without an extra binary package we would
> "avoid" the ftpmaster review of the source package.  My intention is
> not to avoid the review but to clarify the situation.
> 
> If I understood ftpmaster correctly the amount of data in the source
> package is the problem.  It would be great to hear other developers
> opinion about the size of data needed for proper testing and where to
> put these.  

Since you're soliciting opinions, here's mine.  In the absence of a documented 
consensus, ftpmaster should respect the packager's judgement rather than 
reject on their own personal opinion.

>From your original set of questions:

> On Sun, Sep 13, 2020 at 12:00:08PM +, Thorsten Alteholz wrote:[1]
> 
> > your debian tar file is much too large.
> > Please put all data in a separate source package and don't forget to add  
the copyright information.
> 
> I admit the debian/ dir (2.7MB) exceeds the real code (300kB) by far.
> However can we please fix somewhere in our packaging documentation
> what size of the debian/ dir is acceptable or not.

Thorsten's observation ("... is much too large") is completely arbitrary.  
Also, why does size matter?  If the files are necessary, they will show up 
somewhere.  Why do we care which tarball they are part of?


> On Sun Sep 13 13:00:09 BST 2020, Thorsten Alteholz wrote:[2]
> 
> > please explain why you need such a huge amount of test data in this
> > package.

This is, to me, also a completely arbitrary opinion ("huge amount").  
Ftpmaster should give the packager the benefit of the doubt.  They have 
presumably also noticed the amount of data and deemed it acceptable.  This 
should not be a barrier to acceptance.  

Demanding an explanation up front is also an arbitrary request.  Allow the 
package and have a conversation afterwards.


> On  Sun Sep 13 18:00:08 BST 2020, Thorsten Alteholz wrote:[3]
> 
> > please don't hide data under debian/*.

There shouldn't be a need for language ("hide data") that suggests possible 
malfeasance on the part of the packager.  If the file placement is against 
documented consensus, then simply point to the relevant policy section.  
Otherwise, accept the package without editorializing.

-Steve


signature.asc
Description: This is a digitally signed message part.


Bug#970447: ITP: pinfish -- Collection of tools to annotate genomes using long read transcriptomics data

2020-09-16 Thread Nilesh Patra
Package: wnpp
Severity: wishlist
Owner: Nilesh Patra 
X-Debbugs-CC: debian-devel@lists.debian.org

* Package name: pinfish
  Version : 0.1.0+ds-1
  Upstream Author : Oxford Nanopore Technologies Ltd.
* URL : https://github.com/nanoporetech/pinfish
* License : MPL-2.0
  Programming Lang: Go
  Description :  Collection of tools to annotate genomes using long
read transcriptomics data
 The toolchain is composed of the following tools:
 1. spliced_bam2gff - a tool for converting sorted BAM
 files containing spliced alignments
 into GFF2 format. Each read will be represented as a distinct
 transcript. This tool comes handy when visualizing spliced
 reads at particular loci and to provide input to the rest
 of the toolchain.
 .
 2. cluster_gff - this tool takes a sorted GFF2 file as
 input and clusters together reads having similar
 exon/intron structure and creates a rough consensus
 of the clusters by taking the median of exon
 boundaries from all transcripts in the cluster.
 .
 3. polish_clusters - this tool takes the cluster
 definitions generated by cluster_gff and for each
 cluster creates an error corrected read by mapping
 all reads on the read with the median length
 and polishing it using racon. The polished reads
 can be mapped to the genome using minimap2 or GMAP.
 .
 4. collapse_partials - this tool takes GFFs generated
 by either cluster_gff or polish_clusters and filters
 out transcripts which are likely to be based on RNA
 degradation products from the 5' end. The tool clusters
 the input transcripts into "loci" by the 3' ends and
 discards transcripts which have a compatible transcripts
 in the loci with more exons.

I shall maintain this package.


Re: How much data load is acceptable in debian/ dir and upstream

2020-09-16 Thread Andreas Tille
Hi Russ

On Mon, Sep 14, 2020 at 12:21:10PM -0700, Russ Allbery wrote:
> > I think we should try to document somehow, when there is a need for some
> > separate source package.  I would agree if the code is some kind of
> > moving target and data would not change or if there is some kind of
> > versioned downloadable tarball or the data can be shared between
> > different software package.  But here none of these conditions is
> > fulfilled.
> 
> Is there any overlap of the test data required by different packages?

In the packages that where rejected this is not the case.  We are
actually considering to create some kind of universal data set to be
used in several packages.  However, this is a tough task since there are
several different data formats and sometimes software is dedicated to
very specific data.

> I'm
> wondering if it would make sense to create a new native Debian package
> called debian-med-test-data or something like that, and put all of the
> data used for package test suites in that package.  The tests can then
> depend on it.

We definitely keep this in mind.

> That may be a little inefficient for autopkgtest because it will need to
> download more data than is necessary to test a specific package, but a bit
> cleaner for the archive since it collects a class of data in one place and
> provides a natural place for supporting documentation, copyright
> information, and so forth.  It also provides a logical place to put
> supporting scripts to, say, refresh the download or restructure the data
> if required for different packages.  It feels a bit more self-documenting
> and obvious what's going on.

The problem is not only for autopkgtest.  As I said we try to enable
users to run the test suite on their local machines as kind of examples
or simply to prove that their machine behaves like the tested behaviour.
In this case also a big package would need to be installed on users
machines which is in most cases not really needed. 

> That also avoids the hassle of having to maintain a bunch of separate test
> data packages (although one could of course also do that) by collecting
> the packaging in one place.

I agree that this would avoid that hassle in the current cases (except
the one where upstream had provided the data inside the tarball
(graphbin).  However, I think it is just a temporary solution for
the question my mail might boil down:

  Provided that license and copyright of the data in question is OK
  is there any size limit for data to be stored under debian/?

I think we should answer this question and write the answer down in
our documents.

Kind regards

  Andreas.

-- 
http://fam-tille.de



Bug#970461: ITP: ckermit -- serial and network communications package

2020-09-16 Thread Sébastien Villemot
Package: wnpp
Severity: wishlist
Owner: Sébastien Villemot 

* Package name: ckermit
  Version : 9.0.305~Alpha.01
  Upstream Author : Frank da Cruz 
* URL : http://www.kermitproject.org/ckdaily.html
* License : BSD-3-clause
  Programming Lang: C
  Description : serial and network communications package

C-Kermit is a combined serial and network communication software package
offering a consistent, medium-independent, secure cross-platform approach
to connection establishment, terminal sessions, file transfer,
character-set translation, and automation of communication tasks.

This is actually a package reintroduction. ckermit was removed from sid in
2019.