Bug#1038326: ITP: transformers -- State-of-the-art Machine Learning for JAX, PyTorch and TensorFlow (it ships LLMs)

2023-06-16 Thread M. Zhou
Package: wnpp
Severity: wishlist
Owner: Mo Zhou 
X-Debbugs-Cc: debian-devel@lists.debian.org, debian...@lists.debian.org

* Package name: transformers
  Upstream Contact: HuggingFace
* URL : https://github.com/huggingface/transformers
* License : Apache-2.0
  Description : State-of-the-art Machine Learning for JAX, PyTorch and 
TensorFlow

I've been using this for a while.

This package provides a convenient way for people to download and run an LLM 
locally.
Basically, if you want to run an instruct fine-tuned large language model with 
7B parameters,
you will need at least 16GB of CUDA memory for inference in half/bfloat16 
precision.
I have not tried to run any LLM with > 3B parameters with CPU ... that can be 
slow.
LLaMa.cpp is a good choice for running LLM on CPU, but that library supports 
less models
than this one. Meanwhile, the cpp library only supports inference.

I don't know how many dependencies are still missing, but that should not be 
too much.
Jax and TensorFlow are optional dependencies so they can be missing from our 
archive.
But anyway, I think running a large language model locally with Debian packages 
will
be interesting. The CUDA version of PyTorch is already in the NEW queue.

That said, this is actually a very comprehensive library, which provides far 
more functionalities
than running LLMs.

Thank you for using reportbug



Re: opencl-icd virtual package(s)?

2023-06-17 Thread M. Zhou
On Sun, 2023-06-18 at 10:37 +0800, Paul Wise wrote:
> [BCCed to OpenCL ICD implementation package maintainers]
> 
> I noticed that some packages have a dep on specific OpenCL ICD
> packages, but don't dep on the opencl-icd virtual package(s).
> Presumably any of the OpenCL ICDs work for most packages?

Theoretically any of them is expected to work. That's the point of ICD.
But, while I'm not an OpenCL user, I have heard that different OpenCL
implementations have their own quirk... (forgotten source)

> $ grep-aptavail --no-field-names --show-field Package --field
> Depends,Recommends,Suggests --whole-pkg '(' --pattern .*opencl-icd.* --
> and --not --pattern '^opencl-icd(-1\.[1]2-1)?$' ')'
> 
> In addition, I noticed that hashcat-nvidia (which presumably doesn't
> need to depend on the opencl-icd virtual package) only depends on two
> of the nvidia OpenCL ICD packages, while there are lots of other nvidia
> OpenCL ICD packages that presumably work too.

It won't surprise me if .*-nvidia fails to work with a non-nvidia OpenCL
implementation.

> I have attached a package list and dd-list for these issues.
> 
> Perhaps there should be a default-opencl-icd virtual package?

Usable OpenCL implementation is very device specific. We cannot make
sure what OpenCL implementation will always be avaialble on user devices,
and even our buildd.

If there must be a default, pocl-opencl-icd is the solution. It supports runing
OpenCL kernels on CPU, so it should be working on most hosts.
Just don't expect any higher performance from CPU-based OpenCL.

FYI: to verify what OpenCL is usable on your host, you may just
$ sudo apt install clinfo; clinfo


> Perhaps lintian should flag situations where the dep isn't just
> default-opencl-icd | opencl-icd? or is missing opencl-icd?
> 
> Thoughts?

I think the current status for some typical packages, like python3-pyopencl
is already correct.

$ apt show python3-pyopencl
Depends: ..., pocl-opencl-icd | mesa-opencl-icd | opencl-icd, ...

You see pocl there as the first choice. For any other packages
that depend on opencl, I think maintainers might have an idea
on what opencl implementation is preferred, either inclusively
or exclusively.

> PS: I noticed this because beignet-opencl-icd is RC-buggy. This is the
> only OpenCL ICD implementation package I can see that supports Intel
> Ivy Bridge, but it is hard to tell which other packages support this,
> because some descriptions don't mention which hardware is supported.

It looks the intel-opencl-icd does not support very old CPUs,
(as listed here https://github.com/intel/compute-runtime )
but I think most major users of OpenCL depends on dedicated GPUs.
The performance of integrated graphics seems no different than nothing.

I think all OpenCL ICD providers can be found by $ apt search opencl-icd .
The AMD opencl implementation is missing.
It is a part of ROCm (https://github.com/RadeonOpenCompute/ROCm-OpenCL-Runtime 
),
and indeed something to work on for the ROCm team  in the 
future.



Bug#1051520: ITP: python-expecttest -- expect test for python

2023-09-08 Thread M. Zhou
Package: wnpp
Severity: wishlist
Owner: Mo Zhou 
X-Debbugs-Cc: debian-devel@lists.debian.org

* Package name: python-expecttest
* URL : https://github.com/ezyang/expecttest/
* License : MIT
  Programming Lang: (python
  Description : expect test for python

Unit testing package for pytorch packages.
Will be maintained by debian deep learning team.

Thank you for using reportbug



Re: Default font: Transition from DejaVu to Noto

2023-09-09 Thread M. Zhou
On Sun, 2023-09-10 at 11:23 +0800, Paul Wise wrote:
> On Sat, 2023-09-09 at 23:08 +0200, Gunnar Hjalmarsson wrote:
> 
> > My personal view is that it is a change in the right direction, and
> > I
> > have taken a couple of follow-up steps in Debian. There are still
> > loose 
> > ends and more work to be done to achieve a consistent configuration
> > in 
> > this respect. However, before taking further steps, I feel there is
> > a
> > need to reach out to a broader audience about the change. Hence
> > this 
> > message. Basically I'm asking if this move towards Noto is
> > desirable 
> > and, if so, I plea for relevant input for the completion of the
> > transition.
> 
> Personally, I found Noto Mono to be very ugly in comparison to the
> DejaVu fonts that I was used to, so my knee-jerk reaction was to
> override the fontconfig settings to avoid all of the Noto fonts.
> I haven't yet evaluated the non-monospace Noto fonts though.
> 
> Related discussion on the various IRC channels suggested that: 
> 
> Noto is meant to be a fallback font (something like unifont but
> not using bitmaps) not the default font for all or any languages.
> 
> Noto makes the font selector useless because each alphabet shows up.
> Apparently fixing this requires changing the OpenType file format.
> 
> Apparently the Noto meta package installs 700+ MB of font data,
> which seems a bit excessive to some folks.
> 
> Some folks are setting explicit fonts in applications they use to
> avoid
> relying on system defaults and also seeing changes to those defaults.
> 

I have similar personal feelings. Aesthetic matter differs from person
to person. As font can be seen as kind of artwork, can we make a pool
for the default font just like our wallpaper?

Noto is usually annoying to me. It provides too many "Noto .*" fonts
in whatever software which shows you a font selection menu.
It is likely that at least 1/3 of the whole fonts menu will be
overwhelmed by the Noto .* fonts that I'll never use, even
if I only keep fonts-noto-core and fonts-noto-cjk.

The number of fonts is overwhelming especially when I frequently
change the font in software like libreoffice, inkscape, etc.
I usually end up deleting the ttf files that I will never need.

Just a negative vote on Noto. Anything else is better.



Re: bookworm+½ needed (Arc GPUs, ...)

2023-09-12 Thread M. Zhou
Intel is also slow in upstreaming their SYCL implementation to LLVM
upstream. So that there is still a very far way to go towards
the pytorch variant that can use intel ARC GPU.


On Mon, 2023-09-11 at 12:47 +0200, Adam Borowski wrote:
> So...
> If you've watched our Dear Leader's talk, a prominent problem listed
> was problems with new graphics cards.
> 
> While he didn't elaborate, I assume it was about Intel Arc -- ie, new
> DG2
> discrete GPUs.  And the problem is, proper support didn't hit the
> kernel
> until after 6.1.  You can kinda-sorta run on 6.1 by whitelisting it
> by PCIe
> ID on the kernel cmdline, and it even works (6.0 couldn't cope with
> my
> setup, 6.1 was ok), but such an intentional block doesn't suggest
> it's
> something wise for a normal user to do.
> 
> I'm not sure if firmware shipped with Bookworm is good enough, either
> (having grabbed a copy of the files much earlier, would need to
> test).
> 
> Of course, this wasn't Debian's fault.  The group at Intel
> responsible
> for upstreaming kernel/etc bits was too slow, not providing drivers
> until
> after the hardware has already been shipping to regular non-NDAed
> customers.
> 
> But then, hardware makers do this all the time.  Intel Arc just gives
> a
> more prominent reason to have an installer with backported
> kernel+stuff.
> 
> Before we go and bother the relevant folks (or maybe even do part of
> the
> work ourselves...), could someone name other pieces of hardware that
> would
> be wanted for Bookworm+½?
> 
> 
> Meow?



Re: [idea]: Switch default compression from "xz" to "zstd" for .deb packages

2023-09-16 Thread M. Zhou
Just one comment.

Be careful if it bloats up our mirrors. Is there any estimate on
the extra space cost for a full debian mirror?

If we trade-off the disk space with decompression speed, zstd -19
is not necessarily very fast. I did not benchmark, but it is slow.


On Sat, 2023-09-16 at 10:31 +0530, Hideki Yamane wrote:
> Hi,
> 
>  Today I want to propose you to change default compression format in
> .deb,
>  {data,control}.tar."xz" to ."zst".
> 
>  I want to hear your thought about this.



Re: [idea]: Switch default compression from "xz" to "zstd" for .deb packages

2023-09-17 Thread M. Zhou
On Sun, 2023-09-17 at 22:16 +0200, Joerg Jaspert wrote:
> 
> I do not think wasting space is any good idea.
> 
> > ## More bandwidth
> >  According to https://www.speedtest.net/global-index, broadband 
> >  bandwidth
> >  in Nicaragua becomes almost 10x
> 
> And elsewhere it may have gone up a different factor.
> Still, there are MANY places where its still bad.

I fully agree. I forgot to mention this point in my other response
in the thread.

The thing is, while the average bandwidth do increase as time goes
by, the bottom-1% will not likely increase like that.

In many corners in the world, people are still using poor and
expensive network. Those networks might be even metered.
Significantly bloating up the mirror size will directly increase
the metered network bill for those people. I was one of them.

It will also increase the pressure on mirror hosting.
Some universities host linux mirrors on their own. Debian is always
the most bulky repository to mirror. They are not always commercially
supported -- sometimes supported only by volunteer's funds.
A significantly bloated up debian mirrors can make their life more
difficult. Things can be worse if they the uploading bandwidth is
limited of even metered. I was one of such hosts.

I know, this is a difficult trade-off between Debian's accessibilty
and software performance.

> >  And, xz cannot use multi core CPUs for decompression but zstd can.
> >  It means that if we use xz, we just get 2x CPU power, but change
> > to 
> >  zst,
> >  we can get more than 10x CPU power than 2012.
> 
> In ideal conditions, maybe, but still, thats the first (to me) good 
> reason. IMO not good enough to actually do it.
> 
> >   - Not sure how repository size will be, need an experiment
> 
> And keep in mind the repository is mirrored a trillion times, stored
> in
> snapshots, burned into images here and there, so any "small" increase
> in
> size actually means a *huge* waste in the end.
> 
> If we switch formats, going to something that's at least as good as
> the
> one we currently have should be the goal. (And I do not mean
> something
> like "its code/algorithm is bad", really, that argument is dead
> before
> it begins even).
> 
> Now, thats for .debs. There might be a better argument when talking
> about the index files in dists/, they are read so much more often, it
> *may* make more sense there.
> 



Bug#1055677: ITP: monaspace -- An innovative superfamily of fonts for code

2023-11-09 Thread M. Zhou
Package: wnpp
Severity: wishlist
Owner: Mo Zhou 
X-Debbugs-Cc: debian-devel@lists.debian.org

* Package name: monaspace
* URL : https://github.com/githubnext/monaspace/tree/main
* License : OFL-1.1
  Programming Lang: N/A
  Description : An innovative superfamily of fonts for code

I'll maintain this under the fonts team. It is not easy to find a
good coding font. This one looks great.

The Monaspace type system: a monospaced type superfamily with some
modern tricks up its sleeves. It is comprised of five variable axis
typefaces. Each one has a distinct voice, but they are all metrics-
compatible with one another, allowing you to mix and match them for a
more expressive typographical palette.

Thank you for using reportbug



Re: RFC: advise against using Proton Mail for Debian work?

2023-11-14 Thread M. Zhou
On Tue, 2023-11-14 at 18:40 -0500, Nicholas D Steeves wrote:
> 
> I see three outcomes:
> 
> A) Continue to explain this to new contributors on a one-by-one
> basis.
> B) Advise against using Proton Mail for Debian work (where?  our
> wiki?)
> C) Proton Mail begins to do something differently on their end, such
> as
> offering some features to Debian contributors that currently require
> a
> subscription.

Instead, is it possible to make BTS reject encrypted (unreadable)
mails? In that way no particular service name has to be mentioned
and Debian members will be able to figure out the mail service
has got an issue.

Discouraging the usage of a certain service publically just
because of some technical issues (instead of ethical/moral/legal)
is very much likely problematic, as an organization.
Plus, the 5-th clause if DFSG is "no discrimination against
persons or groups". People outside the community will mock us
if we ban it like this.

I just felt some sort of resemblance to "XXX operating system
is evil, do not use it". Whether or not people agrees with it
personally, it is inappropriate to make such opinion official.



DebGPT: how LLM can help debian development? demo available.

2024-01-02 Thread M. Zhou
Hi folks,

Following what has been discussed in d-project in an offtopic
subthread, I prepared some demo on imagined use cases to
leverage LLMs to help debian development.
https://salsa.debian.org/deeplearning-team/debgpt

To run the demo, the least requirement is a CUDA GPU with
> 6GB memory. You can run it on CPU of course, but that
will require > 64GB RAM, and it may take > 20 minutes to
give you a reply (I tested this using Xeon Gold 6140).
A longer context will require more memory.

If you cannot run the demo, I also provided a couple of example
sessions. You can use `reply.py` to replay my llm session
to figure out how it works.

Installation and setup guide can be found in docs/.

First start the LLM inference backend:
$ debgpt backend --device cuda --precision 4bit

Then you can launch the frontend to interact with it.
The complete list of potential use cases are listed
in demo.sh . I have recorded my session as an example for
every single command inside.

The following are some selected examples use cases:
(the results are not always perfect. You can ask LLM to retry though)

1. let LLM read policy section 4.9.1 and implement "nocheck"
   support in pytorch/debian/rules

   command: debgpt x -f examples/pytorch/debian/rules --policy 4.9.1 free -i
   replay: python3 replay.py examples/84d5a49c-8436-4970-9955-d14592ef1de1.json

2. let LLM add armhf, and delete kfreebsd-amd64 from archlist
   in pytorch/debian/control

   command: debgpt x -f examples/pytorch/debian/control free -i
   replay: python3 replay.py examples/e98f8167-be4d-4c27-bc49-ac4b5411258f.json

3. I always forget which distribution should I target when
   uploading to stable. Is it bookworm? bookworm-pu? bookworm-updates?
   bookworm-proposed-updates? We let llm read devref section 5.1 and
   let it answer the question

   command: debgpt devref -s 5.5 free -i
   replay: python3 replay.py examples/6bc35248-ffe7-4bc3-93a2-0298cf45dbae.json

4. Let LLM explain the difference among proposals in
   vote/2023/vote_002 .

   command: debgpt vote -s 2023/vote_002 diff
   replay: python3 replay.py examples/bab71c6f-1102-41ed-831b-897c80e3acfb.json

   Note, this might be sensitive. I added a big red warning in the program
   if you ask LLM about vote questions. Do not let LLM affect your vote.

5. Mimic licensecheck. The licensecheck perl implementation is based on regex.
   It has a small knowledge base, and does not work when the text is very noisy.

   command: debgpt file -f debgpt/llm.py licensecheck -i
   replay: python3 replay.py examples/c7e40063-003e-4b04-b481-27943d1ad93f.json

6. My email is too long and you dont want to read it. LLM can summarize it.

   command: debgpt ml -u 
'https://lists.debian.org/debian-project/2023/12/msg00029.html' summary -i
   replay: python3 replay.py examples/95e9759b-1b67-49d4-854a-2dedfff07640.json


7. General chat with llm without any additional information.

   command: debgpt none -i
   replay: python3 replay.py examples/da737d4c-2e93-4962-a685-2a0396d7affb.json

The core idea of all those sub functionalities are the same.
Just gather some task-specific information. And send them
together to the LLM.

I felt the state-of-the-art LLMs are much better than
that in a few months ago. I'll leave it to the community to
evaluate how LLM can help debian development, as well as how
useful it is, and how reliable it is.

You can also tell me more ideas on how we can interact with LLM
for debian-specific tasks. It is generally not difficult to
implement. The difficulty stems from the hardware capacity, and
hence the context length. Thus, the client program has to fetch
the most-relevant information regarding the task.

How do you think?



Bug#1060113: ITP: debgpt -- Chatting LLM with Debian-Specific Knowledge

2024-01-05 Thread M. Zhou
Package: wnpp
Severity: wishlist
Owner: Mo Zhou 
X-Debbugs-Cc: debian-devel@lists.debian.org

* Package name: debgpt
  Version : ? (CLI not yet stablized)
  Upstream Contact: me
* URL : https://salsa.debian.org/deeplearning-team/debgpt
* License : MIT/Expat
  Programming Lang: python
  Description : Chatting LLM with Debian-Specific Knowledge

This tool is still under development. I may not upload it very soon,
but an ITP number helps me silent lintian. I will not upload untill
I finish the CLI re-design, and finish the documentation parts.

There are some interesting findings while experimenting. For instance,
I find this rather convenient:

$ debgpt -HQ --cmd 'git diff --staged' -A 'Briefly describe the change as a git 
commit message.'

So I further wrapped the git commit command into

$ debgpt git commit

which automatically generates a description for staged changes and commit them 
for you.

Currently, some of the code of debgpt is written by debgpt, some of
the git commit messages are written by `debgpt git commit`. I will
try to explore more possibilities and add them in future releases.


The only missing dependency before uploaindg this is only src:python-openai,
which awaits in NEW as the time of writing.


The following is the full package description:

Large language models (LLMs) are newly emerged tools, which are capable of
handling tasks that traditional software could never achieve, such as writing
code based on the specification provided by the user. In this tool, we
attempt to experiment and explore the possibility of leveraging LLMs to aid
Debian development, in any extent.

Essentially, the idea of this tool is to gather some pieces of
Debian-specific knowledge, combine them together in a prompt, and then send
them all to the LLM. This tool provides convenient functionality for
automatically retrieving information from BTS, buildd, Debian Policy, system
manual pages, tldr manuals, Debian Developer References, etc. It also provides
convenient wrappers for external tools such as git, where debgpt can
automatically generate the git commit message and commit the changes for you.

This tool supports multiple frontends, including OpenAI and ZMQ.
The ZMQ frontend/backend are provided in this tool to make it self-contained.



Thank you for using reportbug



Disabling automatic upgrades on Sid by default?

2020-12-26 Thread M. Zhou
Hi folks,

I don't quite understand the meaning of automatic upgrades on a rolling
system such as Debian/Sid. According to my own experience, such
automatic upgrades could be dangerous.

Recently package ppp is pending for upgrade but it does not co-exist
with my currently installed network-manager. Today when I was shutting
down my machine, Gnome automatically checked the "install updates ..."
box for me before I realized its existence. As a result, the system
reboots and installed ppp by force, removing network-manager and break
my system for daily use as I need network-manager for wifi-access.

I've been a daily Sid user for at least 4 years. Automatic upgrades are
to blame for nearly all my system troubles. And I feel very
disappointed every time linux behaves like M$ windows.

So, do we have a consensus on whether automatic upgrades should be
enabled by default?



Re: Tools to better colorize the command-line experience?

2021-01-07 Thread M. Zhou
Hi Otto,

I rely on such helper utilities. Here I can list some recommendations.

On Thu, 2021-01-07 at 23:02 +0200, Otto Kekäläinen wrote:
> Do you have any tips about generic command-line coloring programs?

1. [diff-highlight] highlighting .diff/.patch files

   located in package
   git: /usr/share/doc/git/contrib/diff-highlight
   (there was once a discussion about it on -devel)

2. [grc] generic colouriser for everything

   can colorize output of many commands such as
   ip, dig, ifconfig, findmnt, mount, ps, etc.

3. [python3-pygments] highlighting source code

4. for highlighting the shell command line itself, there are
   [fish] for out-of-the-box shell syntax highlighting, and
   an alternative [zsh-syntax-highlighting] for zsh. Doesn't
   know whether there is such thing for bash.



Re: ZFS 2.0 update - buster

2021-02-23 Thread M. Zhou
Hi Daniel,

On Tue, 2021-02-23 at 12:23 +0100, Daniel Garcia Sanchez wrote:
> Yesterday the backports ZFS package was updated to 2.0. I have a
> machine using ZFS as the root filesystem. After the update the
> machine was not able to boot. I think it was the ZFS update that
> caused the problem.

Thanks for the bug report. I've opened a bug against ZFS, and
further discussions can be redirected to 
https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=983398

Could you please briefly describe the configuration of your
ZFS so the others can try to reproduce the problem?

And is there any existing bug similar to the one you
have encountered? See:
https://bugs.debian.org/cgi-bin/pkgreport.cgi?src=zfs-linux

(Please drop debian-devel in the follow-ups).



Re: Recalling Key Points of the Previous Attempt (was: Re: Regarding the new "Debian User Repository"

2021-07-05 Thread M. Zhou
On Mon, 2021-07-05 at 02:09 +, Paul Wise wrote:
On Sun, Jul 4, 2021 at 11:20 AM Mo Zhou wrote:
> (2) use the "hardware capabilities" feature of ld.so(8)
...
> Solution (2) will result in very bulky binary packages;

Solution (2) seems like the only option that can be done entirely
within Debian, how bulky are the packages that use this?


For example, the latest tensorflow packaging (in NEW queue)
results in these binary debs:
(thanks to Wookey, Michael, and Andreas for their work!)

~/Debian/tensorflow  ls -lh *.deb   


 18:57:24
-rw-r--r-- 1 root root  36M Jul  5 15:25 libtensorflow-cc2_2.3.1-1_amd64.deb
-rw-r--r-- 1 root root 1.7G Jul  5 15:25 
libtensorflow-cc2-dbgsym_2.3.1-1_amd64.deb
-rw-r--r-- 1 root root 629K Jul  5 15:25 libtensorflow-dev_2.3.1-1_amd64.deb
-rw-r--r-- 1 root root 5.6M Jul  5 15:25 
libtensorflow-framework2_2.3.1-1_amd64.deb
-rw-r--r-- 1 root root 128M Jul  5 15:25 
libtensorflow-framework2-dbgsym_2.3.1-1_amd64.deb

Supporting multiple ISA variants based on ld.so means a
multiple of the current package size.



Re: ROCm installation

2022-01-12 Thread M. Zhou
Hi,

Thanks for the updates.

On Wed, 2022-01-12 at 18:14 +0100, Maxime Chambonnet wrote:
"Native" Debian packages are starting to cover a significant portion of
the
stack [2], and it would be great to figure out the installation topic

The word "native" is ambiguous to a portion of developers as it may
also refer a native (debian/source/format) package.
For other readers: it's "offician debian package" in contrast to
"third-party debian packages by upstream.


on how to install ROCm today. 

After skimming through the mail I realize what you actually meant
is the "ROCm file installation layout" right?

The installation options and paths generally looked for by CMake 
Lists/configs
are currently:
- various cmake project-specific flags for the install paths of the 
components
   HIP_CLANG_PATH, HIP_DEVICE_LIB_PATH, HIP_PATH, ROCM_PATH, ... see
[5] 


Headers and libraries should installed under the standard path,
so that the compiler and linker should be able to find them without
additional flags. Just install all stuff to /usr should be enough.

- /opt/rocm as a default backup

There is no way for `/opt` as official debian package. If any component
breaks without any specific file under /opt, then it is a bug to fix.


I see at least three choices, and sub-decisions to be made:
- Multi-arch or not
   nvidia toolkit supports aarch64 and a few others.
   Cross-compiling ROCm from Debian could be interesting in a near-
future.

The rocm libraries and binary executables are architecture dependent.
Most of them should have Architecture: any in d/control.

Cross-compiling ROCm is not something worth being looked at IMHO.
ROCm targets on high performance computing. A hardware architecture
really capable of "high performance computing" can't be too weak
to compile ROCm itself.

That said, making the installation layout Multi-Arch aware is a
good practice. Most of the packages may have Multi-Arch: same
as long as they contain architecture-dependent files.

- Nested or not
   Other stacks and relatively important projects, such as postgresql
or 
llvm go
   nested (there is a central /usr/lib/{llvm-13, postgresql} directory,
   often with a sub ./bin, ...)

I did not understand this question. Do you mean something like
/usr/lib/rocm-{4.5.2,5.0.0},
or
/usr/lib/rocm-4.5.2/llvm ?

- Where to install machine-readable GPU code
   There is at least 3 types of device-side (aka GPU) binary files -
 .bc for bitcode,
 .hsaco for HSA code object and
 .co for code object.

How are these files read by ROCm? Is there anything like
"PYTHONPATH" for the gpu code files? We should choose a
supported path compatible to debian policy.

BTW, are these files architecture-independent? Namely,
can arm64 and amd64 produce the exactly the same (e.g.
md5sum-identical) output?

   Bitcode files are the machine readable form of the LLVM intermediate
   representation. HSA (Heterogeneous System Architecture) and other 
code object
   files are AMD containers for GPU machine code. PostgreSQL does use
llvm
   bitcode files: since the install path is nested, they are in
   /usr/lib/postgresql/14/lib/bitcode.
   Since it is arch-independent in the sense of the CPU architecture, I
have
   been proposed that such code should reside in /usr/share.

Nested layout for llvm and postgresql intends to allow multiple
versions of the software co-exist on the same system. For example,
llvm-{11,12,13} may be installed simultaneously on Debian.

We debian rocm team does not have so many contributors to support
multiple versions. Just do it the simplest way as we can.

The official repacked nvidia-cuda-toolkit is not relevant
to such nested layout.

What I tried to keep in mind is that:
- shared libraries should be easily discoverable in paths looked by
   /etc/ld.so.conf
- there are only so much paths that cmake find_package in config mode
   looks for [8].

Shared objects from Multi-arch aware library packages should be
put at /usr/lib// as long as they are indended
for public usage.

Don't be misled by complicated setups such as llvm, postgresql or
the upstream non-standard installation path. In the standard setup
everything is likely becoming simpler. When you started to think
about ld.so.conf for a regular official debian shlib package, I
doubt something had been going wrong.

Gentoo has basically finished their ROCm packaging. Feel free to
borrow them as their license permits.

I attached as an image a direct comparison between some arbitrary 
combinations
of these decisions. The directories are bundled in the attached archive
too.
- install_layout_proposal_v1 goes
   multi-arch, flattened, and with GPU code in /usr/share
- install_layout_proposal_v2 goes
   "ante-multi-arch", nested, and with GPU code in /usr/lib

1. header.

installation path of architecture-dependent headers should contain
multi-arch triplet (e.g. x86_64-linux-gnu). In this case,
Architecture: any, Multi-Arch: same

if the headers are identical across all architectures, the mu

Re: ROCm installation

2022-01-12 Thread M. Zhou
On Wed, 2022-01-12 at 21:06 +0100, Maxime Chambonnet wrote:
> 
> > Headers and libraries should installed under the standard path,
> > so that the compiler and linker should be able to find them without
> > additional flags. Just install all stuff to /usr should be enough.
> Currently for example rocm-hipamd installs to /usr/hip, and
> lintian yells a lot. All to /usr is quite not clear enough.

Then it sounds like that the upstream CMake installation targets
are primarily written for somewhere like /opt instead of /usr.
I looked into one Gentoo ebuild for ROCm and the problem is
rather distinct.

https://github.com/gentoo/gentoo/blob/2ed748a3b6412f99bc249e089e9221e38417a8f8/dev-util/hip/hip-4.1.0.ebuild

If shlibs are installed to somewhere like /usr/lib/rocm/lib/,
we are still able to tamper with ld.so.conf.
If binary executables are installed to /usr/lib/rocm/bin/,
then we are screwing up with the default shell PATH.
This is a deadend because we are not going to patch all
POSIX and non-POSIX shell configs. Neither do we introduce weird
scripts for the user to source.

Standarlizing the upstream install target is inevitable
to some extent.
A flag can be introduced for the upstream cmake file along
with some code, which by default install things to /usr/local
like most other existing software.

> 
> > 
> > I did not understand this question. Do you mean something like
> > /usr/lib/rocm-{4.5.2,5.0.0},
> > or
> > /usr/lib/rocm-4.5.2/llvm ?
> Rather the first, not sure I see a difference, in all cases, it looks
> nested under "rocm-something" to me. And we further down agree
> that nested is probably not the way.

Yes. We should just stay away from nesting things.

> > How are these files read by ROCm? Is there anything like
> > "PYTHONPATH" for the gpu code files? We should choose a
> > supported path compatible to debian policy.
> There is a cmake flag / environment variable for now,
> HIP_DEVICE_LIB_PATH :<
> The current preferred layout is /usr/amdgcn/*.bc

Anything like
/usr/share/amdgcn/ (in case they are arch-indep)
or
[/usr/lib/amdgcn, /var/lib/amdgcm, /var/cache/amdgcn]
(in ase they are arch-dep) could be better.

> > BTW, are these files architecture-independent? Namely,
> > can arm64 and amd64 produce the exactly the same (e.g.
> > md5sum-identical) output?
> I don't know, we discussed it last jitsi meeting and
> I believe that no one tried yet :)

Then we regard them as architecture-dependent for initial
debian packaging.

I looked around in the Gentoo ebuild repository,
https://github.com/gentoo/gentoo/search?q=hip&type=commits
https://github.com/gentoo/gentoo/search?q=rocm&type=commits
from which we can borrow a lot. Namely, starting from
scratch by ourselves is not necessary.
> 
> 



Lottery NEW queue (Re: Are libraries with bumped SONAME subject of inspection of ftpmaster or not

2022-01-21 Thread M. Zhou
Hi Andreas,

Thank you for mentioning this. Your post inspired me to came up a
new choice.

On Fri, 2022-01-21 at 11:33 +0100, Andreas Tille wrote:
> 
> This recently happed for me in the case of onetbb (which was not
> uploaded by myself - so I'm not even asking for myself while other
> packages of mine (new ones and ones with just renames) are waiting in
> the queue.)  There are lots of other packages (namely numba and lots
> of
> other packages depending from numba that FTBFS) depending from onetbb
> -
> thus I pinged on #debian-ftp more than once (which I really hate).

When I was once an ftp-trainee (now I'm not in ftp-team), there was no
popcon query when doing dak review. Whether a package is of high
priority depends on one's experience to some extent. Even if I knew
some packages must have a high popcon, their bulky size make the
review process (checking files one by one) not quite easy.

> Due to this I'd like to have a clear statement here (which would
> prove myself in pinging either right or wrong and thus I'm asking):
> 
>   A. Packages with new binary package names should not undergo
>  the copyright inspection by ftpmaster and can be
>  auto-approved (either technically or manually)
> 
>   B. Any package in the new queue needs to be inspected by
>  ftpmaster for copyright issues and it takes as long as
>  it takes no matter whether it is a new package or a
>  package with changed binary name.
> 

I'd rather propose choice C. Because I to some extent understand
both sides who support either A or B. I maintain bulky C++ packages,
and I also had a little experience reviewing packages on behalf of
ftp-team.

A -- Some (e.g. C++) packages may frequently enter the NEW queue,
with OLD src and NEW bins (e.g. SOVERSION bump). A portion of devs
feel it is not necessary for frequently because it drastically
slows down the maintainer's work. In the worst case, when the
package finally passed the NEW queue, the maintainer may have
to go through NEW queue again upon the next upload. (This is very
likely to happen for tensorflow, etc).

B -- Uploads with OLD src and OLD bin are not required to go through
NEW queue, even if a decade as passed as long as the src names and
bin names are kept unchanged. One of the supports for B is that
the d/copyright file may silently rot (get outdated), as uploads 
without updated d/copyright won't be rejected. Checking packages
when they bump SOVERSION is to some extent a "periodical" check.
This worked very well for packages with stable ABI. But for pacakges
without stable ABI, especially bulky (C++) packages, this is
painful for both uploaders and ftp checkers.

Given the understanding of both options, I propose choice C:

C. Lottery NEW queue:

if (src is new):
# completely new package
require manual-review
elif (src is old) but (bin is new):
if not-checked-since-last-stable-release:
# approximate advantage of choice B.
require manual-review
elif src.version already exists in archive
# choice A wants to avoid this case.
auto-accept
else:
if (lottery := random()) < threshold
require manual-review
else:
# I expect a faster pace of debian development.
auto-accept

In this way concerns for both people supporting A and B can be partly
addressed. The old-src-new-bin case have a large chance to pass NEW
as long as they have been reviewed once after the last stable release.
The burden for ftp-team can be reduced. And pace of maitnainers can
be faster with less chance of being blocked and unstable to do anything
but wait.

> I would love to have this clearly documented since in case B. I would
> stop wasting my and ftpmaster time with nagging which is not
> rectified
> than.
> 
> I personally clearly prefer A. and I wish we could clarify this
> situation.

Me too. I perfer A personally as well. Debian might be the only major
distribution which checks the license so thoroughly. Unconditionally
allow an old-src-new-bin upload to pass is basically impossible, I
speculate. Choice C might be more practical and feasible.

It must be many outdated d/copyright in our archive. Letting eligible
uploads automatically pass with a high probability is not likely
causing problem even if yet another outdated d/copyright sneaked in.

> Kind regards
> 
>  Andreas.
> 
> [1] https://lists.debian.org/debian-devel/2021/07/msg00231.html
> [2] https://ftp-master.debian.org/new/onetbb_2021.4.0-1~exp1.html
> 




Re: Legal advice regarding the NEW queue

2022-02-02 Thread M. Zhou
On Wed, 2022-02-02 at 13:44 +0100, Andreas Tille wrote:
> Hi Wookey,
> 
> Am Tue, Feb 01, 2022 at 02:07:21PM + schrieb Wookey:
> > Has anyone on the actual FTP team responded to this thread yet?
> > (sorry, I can't remember who that is currently)
> > 
> > Either on Andreas's original simple question: 'Do we still _have_
> > to keep the binary-NEW thing?'
> > Or this more complex question: Is NEW really giving us a pain:risk
> > ratio that is appropriate?
> > 
> > Andreas tried hard to get someone to just stick to the first matter
> > and answer that. I don't recall seeing an answer from FTP-master
> > yet?
> 
> Me neither.  In my eyes its a problem that it is hard to comminicate
> with ftpmaster team.  I tried on IRC as well but I prefer mailing
> list
> since this is recorded online.
> 

Without answer from FTP team, it's hard to reach anywhere constructive
with respect to this problem. They have the most accurate understanding
on what part needs to be improved or revised.

Of course I can propose ideas based on my shallow experience and
speculation, but ideas in this thread will largely going to sink
unless there is any effective way to make real progress on it.



Re: Rakudo has a transition tracker and then what ?

2022-02-03 Thread M. Zhou
@dod: It looks that we have to change the Architecture: of raku-*
packages into any (instead of "all") because there is no binnmu
for Arch:all package. Then upon each time we bump the rakudo API
version, we just need to file a regular transition bug to the
release team and trigger the rebuild.

On Thu, 2022-02-03 at 19:13 +0100, Jonas Smedegaard wrote:
> Quoting Paul Gevers (2022-02-03 19:08:34)
> > On 03-02-2022 18:53, Dominique Dumont wrote:
> > > Hoping to automate this process, I've setup a transition tracker
> > > for Rakudo
> > > [1].
> > 
> > See
> > https://lists.debian.org/debian-release/2022/02/msg00029.html and 
> > follow-up messages.
> 
> As I understand it, librust-* packages are released as arch:any (not 
> arch:all) for this exact reason (I seem to recall some discussion
> with 
> the ftpmaster and/or release team about that - evidently leading to
> that 
> praxis being tolerated, except I am totally wrong and the cause for
> Rust 
> is a different one).
> 
> 
>  - Jonas
> 




Re: Rakudo has a transition tracker and then what ?

2022-02-04 Thread M. Zhou
Hi Sebastian,

Building the files upon installation is exactly the original
behavior. The problem is that compilation speed is too slow.
Three raku packages could take more than 2 minutes every
time when there is a raku upgrade to any version.

On Fri, 2022-02-04 at 13:55 +0100, Sebastian Ramacher wrote:
> On 2022-02-03 17:58:24, M. Zhou wrote:
> > @dod: It looks that we have to change the Architecture: of raku-*
> > packages into any (instead of "all") because there is no binnmu
> > for Arch:all package. Then upon each time we bump the rakudo API
> > version, we just need to file a regular transition bug to the
> > release team and trigger the rebuild.
> 
> If the pre-compiled files are like pyc files for Python, is there are
> a
> reason to not follow the same approach? That is, build the pre-
> compiled
> files on install.
> 
> Cheers
> 
> > 
> > On Thu, 2022-02-03 at 19:13 +0100, Jonas Smedegaard wrote:
> > > Quoting Paul Gevers (2022-02-03 19:08:34)
> > > > On 03-02-2022 18:53, Dominique Dumont wrote:
> > > > > Hoping to automate this process, I've setup a transition
> > > > > tracker
> > > > > for Rakudo
> > > > > [1].
> > > > 
> > > > See
> > > > https://lists.debian.org/debian-release/2022/02/msg00029.html a
> > > > nd 
> > > > follow-up messages.
> > > 
> > > As I understand it, librust-* packages are released as arch:any
> > > (not 
> > > arch:all) for this exact reason (I seem to recall some discussion
> > > with 
> > > the ftpmaster and/or release team about that - evidently leading
> > > to
> > > that 
> > > praxis being tolerated, except I am totally wrong and the cause
> > > for
> > > Rust 
> > > is a different one).
> > > 
> > > 
> > >  - Jonas
> > > 
> > 
> > 
> 




Re: Bug#1006885: ITP: lumin -- pattern match highlighter

2022-03-08 Thread M. Zhou
Meh... an interesting package name.

On Mon, 2022-03-07 at 20:29 +0100, Vincent Bernat wrote:
>  ❦  7 March 2022 18:33 +01, Adam Borowski:
> 
> > > lumin highlights matches to a specified pattern (string or
> > > regular
> > > expression) in files, using color. This is similar to grep with
> > > colorized output, but it outputs all lines in the given files,
> > > not
> > > only matching lines.
> > 
> > .--[ ~/bin/hl ]
> > #!/bin/sh
> > sed 's/'"$*"'/\c[[1;33m&\c[[0m/g'
> > `
> 
> grep --color -C4000 pattern
> 
> There are other suggestions here:
> https://stackoverflow.com/questions/981601/colorized-grep-viewing-the-entire-file-with-highlighted-matches




Re: isa-support -- exit strategy?

2022-03-25 Thread M. Zhou
Hi Adam,

I think the problems that apt/dpkg
are trying to deal with is already complicated enough, and
the architecture specific code are still not significant
enough to introduce change there.

Indeed supporting number crunching programs on ancient
hardware is not meaningful, but the demand on Debian's
support for number crunching is not that strong according
to my years of observation.

For popular applications that can take advantage of above-baseline
instruction sets, they will eventually write the dynamic code
dispatcher and add the fallback.

For applications seriously need performance, they will
leave CPU and go to GPU or other hardware. If the user correctly
write the code and fully leverage GPU, the non-optimal CPU
code won't necessarily be a bottleneck.

For applications seriously need CPU performance, they are
possibly going to tell the users how to tweak compiling
parameters and how to compile locally.

Eventually, my thoughts about above-baseline support are
still either source-based package distribution like portage, or
small deb repository built with a customized dpkg-dev, like
I mentioned in the past.

On Fri, 2022-03-25 at 23:34 +0100, Adam Borowski wrote:
> Hi!
> While packages are allowed to not support entire architectures
> outright, there's a problem when some code requires a feature that is
> not present in the arch's baseline.  Effectively, this punishes an
> arch
> for keeping compatibility.  The package's maintainers are then
> required
> to conform to the baseline even when this requires a significant work
> and/or is a pointless exercise (eg.  scientific number-crunching code
> makes no sense to run on a 2002 box).
> 
> With that in mind, in 2017 I added "isa-support" which implements
> install-time checks via a dependency.  Alas, this doesn't work as
> well
> as it should:
> 
> * new installs fail quite late into installation process, leaving you
>   with a bunch of packages unpacked but unconfigured; some apt
>   frontends don't take this situation gracefully.
> 
> * upgrades when an existing package drops support for old hardware
> are
>   even worse.
> 
> * while a hard Depends: works for leafy packages, on a library it
>   disallows having alternate implementations that don't need the
>   library in question.  Eg, libvectorscan5 blocks a program that
>   uses it from just checking the regexes one by one.
> 
> Suggestions?
> 
> 
> Meow!




Re: isa-support -- exit strategy?

2022-03-26 Thread M. Zhou
On Sat, 2022-03-26 at 11:42 +0100, Stephan Lachnit wrote:
> On Sat, Mar 26, 2022 at 2:36 AM M. Zhou  wrote:
> > 
> > Indeed supporting number crunching programs on ancient
> > hardware is not meaningful, but the demand on Debian's
> > support for number crunching is not that strong according
> > to my years of observation.
> > 
> > For popular applications that can take advantage of above-baseline
> > instruction sets, they will eventually write the dynamic code
> > dispatcher and add the fallback.
> > 
> > For applications seriously need performance, they will
> > leave CPU and go to GPU or other hardware. If the user correctly
> > write the code and fully leverage GPU, the non-optimal CPU
> > code won't necessarily be a bottleneck.
> > 
> > For applications seriously need CPU performance, they are
> > possibly going to tell the users how to tweak compiling
> > parameters and how to compile locally.
> 
> I have to disagree on this one. Yes, runtime detection and GPU
> acceleration is great and all, but not every scientific library does
> it and I think it's unrealistic for us to patch them all up.

Please note I wrote "they (i.e. the upstream)" will implement
the runtime detection or GPU acceleration, instead of us (Debian).

> Also I don't like the point "since there is low demand for number
> crunching on Debian, so let's just continue to ignore this problem".

If it was 6 years ago, I would disagree with what I've said in the
original post. Whether you like it or not, what I said is my
changed mind after closely working on numerical related libraries
for 6 years in Debian. And to be clear, I hold a negative opinion
on what we Debian could actually change besides the upstream.

If the upstream does not write runtime detection or GPU acceleration,
they are either not facing a wide range of audience, or the problem
does not matter, or simply the software isn't appropriate for
Debian packaging.

I mentioned infinite times that the eigen3 library which implements
the core numerical computation part of TensorFlow does not support
runtime detection -- because CPU acceleration does not matter for
most of the users. Sane users who really need CPU performance are
able to recompile tensorflow themselves.

> At least I know a decent amount of people that use Debian (or
> downstream distros) for scientific number crunching. Compiling
> optimized for large workloads will always be a thing no matter the
> baseline, but when getting started distro packages are just one less
> thing to care about.

I humbly believe over 1/3 of packages I (co-)maintained for Debian are
for number crunching. And I INSIST in my NEGATIVE opinion after
trying to do some experiments over the years. The number of people
who really care about the ISA baseline for Debian distributed package
is very likely less than you expected.

I appreciate people who speak for ISA baseline, and appreciate any
actual effort in this regard. But the lack of care eventually
changed my mind and make me hold a negative opinion.

If you think I was simply unsuccessful in promoting any solution for
the topics in this discussion, please go ahead and I will support
you in a visible way.

> On Sat, Mar 26, 2022 at 7:25 AM Andrey Rahmatullin  wrote:
> > 
> > A partial arch (whatever that is, yeah) with the x86-64-v3 baseline, and
> > optionally raise the main amd64 baseline to x86-64-v2?
> 
> +1

So again, that's possibly something like a partial debian archive with
a dpkg fork I mentioned.
That's probably the same idea as the ancient SIMDebian proposal.
See the example patch for dpkg:
https://github.com/SIMDebian/dpkg/commit/13b062567ac58dd1fe5395fb003d6230fd99e7c1
So that a partial archive with selected source packages can be
rebuilt automatically in bumped ISA baseline.

To be clear, the fact that tensorflow does not support runtime detection
while the baseline code sucks in performance is the direct reason
why I proposed SIMDebian. The project is abandoned, and patch is
only for reference.



Re: Debian doesn't have a "core team", should it? can it?

2022-04-10 Thread M. Zhou
Hi,

"Core team" is already ambiguous enough. I'd suggest leave it alone and do not
try to define it. Attempts to define it are likely lead to nowhere other than
a definition hell.  Unless there is such need in Debian constitution, I think
Debian should not try to do that.

The intention of that post is simply transferring some packages to more
suitable maintainers. As long as the new maintainer (rpm team as you mentioned)
feels suitable to take over, I don't see any problem here.

I'm even ok with non-debian member being maintainers for critical packages
as long as the work goes through some kind of peer review.

On Sun, 2022-04-10 at 21:23 +0100, Peter Michael Green wrote:
> Recently andreas-tille sent the following message about libzstd to 
> debian-devel
>  
> > I'd like to repeat that I'm really convinced that libzstd should *not*
> > be maintained in the Debian Med team but rather some core team in
> > Debian.  It is here for historic reasons but should have moved somewhere
> > more appropriately since it became its general importance.
>  
>  It ended up being transferred to the rpm team, which got it out of the med 
> team's
>  hair but I'm not convinced the rpm team satisfies "some core team" any better
>  than the med team does.
>  
>  As far as I can tell Debian has broadly 3 types of teams.
>  
>  1. Teams focussed on an application area, for example the med team, the 
> science team, the games team.
>  2. Teams focussed on a programming language, for example the python team, 
> the perl team, the rust team. There is
> however no such team for software written in C, C++ or shell script.
>  3. Teams focussed on a particular piece of software
>  
>  As far as I can tell this means that there are a bunch of packages that 
> "fall between the gaps", packages
>  that are of high importance to Debian as a whole but are not a great fit for 
> any team. They either end up not
> associated with a team at all or sometimes associated with a team who 
> happened to be the first to
>  use the library.
>  
>  I decided to get a sample of packages that could be considered "core", 
> obviously different people have different
> ideas of what should be considered core but I decided to do the following to 
> get a list of packages that hopefully
> most people would consider core.
>  
>  debootstrapped a new sid chroot
>  ran tasksel install standard (a bit less spartan than just the base system)
>  ran apt-get install build-essential (we are an opensource project, 
> development tools are essential to us)
>  ran apt-get install dctrl-tools (arguablly not core, but I needed it to run 
> the test commands and it's only one
> package)
>  
>  There were 355 packages installed, built from 223 source packages. I got a 
> list of source packages with
>  the command
>  
>  grep-dctrl installed /var/lib/dpkg/status -n -ssource:package | cut -d ' ' 
> -f 1 | sort | uniq > sourcepks.txt
>  
>  I then extracted the source stanzas with.
>  
>  grep-dctrl -e -F Package `sed "s/.*/^&$/" sourcepks.txt | paste -s -d '|'`
> /var/lib/apt/lists/deb.debian.org_debian_dists_sid_main_source_Sources > 
> sourcestanzas.txt
>  
>  Then wrote a little python script to extract teams from those stanzas.
>  
>  #!/usr/bin/python3
>  from debian import deb822
>  import collections
>  import sys
>  f = open(sys.argv[1])
>  counts = collections.defaultdict(int)
>  for source in deb822.Sources.iter_paragraphs(f):
>  maintainers = [source['maintainer']]
>  if 'uploaders' in source:
>  maintainers += source['uploaders'].split(',')
>  maintainers = [s.strip() for s in maintainers if s.strip() != '']
>  teams = [s for s in maintainers if ('team' in s.lower()) or ('lists' in 
> s.lower()) or ('maintainers' in
> s.lower()) or ('group' in s.lower())]
>  teams.sort()
>  counts[tuple(teams)] += 1
>  #print(repr(maintainers))
>  #print(repr(teams));
>  
>  for teams , count in sorted(counts.items(), key = lambda x: x[1]):
>  if len(teams) == 0:
>  teamtext = 'no team'
>  else:
>  teamtext = ', '.join(teams)
>  print(str(count) + ' ' + teamtext)
>  
>  This confirms my suspiscions, of the 223 source packages responsible
>  for the packages installed in my "reasonablly but not super minimal"
>  environment more than half of them were not associated with a team at all.
>  
>  I also saw a couple of packages in there maintained by the science team
>  and the med team. two source packages telnet and apt-listchanges
>  were orphaned.
>  
>  I do not know what the soloution is, whether a "core team" is a good idea
>  or even whether one is possible at all but I feel this is an elephant that
>  should have some light shone on it. 
>  



Is there room for improvement in our collaboration model?

2022-04-14 Thread M. Zhou
Hi,

I just noted this problem recently. Our model for team collaboration 
(specifically for
package maintenance) is somewhat primitive.

We are volunteers. Nobody can continuously maintain a package for decades like 
a machine.
Currently our practice for accepting other people's help involves:
(1) low-threshold NMU. This is not quite easy to lookup (only shows on 
tracker.d.o, etc)
(2) VAC note in debian-private channel. Who remembers you said the others can 
help you
upload a package? And when does that temporary permission expire? What tracks 
that?
(3) salsa permission. Yes, after joining the salsa team, others can commit code 
as they like.
However, when it needs to be uploaded, the others still have to write mail to 
the maintainer
for an ack. Whether multiple peoples should commit at the same time is 
coordinated through
emails in order to avoid duplicated work.
(4) last-minute NMU/delayed. When the others cannot bear an RC bug anymore, 
they may
want to nmu upload to the delayed queue.
(5) intend to salvage. When the others cannot hear from me for very long time, 
this
is the only formal way to take over maintanence (afaik).

The problems are:
(1) whether multiple people should work on the same package at the same time is
based on human communication. Namely, acquiring lock and releasing lock on a 
package
is done through human communication. This is clearly something could be 
improved.
It should be some program to acquire and release the lock.
(2) different packages are clearly regarded differently by people.
I'm actually very open to the other people hijacking some of my selected 
packages
and fix these packages as they like. Namely, I think there should be a system 
where
we can optionally tag our packages as:

 A. The other DDs can do whatever they like to this package and upload directly
without asking me in a hijacking way.
 B. May git commit but should ask before upload.
 C. Must ask before any action.
 D. ...

You know that in parallel programming, optimizing IPC (in this context it would 
be
inter-DD communication) and optimizing the locking mechanism could help.

My motivation for pointing these stems from some uncomfortable feelings when
it's hard to get response from busy maintainers. If I understand correctly,
technically DDs have enough permission to hijack any package and do the upload.
People are polite and conservative to not abuse that power. But ... in order to
improve contributor experience in terms of collaboration ... maybe we can
use that tagging mechanism to formally allow a little bit of upload permission 
abuse.

I think this will also improve newcomer's contributing experience.
This proposal is also filed at
https://salsa.debian.org/debian/grow-your-ideas/-/issues/34



Re: Is there room for improvement in our collaboration model?

2022-04-15 Thread M. Zhou
On Fri, 2022-04-15 at 14:24 +0200, Luca Boccassi wrote:
> > 
> > I think this will also improve newcomer's contributing experience.
> > This proposal is also filed at
> > https://salsa.debian.org/debian/grow-your-ideas/-/issues/34
> 
> What about doing something even simpler - rather than having additional
> generic/core tags/teams/groups, for packages where one wants to say
> "please help yourself/ourselves", simply set the Maintainer field to
> debian-devel@lists.debian.org, have the Salsa repository in the debian/
> namespace, and that's it? From thereon, anyone can do team-uploads by
> pushing to salsa and uploading directly, no need for
> acks/delays/permissions/requests.
> 

A simpler solution sounds good to me, except that change would be
somewhat "permanent" in stating the original maintainer's preference.
I forgot to mention in my original post that the tags can optionally
expire.

So, things like, `tag all my packages as "feel free to nmu" within
the next two weeks` would be trackable.

BTW, setting debian-devel@ as maintainer will result in mail flood.
An alternative should be considered.



needs suggestion on LuaJit's IBM architecture dilemma

2022-05-11 Thread M. Zhou
Hi folks,

I learned in disappointment after becoming LuaJit uploader that
the LuaJit upstream behaves uncooperatively especially for IBM
architectures [1]. IIUC, the upstream has no intention to care
about IBM architectures (ppc64el, s390x).

The current ppc64el support on stable is done through cherry-picked
out-of-tree patch. And I learned that the patch is no longer
functional[2] for newer snapshots if we step away from that
ancient 2.1.0~beta3 release.

However, architectures like amd64 needs relatively newer version[3],
while IBM architecture still has demand luajit[4] (only the
ancient version will possibly work on IBM archs).

I'm looking for suggestions on what to do next:

option 1:
  drop IBM architectures that the upstream cannot support
  from src:luajit, and provide archs like amd64 with relatively
  newer snapshot versions[5].
  and package reliable alternatives (if any) for IBM archs.
option 2:
  use latest source for amd64 architecture, and rollback the
  source specifically for IBM architectures to keep it
  functional.
option 3:
  rollback to the ancient stable release and screw it
option 4:
  ...

Thanks.

[1] https://github.com/LuaJIT/LuaJIT/pull/140
[2] https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=1004511
[3] https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=981808
[4] https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=1008858
[5] Yes ... the upstream do not release anymore.



Bug#1011320: ITP: luajit2 -- OpenResty's Branch of LuaJIT 2

2022-05-19 Thread M. Zhou
Package: wnpp
Severity: wishlist
Owner: Mo Zhou 
X-Debbugs-Cc: debian-devel@lists.debian.org

* Package name: luajit2
* URL : https://github.com/openresty/luajit2
* License : MIT/X
  Description : OpenResty's Branch of LuaJIT 2

I'm going to remove ppc64el support from src:luajit,
let's see if this derivative works for IBM architectures
based on suggestions.



Re: needs suggestion on LuaJit's IBM architecture dilemma

2022-05-19 Thread M. Zhou
Hi Dipack,

I filed an ITP bug for luajit2 and will look into it.
Thank you!

On Mon, 2022-05-16 at 16:22 +, Dipak Zope1 wrote:
> Hello all,
> It'd be better to switch to luajit2 if it is possible. We can see
> right now the main issue with luajit project is no response from
> upstream of LuaJIT to previous merge request attempts. And luajit2
> already contains almost everything needed for s390x support.
>  
> Thanks,
> -Dipak
>  



Re: needs suggestion on LuaJit's IBM architecture dilemma

2022-05-19 Thread M. Zhou
On Thu, 2022-05-19 at 16:30 +0200, Frédéric Bonnard wrote:
> Hi,
> 
> I've followed luajit closely since 2015 on ppc64el as a porter
> without enough knowledge to port it, but trying to ease on the
> packaging/Debian side (being both IBMer/DD).
> That port has been a mixed effort between a code bounty and an IBM
> effort (some devs) .
> It didn't started well (
> https://www.freelists.org/post/luajit/PPC64le-port-status,1 )
> and it has never grown and be really part of the upstream project
> sadly.
> 
> With the years, I'm even less optimistic as no IBM nor external
> developer seem to be working on that. Mike Pall seems to be around
> though as you said there's no release (not necessarily a bad sign).
> I can ping inside IBM but I'm not sure there will be any positive
> feedback.
> 
> So I'd say we have no choice, i.e. let's drop IBM arches .
> What I did a few times for packages depending on libluajit was to use
> liblua instead :
> https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=892765
> 
> Thanks,
> F.

Nobody want to spend time on an bottomless hole ...
I'll simply remove ppc64el architecture support from src:luajit,
and give src:luajit2 (openresty) a try.



Bug#1011460: ITP: gnome-shell-extension-flypie -- an innovative marking menu written as a GNOME Shell extension

2022-05-23 Thread M. Zhou
Package: wnpp
Severity: wishlist
Owner: Mo Zhou 
X-Debbugs-Cc: debian-devel@lists.debian.org

* Package name: gnome-shell-extension-flypie
  Upstream Author : Simon Schneegans
* URL : https://github.com/Schneegans/Fly-Pie
* License : MIT/X
  Programming Lang: Javascript
  Description : an innovative marking menu written as a GNOME Shell 
extension

Generally I don't like packaging fancy stuff for Debian. But this
literally surprised me, and brings me a very familiar feeling for
game menus designed for quick operation. If you are a gamer who
like recent action games a lot, I think you may like this extension as well.

Basically the marking menu is to pop up a circle-shaped menu,
where selection is made by the mouse position relative to
the circle center upon key release.

And it has got an achievement system like games... LOL



questionable massive auto-removal: buggy deps nvidia-graphics-drivers-tesla-470

2022-05-24 Thread M. Zhou
I wonder why an irrelevant package suddenly triggered autoremoval
of a very large portion of packages from testing.

https://udd.debian.org/cgi-bin/autoremovals.cgi
Searched for keyword nvidia-graphics-drivers-tesla-470, and I got
68866 entries. There must be something wrong.

https://bugs.debian.org/cgi-bin/pkgreport.cgi?src=nvidia-graphics-drivers-tesla-470
Looking at the bug report of the package itself, and it does not
look like glibc package is made broken by it.

Any idea?



Bug#1011667: ITP: mujoco -- A general purpose physics simulator.

2022-05-25 Thread M. Zhou
Package: wnpp
Severity: wishlist
Owner: Mo Zhou 
X-Debbugs-Cc: debian-devel@lists.debian.org

* Package name: mujoco
  Version : 2.2.0
  Upstream Author : DeepMind
* URL : https://mujoco.org/
* License : Apache-2.0
  Programming Lang: C
  Description : A general purpose physics simulator.

I plan to maintain this under Debian Deep Learning Team.



Re: how to convey package porting details?

2022-06-05 Thread M. Zhou
I like this idea. I would personally recommend adding negative priority
as well. You know, it is completely meaningless to port some high performance
scientific computing software to archs like armel...

Meanwhile, different packages varies in the difficulty to port as well.
A software that heavily leverages architecture specific intrisics is not
likely easy to port. Packages that does not require hardware performance,
and simply misses several arch-specific C macros should be easy to port.

Since package maintainers should have somehow skimmed the whole codebase,
they could provide an accurate hint about this, as long as such hints
are useful for porters.

On Mon, 2022-06-06 at 10:02 +0800, Paul Wise wrote:
> 
> There are lots of packages that need porting to every new architecture
> that comes along. There are others that don't require porting but
> benefit in some way from porting to some aspect of the architecture.
> 
> In order to help porters to prioritise those packages and quickly add
> support for their architecture, it would be nice to have a standard
> mechanism for source packages to indicate that they potentially require
> porting for each new architecture, the nature of the porting required
> and where in the source package the changes are required.
> 
> For example packages like autotools-dev, linux, gcc or linuxinfo could
> state that they require porting to all new arches and have a pointer to
> Debian and or upstream porting info. Packages containing a JIT but with
> interpreter fallback could list themselves as having optional porting. 
> 
> Once we have a mechanism for this, the documentation for creating new
> Debian ports could provide a way to find the list of packages that need
> porter attention; via codesearch.d.n, apt, apt-file, grep-available etc.
> 
> https://wiki.debian.org/PortsDocs/New
> 



Re: How to handle packages which build themselves with bazel?

2022-06-08 Thread M. Zhou
Hi David,

Debian has a group of people working on bazel packaging.
https://lists.debian.org/debian-bazel/2022/06/threads.html
And bazel itself has been a years-long pain for tensorflow packaging.

I'm not following the updates for bazel packaging, but you
may browse the packaging work of the corresponding team
to see whether there is anything you are interested in:
https://salsa.debian.org/bazel-team/bazel

On Wed, 2022-06-08 at 17:18 +0200, David Given wrote:
> I'm looking into converting some of my upstream packages to use Google's 
> bazel build system, because it makes life
> much easier as a developer.
> 
> Unfortunately, with my other hat on, it makes life much harder as a package 
> maintainer: bazel is very keen on
> downloading source packages and then building them locally, resulting in a 
> mostly-statically-linked executable.
> protobuf is the most obvious culprit here, because if you do anything with 
> Google's ecosystem you inevitably end up
> using protobufs, and as soon as you refer to a cc_proto_library rule in bazel 
> you get a statically linked libprotobuf.
> 
> Are there any known best practices yet in Debian on how to persuade bazel not 
> to do this, and to use the system one
> instead?
> 



Re: needs suggestion on LuaJit's IBM architecture dilemma

2022-06-09 Thread M. Zhou
>From the buildlogs / testlogs / local tests (ppc64el qemu), it seems that
there is completely no improvement for ppc64el. Simple scripts can still
encounter segmentation faults (e.g., autopkgtest for src:lua-moses).
s390x is newly enabled. I still have not seen enough test log to give
any preliminary conclusion.


On Thu, 2022-06-09 at 16:19 +0200, Frédéric Bonnard wrote:
> Hi Mo, Paul,
> did you see any improvement with luajit2 ?
> I was looking at luakit, which still fails "silently" on ppc64el, a lua
> script generating a .h with no symbols with luajit2, where it does work
> with lua.
> Also I see that the autopkgtest of knot-resolver still fails on
> ppc64el.
> 
> F.
> 
> On Thu, 19 May 2022 22:14:01 -0400 "M. Zhou"  wrote:
> > On Thu, 2022-05-19 at 16:30 +0200, Frédéric Bonnard wrote:
> > > Hi,
> > > 
> > > I've followed luajit closely since 2015 on ppc64el as a porter
> > > without enough knowledge to port it, but trying to ease on the
> > > packaging/Debian side (being both IBMer/DD).
> > > That port has been a mixed effort between a code bounty and an IBM
> > > effort (some devs) .
> > > It didn't started well (
> > > https://www.freelists.org/post/luajit/PPC64le-port-status,1 )
> > > and it has never grown and be really part of the upstream project
> > > sadly.
> > > 
> > > With the years, I'm even less optimistic as no IBM nor external
> > > developer seem to be working on that. Mike Pall seems to be around
> > > though as you said there's no release (not necessarily a bad sign).
> > > I can ping inside IBM but I'm not sure there will be any positive
> > > feedback.
> > > 
> > > So I'd say we have no choice, i.e. let's drop IBM arches .
> > > What I did a few times for packages depending on libluajit was to use
> > > liblua instead :
> > > https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=892765
> > > 
> > > Thanks,
> > > F.
> > 
> > Nobody want to spend time on an bottomless hole ...
> > I'll simply remove ppc64el architecture support from src:luajit,
> > and give src:luajit2 (openresty) a try.
> > 



Bug#1014661: ITP: lodepng -- LodePNG is a PNG image decoder and encoder

2022-07-09 Thread M. Zhou
Package: wnpp
Severity: wishlist
Owner: Mo Zhou 
X-Debbugs-Cc: debian-devel@lists.debian.org

* Package name: lodepng
  Version : git master
  Upstream Author : Lode Vandevenne
* URL : https://lodev.org/lodepng/
* License : Zlib
  Programming Lang: C
  Description : LodePNG is a PNG image decoder and encoder

More than one package of my insterest has this dependency,
including the mujoco physics simulator.

Thank you for using reportbug



Re: Automatic trimming of changelogs in binary packages

2022-08-18 Thread M. Zhou
On Thu, 2022-08-18 at 21:18 +0200, Gioele Barabucci wrote:
> * The `--no-trim` option allows package maintainers that want to ship 
> the whole changelog a way to do so.
> 
> * The full changelogs are preserved in the source packages and thus 
> available via `apt changelog` and similar mechanisms.
> 
> Does anybody have objective objections against activating automatic 
> changelog trimming in binary packages?

Thank you for working on this.

My original concern about automatically trimming logs was about convenience
of debugging -- I sometimes need to search among the full history to see
what happened to the package of interest in the past, or to quickly figure
out what change has been made at what time.

Since `apt changelog` can still retrieval the full history, my concerns
are gone.



Comments on proposing NEW queue improvement (Re: Current NEW review process saps developer motivation

2022-08-26 Thread M. Zhou
To be honest, in terms of volunteered reviewing work, waiting
for several months is not something new. In academia, it may
take several months to years to get a journal paper response.

I've ever tried to think of possible ways to improve the process, but
several observations eventually changed my mind, and I'm willing
to accept the status quo.

* there is a trade-off between rigorousness and efficiency.
  Any change in the process may induce disadvantages, where
  the most difficult thing is to reach an agreement.
* we will add more work for ftp team if we get them involved in the
  discussion of possible (but unsure) ways to improve NEW.

My ultimate opinion on NEW processing is neutral, and my only
hope for ftp team is to increase the pace of hiring new members.
To be concrete, it is much harder to write a concrete proposal
to debian-vote@l.d.o than discussing possibilities.

I understand we may have the enthusiasm to sprint on something.
However, in terms of the long-term endeavor on Debian development,
the negligible popcon number won't be less disappointing than
a long-term wait to clear the NEW queue.

If one's enthusiasm on working on some package is eventually
worn out after a break, then try to think of the following question:

  Is it really necessary to introduce XXX to Debian?
  Must I do this to have fun?

Strong motivations such as "I use this package, seriously" are not
likely to wear out very easily through time. Packages maintained
with a strong motivation are better cared among all packages in our
archive.

Why not calm down, and try to do something else as interesting
as Debian development when waiting for the NEW queue?
Or simply think of NEW queue as a Debian holiday application.

I just realized these over the years, and these are only my personal
opinion.


On Fri, 2022-08-26 at 09:18 +0200, Gard Spreemann wrote:

Jonas Smedegaard  writes:

> Quoting Gard Spreemann (2022-08-26 08:49:21)
> > On August 25, 2022 10:52:56 AM GMT+02:00, "Sebastian Dröge"
> >  wrote:
> > > PS: To preempt any questions as for why, the background for my
> > > decision
> > > to stop maintaining any packages is this thread, but it's really
> > > just
> > > the straw that broke the camel's back
> > >  
> > > https://alioth-lists.debian.net/pipermail/pkg-rust-maintainers/2022-August/022938.html
> > > 
> > 
> > A bit off-topic, but I think we really ought to discuss (address?)
> > this elephant in the room once more. I don't have the answers, but
> > Sebastian's email yet again clearly illustrates how the status quo
> > is hurting the project. This clear example comes in addition to
> > worries raised before about what the status quo does to recruitment
> > of new developers.
> > 
> > PS: I do not imply that the elephant in the room is the
> > ftpmasters. I'm thinking of the *process*. The people involved put
> > in admirable work in carrying out said process.
> 
> The way I see it, the process is clear: provide *source* to build
> from.
> 
> If there is "source" built from another source, then that other
> source
> is the true source.
> 
> If ftpmasters sometimes approve intermediary works as source, then
> that
> is not a reason to complain that they are inconsistent - it is a
> reason
> to acknowledge that ftpmasters try their best just as the rest of us,
> and that the true source is the true source regardless of misssing it
> sometimes.
> 
> Yes, this is painful.  Yes, upstreams sometimes consider us stupid to
> care about this.  Nothing new there, and not a reason to stop do it.
> 
> If you disagree, then please *elaborate* on what you find sensible -
> don't assume we all agree and you can only state that the process is
> an
> elephant.

Apologies, I should have been a lot clearer. I did not mean the exact
issue of what is the "true source" of something in a package. Rather, I
was referring to the process itself (looking in particular to the last
three paragraphs and the PS in Sebastian's linked email [1]). Whatever
the correct answer to what a "true source" is, in the current process,
a
developer has to make an attempt at doing the right thing, and then
wait
*weeks or possibly months* to know for sure whether it was OK. And if
it's deemed not OK, the reasoning may be entirely specific to the exact
package and situation at hand, and therefore extremely hard to
generalize and to learn from. (Do not construe the above as "ftpmasters
should work faster and give more lengthy reasoning!" – adding *more*
work to the process would make things even worse in my opinion.)

Although I maintain a very small number of packages, and ones that also
very rarely have to re-clear NEW, even I feel sapped of motivation from
the process. And I read Sebastian's email partly as an expression of
the
same thing (apologies if I misconstrue your views, Sebastian). I do
believe similar points of view have been aired on the list before by
others too.

As to your last point, elaborating on what I find sensible: I sadly
don't have a go

Re: Comments on proposing NEW queue improvement (Re: Current NEW review process saps developer motivation

2022-08-27 Thread M. Zhou
On Sat, 2022-08-27 at 09:50 +0200, Gard Spreemann wrote:
> 
> contributing work. In some sense, contributing to Debian becomes
> mostly
> about waiting. (Sure, there is something to be said about extremely
> short, fragmented attention spans being unhealthy – but some
> contributions are naturally short and easy, and we certainly don't
> want
> to drive those away.)

That's why I still hope ftp team to recruit more people. This is
a very direct and constructive way to speed up everything.
More volunteers = higher bandwidth.
Recruiting more people doesn't seem to have a serious disadvantage.

In my fuzzy memory, the last discussion on NEW queue improvement
involves the disadvantages by allowing SOVERSION bump to directly
pass the NEW queue. I'm not going to trace back, because I know
this will not be implemented unless someone proposes a GR.

> > If one's enthusiasm on working on some package is eventually
> > worn out after a break, then try to think of the following
> > question:
> > 
> >   Is it really necessary to introduce XXX to Debian?
> 
> I hope we won't try to define what "necessary" means, or have it
> become
> a criterion for inclusion :-)
> 
> >   Must I do this to have fun?
> 
> I don't think Debian contribution has ever been a necessary condition
> for fun. That's an incredibly high bar. If we were only to attract
> people whose only idea of fun was contributing to Debian, I think
> we'd
> become a very unhealthy project (and one severely lacking in
> contributors).

For newcomers, a long wait wears out their interest of course. I'm
not sure what would be the reason for a potential newcomer to reach
us if they do not find contributing this project "fun/interesting",
or "worthwhile/useful".

For people who chose to stay in this community, there must be a
reason behind them. Because I believe no body can contribute
to a volunteer project without payment / fame / enjoyment.
Without such a high bar, the member list will be much more volatile.

> > Strong motivations such as "I use this package, seriously" are not
> > likely to wear out very easily through time. Packages maintained
> > with a strong motivation are better cared among all packages in our
> > archive.
> 
> I humbly disagree. Even from my own point of view, I may well be very
> motivated to package something I use seriously all the time,
> seriously. But then I see its dependency chain of 10 unpackaged
> items,
> start thinking about the probability that they'll *all* clear the NEW
> queue, and how long that would take, and I give up. And then there's
> the
> problem of attracting smaller contributions, as mentioned above: I
> really believe that people get put off from putting in 30 minutes of
> work for a nice MR on Salsa if they can't expect their work to hit
> the
> archives for months and months (suppose for example they contributed
> to
> a package whose SONAME is being bumped).

I agree with your disagreement but I keep my opinion. My track record
involves maintaining loads of reverse dependency libraries.
I've already gone
through all kinds of pains from the NEW queue and eventually learned
to take a break immediately after uploading something to new.

That said, if someone presents a GR proposal I'll join. In Debian,
it is not that easy to push something forward unless it hurts everyone.
Our NEW queue mechanism has been there for decades, and people are
already accustomed to it (including me). From multiple times of
discussion in the past, I don't see the NEW queue problem hurting
too many people. If nothing gets changed in the NEW queue mechanism,
people may gradually get used to it, following the "do not fix it
if it ain't wrong" rule. The voice will gradually vanish.

This is surely unhealthy. But as an individual developer I don't find
many feasible ways to push things forward unless someone figure out
a reason that as many people feel hurt about it as possible.

> > Why not calm down, and try to do something else as interesting
> > as Debian development when waiting for the NEW queue?
> 
> Sure. That's what I do. My list of joyful and less joyful things to
> fill
> my days with is enormous. **BUT: I worry for the project if our
> solution
> to the problem at hand is "maybe just contribute less to Debian".**
> Is
> that really what we want?
> 

I forecast this thread will eventually end up with
"calm down and take a break" solution again.



Re: Populating non-free firmware?

2022-12-24 Thread M. Zhou
On Sat, 2022-12-24 at 11:44 +0200, Jonathan Carter wrote:
> 
> The non-free-firmware [component] has been created, but so far it
> only 
> contains the rasbpi-firmware package.

Please ensure to include the packages for wifi cards, especially
the iwlwifi since I don't use desktop pc.

One of the most painful ways to install Debian is to realize that
iwlwifi is missing during the netinstall process, while RJ45 cable
is not available. As a result, one may download the package using
cellphone and find a way to transmit that file to the laptop.
If it's iphone then game is sadly over.

In the past, such frustration had once irritated one of the new
users to whom I have recommended Debian. The user's anger has
finally converted into personal/emotional attacks, blaming
me as a Debian developer being incompetent to make the wifi
card working. As a result, I as a Debian developer, would never
recommend Debian to any new user, nor discussing linux with
any new user since then.

iwlwifi is the very only reason that forced me to never use the
the default iso again.

That said, my word only counts as a vote for the wifi card packages.
Just wanted to mention that iwlwifi may hurt people.



Bug#1031565: ITP: nvidia-nccl -- Optimized primitives for collective multi-GPU communication

2023-02-18 Thread M. Zhou
Package: wnpp
Severity: wishlist
Owner: Mo Zhou 
X-Debbugs-Cc: debian-devel@lists.debian.org, 
pkg-nvidia-de...@lists.alioth.debian.org

* Package name: nvidia-nccl
* URL : https://github.com/NVIDIA/nccl
* License : BSD-3-Clause but has to enter non-free.
  Programming Lang: C/C++
  Description : Optimized primitives for collective multi-GPU communication

This is needed for some cuda applications like pytorch-cuda.
The package will be maintained by
Debian NVIDIA Maintainers 



Bug#1031972: ITP: nvidia-cudnn-frontend -- c++ wrapper for the cudnn backend API

2023-02-25 Thread M. Zhou
Package: wnpp
Severity: wishlist
Owner: Mo Zhou 
X-Debbugs-Cc: debian-devel@lists.debian.org, 
pkg-nvidia-de...@lists.alioth.debian.org

* Package name: nvidia-cudnn-frontend
* URL : https://github.com/NVIDIA/cudnn-frontend
* License : MIT (but will enter contrib due to non-free deps)
  Programming Lang: C++
  Description : c++ wrapper for the cudnn backend API

This is needed for the cuda version of pytorch.
The package will be maintained by
Debian NVIDIA Maintainers 



Bug#1031973: ITP: nvidia-cutlass -- CUDA Templates for Linear Algebra Subroutines

2023-02-25 Thread M. Zhou
Package: wnpp
Severity: wishlist
Owner: Mo Zhou 
X-Debbugs-Cc: debian-devel@lists.debian.org, 
pkg-nvidia-de...@lists.alioth.debian.org

* Package name: nvidia-cutlass
* URL : https://github.com/NVIDIA/cutlass
* License : BSD-3-Clause (has to enter contrib due to non-free deps)
  Programming Lang: C++
  Description : CUDA Templates for Linear Algebra Subroutines

This is needed for the cuda version of pytorch.
The package will be maintained by
Debian NVIDIA Maintainers 



Bug#1033345: ITP: nvitop -- An interactive NVIDIA-GPU process viewer and beyond

2023-03-22 Thread M. Zhou
Package: wnpp
Severity: wishlist
Owner: Mo Zhou 
X-Debbugs-Cc: debian-devel@lists.debian.org

* Package name: nvitop
* URL : https://github.com/XuehaiPan/nvitop
* License : Apache-2.0 / GPL-3.0 dual license
  Programming Lang: Python
  Description : An interactive NVIDIA-GPU process viewer and beyond

We have a couple of nvidia GPU utility monitors. Nvidia's nvidia-smi
is standard but far not readable enough for heavy GPU users like me.
I packaged gpustat -- it is good, but it does not show the standard
top informantion, and as a result I have to open another tmux window
for glances or htop, in order to make sure the neural network does
not blow up the system memory.

This nvitop just combines both gpu monitoring and CPU/ram monitoring.
Have used it for a while on GPU servers. It cannot be better.

This package will be maintained under the umbrella of the nvidia packaging
team. I suppose the package has to enter contrib because it depends on the
non-free nvidia driver.

Thank you for using reportbug



Bug#1095237: ITP: vllm -- A high-throughput and memory-efficient inference and serving engine for LLMs

2025-02-05 Thread M. Zhou
Package: wnpp
Severity: wishlist
Owner: Mo Zhou 
X-Debbugs-Cc: debian-devel@lists.debian.org, debian...@lists.debian.org

* Package name: vllm
  Version : 0.7.1
  Upstream Contact: 
* URL : 
* License : Apache-2.0
  Programming Lang: Python
  Description : A high-throughput and memory-efficient inference and 
serving engine for LLMs

I think this is one of the most important applications in the reverse
dependency tree of pytorch package. vllm has a very large tree of dependencies
and many of them are missing. I'm just setting vllm as a long term goal.

Alternatives are ollama and llama.cpp.

Everything, including vllm's necessary dependencies will be maintained by
debian deep learning team.

Thank you for using reportbug



[draft] need your help on the AI-DFSG general resolution prepration

2025-02-01 Thread M. Zhou
Hi all,

I heard that people were looking for me during FOSDEM.

I spent a couple of hours and finally get something draft-ish
for the previously mentioned general resolution on the software
freedom interpolation with respect to AI software.

https://salsa.debian.org/lumin/gr-ai-dfsg
(I turned the issues on. Feel free to open issues there)

This is an early draft. Before really posting to -vote, I need your
help on the following aspect:

(1) do you know any important but missing reference materials?

(2) are the options clear enough for vote? Considering lots of the readers may
not be faimiliar with how AI is created. I tried to explain it, as well as
the implication if some components are missing.

(3) is there anything unclear or ambiguous in the text for backgrounds and 
options?

(4) is there anything else that should be added to the text?

(5) what is the actionable outcome of this generaal resolution?

(6) is a neutral tone necessary for a proposal? I have a clear
tendency throughout the texts.

(7) I have not yet asked ftp-master on their opinion.


According to https://www.debian.org/vote/howto_proposal ,
there is a template https://www.debian.org/vote/sample_vote.template
but I don't understand this XML dialect. How to use this XML file?



Project-wide LLM budget for helping people (was: Re: Complete and unified documentation for new maintainers

2025-01-11 Thread M. Zhou
On Sat, 2025-01-11 at 13:49 +0100, Fabio Fantoni wrote:
> 
> Today trying to see how a new person who wants to start maintaining new 
> packages would do and trying to do research thinking from his point of 
> view and from simple searches on the internet I found unfortunately that 
> these parts are fragmented and do not help at all to aim for something 
> unified but not even simple and fast enough.

And those fragments also changes as the time goes by. Such as the sbuild
schroot -> unshare changes. They are not necessarily well documented in
every introduction material for new comers.

Even if somebody in Debian community has enough time to overhaul everything
and create a new documentation, it will become the situation described
in XKCD meme "standards": xkcd.com/927/ -- we just got yet another document
as a fragment as time goes by.

LLMs are good companions as long as the strong ones are used. In order to
help new comers to learn, it is better for Debian to allocate some LLM API
credits to them, instead of hoping for someone to work on the documentation
and falling in the XKCD-927 infinite loop.

Considering the price, the LLM API call for helping all DDs + newcomers,
I believe, will be cheaper than hiring a real person to overhaul those
documentations and keep them up to date. This is a feasible way to partly
solve the issue without endlessly waiting for the HERO to appear.

Debian should consider allocating some budget like several hundred USD
per month for the LLM API calls for all members and new-comers' usage.

DebGPT can be hooked somewhere within the debian development process,
such as sbuild/ratt for build log analysis, etc. It is cheap enough
and people will eventually figure out the useful apsect of them.

Opinion against this post will include something about hallucination.
In the case LLM write something that does not compile at all, or write
some non-existent API, a human is intelligent enough to easily notice
that build failure or lintian error and tell whether it is hallucination
or not. I personally believe LLMs, at the current stage, is useful
as long as used and interpreted properly.


BTW, I was in the middle of evaluation LLMs for the nm-template. I did lots
of procrastinations towards finishing the evaluation, but the first
several questions were answered perfectly.
https://salsa.debian.org/lumin/ai-noises/-/tree/main/nm-templates?ref_type=heads
If anybody is interested in seeing the LLM evaluation against nm-templates,
please let me know and your message will be significantly useful for me
to conquer my procrastination on it.



Re: Project-wide LLM budget for helping people (was: Re: Complete and unified documentation for new maintainers

2025-01-12 Thread M. Zhou
On Sun, 2025-01-12 at 16:56 +, Colin Watson wrote:
> 
> (I have less fixed views on locally-trained models, but I see no very
> compelling need to find more things to spend energy on even if the costs
> are lower.)

Locally-trained models are not practical in the current stage. State-of-the-art
models can only be trained by the richest capitals who have GPU clusters. 
Training
and deploying smaller models like 1 billion can lead to a very wrong impression
and conclusion on those models.

Based on the comments, what I saw is that using LLMs as an organization is too
radical for Debian. In that sense leaving this new technology to individuals' 
personal
evaluation and usage is more reasonable.

So what I was talking is simply a choice among the two:
 1. A contributor who needs help can leverage LLM for its immediate response and
help even if it only correct, for 30% of the time. It requires the 
contributor
to have knowledge and skill to properly use this new technology.
 2. A contributor who needs help has to wait for a real human for indefinite 
time
period, but the correctness is above 99$.

The existing voice chose the second one. I want to mention that "waiting for a 
real
human for help on XXX for indefinite time" was a bad experience when I was a 
new comer.
The community not agreeing on using that new technology to aid such pain point, 
seems understandable to me.



Bug#1092897: ITP: python-onnxscript -- naturally author ONNX functions and models using a subset of Python

2025-01-12 Thread M. Zhou
Package: wnpp
Severity: wishlist
Owner: Mo Zhou 
X-Debbugs-Cc: debian-devel@lists.debian.org, debian...@lists.debian.org

* Package name: python-onnxscript
  Version : git head, since there is no versioned release
  Upstream Contact: Microsoft
* URL : https://github.com/microsoft/onnxscript
* License : MIT
  Programming Lang: Python
  Description : naturally author ONNX functions and models using a subset 
of Python

This is needed for exporting a transformer model from pytorch.
Pytorch's old torch.onnx.export() simply does not work on transformer.
The dynamo_export needs onnxscript which is missing from the archive.

Thank you for using reportbug



Re: Project-wide LLM budget for helping people

2025-01-12 Thread M. Zhou
On Sun, 2025-01-12 at 22:36 +0100, Philipp Kern wrote:
> 
> No-one is stopped from using any of the free offers. I don't think we
> need our own chat bot. Of course that means, in turn, that we give up on
> feeding it domain-specific knowledge and our own prompt. But that's...
> probably fine?

One long term goal of debian deep learning team is to host an LLM with
the team's AMD GPUs and expose it to the members. That said, the necessary
packages to run that kind of service are still missing from our archive.
It is a good way to use existing GPUs any way.

Even if we get no commercial sponsorship of API calls, we will eventually
experiment and evaluate one with the team's infrastructure. We are still
working towards that.

> If those LLMs support that, one could still produce a guide on how to
> feed more interesting data into it - or provide a LoRA. It's not like
> inference requires a GPU.

First, DebGPT is designed to conveniently put any particular information
whether or not Debian-specific, to the context of LLM. I have also implemented
some mapreduce algorithm to let the LLM deal with extremely overlength
context such as a whole ratt buildlog directory.

LoRA is only sound when you have a clear definition on the task you want
the LLM to deal with. If we do not know what the user want, then forget
about LoRA and just carefully provide the context to LLM. DebGPT is
technically on the right way in terms of feasibility and efficiency.

RAG may help. I have already implemented the vector database and the
retrieval modules in DebGPT, but the frontend part for RAG is still
under development.

> But then again saying things like "oh, look, I could easily answer the
> NM templates with this" is the context you want to put this work in.

My intention is always to explore possible and potential ways to make LLM
useful in any possible extent. To support my idea, I wrote DebGPT, and
I tend to only claim things that is *already implemented* and *reproducible*
in DebGPT.

For instance, I've added the automatic answering of the nm-tempaltes in
DebGPT and the following script can quickly give all the answer.
The answers are pretty good at a first glance. I'll postpone the full
evaluation when I wrote the code for all nm-templates.

I simply dislike saying nonsense that cannot be implemented in DebGPT.
But please do not limit your imagination with my readily available
demo examples or the use cases I claimed.


(you need to use the latest git version of DebGPT)
```
# nm_assigned.txt
debgpt -f nm:nm_assigned -a 'pretend to be lu...@debian.org and answer the 
question. Give concrete examples, and links as evidence supporting them are 
preferred.' -o nm-assigned-selfintro.txt

# nm_pp1.txt
for Q in PH0 PH1 PH2 PH3 PH4 PH5 PH6 PH7 PHa; do
debgpt -HQf nm:pp1.${Q} -a 'Be concise and answer in just several sentences.' 
-o nm-pp1-${Q}-brief.txt;
debgpt -HQf nm:pp1.${Q} -a 'Be precise and answer with details explained.' -o 
nm-pp1-${Q}-detail.txt;
done

# nm_pp1_extras.txt
for Q in PH0 PH8 PH9 PHb; do
debgpt -HQf nm:pp1e.${Q} -a 'Be concise and answer in just several sentences.' 
-o nm-pp1e-${Q}-brief.txt;
debgpt -HQf nm:pp1e.${Q} -a 'Be precise and answer with details explained.' -o 
nm-pp1e-${Q}-detail.txt;
done
```

[1] DebGPT: https://salsa.debian.org/deeplearning-team/debgpt



Debian/Linux's NPU support?

2025-01-14 Thread M. Zhou
Hi folks,

It seems that "AI PC" was everywhere in the CES 2025, which basically indicates
the presence of the NPU device. Both AMD and Intel have integrated the NPU 
device
into their own new CPUs -- in that sense I guess the NPU device will be more 
popular
than discrete GPUs in the future, in terms of availability on a random user's 
computer.

For instance, Intel's U9 285K has an NPU, based on its official Ark page:
  
https://www.intel.com/content/www/us/en/products/sku/241060/intel-core-ultra-9-processor-285k-36m-cache-up-to-5-70-ghz/specifications.html
AMD's AI Max 395 also has an NPU (called AMD Ryzen AI):
  
https://www.amd.com/en/products/processors/laptop/ryzen/ai-300-series/amd-ryzen-ai-max-plus-395.html

The NPU devices are still very new to me. I did a little bit of research, and 
they
seem to need some new drivers and libraries:
  * Intel: https://github.com/intel/linux-npu-driver
   https://github.com/intel/intel-npu-acceleration-library
  * AMD: https://github.com/amd/xdna-driver
Since they are still very new hardware, I think we have plenty time to prepare 
this for
the trixie+1 release. I just hope they are not as annoying as discrete GPUs 
from the green
compoany.

If anybody is interested in improving Debian's support on such new hardware, 
I'd suggest
direct the communications to Debian Deep Learning Team, and work with the team:
https://lists.debian.org/debian-ai/
The team's mailing list is targeted at machine learning and hardware 
acceleration at the
very beginning. And NPU was started from AI.

I'd like to learn if anybody here has experience working with those devices.
Do they run without non-free blobs? If they work well with just DFSG-compliant 
packages,
that would be great.



Bug#1093076: ITP: openvino -- open-source toolkit for optimizing and deploying AI inference

2025-01-14 Thread M. Zhou
Package: wnpp
Severity: wishlist
Owner: Mo Zhou 
X-Debbugs-Cc: debian-devel@lists.debian.org, debian...@lists.debian.org

* Package name: openvino
  Version : 2024.6.0
  Upstream Contact: Intel
* URL : https://github.com/openvinotoolkit/openvino
* License : Apache-2.0
  Programming Lang: Python, C++
  Description : open-source toolkit for optimizing and deploying AI 
inference

Seems to be a popular backend choice for AI inference on CPU.
It also has the NPU support. NPU is still new to me and let's see whether it 
works
for linux.

OpenVINO also have pytorch and onnxruntime integration, which might be useful 
as well.

Thank you for using reportbug



Bug#1096008: ITP: simdutf -- Unicode validation and transcoding at billions of characters per second

2025-02-14 Thread M. Zhou
Package: wnpp
Severity: wishlist
Owner: Mo Zhou 
X-Debbugs-Cc: debian-devel@lists.debian.org

* Package name: simdutf
  Version : 6.2.0
* URL : https://github.com/simdutf/simdutf
* License : Apache-2.0 OR MIT
  Programming Lang: C++
  Description : Unicode validation and transcoding at billions of 
characters per second

Similar to simdjson. SIMD acceleration is cool.
I note this library has been embedded in many existing packages:
https://codesearch.debian.net/search?q=simdutf.h

Thank you for using reportbug



Bug#1101005: ITP: huggingface-hub -- official Python client for the Huggingface Hub

2025-03-21 Thread M. Zhou
Package: wnpp
Severity: wishlist
Owner: Mo Zhou 
X-Debbugs-Cc: debian-devel@lists.debian.org, debian...@lists.debian.org

* Package name: huggingface-hub
  Version : 0.29.3
  Upstream Contact: Huggingface
* URL : https://github.com/huggingface/huggingface_hub
* License : Apache-2.0
  Programming Lang: Python
  Description : official Python client for the Huggingface Hub

This package is a part of GSoC 2025 project, as an easy start on
the vLLM dependency tree. It will be maintained by Debian Deep
Learning Team.

LLM Inference is one of the most promising use cases for packages
maintained by deep learning team. Eventually we want to make
vLLM/SGLang usable out-of-box.

https://wiki.debian.org/SummerOfCode2025/Projects#SummerOfCode2025.2FApprovedProjects.2FPackageLLMInferenceLibraries.Package_LLM_Inference_Libraries



Re: dh-shell-completions: simple debhelper addon to install shell completions

2025-03-16 Thread M. Zhou
On Sun, 2025-03-16 at 18:47 +, Blair Noctis wrote:
> 
> Kind of a shameless plug, but enough people said it's useful so I thought 
> might as well let more know.
> 
> I'll not go into great details, because there isn't any. Just check its man 
> page and source code:
> https://manpages.debian.org/unstable/dh-shell-completions/dh_shell_completions.1.en.html
> https://salsa.debian.org/debian/dh-shell-completions

Thank you for this dh module! After browsing the examples, I came up with
an old problem about these completion scripts -- whether the completions
should be loaded by default, and whether that behavior should be synced
among shells.

Do you think it is useful to define some flags for
dh_shell_completions, like --enable-by-default, --disable-by-default
to decide whether a completion file should be enabled by default.

I'm raising this question because I find it somewhat confusing
because different shells do the completions and keybindings in very
different ways. For instance, fzf:
https://salsa.debian.org/go-team/packages/fzf/-/blob/master/debian/README.Debian

As you can see, key-binding scripts are similar to completion scripts.
Maybe they can be dealt by the same dh module as well. Generally
key-binding scripts should not be enabled by default due to potential
conflicts.

A quick search suggests that both policy and devref do not cover
the shell integration topic.

An example that resembles what I'm talking about is dh_installsystemd.
It has --no-enable, --restart-after-upgrade, --no-restart-after-upgrade,
etc flags to control the systemd unit behavior upon dpkg actions.



Bug#1102498: ITP: llm.nvim -- LLM powered development for Neovim

2025-04-10 Thread M. Zhou
Package: wnpp
Severity: wishlist
Owner: Mo Zhou 
X-Debbugs-Cc: debian-devel@lists.debian.org, debian...@lists.debian.org

* Package name: llm.nvim
* URL : https://github.com/huggingface/llm.nvim
* License : Apache-2.0
  Programming Lang: Lua
  Description : LLM powered development for Neovim

Copilot.vim alternative that allows you to use an alternative
backend server like a self-hosted LLM inference server such
as vLLM, etc.

Thank you for using reportbug



Bug#1102497: ITP: llama.vim -- Local LLM-assisted text completion.

2025-04-09 Thread M. Zhou
Package: wnpp
Severity: wishlist
Owner: Mo Zhou 
X-Debbugs-Cc: debian-devel@lists.debian.org, debian...@lists.debian.org, 
c...@debian.org

* Package name: llama.vim
* URL : https://github.com/ggml-org/llama.vim
* License : MIT/Expat
  Programming Lang: Vim
  Description : Local LLM-assisted text completion.

Copilot.vim alternative that allows you to do code completion in
vim with a local-hosted LLM.

The dependency llama.cpp is prepared by @ckk , and pending in NEW
queue.

Thank you for using reportbug



discussion extension (was: Re: General Resolution: Interpretation of DFSG on Artificial Intelligence (AI) Models

2025-05-05 Thread M. Zhou
Hi Andreas,

According to constitution A.1.6, would you mind helping us extend
the discussion period by a week?
https://www.debian.org/devel/constitution

>From the feedbacks I've heard, there are a couple of problems we are
facing currently.

* Myself being confident in proposal A is one thing. But the audience
  of this GR needs more information in order to make a well-thought
  vote, while I'm suffering from low bandwidth in recent weeks.

* People holding different opinions have really short time to prepare
  a formal polished proposal.

Maybe those factors indicate this is not the best time to vote.
I'm considering whether I should withdraw the proposal, collect more
information to fill up the overlooked aspects, further polish the
proposal A, and come back when it is ready. Doing so also allows
enough time for proposal B,C,D... since people know that I'm really
pushing this forward now.


On Mon, 2025-05-05 at 16:09 -0400, Jeremy Bícha wrote:
> On Mon, May 5, 2025 at 2:49 PM Mo Zhou  wrote:
> > On 5/5/25 11:44, Andrey Rakhmatullin wrote:
> > > > It is too rush to start to vote for this within 3 weeks
> > > 
> > > Does this maybe sound like the GR call was premature?
> > > The project consensus, especially after
> > > https://www.debian.org/vote/2021/vote_003, seems to say that we don't
> > > want multi-month GR discussions.
> > > 
> > 
> > Not quite. Proposal A is mature and I'm confident in it.
> > Potential proposal B,C,D,... are premature.
> > I have no intention to let people holding different opinions
> > to have a very short time to prepare a formal proposal B,C,D.
> > 
> > But, the whole series of discussions started 7 years ago.
> > And I have already mailed everywhere about my intention
> > to submit the GR. If there is no proposal B, that could mean
> > it is really difficult to formally write a proposal B.
> > 
> > I lean towards going ahead.
> 
> I don't believe we have enough information to do the GR now (or one
> week from today, the longest we can delay). I am unclear on whether
> existing packages in Debian are affected. Your proposal does not
> indicate whether the GR would be effective immediately.
> 
> My suggestion is for you to ask the DPL to extend the discussion
> period by a week (for constitutional reasons) followed by an immediate
> withdrawal of the GR. Withdrawing the GR allows you to resubmit later
> and wait 2-3 weeks from that point.
> 
> Thank you,
> Jeremy Bícha