Hi Roland,

On Thu, Apr 10, 2025 at 07:41:44PM +0200, Roland Rosenfeld wrote:
> On Thu, 10 Apr 2025, Daniel Gröber wrote:
> > I'm trying to use lbdb for Debian work, but I found some of my
> > queries are not returning results.
> 
> Could you please be more concrete in what you executed and what your
> results are?

Ah! Ofc. I forgot to mention the one thing that would help you debug this
instead of having to flail around on my account! :-)

On the mutt side I have

    query_command="lbdbq "

I was testing with both quoted and unquoted fullnames and found both to be
broken ("no matches") Eg.:

    $ lbdbq Roland Rosenfeld
    lbdbq: no matches

    $ lbdbq 'Roland Rosenfeld'
    lbdbq: no matches

With my patch at least the unquoted version returns some results --
arguably too many but an overapproximation is fine for me until we find a
proper fix.

> To become more reproducible with your setup, I copied your ldap.rc
> enabled METHODS="m_ldap" and LDAP_NICKS="debian" (m_gpg heavily
> depends on the installed pubring).

This shouldn't depend on my keyring I have hardly any people in there (yet)
:-).

> Hmmm, from my point of view there is a bug in m_ldap, which should be
> possible to find at least one "Roland Rosenfeld" (maybe by internally
> fetching all "roland" and then grep for "Roland Rosenfeld" in the
> result).

Seems plausible.

> On the other hand I don't think that the behavior gets better with
> your patch, since searching for Roland Rosenfeld (without quotes)
> returns 8 Rolands, 7 of them without "Rosenfeld".
> Searching with quotes has the same bug as my version.

Yeah something is still off. This was just my quick hack to get myself a
fix and point prominently at the problem.

> > Instead I submit to you the idea to
> > just-not-use-eval-at-all-ever-as-it's-rarely-really-actually-needed^TM:
> 
> To say the truth, I never reviewed this eval before.  This was in the
> original lbdb package by Thomas Roessler and I always kept it without
> reviewing this.  As you mentioned, the behavior was last changed 25
> years ago...

Ah the good times before shellcheck reigned in our creativity ;-P. You can
always tell how old a shell codebase is by looking for (needless) evals.

> I'll have to check whether changing this will fix or break something,
> especially with different implementations of /bin/sh (I use dash
> myself, but tested your change with bash and dash without noting a
> behavior change).

I'm sure it will work in bash, dash and busybox sh (just tested). I'm not
aware of any other /bin/sh implementations in common use and this is just
how (POSIX'ish) shells work. Expansion happens before command execution.

See also "Simple Commands, Order of Processing" in the Shell Command Language:
https://pubs.opengroup.org/onlinepubs/9799919799/utilities/V3_chap02.html#tag_19_09_01_01

Specifically note how the "command name" is selected here.

> > This seems to fix my problem at least, but I'm not sure passing the
> > $@ args individually is the behaviour you intended?
> 
> If I enter
>  To: Roland Rosenfeld
> in mutt and press Ctrl-T, I expect that only "Roland Rosenfeld" is
> found in the results.
> For this
>  $ lbdbq Roland Rosenfeld
> should work (otherwise it may be necessary to add some quotes around
> %s in muttrc
>  set query_command="lbdbq %s"

Alright, that's what I thought, but returning an overapproximation (OR) of
results didn't seem too unreasonable either so I wasn't sure what the
intended behaviour is.


On Sun, Apr 13, 2025 at 02:36:24PM +0200, Roland Rosenfeld wrote:
> In the meantime I digged a little deeper and found the root cause for
> this problem: The Debian LDAP (db.debian.org) uses the attribute cn
> in an unusual way:
> 
>               cn: Roland
>              uid: roland
>               sn: Rosenfeld
>            gecos: Roland Rosenfeld,,,,
> 
> The usual way is to have the first name (Roland) in the givenName
> attribute and use the cn (CommonName) attribute for the complete name
> (Roland Rosenfeld).  But this one is only available in the gecos field
> here (with the commas (unused subfields) as a suffix).

That seems unfortunate. Wonder if DSA could be convinced to change this,
but I'd guess the convention is too entrenched all over our systems by
now...

I'm not super familiar with how LDAP is usually configured. I'm wondering
if it's common to have this kind of subtle difference between deployments
or if ours is just broken here?

> So if I want to search in db.debian.org for "Roland Rosenfeld", I have
> to search in the gecos field using wildcards (which can be enabled in
> mutt_ldap_query/m_ldap with $ignorant=1).
> 
> This can be archived with the following ldap.rc:
> 
> %ldap_server_db = (
>     'debian' => [
>         'ldaps://db.debian.org',
>         'ou=users,dc=debian,dc=org',
>         'uid cn sn ircnick gecos',
>         'uid cn sn ircnick',
>         '${uid}@debian.org',
>         '${cn} ${sn}', '${ircnick}',
>         1, # wildcard search for matching gecos
>     ],
> );

Right. That seems to do something reasonable :-)

> After some more testing I found out, that mutt_ldap_query by default
> does an "or" join, if called with two parameters.  This also seems to
> be the behavior you intended with your patch, which works at
> least with m_ldap:
> 
> $ lbdbq-test Gröber Rosenfeld
> lbdbq: 2 matches
> d...@debian.org Daniel Gröber   dxld
> rol...@debian.org       Roland Rosenfeld        RoRo
> 
> The problem here is, that most other modules (like m_inmail or m_gpg)
> do not expect multiple parameters this way and fail with your changed
> lbdbq.

Alright. If a single arguemnt is what you want that's also easy to do properly:

-  eval ${method}_query \""$@"\" >> "$collection" || true
+  ${method}_query "$*" >> "$collection" || true

($* instead of $@ for space separated arguments)

> Maybe I should extend the documentation to more explicitly mention
> that lbdbq always takes all parameters as a long (virtually quoted)
> string and searches for this string in this order (instead of
> searching for multiple space separated strings as "or"-connected).

Seems like a good idea to mention this yeah.

> Alternatively I could modify all modules to join multiple space
> separated parameters with "or" (except manually quoted in the query
> string), but this would change the default behavior for most current
> users which doesn't sound like a very good idea to me...

So the way I plan to use lbdbq is this: I often fail to remmeber full names
or nicks. I'll remember one of the components (approximately) either the
nick, lastname, firstname. As soon as I have a list of candidates I can
throw my recognition engine (brain) works just fine, but pulling the exact
name out of my head on the spot is neigh impossible.

So for me the OR makes complete sense and additional fuzzy matching would
be even more useful, but others will have more reliable recall and might
just want Full Name to email@d.o competion.

If the prevailing convention among modules is the single argument it
shouldn't be too hard to introduce a config option in lbdbq for the OR
beahviour and just implicitly ask modules for it by starting to pass
multiple arguments. After we fix the modules to support this ofc.

However, I'm not sure it's worth it. Let me use this new ldap.rc for a bit
and maybe a patch or two will fall out if I get irritated enough :-).

One thing that I might still add is Debian Maintainer/Uploaders support. So
"lbdb maintainer" Ctrl-T would do you'd think :-).

Thanks!
--Daniel

Attachment: signature.asc
Description: PGP signature

Reply via email to