Am Thu, Dec 07, 2023 at 08:36:12PM +0100 schrieb Lucas Nussbaum: > On 07/12/23 at 20:24 +0100, Andreas Tille wrote: > > Am Thu, Dec 07, 2023 at 07:59:38PM +0100 schrieb Lucas Nussbaum: > > > On 07/12/23 at 09:58 +0100, Andreas Tille wrote: > > > > > > > > udd=> select '"' || u.name || '"' as name_with_spaces, uploader from > > > > uploaders u where name like '% ' or name like ' %' ; > > > > name_with_spaces | uploader > > > > --------------------------+------------------------------------------- > > > > " Mehdi Dogguy" | Mehdi Dogguy <[email protected]> > > > > " David Paleino" | David Paleino <[email protected]> > > > > " Stéphane Glondu" | Stéphane Glondu <[email protected]> > > > > " Stefano Zacchiroli" | Stefano Zacchiroli <[email protected]> > > > > " Stefano Zacchiroli" | Stefano Zacchiroli <[email protected]> > > > > " Stefano Zacchiroli" | Stefano Zacchiroli <[email protected]> > > > > " Stefano Zacchiroli" | Stefano Zacchiroli <[email protected]> > > > > " Stefano Zacchiroli" | Stefano Zacchiroli <[email protected]> > > > > "Andreas Tille " | Andreas Tille <[email protected]> > > > > " LI Daobing" | LI Daobing <[email protected]> > > > > " David Paleino" | David Paleino <[email protected]> > > > > " Stefano Zacchiroli" | Stefano Zacchiroli <[email protected]> > > > > " Nikita V. Youshchenko" | Nikita V. Youshchenko <[email protected]> > > > > " Nikita V. Youshchenko" | Nikita V. Youshchenko <[email protected]> > > > > " Nikita V. Youshchenko" | Nikita V. Youshchenko <[email protected]> > > > > " Nikita V. Youshchenko" | Nikita V. Youshchenko <[email protected]> > > > > " Nikita V. Youshchenko" | Nikita V. Youshchenko <[email protected]> > > > > "Colin Tuckley " | Colin Tuckley <[email protected]> > > > > "Colin Tuckley " | Colin Tuckley <[email protected]> > > > > "Colin Tuckley " | Colin Tuckley <[email protected]> > > > > (20 rows) > > > > ... > > > > UPDATE uploaders SET name = trim(name), uploader = trim(name) || ' ' > > > > || email WHERE name like ' %' or name like '% ' ; > > > > > > > > > > BTW: I found > > > > udd=> SELECT count(*), name FROM (SELECT CASE WHEN changed_by_name = '' > > THEN maintainer_name ELSE changed_by_name END AS name FROM upload_history) > > uh WHERE name ilike '%tille%' group by name; > > count | name > > -------+--------------- > > 16524 | Andreas Tille > > (1 Zeile) > > > > So why do I have 8707 uploads per uploaders but 16524 per upload_history?
??? > > Is my assumption wrong that both values should match (modulo some wrongly > > spelled names) Could you please comment on these different results? > If you look at the uploaders table, there are three columns: > - 'uploader', than contains the raw data > - 'name' and 'email' that contain the parsed (and trimmed) data > > udd=> select uploader, name, email, count(*) from uploaders where uploader > ilike '%tille%' group by 1,2,3; > uploader | name | email | > count > ------------------------------------+-----------------+------------------+------- > Andreas Tille <[email protected]> | Andreas Tille | [email protected] | > 8785 > Andreas Tille <[email protected]> | Andreas Tille | [email protected] | > 1 > Andreas Tille <[email protected]> | Andreas Tille | [email protected] | > 1 > > So, just use name and/or email? Well, I do not seek for a solution for this (non-)problem. I simply think that not stripping values from spaces before injecting these into UDD is wrong. I simply stumbled upon this when I did the query above. I stumbled upon another reason which might be even worse: select distinct done, done_name, done_email, owner, owner_name, owner_email from archived_bugs where done_name like '%"%' or owner_name like '%"%' order by done_name; done | done_name | done_email | owner | owner_name | owner_email ---------------------------------------------------------------------------------------------------------------+-----------------------------------------------------------------------------+-------------------------------------------------+-----------------------------------------------------------------------------------------------------------------+-----------------------------------------------------------------------------+---------------------------------------------- <[email protected]> | | [email protected] | "[email protected]" <[email protected]> | "[email protected]" | [email protected] <[email protected]> | | [email protected] | "Varun Hiremath" <[email protected]> | "Varun Hiremath" | [email protected] [email protected] (Alexander L. Belikoff) | | [email protected] | "Alexander L. Belikoff" <[email protected]> | "Alexander L. Belikoff" | [email protected] [email protected] (Andreas B. Mundt) | | [email protected] | "Andreas B. Mundt" <[email protected]> | "Andreas B. Mundt" | [email protected] [email protected] (Antoine R. Dumont (@ardumont)) | | [email protected] | "Antoine R. Dumont" <[email protected]> | "Antoine R. Dumont" | [email protected] [email protected] (Antoine R. Dumont) | | [email protected] | "Antoine R. Dumont" <[email protected]> | "Antoine R. Dumont" | [email protected] [email protected] (Artur R. Czechowski) | | [email protected] | "Artur R. Czechowski" <[email protected]> | "Artur R. Czechowski" | [email protected] ... We have lots of names in probably more than archived_bugs which are not stripped from '"'. You always find the very same names without the quotes inside the same table. I think this is similarly wrong and even more annoying than the spaces. I wonder where we could sensibly discuss those issues which I consider bugs in UDD. Would it make sense to add some udd category in `reportbug other` ? Kind regards Andreas. -- http://fam-tille.de

