On czw, 2017-06-01 at 10:55 +0200, Alexis Ballier wrote: > On Wed, 31 May 2017 21:02:24 +0200 > Michał Górny <mgo...@gentoo.org> wrote: > > > On śro, 2017-05-31 at 19:39 +0200, Alexis Ballier wrote: > > > > > Again, you *need* to process the constraints in order. '!a? > > > > > ( b ) !b? ( a )' is not deterministic when none of a and b are > > > > > enabled otherwise. > > > > > > > > You can't rely on any particular order of constraints, especially > > > > when eclass stacking comes into play. You could try defining some > > > > constraint- sorting algorithm but it's going to be complex and > > > > it's usefulness will be limited anyway due to various kinds of > > > > grouping. > > > > > > > > > Better have some order: If half of the above constraint comes from > > > ebuild and the other half comes from eclass, then PM might toggle > > > 'a' or 'b' depending on the phase of the moon which is specifically > > > what we're trying to avoid. > > > > No, it can't. That's the whole point. The algorithm must be defined so > > that it is always predictable independently of order (maybe except > > the ordering inside ^^, ||, ??) and independently of how it's nested > > (i.e. 'a? ( b? ( c ) )' must give the same result as 'b? ( a? ( c ) > > )'). > > This is a lot of handwaving without real description of how it would > work. This starts to look a lot like confluence in rewriting systems > and you want developers to write only confluent rules and repoman to > decide that. Good luck with that.
I'm sorry, what I said was quite stupid. I did some thinking and testing today, and the algorithm can be actually quite trivial. The real issue is getting a correct input, that is REQUIRED_USE constraints that have only one solution. That is the whole problem I was signaling, and which I seem to have lost sight of: solving is trivial, ensuring that the constraint can have only one solution isn't. Hopefully, most of the simple constraints in use will 'just work'. The biggest issue for me right now is to find a way to determine whether a particular REQUIRED_USE constraint can have only one solution, independently of the order of non-n-ary constraints. However, some of it can be easily eyeball-checked using a graph. > Let's look at real world examples: > gtk+: > > REQUIRED_USE=" > || ( aqua wayland X ) > xinerama? ( X ) > " $ ./solve.py '|| ( aqua wayland X ) xinerama? ( X )' [!wayland? => [!X? => [aqua]], xinerama? => [X]] x x w i w i a n a n y e y e a l r a l r q a a q a a u n m u n m X a d a | X a d a 0 0 0 0 | 0 1 0 0 0 0 0 1 | 1 1 0 1 0 0 1 0 | 0 0 1 0 (==) 0 0 1 1 | 1 0 1 1 0 1 0 0 | 0 1 0 0 (==) 0 1 0 1 | 1 1 0 1 0 1 1 0 | 0 1 1 0 (==) 0 1 1 1 | 1 1 1 1 1 0 0 0 | 1 0 0 0 (==) 1 0 0 1 | 1 0 0 1 (==) 1 0 1 0 | 1 0 1 0 (==) 1 0 1 1 | 1 0 1 1 (==) 1 1 0 0 | 1 1 0 0 (==) 1 1 0 1 | 1 1 0 1 (==) 1 1 1 0 | 1 1 1 0 (==) 1 1 1 1 | 1 1 1 1 (==) > > Let's assume aqua is masked, which reduces to: > REQUIRED_USE=" > || ( wayland X ) > xinerama? ( X ) > " $ ./solve.py '|| ( aqua wayland X ) xinerama? ( X )' '!aqua' [!X? => [!aqua? => [wayland]], xinerama? => [X]] x x w i w i a n a n y e y e a l r a l r q a a q a a u n m u n m X a d a | X a d a 0 0 0 0 | 0 0 1 0 0 0 0 1 | 1 0 1 1 0 0 1 0 | 0 0 1 0 (==) 0 0 1 1 | 1 0 1 1 1 0 0 0 | 1 0 0 0 (==) 1 0 0 1 | 1 0 0 1 (==) 1 0 1 0 | 1 0 1 0 (==) 1 0 1 1 | 1 0 1 1 (==) (to avoid having to explicitly play with empty clauses and such, it doesn't remove masked flags, just shifts them to the end) > > With USE='-* xinerama' this one is invalid: applying the first > constraint first enables wayland, then the 2nd enables X, giving a > result of USE="xinerama wayland X". Applying the 2nd first enables X, > then the 1st does nothing, giving a result of USE="xinerama X". Indeed. To regain order-indepedence, you could do: xinerama? ( X ) !xinerama? ( || ( aqua wayland X ) ) While it's non-obvious, it's the only reasonable way to ensure that things don't start falling apart randomly. > Now the funny one: > $ portageq metadata / ebuild dev-lang/php-7.0.18 REQUIRED_USE > cli? ( ^^ ( readline libedit ) ) truetype? ( gd ) webp? ( gd ) cjk? > ( gd ) exif? ( gd ) xpm? ( gd ) gd? ( zlib ) simplexml? ( xml ) soap? > ( xml ) wddx? ( xml ) xmlrpc? ( || ( xml iconv ) ) xmlreader? ( xml ) > xslt? ( xml ) ldap-sasl? ( ldap ) mhash? ( hash ) phar? ( hash ) qdbm? > ( !gdbm ) readline? ( !libedit ) recode? ( !imap !mysqli ) sharedmem? > ( !threads ) mysql? ( || ( mysqli pdo ) ) || ( cli cgi fpm apache2 > embed phpdbg ) It isn't as scary as it looks. If you graph it, then you can notice it's just got a few separate sub-graphs: 1. [!cgi, !fpm, !phpdbg, !apache2, !embed] -> cli -> (readline <=> !libedit) (the last bit having duplicate 'readline? ( !libedit )' which is wrong) 2. [truetype, webp, cjk, exif, xpm] -> gd -> zlib 3. [simplexml, soap, wddx, xmlrpc, !iconv, xmlreader, xslt] -> xml 4. ldap-sasl -> ldap 5. [mhash, phar] -> hash 6. qdbm -> !gdbm 7. recode -> [!imap, !mysqli] [mysql, !pdo] -> mysqli (note that there are both implications for mysqli and !mysqli -- for 4 cases they cause convergence error) 8. sharedmem -> !threads All but 1 and 7 are trivial. > There are probably dozens of ways to make that non deterministic. > Here's one: USE='-*'. Apply '|| ( cli cgi fpm apache2 embed phpdbg )' > last; this enables 'cli'. Since it's the last one, REQUIRED_USE is not > satisfied because of 'cli? ( ^^ ( readline libedit ) )'. > If you process it first, then you enable cli and readline and are good. You just take a second iteration, and things fall into place. That's not a problem. > > > Also, what happens if we applied all the constraints and obtained > > > some useflags setting that still fails REQUIRED_USE check ? > > > > It can't happen. If you can apply all the constraints, then implicitly > > REQUIRED_USE is satisfied. If you can't apply all the constraints, > > then it just fails. Of course, we want to ultimately avoid that case. > > See the php example above. How do you ensure this does not happen? > > Note that your assertion 'If you can apply all the constraints, then > implicitly REQUIRED_USE is satisfied.' is false: You're changing the > values of useflags, thus might violate some previously checked > constraint. So you reiterate. Either you reach a solution (preferably there's only a single valid solution you can reach) or you hit a previous state which indicates a loop and you fail. > > > > The point would be: by definition, the solver should be able to > > > > touch *any* USE flag. If it can't, it mostly implies that you > > > > can't use the automatic solver, and so the case is irrelevant for > > > > checking. Attempting to go beyond that is going to give a lot of > > > > complexity and most likely kill the whole idea. > > > > > > > > One thing we need to worry about are masks. We need to think about > > > > them. > > > > > > It makes zero difference for any solver if you replace variables > > > (useflags) by 'true' or 'false' constants (masked/forced/user-forced > > > useflags). It's even simpler actually. Whether the formula is not > > > constantly 'true' (ie REQUIRED_USE is useless) or constantly > > > 'false' (ie there's no way to solve it) is equivalent to solving > > > SAT. We likely don't want that for repoman running on php. > > > > > > > Well, probably yes. We just need to make sure to apply them correctly > > in different contexts, to avoid accidentally skipping some > > constraints. I think it would be reasonably to assume that: > > [...] > > Yes, you can eliminate constants. It is probably even confluent, i.e. > the order in which you eliminate them does not matter and it always > converges to the same result. > As mentioned above, I've chosen an even simpler path -- I just push true parts up front, and false to back. This handles all the cases without extra checking logic -- if immutables satisfy the constraint, it's satisfied early; if they make it impossible to satisfy it, they just make it fail at trying to touch an immutable flag. $ ./solve.py "^^ ( c b a )" 'a b' [!a? => [!c? => [b]], b? => [!a, !c], a? => [!c]] a b c | a b c 1 1 0 | [unsolvable: immutable a mismatched] 1 1 1 | [unsolvable: immutable a mismatched] My current code is on github [1]. It's ugly, slow and incomplete. It's merely a proof-of-concept and testing toy but still could give some clues. [1]:https://github.com/mgorny/required-use -- Best regards, Michał Górny
signature.asc
Description: This is a digitally signed message part