On Sun, 27 Mar 2016 22:29:57 -0400 Drew DeVault <[email protected]> said:
> On 2016-03-28 8:55 AM, Carsten Haitzler wrote: > > i can tell you that screen capture is a security sensitive thing and likely > > won't get a regular wayland protocol. it definitely won't from e. if you can > > capture screen, you can screenscrape. some untrusted game you downloaded for > > free can start watching your internet banking and see how much money you > > have in which accounts where... > > Right, but there are legitimate use cases for this feature as well. It's > also true that if you have access to /dev/sda you can read all of the > user's files, but we still have tools like mkfs. We just put them behind > some extra security, i.e. you have to be root to use mkfs. yes but you need permission and that is handled at kernel level on a specific file. not so here. compositor runs as a specific user and so you cant do that. you'd have to do in-compositor security client-by-client. > > the simple solution is to build it into the wm/desktop itself as an explicit > > user action (keypress, menu option etc.) and now it can't be exploited as > > it's not pro grammatically available. :) > > > > i would imagine the desktops themselves would in the end provide video > > capture like they would stills. > > I'd argue that this solution is far from simple. Instead, it moves *all* > of the responsibilities of your entire desktop into one place, and one > codebase. And consider the staggering amount of work that went into > making ffmpeg, which has well over 4x the git commits as enlightenment. you wouldn't recreate ffmpeg. ffmpec produce libraries like avcodec. like a reasonable developer we'd just use their libraries to do the encoding - we'd capture frames and then hand off to avcodec (ffmpeg) library routines to do the rest. ffmpeg doesnt need to know how to capture - just to do what 99% of its code is devoted to doing - encode/decode. :) that's rather simple. already we have decoding wrapped - we sit on top of either gstreamer, vlc or xine as the codec engine and just glue in output and control api's and events. encoding is just the same but in reverse. :) the encapsulation is simple. > > > - Output configuration > > > > why? currently pretty much every desktop provides its OWN output > > configuration tool that is part of the desktop environment. why do you want > > to re-invent randr here allowing any client to mess with screen config. > > after YEARS of games using xvidtune and what not to mess up screen setups > > this would be a horrible idea. if you want to make a presentation tool that > > uses 1 screen for output and another for "controls" then that's a matter of > > providing info that multiple displays exist and what type they may be > > (internal, external) and clients can tag surfaces with "intents" eg - this > > iss a control surface, this is an output/display surface. compositor will > > then assign them appropriately. > > There's more than desktop environments alone out there. Not everyone > wants to go entirely GTK or Qt or EFL. I bet everyone on this ML has > software on their computer that uses something other than the toolkit of > their choice. Some people like piecing their system together and keeping > things lightweight, and choosing the best tool for the job. Some people > might want to use the KDE screengrab tool on e, or perhaps some other > tool that's more focused on doing just that job and doing it well. Or > perhaps there's existing tools like ImageMagick that are already written > into scripts and provide a TON of options to the user, which could be > much more easily patched with support for some standard screengrab > protocol than to implement all of its features in 5 different desktops. the expectation is there won't be generic tools but desktop specific ones. the CURRENT ecosystem of tools exist because that is the way x was designed to work. thus the srate of software matches its design. wayland is different. thus tools and ecosystem will adapt. as for output config - why would the desktops that already have their own tools then want to support OTHER tools too? their tools integrate with their settings panels and look and feel right and support THEIR policies. let me give you an example: http://devs.enlightenment.org/~raster/ssetup.png bottom-right - i can assign special scale factors and different toolkit profiles per screen. eg one screen can be a desktop, one a media center style, one a mobile "touch centric ui" etc. etc. - this is part of the screen setup tool. a generic tool will miss features that make the desktop nice and functional for its purposes. do you want to go create some kind of uber protocol that every de has to support with every other de's feature set in it and limit de's to modifying the protocol because they now have to go through a shared protocol in libwayland that they cant just add features to as they please? ok - so these features will be added adhoc in extra protocols so now you have a bit of a messy protocol with 1 protocol referring to another... and the "kde tool" messes up on e or the e tool messes up in gnome because all these extra features are either not even supported by the tool or existing features don't work because the de doesn't support those extensions? just "i want to use the kde screen config tool" is not reason enough for there to be a public/shared/common protocol. it will fall apart quickly like above and simply mean work for most people to go support it rather than actual value. > We all have to implement output configuration, so why not do it the same > way and share our API? I don't think we need to let any client no - we don't have to implement it as a protocol. enlightenment needs zero protocol. it's done by the compositor. the compositors own tool is simply a settings dialog inside the compositor itself. no protocol. not even a tool. it's the same as edit/tools -> preferences in most gui apps. its just a dialog the app shows to configure itself. chances are gnome likely will do this via dbus (they love dbus :)). kde - i don't know. but not everyone is implementing a wayland protocol at all so assuming they are and saying "do it the same way" is not necessarily saving any work. > manipulate the output configuration. We need to implement a security > model for this like all other elevated permissions. like above. if gnome uses dbus - they will use polkit etc. etc. to decide that. enlightenment doesn't even need to because there isn't even a protocol nor an external tool - it's built directly in. > Using some kind of intents system to communicate things like Impress > wanting to use one output for presentation and another for notes is > going to get out of hand quickly. There are just so many different > "intents" that are solved by just letting applications configure outputs even impress doesnt configure outputs. thank god for that. > when it makes sense for them to. The code to handle this in the > compositor is going to become an incredibly complicated mess that rivals > even xorg in complexity. We need to avoid making the same mistakes > again. If we don't focus on making it simple, then in 15 years we're > going to be writing a new protocol and making a new set of mistakes. X > does a lot of things wrong, but the tools around it have a respect for > the Unix philosophy that we'd be wise to consider. how would it be complex. a compositor is already, if decent, going to handle multiple outputs. it's either going to auto-configure new ones to extend/clone or maybe pop up a settings dialog. e already does this for example and remembers config for that screen (edid+output) so plug it in a 2nd time and it automatically uses the last stored config for that. so the screen will "work" as basicalyl a biu product of making a compositor that can do multiple outputs. then intents are only a way of deciding where a surface is to be displayed - rather than on the current desktop/screen. so simply mark a surface as "for presentation" and the compositor will put it on the non-internal display (chosen maybe by physical size reported in edid as the larger one, or by elimination - its on the screen OTHER than the internal... maybe user simply marks/checkboxes that screen as "use this screen for presenting" and all apps that want so present get their content there etc.) so what you are saying is it's better to duplicate all this logic of screen configuration inside every app that wants to present things (media players - play movie on presentation screen, ppt/impress/whatever show presentation there, etc. etc.) and how to configure the screen etc. etc., rather than have a simple tag/intent and let your de/wm/compositor "deal with it" universally for all such apps in a consistent way? > > > - More detailed surface roles (should it be floating, is it a modal, > > > does it want to draw its own decorations, etc) > > > > that seems sensible and over time i can imagine this will expand. > > Cool. Suggestions for what sort of capability thiis protocol should > have, what kind of surface roles we will be looking at? We should > consider a few things. Normal windows, of course, which on compositors > like Sway would be tiled. Then there's floating windows, like ummm whats the difference between floating and normal? apps like gnome calculator just open ... normal windows. > gnome-calculator, that are better off being tiled. Modals would be > something that pops up and prevents the parent window from being > interacted with, like some sort of alert (though preventing this > interactivity might not be the compositor's job). Then we have some yeah - good old "transient for" :) > roles like dmenu would use, where the tool would like to arrange itself > (perhaps this would demand another permission?) Surfaces that want to be > fullscreen could be another. We should also consider additional settings > a surface might want, like negotiating for who draws the decorations or > whether or not it should appear in a taskbar sort of interface. xdg shell should be handling these already - except dmenu. dmenu is almost a special desktop component. like a shelf/panel/bar thing. > > > - Input device configuration > > > > as above. i see no reason clients should be doing this. surface > > intents/roles/whatever can deal with this. compositor may alter how an input > > device works for that surface based on this. > > I don't feel very strongly about input device configuration as a > protocol here, but it's something that many of Sway's users are asking > for. People are trying out various compositors and may switch back and > forth depending on their needs and they want to configure all of their > input devices the same way. they are going to have to deal with this then. already gnome and kde and e will all configure mouse accel/left/right mouse on their own based on settings. yes - i can RUN xset and set it back later but its FIGHTING with your DE. waqyland is the same. use the desktop tools for this :) yes - it'll change between compositors. :) at least in wayland you cant fight with the compositor here. for sway - you are going ot have to write this yourself. eg - write tools that talk to sway or sway reads a cfg file you edit or whatever. :) > However, beyond detailed input device configuration, there are some > other things that we should consider. Some applications (games, vnc, > etc) will want to capture the mouse and there should be a protocol for > them to indicate this with (perhaps again associated with special > permissions). Some applications (like Krita) may want to do things like > take control of your entire drawing tablet. as i said. can of worms. :) > > [snip] screen capture is a nasty one and for now - no. no access [snip] > > Wayland has been in the making for 4 years. Fedora is thinking about > shipping it by default. We need to quit with this "not for now" stuff > and start thinking about legitimate use-cases that we're killing off > here. The problems are not insurmountable and they are going to kill > Wayland adoption. We should not force Wayland upon our users, we should > make it something that they *want* to switch to. I personally have > gathered a lot of interest in Sway and Wayland in general by > livestreaming development of it from time to time, which has led to more > contributors getting in on the code and more people advocating for us to > get Wayland out there. you have no idea how many non-security-sensitive things need fixing first before addressing the can-of-worms problems. hell nvidia just released drivers that requrie compositors to re-do how they talk to egl/kms/drm to work that's not compatible with existing drm dmabuf buffers etc. etc. there's lots of things to solve like window "intents/tags/etc." that are not security sensitive. even clients and decorations. tiled wm's will not want clients to add decorations with shadows etc. - currently clients will do csd because csd is what weston chose and gnome has followed and enlightenment too. kde do not want to do csd. i think that's wrong. it adds complexity to wayland just to "not follow the convention". but for tiling i see the point of at least removing the shadows. clients may choose to slap a title bar there still because it's useful displaying state. but advertising this info from the compositor is not standardized. what do you advertise to clients? where/when? at connect time? at surface creation time? what negotiation is it? it easily could be that 1 screen or desktop is tiled and another is not and you dont know what to tell the client until it has created a surface and you know where that surface would go. perhaps this might be part of a larger set of negotiation like "i am a mobile app so please stick me on the mobile screen" or "i'm a desktop app - desktop please" then with the compositor saying where it decided to allocate you (no mobile screen available - you are on desktop) and app is expected to adapt... these are not security can-of-worms things. most de's are still getting to the point of "usable" atm without worrying about all of these extras yet. there's SIMPLE stuff like - what happens when compositor crashes? how do we handle this? do you really want to lose all your apps when compositors crash? what should clients do? how do we ensure clients are restored to the same place and state? crash recovery is important because it is always what allows updates/upgrades without losing everything. THIS stuff is still "un solved". i'm totally not concerned about screen casting or vnc etc. etc. until all of these other nigglies are well solved first. > > for the common case the DE can do it. for screen sharing kind of > > things... you also need input control (take over mouse and be able to > > control from app - or create a 2nd mouse pointer and control that... > > keyboard - same, etc. etc. etc.). [snip] > > Screen sharing for VOIP applications is only one of many, many use-cases > for being able to get the pixels from your screen. VNC servers, > recording video to provide better bug reports or to demonstrate > something, and so on. We aren't opening pandora's box here, just > supporting video capture doens't mean you need to support all of these > complicated and dangerous things as well. apps can show their own content for their own bug reporting. for system-wide reporting this will be DE integrated anyway. supporting video capture is a a can of worms. as i said - single buffer? multiple with metadata? who does conversion/scaling/transforms? what is the security model? and as i said - this has major implications to the rendering back-end of a compositor. > > nasty little thing and in implementing something like this you are also > > forcing compositors to work ion specific ways - eg screen capture will > > likely FORCE the compositor to merge it all into a single ARGB buffer for > > you rather than just assign it to hw layers. or perhaps it would require > > just exposing all the layers, their config and have the client "deal with > > it" ? but that means the compositor needs to expose its screen layout. do > > you include pointer or not? compositor may draw ptr into the framebuffer. > > it may use a special hw layer. what about if the compositor defers > > rendering - does a screen capture api force the compositor to render when > > the client wants? this can have all kinds of nasty effects in the rendering > > pipeline - for use our rendering pipeline iss not in the compositor but via > > the same libraries clients use so altering this pipeline affects regular > > apps as well as compositor. ... can of worms :) > > All of this would still be a problem if you want to support video > capture at all. You have to get the pixels into your encoder somehow. > There might be performance costs, but we aren't recording video all the > time. there's a difference. when its an internal detail is can be changed and adapted to how the compositor and its rendering subsystem work. when its a protocol you HAVE to support THAT protocol and the way THAT protocol defines things to work or apps break. keep it internal - you can break at will and adapt as needed, make it public and you are boxed in by what the public api allows. > We can make Wayland support use-cases that are important to our users or > we can watch them stay on xorg perpetually and end up maintaining two > graphical stacks forever. priorities. there are other issues that should be solved first before worrying about the pandoras box ones. -- ------------- Codito, ergo sum - "I code, therefore I am" -------------- The Rasterman (Carsten Haitzler) [email protected] _______________________________________________ wayland-devel mailing list [email protected] https://lists.freedesktop.org/mailman/listinfo/wayland-devel
