Re: [dev-servo] State of Servo

2012-07-11 Thread Patrick Walton

On 7/11/12 10:09 AM, Ian Melven wrote:

Also, in general, i'm pretty curious about Servo's process model and its 
security architecture,
maybe that's best discussed in a new thread though (I really need to take some 
time
to understand Rust better as well). My particular interest is in how Servo 
relates to the process sandboxing project I'm working on and
any ideas around what Servo's possible addon model might be -
Servo is often proposed as a solution to the needs driving the sandboxing 
project but it seems
there will still be unsafe, possibly exploitable code in certain parts of it.


The memory safety and type safety of Rust isn't a substitute for a 
sandbox. Even with memory safety, it's still possible for someone to 
call os::exec("calc.exe"). And it's still potentially possible to 
exploit kernel32.dll, user32.dll, d3d9.dll, etc.


So we will need to use a sandbox. I think this sandboxing code should be 
part of the Rust cargo ecosystem, so that Rust programs can generally 
use it and the Rust community can contribute to it.


That said, memory safety definitely helps security in a big way. I think 
of memory safety and type safety as just one particularly powerful layer 
of protection. Just like any security layer, it rules out many sources 
of exploits, but other layers of protection are needed.


Patrick
___
dev-servo mailing list
dev-servo@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-servo


Re: [dev-servo] How to deal when layout lags behind DOM updates

2012-07-12 Thread Patrick Walton

On 7/12/12 6:45 PM, Boris Zbarsky wrote:

I was talking to BenWa today, and he asked how the RCU world handles
this situation:

1)  DOM update starts at time t=0
2)  DOM update finishes at time t=1, schedules 4ms setTimeout
3)  Layout starts at time t=1
4)  Timer fires at t=5, updates DOM
5)  DOM update finishes at time t=6, schedules 4ms setTimeout
6)  Timer fires at time t=10, updates DOM
7)  Layout from item 3 finishes at time t=11.

It seems to me like in this case the RCU setup would basically skip
painting the t=5 frame, right?

We could make it delay the t=10 frame until the layout is done instead.

But we can't make it buffer up all the frames in hopes that the slow
layout is transient, afaict.  Correct?


I believe there are three options:

1. Skip the t=5 frame. Instead, at time t=5, set a "needs a new layout 
once this layout is done" flag. At t=11, the DOM sees that the layout is 
done, but the flag is on, so it clears the flag and spawns a new layout 
task This. is what I figured would happen.


2. Delay the t=10 frame until layout is done.

3. Have a fixed number of layout tasks that are permitted to be in 
flight. If a layout task is currently running on an old version but we 
haven't reached our cap yet, spawn a new layout task for a new version. 
This is an extension of the RCU model to some fixed number k of readers 
(with K read pointers for each node). I'm not sure whether this will be 
necessary, but I believe it could be done.


Patrick
___
dev-servo mailing list
dev-servo@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-servo


Re: [dev-servo] Handling adoptNode

2012-10-20 Thread Patrick Walton

On 10/19/12 1:55 PM, Boris Zbarsky wrote:

Do we have a plan for adoptNode?  If we're allocating nodes out of the
JS heap, presumably we'll need to brain-transplant the JSObject and then
copy over the implementation into the new compartment... and make sure
we update any pointers to the implementation, right?


I'm not entirely clear on the semantics of adoptNode. Is it possible 
that the node could come from a different origin?


If the node can't come from a different origin, it's still tricky, but 
relatively straightforward: we brain transplant the JS object and 
essentially treat the node as deleted on the document it came from as 
far as layout is concerned. It's an invariant that layout can't directly 
store DOM node handles from layout pass to layout pass, so once layout 
is completed it should be OK to move the node over. We may have to wait 
for layout on the source document to complete before moving the node 
over to the destination document though; this means that adoptNode will 
probably cause a content task stall if layout is happening in the source 
document.


If the node can come from a different origin, then, well, we'll have to 
send it over IPC or something. It seems to me that Google Chrome would 
have the same problems here, since it runs different origins in 
different processes. Might be worth looking at its implementation in 
this scenario.


Patrick

___
dev-servo mailing list
dev-servo@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-servo


[dev-servo] Submodule split

2012-11-11 Thread Patrick Walton
Just FYI, I've split out GFX and all its dependencies (image, resource, 
text, most of util) into the servo-gfx crate as a submodule. This means 
that if you're just modifying layout you can avoid rebuilding some ~5000 
lines of code of the graphics module.


The resource crate and util should further be split out; clearly the 
resource manager does not belong in the graphics crate. Possibly we 
should split out layout as well.


Patrick
___
dev-servo mailing list
dev-servo@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-servo


Re: [dev-servo] Submodule split

2012-11-11 Thread Patrick Walton

On 11/11/12 8:25 PM, Brian Burg wrote:

In what sense are any of these things (outside of utility-like data
structures) reusable? Or, is it possible to swap out one
implementation for another? Is this just a workaround for lacking
incremental compilation? Do we want to encourage modularity at the
task or functionality boundaries?


I'm not sure how reusable they will be in practice; the probability of 
reuse drops as you go up toward the root from the leaves of the 
dependency tree (for example, servo- modules will probably realistically 
never be used outside Servo). The crate division instead serves two 
purposes:


(1) It encourages modularity within the codebase to keep things clean 
and understandable. It's easier to understand portions of the code in 
isolation when the codebase is split out into large chunks lacking 
cyclic dependencies. I don't want the dependency graph of the codebase 
to approach the complete graph over time the way Gecko (and as I 
understand it WebKit) did.


(2) It allows incremental recompilation. To be honest, I don't know if 
incremental recompilation in Rust below the crate level is ever going to 
be feasible. It's a very tricky problem because of cyclic dependencies, 
monomorphization, inlining, structural types, and so forth. Most 
ahead-of-time statically-compiled languages either (a) don't allow 
mutually recursive modules at all (ML); (b) greatly restrict their use 
(Haskell compilers, Go); or (c) don't even try to support incremental 
recompilation (C++). I'm not confident that we're going to be able to do 
any better than these languages did, especially since our name 
resolution semantics are extremely expressive.


It's really not so bad though; large projects just need to be vigilant 
about breaking themselves up into crates to achieve good compilation 
times. This really isn't any different of a story than that of most 
other languages; it's just that modules in Rust are so expressive that 
going back to DAG semantics feels like a bummer.


That said, you do have a good point about the project history; perhaps 
it isn't worth it to make the Servo modules into submodules if we don't 
think they have any use to the rest of the Rust community.


Patrick

___
dev-servo mailing list
dev-servo@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-servo


Re: [dev-servo] Submodule split

2012-11-12 Thread Patrick Walton

On 11/12/12 8:04 AM, Brian Burg wrote:

In the case of rust bindings to other APIs (which is what most of the
submodules are), separate repositories and crates make sense.

For parts of servo like gfx, css, dom, and layout, I think it makes
sense to put them in separate crates but in the same repository. I'm
mostly averse to splitting repositories when the chance of reuse is
slim (and the admin overhead not worth it).


That's fine by me. Requires a little bit of build system hacking, but 
nothing terrible I don't think.


Patrick
___
dev-servo mailing list
dev-servo@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-servo


Re: [dev-servo] More planning

2013-02-13 Thread Patrick Walton

On 2/13/13 7:16 PM, Boris Zbarsky wrote:

On 2/13/13 9:47 PM, Brian Anderson wrote:

During a lunch discussion today we decided that we have three areas that
we care most about this year: layout, crow, and dom bindings.


We still need to make basic measurements of round-trip times through our
DOM to layout and back on a sync layout flush.  I haven't been able to
do that yet, not least because I haven't actually managed to get servo
to compile any of the times I've tried it recently


Servo now ships a known-working version of Rust, so it should be easier 
to build.


With the changes today Servo now uses the faster pipes instead of the 
slower comm, so the communication should be faster than it was. However, 
kicking a sleeping Rust task awake is still more expensive than it 
should be; it involves several locks. Fixing this will require the 
scheduler rewrite that brson is currently working on. As per the current 
plan for the rewrite, content will be able to atomically steal the 
sleeping layout task and context switch into it if it is sleeping, which 
should be very fast; overhead over a function call is one atomic 
operation plus some more register saves/restores.


Maybe for a performance test we could just have the layout spin in a 
separate thread. Communication is lock-free as long as the layout task 
is awake at the time of the send. This may not be as fast as the 
sleeping case under the scheduler rewrite, however, because it increases 
the multi-CPU bus traffic.


Ultimately we will need the scheduler rewrite in order to get meaningful 
numbers out of these performance benchmarks. Essentially this boils down 
to how fast a function call (Gecko/WebKit) performs compared to a 
context switch (Rust scheduler/Servo). We know that some sort of 
architecture that allows a message send to avoid a round trip through 
the scheduler is needed to achieve this. We won't know until the 
scheduler rewrite is complete.


I could microbenchmark the cost of an atomic operation and a register 
context switch compared to a function call, perhaps...



I believe that's higher priority than pretty much anything else right
now, because we need that to know whether the general architecture is
getting us the performance we want.

That said, what is "crow"?


A simple frontend to establish the multi-process model. The rationale is 
that getting multi-process working early is better than putting it off 
to avoid the situation we are in with Gecko.



How about deciding on an actual implementation strategy for the DOM?
Simple things like how we actually plan to represent DOM objects in Rust
in a way that allows bindings to work with reasonable performance (i.e.
without having to make everything a virtual call) would be a good
start...  I've tried bringing this up before and keep being told that
the relevant parts of Rust are in flux so we have to decide later.


I plan to write up my thoughts more on the wiki tomorrow, but I am 
currently leaning toward not using copy-on-write and instead just having 
the DOM objects' contents be immutable while layout is running. Thus we 
avoid having to allocating handles and have multiple pointers. This is 
basically the way Gecko and WebKit do it, so this reduces the 
single-threaded performance risk. For now JS will just block if it wants 
to mutate the DOM while layout is running. If we want JS to be able to 
mutate the DOM while layout is running without blocking, then we could 
add some sort of change queue or journal later on top of this design.


I think we should just ignore the fact that Rust cannot currently 
inherit traits from structs for now and implement the DOM object methods 
that have to be virtual using unsafe Rust code. Basically we can just 
embed a vtable full of "extern fn"s into the structs that represent the 
DOM and manually cast to obtain the self pointer. (Inheritance of traits 
from structs is the missing feature that we would need in Rust to make 
this safe.) Later we can add that feature to Rust and make it safe. But 
for now this would give us the right performance model: a struct with 
flat fields, supporting a combination of nonvirtual and virtual methods.


Regarding stack switching, I have a proposal that has achieved tentative 
consensus on the Rust team that should allow us to remove the stack 
switching burden from the DOM task entirely for calls from Rust into the 
JS engine and vice versa. So we should be good there. It needs to be 
implemented, of course.


Patrick

___
dev-servo mailing list
dev-servo@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-servo


Re: [dev-servo] More planning

2013-02-13 Thread Patrick Walton

On 2/13/13 7:23 PM, Robert O'Callahan wrote:

I think it's critical to examine the performance of Servo DOM+layout at its
most vulnerable, which I think is when JS is synchronously pounding on it.
A couple of microbenchmarks spring to mind:
1) DOM bindings performance for simple DOM operations, e.g.
   for (var i = 0; i < 100; ++i) {
 element.setAttribute("class", "foo");
   }


This one should be easier. We've been looking at this microbenchmark for 
a while. We need a couple of things to make this fast:


(1) The DOM representation needs to be made into a Rust struct, using 
the unsafe code scheme I alluded to in my reply to Boris.


(2) Rust needs the manual control over stack switching that has more or 
less achieved consensus on the Rust team. That way we can remove all 
stack switching costs.


I suspect these two will get us down to within 2x of Gecko/WebKit, and 
possibly much better.


Icing on the cake will be moving from the copy on write scheme to the 
new immutability scheme. This will save one pointer load. Additionally, 
being able to tag functions as not needing the stack growth check will 
save 3 instructions.



2) Harder: performance of synchronous layout, e.g.
   for (var i = 0; i < 10; ++i) {
 element.setAttribute("style", "width:" + i + "px");
 element.getBoundingClientRect().width;
   }


This is harder to make performant because of the message passing. When 
the scheduler is rewritten we will have better performance here.


I did a microbenchmark just now to test Brian's proposed scheme that he 
is implementing. I took the naive Fibonacci function (traditionally used 
to test scheduling overheads) and added two context switches, a pointer 
load, and a memory barrier for every single recursive call (source is at 
[1]). Here are my results:


Without the context switches (baseline):

real0m3.733s
user0m3.705s
sys 0m0.006s

With the context switches:

real0m12.249s
user0m12.057s
sys 0m0.021s

That's a 3.28x slowdown. This matches what projects like Cilk saw in 
their papers for Fibonacci. The Fibonacci function is basically just one 
load and store and add, so this is basically a test of pure overhead 
versus a function call.


I suspect that the function call overhead will be completely swamped by 
the overhead of parsing the modified "width" selector alone. So I'm 
feeling guardedly optimistic about this approach. I can try more 
microbenchmarks later that pretend to parse the CSS selector and whatnot 
if people are interested. Of course, nothing will substitute for testing 
the actual implementation.


Patrick

___
dev-servo mailing list
dev-servo@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-servo


Re: [dev-servo] More planning

2013-02-13 Thread Patrick Walton

On 2/13/13 7:25 PM, Robert O'Callahan wrote:

I also think that sandboxing the engine is not interesting. Assuming you're
talking about OS-level process sandboxing, there's no risk there; we know
browser engines can be sandboxed that way.


I don't plan to spend time on the OS-specific parts of this. (This is 
basically a sunk cost anyway, at least on the Mac; there is already a 
sandbox file for the Mac in the tree.)


Mostly I want to emphasize that we should avoid architectural decisions 
that make it difficult to sandbox later. For example, writing to trusted 
filesystem locations should be mediated through the browser process so 
that we don't have to go back and patch up all the places that do this 
later when we do decide to sandbox.


Patrick

___
dev-servo mailing list
dev-servo@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-servo


Re: [dev-servo] More planning

2013-02-13 Thread Patrick Walton

On 2/13/13 9:43 PM, Robert O'Callahan wrote:

(Of course a big sandbox around the entire browser, or independent browsing
contexts, could still be useful as a second line of defense to prevent the
system from being persistently corrupted in case of a Rust exploit.)


That's exactly the purpose. Mostly the threat model I'm concerned with 
consists of these three vectors:


(1) An exploit in OS-supplied native code that we can't control. We 
don't control graphics drivers (and even if we proxied all GL calls they 
could be attacked remotely), and on Windows in particular you have to 
bring kernel32.dll into your address space to talk to the kernel at all.


(2) An exploit in our safe wrappers for OS native code. For example, we 
might incorrectly wrap a Windows COM object, causing it to be freed 
early and a subsequent dereference of the vtable would cause an 
exploitable security vulnerability.


(3) An exploit of a logic error in the safe Rust code. For example, 
someone might trick the download manager into saving a file to 
C:\Windows\System32 without user permission.


Sandboxing can provide an additional line of defense against all of 
these issues. There is also the chance that the Rust implementation or 
type system has exploitable bugs, which sandboxing helps with too.


The memory safety guarantees that the Rust compiler provides vastly 
reduce the surface area of vulnerabilities, of course, and I think the 
engine will be much stronger for that. I think the various Chromium 
exploits have demonstrated that sandboxing is not enough; we need to 
make sure that every virtual call added to the browser isn't a potential 
exploit waiting to happen, and that's what the Rust type system is 
designed to protect against.


Patrick

___
dev-servo mailing list
dev-servo@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-servo


Re: [dev-servo] More planning

2013-02-13 Thread Patrick Walton

On 2/13/13 10:06 PM, Boris Zbarsky wrote:

That sounds like it should be fast enough, yes.  We should verify that
it actually is in practice.  Which to me suggests that the scheduler
rewrite is a higher priority than, say, floats.  At least if they're
both gated on brson.  ;)  Do we have any ETA on the scheduler rewrite?


Currently, floats aren't really on our priority list; we want them to be 
done, but other things are higher priority. For example, the scheduler 
rewrite. :)


The current status of the rewrite is that the scheduler thread (which is 
the I/O thread in the new design) is starting up tasks, but 
communication and I/O does not work. I/O is important to get right, 
since a major impetus for the creation of the new scheduler is to avoid 
having separate event loops for I/O and scheduling. The two event loops 
proved to be a big performance problem for I/O (see the threads on 
rust-dev if you're curious).



I plan to write up my thoughts more on the wiki tomorrow, but I am
currently leaning toward not using copy-on-write and instead just having
the DOM objects' contents be immutable while layout is running.


Hmm.  This loses us a lot of the "quickly return to the event loop and
page script" benefit we're ideally aiming for, I think.  It also makes
doing any sort of speculative layout much more problematic, right?

>

Why the change of heart here?  Or should I just wait for your writeup on
the wiki which will cover that?


So as I write this I realize there is a hybrid solution (#3 below) that 
we could possibly employ.


The current/old COW scheme uses "handles". In this scheme, if script 
modifies a DOM object while layout is running, the memory looks like this:


[ JSObject ] -> [ Handle ]
/\
[ Script Object ] [ Layout Object ]

This has the problem that there are two allocations (one for the handle) 
and two levels of indirection for each property access. Moreover, these 
costs are paid even in the sequential case, when layout is not running:


[ JSObject ] -> [ Handle ]
  \/
  [ DOM Object ]

Compare this to Gecko/WebKit, where the memory simply looks like this:

[ JSObject ] -> [ DOM Object ]

In Gecko/WebKit we have just one allocation and one level of 
indirection. So ideally we would like Servo to have this property as 
well. The obvious problem is what to do when script modifies the DOM 
while layout is running. There are a number of strategies we can choose:


(1) Have script block on layout whenever it tries to modify the DOM 
while layout is running. While it's blocking try to do *something* 
useful like GC. This can cause jank, however.


(2) Have script append its modification to a journal if layout is 
running. (If layout is not running, the change is committed directly.) 
Script then checks the log when doing a get if layout is running. When 
layout completes, commit all the actions in the journal to the DOM. The 
advantage to this is that sets while layout is running are very fast 
(just add to a buffer). The disadvantage is that gets are slow, unless 
we do something really fancy. (There is an interesting analogy to log 
structured filesystems and databases here.)


(3) Have script copy the DOM node and store a pointer to the new node in 
the header of the DOM object.


 Layout Object   Script Object

+---++---+
[ JSObject ] -> | Script Object Ptr | -> | fields... |
+---+\/\/\/\/\/\/\
| fields... |
\/\/\/\/\/\/\/\/\/\/\

Whenever script does a get, if layout is running, it follows the script 
object pointer to find the object it should be looking at. When script 
does a set, it copies the DOM node, sets the script object pointer, and 
then modifies the copy. Layout never looks at the script object pointer. 
When layout completes, the layout objects are discarded and the script 
objects become the new layout objects. This has the advantage that gets 
are fast; the only overhead is one extra load (to follow the script 
object pointer). The disadvantage is that sets are slow, because they 
involve copying the DOM node and allocation.


I am leaning toward (3) now.


I think we should just ignore the fact that Rust cannot currently
inherit traits from structs for now


OK.  How temporary is "for now"?


Niko and I have a proposal and we can bring it up at the next Rust 
meeting on Tuesday.



and implement the DOM object methods
that have to be virtual using unsafe Rust code.


You mean non-virtual, right?


I mean virtual. Right now virtual in Rust is all-or-nothing; all your 
methods are virtual or none are. As I understand things, however, the 
DOM needs a finer-grained approach, in which some methods are virtual, 
some methods are nonvirtual, and some fields can be accessed at known 
offsets. So for now we can use structs

Re: [dev-servo] More planning

2013-02-13 Thread Patrick Walton

On 2/13/13 11:11 PM, Boris Zbarsky wrote:

Only sets on clean nodes are slow, right?  Once you've set something,
you just keep writing to the already-allocated bit.


Right.

Patrick

___
dev-servo mailing list
dev-servo@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-servo


[dev-servo] New DOM implementation

2013-02-16 Thread Patrick Walton
I have started work a new DOM implementation in the "dom" branch. It is 
currently building except for an internal compiler error, the fix for 
which is currently being upstreamed to Rust.


This new DOM implementation is based on structs instead of enums. The 
base type is `AbstractNode`, which is an opaque pointer with many 
accessors that let you query what type of node it is and downcast 
appropriately. (I'm thinking of changing `AbstractNode` to `Node` and 
changing the concrete `Node` to something else, though, because the word 
`Abstract` is littering the codebase.) When structs are allowed to 
inherit from other structs in Rust, then the implementation will become 
a bit simpler and will use less unsafe code internally.


The advantages of the new implementation over the previous one are (a) 
several layers of indirection are gone; (b) the copy-on-write interface 
is dramatically simplified, eliminating the need for answers to annoying 
questions like how to collect dead handles; (c) DOM nodes now only take 
up as much memory as they need to; (d) we can make borrowing of nodes 
sound, because the new interface is amenable to the dynamic checks 
described in Niko's "Imagine Never Hearing the Phrase 'Aliasable, 
Mutable' Again" blog post.


This new DOM implementation currently exposes an unsafe interface in a 
few ways:


(1) When creating a node (converting from the base type to 
AbstractNode), there is no check performed to ensure that the thing you 
passed in actually is a Node, and moreover that it is the node that you 
claimed it was. This can be fixed by providing safe wrappers around node 
constructors and moving the low-level node constructors into the trusted 
computing base (hereafter TCB). These constructor primitives are very 
simple operations, so the TCB should remain easy to audit for security. 
It can also be fixed to some degree by adding struct inheritance to 
Rust, as I plan to propose. But note that node construction must remain 
part of the TCB to some degree, because nodes are owned by the 
SpiderMonkey garbage collector and not the Rust garbage collector.


(2) Borrowing of nodes (i.e. downcasting from AbstractNode to a concrete 
Node subclass) currently does not perform the dynamic checks needed for 
safety. What this means is that it is possible to cause segfaults with 
certain combinations of mutations and pointer borrowing. The fix will 
simply be to add these checks. Once these checks are added to the TCB, 
segfaults should not be possible in the safe Servo code.


(3) There is nothing currently preventing Rust code in the script task 
from accessing (and racing on) layout data structures, and layout from 
accessing dirty nodes. I believe this can be fixed with phantom types: 
layout will see a type that prevents access to the dirty parts of nodes, 
and script will see a type that prevents direct access to layout info.


I believe there are solutions to each of these problems, and of course 
fixing them is a high priority for the project. But note that, as I 
described before, there will always be some unsafe code relating to node 
memory management as part of the TCB, because SpiderMonkey's garbage 
collector manages the nodes. The goals are (a) to use as little unsafe 
code as possible, and, most importantly, (b) to prevent the unsafeness 
from leaking out into script and layout code.


Finally, note that the copy-on-write scheme is not yet implemented; 
right now script will just block on layout. Fixing this is a high 
priority as well.


Patrick
___
dev-servo mailing list
dev-servo@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-servo


Re: [dev-servo] New DOM implementation

2013-02-17 Thread Patrick Walton

On 2/17/13 3:16 AM, smaug wrote:

Curious, how does GC manage nodes? Do we explicitly trace even those
nodes which don't have js wrappers or do
we just use GC for everything in Rust?


All the nodes in Servo have JS wrappers at the moment, and their 
lifetimes are managed by the JS garbage collector. The JS garbage 
collector traces all nodes. The Rust garbage collector isn't really 
involved as far as DOM nodes are concerned.



How do we keep GC times fast enough? Will generational GC optimize out
those nodes which aren't used lately?


We don't have to run destructors on the main thread in Servo, so IGC 
should in theory be able to perform better. Also we won't have to visit 
all nodes except during a major collection with generational GC. But 
we'll have to see how it performs in practice, of course.


It's possible to move to some sort of different scheme in the future. 
The DOM implementation is pretty simple at the moment.


Patrick

___
dev-servo mailing list
dev-servo@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-servo


Re: [dev-servo] New DOM implementation

2013-02-17 Thread Patrick Walton
This is a good point. There are two issues here:

(1) Ensuring that layout does not retain strong references to DOM nodes across 
layout requests. Since JS GC is not concurrent, we need this invariant, as 
layout runs in a separate task. Currently layout places references to DOM nodes 
inside RenderBoxes. These need to become true weak references. Then we need to 
enforce that strong references to DOM nodes live no longer than a layout 
request. The typical way to enforce that some value never leaves a block of 
code is to use the region system, and I believe we can do that here. Will need 
to think it over and discuss with Niko.

(2) Ensuring that the Rust code in the script task does not retain strong 
references to DOM nodes that are invisible to JS GC. One way to do that would 
be to use the JS GC for the Rust GC in that task. Brian and I have talked about 
pluggable per-task garbage collectors; there nothing in Rust's design that 
comes to mind that would preclude using the JS GC for all Rust GC data in that 
one task (while using the Rust GC for other tasks). This is a more long-term 
project though, so perhaps it would be feasible in the short term to prevent 
nodes from being stored in untraced objects at all (perhaps by making them 
noncopyable and using the region system in a similar way to layout above). Will 
need to think about this a bit more.

Patrick

Robert O'Callahan  wrote:

>On Mon, Feb 18, 2013 at 7:27 AM, Patrick Walton 
>wrote:
>
>> On 2/17/13 3:16 AM, smaug wrote:
>>
>>> Curious, how does GC manage nodes? Do we explicitly trace even those
>>> nodes which don't have js wrappers or do
>>> we just use GC for everything in Rust?
>>>
>>
>> All the nodes in Servo have JS wrappers at the moment, and their
>lifetimes
>> are managed by the JS garbage collector. The JS garbage collector
>traces
>> all nodes. The Rust garbage collector isn't really involved as far as
>DOM
>> nodes are concerned.
>
>
>It's pretty important that we handle cycles through JS and Rust in a
>clean
>way without having to write unsafe code. Ideally we could do it without
>having to write any special code. What prevents us from integrating
>Rust
>and JS GC so it Just Works (tm)?
>
>Rob
>-- 
>Wrfhf pnyyrq gurz gbtrgure naq fnvq, “Lbh xabj gung gur ehyref bs gur
>Tragvyrf ybeq vg bire gurz, naq gurve uvtu bssvpvnyf rkrepvfr nhgubevgl
>bire gurz. Abg fb jvgu lbh. Vafgrnq, jubrire jnagf gb orpbzr terng
>nzbat
>lbh zhfg or lbhe freinag, naq jubrire jnagf gb or svefg zhfg or lbhe
>fynir
>— whfg nf gur Fba bs Zna qvq abg pbzr gb or freirq, ohg gb freir, naq
>gb
>tvir uvf yvsr nf n enafbz sbe znal.” [Znggurj 20:25-28]

-- 
Sent from my Android phone with K-9 Mail. Please excuse my brevity.
___
dev-servo mailing list
dev-servo@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-servo


[dev-servo] Removing Cairo from the tree

2013-03-13 Thread Patrick Walton
Cairo and Pixman are hard to build on some platforms like the Mac, and I 
think Skia-GL, CoreGraphics, Direct2D is where we want to focus our 
efforts in the future. To that end I'd like to remove Cairo from the 
Servo tree.


It'll still be possible to use Cairo through Azure. It'll just have to 
be preinstalled on the system--we won't build it ourselves. I think this 
is probably OK for Linux systems including Tizen--if it's the native 
graphics library it should be preinstalled via the OS package manager.


Any objections?

Patrick
___
dev-servo mailing list
dev-servo@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-servo


Re: [dev-servo] Removing Cairo from the tree

2013-03-13 Thread Patrick Walton

On 3/13/13 11:31 AM, Boris Zbarsky wrote:

On 3/13/13 2:09 PM, Patrick Walton wrote:

It'll still be possible to use Cairo through Azure. It'll just have to
be preinstalled on the system--we won't build it ourselves. I think this
is probably OK for Linux systems including Tizen--if it's the native
graphics library it should be preinstalled via the OS package manager.


Are we happy with not having any control over which (possibly buggy)
cairo version we're using?


Well, it looks like at this time (based on a few informal conversations 
with Jeff Gilbert and other GFX folks) that it probably makes sense to 
use Skia-GL by default on the Linux desktop. And on Windows, Mac, and 
Android we use Direct2D, CG, and Skia respectively. That basically 
leaves Tizen as the only user of Cairo. I don't foresee any problems 
there, since I imagine they're shipping a new Cairo/Pixman, but if need 
be we can build Cairo only on that system.


In any case I think we should stop building Cairo on systems where it's 
not going to be used; shipping all possible rendering backends with 
Servo just seems like code bloat to me.


Patrick

___
dev-servo mailing list
dev-servo@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-servo


Re: [dev-servo] Codegen WIP

2013-03-13 Thread Patrick Walton

On 3/13/13 2:06 PM, Josh Matthews wrote:

I just pushed the result of my work to the master branch. The bindings
for ClientRect and ClientRectList should be generated from their
respective webIDL files as part of the build now, and running
test/test_bindings.html should show the (totally faked) result of
calling getClientRects() on the root document element.

The next step involves cleaning up some of the messes I made, such as
dom/bindings/utils.js, followed by expanding the set of generated bindings.


Great!

Patrick

___
dev-servo mailing list
dev-servo@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-servo


Re: [dev-servo] First synchronous hammering performance numbers

2013-05-27 Thread Patrick Walton
So I guess the problem is that setAttribute() is slow?

Josh Matthews  wrote:

>On 05/27/2013 03:45 PM, Josh Matthews wrote:
>> On 05/27/2013 02:25 PM, Josh Matthews wrote:
>>> The following script (test_hammer_layout.html) can now run in Servo,
>so
>>> I've finally been able to begin obtaining baseline performance data.
>>> Note that this is with the exiting Rust scheduler, not Brian's new
>one:
>>>
>>>  >var divs = document.getElementsByTagName("div");
>>>  >var div = divs[0];
>>>  >
>>>  >var count = 100;
>>>  >var start = new Date();
>>>  >for (var i = 0; i < count; i++) {
>>>  >  div.setAttribute('id', 'styled');
>>>  >  div.getBoundingClientRect();
>>>  >}
>>>  >var stop = new Date();
>>>  >window.alert((stop - start) / count * 1e6);
>>>  >window.close();
>>>
>>> This yields 1239493.
>>>
>>> Cheers,
>>> Josh
>>
>> For the sake of contrast, a current Firefox nightly shows me 1319.
>
>And interestingly enough, the same test with the 
>div.getBoundingClientRect call removed yields a marginally better
>1162253.
>___
>dev-servo mailing list
>dev-servo@lists.mozilla.org
>https://lists.mozilla.org/listinfo/dev-servo

-- 
Sent from my Android phone with K-9 Mail. Please excuse my brevity.
___
dev-servo mailing list
dev-servo@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-servo


[dev-servo] Handling the case where a child becomes non-inline

2013-06-04 Thread Patrick Walton

Hi everyone,

We're starting on incremental layout, in order to improve performance of 
the layout benchmarks we are most focused on (`test_hammer_layout` and 
`test_slam_layout`). It was also felt, based on past experience, that 
having incremental layout from the start is important to get in early 
(at least in some form) to avoid problems later.


We're trying to figure out what to do when an inline RenderBox of an 
inline flow becomes non-inline. This is a vexing issue, as it violates 
the invariant that a flow's descendants and the flow itself are the only 
flows that need to be reconstructed from scratch when a box's style 
changes. This is because the adjacent inlines turn into anonymous block 
flows.


The code in Gecko that is responsible for this is WipeContainingBlock in 
nsCSSFrameConstructor.cpp [1]. The code in WebKit that is responsible 
for this is childBecameNonInline() in RenderInline.cpp [2].


I'm wondering if any layout gurus have any advice as to what to do here. 
What is the most elegant way to handle this case? Is there anything we 
should be aware of in order to optimize this path?


Thanks!

Patrick

[1]: 
http://mxr.mozilla.org/mozilla-central/source/layout/base/nsCSSFrameConstructor.cpp#11280


[2]: 
http://www.opensource.apple.com/source/WebCore/WebCore-955.66.1/rendering/RenderInline.cpp

___
dev-servo mailing list
dev-servo@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-servo


Re: [dev-servo] Handling the case where a child becomes non-inline

2013-06-05 Thread Patrick Walton

On 6/5/13 8:30 AM, Boris Zbarsky wrote:

On 6/4/13 9:47 PM, Patrick Walton wrote:

We're trying to figure out what to do when an inline RenderBox of an
inline flow becomes non-inline. This is a vexing issue, as it violates
the invariant that a flow's descendants and the flow itself are the only
flows that need to be reconstructed from scratch when a box's style
changes.


This invariant simply isn't.  ;)  Consider this simple testcase:

   
 Some
 text
 here
   

and then toggling the display of the id="willChange" div to
"table-cell".  Or to "none", for that matter.


I'm wondering if any layout gurus have any advice as to what to do here.
What is the most elegant way to handle this case? Is there anything we
should be aware of in order to optimize this path?


So first off, this is usually a rare case.  So in practice, I believe
the Gecko approach of "wipe out part of the render tree and recreate it"
is perfectly reasonable here as a complexity/performance tradeoff.


Do you mean that having to wipe the parent flow when a child render box 
changes is a rare case in general or that this specific case (an inline 
becoming non-inline) is rare?


So what I'm thinking is that, in general, we should structure 
incremental layout under the general assumption that parent flows don't 
need to be recreated when their children change (note: they may well 
have to be *reflowed*, but not recreated from scratch), and in the 
corner cases in which this isn't true, wipe and rebuild the parent 
flows. Does this seem OK?


Patrick
___
dev-servo mailing list
dev-servo@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-servo


[dev-servo] Minutes of the discussion regarding strings

2013-06-05 Thread Patrick Walton
Topics covered: Interning, mutability, cost of creating string objects, 
encoding UTF-8 versus UTF-16.


https://github.com/mozilla/servo/wiki/Strings

Patrick
___
dev-servo mailing list
dev-servo@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-servo


Re: [dev-servo] Minutes of the discussion regarding strings

2013-06-07 Thread Patrick Walton

On 6/7/13 6:43 AM, Benjamin Smedberg wrote:

On 6/5/2013 8:42 PM, Patrick Walton wrote:

Topics covered: Interning, mutability, cost of creating string
objects, encoding UTF-8 versus UTF-16.

https://github.com/mozilla/servo/wiki/Strings

I would love to have been invited to this meeting. Was it announced
anywhere?


It was kind of an impromptu thing; should have been announced more 
widely, sorry.



  * Gecko has mutable strings and this is bad for performance


I'd like to understand/challenge this statement. Is this bad for
performance because you have to check the extra SHARED flag on write?
With auto-sharing forcing immutability, I have trouble believing this is
a big deal in practice. The noticeable problem with auto-sharing right
now is that it requires threadsafe refcounting, which *does* show up in
benchmarks, but that would continue to be a problem with immutable
strings, if they needed to be thread-shareable. Was there discussion
about whether these strings would be at all threadsafe (and the
interning table)?


The tentative conclusion was that thread safety is not needed for the 
interning table, because the layout thread does not need to intern 
strings (and CSS parsing is handled on the script thread in Servo), only 
to access the contents of interned strings. In the rare cases in which 
layout would need to hang onto a non-static interned string across 
invocations it could just copy the strings, but bz felt that such cases 
would be rare.



My primary concern with string builders is that they typically
reallocate when you convert the builder to an immutable string. If we
can avoid that case by reassigning the buffer, then I think most of my
objections go away.


In Rust a "string builder" is just a mutable unique string, and it won't 
reallocate when you convert it to immutable. (This is the 
"freeze"/"thaw" pattern.)



Was there discussion about whether string buffers should be refcounted
or GCed (or copied, but I'm pretty sure that would cause memory explosion)?


Ref counting versus GC is determined on a case-by-case basis in Servo. 
There's no one-size-fits-all solution: we're using threadsafe reference 
counting, or unique strings, or possibly-interned strings, as the 
situation calls for it. Admittedly this is kind of a non-answer. :) I'd 
be curious as to which specific situations you had in mind.


Patrick

___
dev-servo mailing list
dev-servo@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-servo


Re: [dev-servo] Where do styles live, and how to make it work with CSS fragmentation stuff

2013-06-13 Thread Patrick Walton

On 6/13/13 5:45 PM, Joel Feenstra wrote:

What about an enum of the various style lookup strategies with
inlining for all of them. Rustc/LLVM may need to be able to inline 2
levels of function calls (to call the enum impl and then the value
impl), and I know it isn't great at inlining currently.


Rust is pretty good about inlining from what I've seen. If anything it 
is overly aggressive.


However, I do think that using a closure is probably the wrong approach; 
flags seem more appropriate to me.


Patrick
___
dev-servo mailing list
dev-servo@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-servo


Re: [dev-servo] Where do styles live, and how to make it work with CSS fragmentation stuff

2013-06-13 Thread Patrick Walton

On 6/13/13 5:48 PM, Joel Feenstra wrote:

I thought inline attributes had to be added to things. Maybe that was a while 
ago.


They do to get cross-crate inlining to work. That's by design, so that 
you can replace the bodies of non-inlined functions without having to 
recompile all downstream crates.


Patrick

___
dev-servo mailing list
dev-servo@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-servo


Re: [dev-servo] Unanswered questions from a Q&A about Servo

2013-07-12 Thread Patrick Walton

Here are my attempts at answers.

On 7/12/13 10:51 AM, Josh Matthews wrote:

* Are there viability criteria? How much better (performance-wise) than
other engines does Servo have to be considered worth continued investment?


No idea. There are both security and performance wins, so both must be 
considered.



* What are the expectations of parallel wins relative to Gecko? Where
are the wins, and how much?


Leo Meyerovich's work (Fast and Parallel Webpage Layout) has some 
numbers that he got from parallel speedups (see section 5.4, Performance 
Evaluation):


http://www.eecs.berkeley.edu/~lmeyerov/projects/pbrowser/pubfiles/playout.pdf


* What's the timeline for determining viability?


Hard to estimate at this stage; as much as possible we're trying to get 
early numbers, but of course we have to ensure that our numbers mean 
something. (A browser that does nothing will be very fast, but that 
number doesn't mean much!)


In general we've been finding that our small microbenchmarks such as 
time spent to perform synchronous message sends (discussed later) are 
encouraging, but until most of the pieces (for example, pure-Rust CSS 
selector matching and dirty bits for reflow) are in, the numbers will 
not be competitive with Gecko for the simple reason that layout engine 
is incomplete.



* What is the plan to handle cycles between the DOM and JS?


The JavaScript garbage collector handles all DOM objects, so the JS 
garbage collector will trace all DOM-JS cycles and clean them up.



* Do same-origin pages that can't communicate synchronously need to be
on the same script task? (scenario: two tabs pointing at google.com
properties)


No. Chromium uses separate processes for these and the plan as far as I 
understand it is to do the same with tasks/processes.



* What's the status of allocating DOM objects on the JS heap and/or
using the JS GC for the DOM task?


The DOM task doesn't use `@` pointers very much, but where it does, 
they're in the Rust heap and can't have strong references to DOM nodes 
(other than the root of the document).



* Are we ignoring the major known issue on the web (synchronous layout
operations) in favour of fixing the less important ones? (mentioned
running GC during sync layout; audience not particularly impressed)


I wouldn't say we're ignoring them; we've talked about prototyping, and 
proposing for standardization, asynchronous layout operations (e.g. 
getBoundingClientRectAsync). Running GC during sync layout is for 
helping extant Web content; ideally the Web shouldn't be blocking on 
layout operations in the first place, and Servo can help it get there.



* Could we "freeze" background pages (on a language level?) and unload
them in order to conserve memory and avoid extra processing work, with
the expectation of resurrecting them later as needed?


Rust does have good support for serialization of object graphs, so this 
is potentially feasible. However, the JS engine is a question mark here: 
probably work would need to be done to serialize the JS state of a page.



* How can we handle individual task failure, such as layout/rendering?
Could we recover by spawning a new task, or do we need to create a whole
pipeline from scratch?


We should be able to recover by spawning a new task.


* Is abortable layout being considered for short-circuiting in-progress
operations when new results are needed?


We've thought about it, but it isn't currently being done.


* If compositing ends up happening no sooner than Gecko (ie. we're still
a sequential pipeline starting from layout), aren't we wasting effort
being parallel (more cores, more work, same amount of time)?


If we don't use the pipelining, it's true that it's a loss. But the 
hypothesis is that pipelining will kick in enough for it to be useful: 
we want to allow e.g. script to run while layout is running, or for 
layout to be able to return to script while Azure is rendering the new 
tiles for display.



* Do we have data showing how many security bugs we could be avoiding in
Servo in comparison to Gecko? Is the security benefit truly as valuable
if expected performance benefits don't pan out?


We've been talking to some members of the security team (Jesse, Brian). 
In general the main class of security vulnerabilities that Rust offers a 
layer of defense against is memory safety problems in layout, rendering, 
and compositing code. Use-after-free is the big one here, but there are 
others. I'm not in the sg so I can't run the numbers myself, but I am 
told this constitutes a large class of security vulnerabilities.


The pwn2own and Pwnium results demonstrate, at least to me, that memory 
safety is still valuable even in the presence of intricate sandboxing. 
We need defense in depth, and Rust's type system provides a strong layer 
of defense for new code.


Servo is *also* designed to be amenable to OS sandboxing, so that 
processes compromised via unsafe code or the JIT can be stopped from 
taking over 

Re: [dev-servo] Unanswered questions from a Q&A about Servo

2013-07-12 Thread Patrick Walton

On 7/12/13 11:04 AM, Benjamin Smedberg wrote:

I'd like to understand what the parallelism targets are... if DOM and
layout are not parallelized with a single DOM tree but we can run
multiple of these tasks at once (so that tabs are isolated from
eachother) is that good enough for a v1 product? Is that a different
target than the servo team is currently looking at as a research project?


It depends on how much the pipelining buys us. Does the ability to 
return immediately to script events instead of waiting for Azure to 
render tiles (for example) help us a lot? We must measure. (But to have 
reasonable measurements, we also need to get some of the basic 
optimizations implemented, for example incremental reflow, so that we 
stack up reasonably against Gecko in the control case.)


Note that layout is sort of parallelized today, in that image decoding 
happens in parallel, but that's only a small part (and one that Gecko 
does too).


Patrick

___
dev-servo mailing list
dev-servo@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-servo


Re: [dev-servo] Unanswered questions from a Q&A about Servo

2013-07-12 Thread Patrick Walton

On 7/12/13 12:18 PM, Josh Matthews wrote:

When you say "very much", I get worried. All codegen'ed DOM code uses @
pointers


Does it need to use @ pointers? That seems unfortunate.


and I don't understand what you mean by "can't have strong
references to DOM nodes"; consider events we eventually want to dispatch
that are targeted at DOM nodes and have properties by which this target
can be retrieved.


OK, we'll need strong references there too. Anyway, that shouldn't be a 
deal breaker unless they're cyclic. Anything that can have cycles should 
be traceable by JS.


Patrick

___
dev-servo mailing list
dev-servo@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-servo


Re: [dev-servo] Unanswered questions from a Q&A about Servo

2013-07-12 Thread Patrick Walton

On 7/12/13 12:43 PM, Josh Matthews wrote:

Yes, this is the bed we have made for ourselves in Servo at this point.


Can we just have the JS trace hook for a DOM node recursively search 
through children for wrappers, even those that don't have wrappers?


To illustrate, suppose we have this DOM:

  O -- 1
 / \
X   X
   / \   \
  X   X   2
  |
  3
  |
  4

O is the DOM node that has a wrapper which the JS garbage collector has 
invoked the trace hook on. 1, 2, 3, and 4 are DOM nodes with wrappers. 
Xs are DOM nodes without wrappers, because the wrappers have not been 
created yet. When the trace hook is called on O, it searches recursively 
through children and siblings and marks them, so it will mark 1, 2, and 
3 as live. It stops once it finds a child node with a wrapper. Because 3 
was marked, the JS engine then invokes the trace hook on it. In turn, it 
marks 4 as live.


(I would prefer not to go the CC-and-GC route if we can avoid it, 
although if we have to we can do that too, with good support from Rust. 
Note that Chromium is adopting a full-GC model, like Servo's, with the 
Oilpan project, although that's mostly driven by Dart.)


Patrick

___
dev-servo mailing list
dev-servo@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-servo


Re: [dev-servo] Unanswered questions from a Q&A about Servo

2013-07-12 Thread Patrick Walton

On 7/12/13 1:27 PM, Boris Zbarsky wrote:

What keeps the "X" that is a child of "2" alive?


Interesting question. I had assumed that X's were kept alive by their 
parents. After all, JS will not try to free things that it doesn't know 
about (and presumably JS doesn't know about wrapperless nodes at all).


If there is no wrapper created for a node, then the ways that that node 
can be destroyed seem limited to me (JS can't remove it, since it can't 
see it without first creating a wrapper). But I could be wrong here.


Patrick

___
dev-servo mailing list
dev-servo@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-servo


Re: [dev-servo] Unanswered questions from a Q&A about Servo

2013-07-12 Thread Patrick Walton

On 7/12/13 1:40 PM, Jack Moffitt wrote:

I can write an example of a script that would produce such a situation, if
desired.


Yes, please. I would love to have a concrete example.


So I think there are basically three options:

1. Use the JS GC for everything; eat the cost of eagerly creating all 
wrappers. As Boris mentioned, maybe this isn't so bad. I would assume 
this is what Oilpan is doing in Blink.


2. Use the JS GC for wrapped objects and reference counting for 
non-wrapped objects. This assumes there are no cycles between them, 
which I believe to be the case (though could be wrong), because strong 
references from Rust to DOM nodes should be fairly minimal and acyclic 
(the reference to the root, events, maybe others?)


3. Use the JS GC for wrapped objects and unique ownership coupled with 
weak pointers for non-wrapped objects. In other words, there is a single 
"strong owner" for each node and all other pointers from Rust to the DOM 
are weak. (For example, event targets for queued events would be weak 
and if the DOM node dies then the event is dropped.) It is not clear to 
me that this is feasible; however, it might be.


Perhaps at this point we should go with (1) and measure what the cost of 
the wrappers is, as Boris suggested.


Patrick

___
dev-servo mailing list
dev-servo@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-servo


Re: [dev-servo] Unanswered questions from a Q&A about Servo

2013-07-12 Thread Patrick Walton

On 7/12/13 4:45 PM, Patrick Walton wrote:

1. Use the JS GC for everything; eat the cost of eagerly creating all
wrappers. As Boris mentioned, maybe this isn't so bad. I would assume
this is what Oilpan is doing in Blink.


After talking to Terrence from the JS team, apparently with a month or 
two of hacking on SpiderMonkey we might be able to allocate Rust objects 
of 18 64-bit words or fewer into the JS heap directly. In other words, 
fuse the Rust structure with its wrapper. We could spill to the Rust 
heap from the wrapper if and only if the object is more than 18 words.


This would be similar to Blink's Oilpan project, but better because we 
can take advantage of the compiler to generate the trace hooks instead 
of having to manually write trace hooks and trace all the pointers.


Patrick

___
dev-servo mailing list
dev-servo@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-servo


Re: [dev-servo] Unanswered questions from a Q&A about Servo

2013-07-12 Thread Patrick Walton

On 7/12/13 5:38 PM, Robert O'Callahan wrote:

What would the developer experience be like with this approach? I mean,
what kind of code would developers have to write to declare classes
managed by JS and to declare references to such objects?


I suspect it would be something like:

#[deriving(JsManagable)]
struct MyObject {
field_a: int,
field_b: float,
field_c: Option>,
...
}

fn f(x: JsManaged) {
... do something with x ...

// make a loop, why not?
x.mutate().field_c = Some(x.clone());
}

fn g(x: &JsManaged) {
... do something with x ...
}

fn main() {
let object = new(JsManaged) MyObject {
field_a: 10,
field_b: 20.0,
field_c: None,
...
};
f(object.clone()); // make a new reference
g(&object);// or don't and just borrow
}

Patrick

___
dev-servo mailing list
dev-servo@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-servo


Re: [dev-servo] Unanswered questions from a Q&A about Servo

2013-07-12 Thread Patrick Walton

On 7/12/13 5:48 PM, Robert O'Callahan wrote:

Does Option> compile down to a single machine word with no
overhead for dereferencing?


The Rust compiler now implements optimizations to compile Option of a 
pointer down to a nullable pointer (although I would have to verify that 
it indeed works in this case).


I think in most cases there will be some overhead for dereferencing, due 
to the write barrier required by a generational/incremental GC. There is 
also a write barrier needed by Rust for ensuring soundness of mutation 
(see the "Imagine Never Hearing the Phrase 'Aliasable, Mutable' Again" 
blog post for details).


In the current design reads need a barrier too, something that I just 
realized -- I should probably talk this over with Niko and pick some 
approach to fixing it in the Rust compiler. (There are various 
approaches; probably the easiest one is something like a ReadPtr trait 
known to the compiler.)


In general the overhead for writes and borrows will probably be only a 
couple of instructions over a dereference operation. Reads should be 
unbarriered, although I think we need a little more design to make that 
happen in the current language.


Patrick

___
dev-servo mailing list
dev-servo@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-servo


Re: [dev-servo] Unanswered questions from a Q&A about Servo

2013-07-12 Thread Patrick Walton

On 7/12/13 7:11 PM, Boris Zbarsky wrote:

Hmm.  18 64-bit words is enough for a basic element, I'd think, though
with the copy/on-write setup we would need to measure carefully.
Subclasses might need to heap-allocate some of their members, though,
which is a bit of a pain.


Right; unique pointers should make that fairly painless in Rust though.

Patrick

___
dev-servo mailing list
dev-servo@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-servo


Re: [dev-servo] Unanswered questions from a Q&A about Servo

2013-07-12 Thread Patrick Walton

On 7/12/13 7:09 PM, Boris Zbarsky wrote:

On 7/12/13 7:45 PM, Patrick Walton wrote:

2. Use the JS GC for wrapped objects and reference counting for
non-wrapped objects. This assumes there are no cycles between them,
which I believe to be the case (though could be wrong)


If we can have both wrapped and unwrapped DOM nodes, I don't see how we
can have no cycles between the two...


Yeah, you're right. We would need some sort of strongly-connected-cycle 
detector, either ad hoc or a cycle collector. Neither sounds appealing, 
but especially the former. Probably the best thing to do in this case 
would be RC+CC, like Gecko does, but compiler-assisted.


At this point I'm most inclined to try to implement the 
allocate-in-the-JS-GC heap strategy, and see how far that gets us. If 
that isn't feasible, we can try wrapping all DOM nodes and then measure 
the overhead (actually, this is what we're doing now, so this is more 
easily testable). If none of those pan out, then we can investigate 
RC+CC schemes.


Patrick

___
dev-servo mailing list
dev-servo@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-servo


Re: [dev-servo] Meeting notes - further discussion of selected topics from "Unanswered questions" thread

2013-07-15 Thread Patrick Walton

On 7/15/13 3:58 PM, Robert O'Callahan wrote:

For compositing with IFRAMEs, you might want to consider what we do for
multi-process compositing in Gecko. It's somewhat obvious but it seems like
it works well. Each process publishes a layer tree to the compositor. A
layer tree can contain RefLayers, each with a unique ID. A layer tree root
can also be given an ID. The compositor matches roots to RefLayers to build
a complete layer tree which is composited as a whole.


That was basically what we whiteboarded. (I called a RefLayer an 
IframeProxyLayer but the idea was the same.)


Patrick

___
dev-servo mailing list
dev-servo@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-servo


Re: [dev-servo] Safe abstractions for unique handles

2013-07-15 Thread Patrick Walton

On 7/15/13 4:32 PM, Robert O'Callahan wrote:

A minor random thought related to that design:

Using integer IDs, as Gecko does, raises the possibility of a bug,
possibly even a security bug, where a rogue IFRAME is able to render
itself at the wrong place in the tree. In Gecko, preventing a
compromised process from doing an attack like that requires explicit
code to track which IDs have been issued to a process and checking to
make sure it doesn't use an ID we didn't give it. Obviously it would be
much better to use an object-capability approach, but of course we can't
do that in Gecko. So it would be nice if Rust+Servo can provide a robust
cross-task object-capability-ID abstraction for use in situations like this.


Right, that's been on our mind as well. We try to use channels and ports 
wherever we can. In general channels and ports are better than IDs in 
every way: they cannot be forged and have better memory management 
properties (as integer IDs can't really be reference counted).


Patrick

___
dev-servo mailing list
dev-servo@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-servo


[dev-servo] Plans for parallel constraint solving in layout

2013-07-16 Thread Patrick Walton

Hi everyone,

Eric Atkinson and I whiteboarded some ideas for ensuring data race 
freedom and memory safety while performing parallel constraint solving 
in layout. Here are some random notes from that discussion as well as 
previous discussions we've had:


* Parallel constraint solving is about structuring the layout passes as 
tree traversals and performing those tree traversals in parallel. 
Parallelizable tree traversals are bottom-up and top-down. In-order tree 
traversals are not parallelizable in general. Therefore, layout should 
favor bottom-up and top-down traversals wherever possible.


* Floats constrain some subtrees to be in-order traversals, and 
therefore those subtrees are not parallelizable during the assign-widths 
traversal. We didn't discuss any general strategy for making sure that 
memory safety/data race freedom are not violated in the presence of 
floats; that will come later. (I don't foresee any fundamental problems 
with ensuring this though.)


* As Leo showed, table layout can be done with 7 parallel passes, but 
it's probably simpler to just do those as a sequential in-order 
traversal for now.


* In Servo we have three main tree traversals: bubble-widths 
(bottom-up), assign-widths (top-down), and assign-heights (bottom-up).


* The traversals in Servo are designed not to be "ad hoc" traversals; 
rather they are driven by a traversal function. An "ad hoc traversal" is 
what (as I understand it) most browser engines do for reflow: each 
frame/render object is responsible for laying out its children by 
calling virtual methods (`Reflow()` or similar). There is no external 
"driver" to the traversal except by calling `Reflow` on the root frame.


* Instead of ad-hoc traversal, layout in Servo is driven by what I'm 
calling "kernel functions". A kernel function is a small function that 
looks only at the node and its children to perform its work, and never 
recurses to children on its own. An example of a kernel function is the 
"assign height" function for block flows: it computes the height of one 
block flow by summing up the heights of its children and then returns.


* The problem with "ad hoc" traversals is that they give too much power 
to the kernel functions. A `Reflow` virtual method is permitted to do 
anything: it can start top-down, bottom-up, or in-order traversals; it 
can walk up or down the tree; it can race on anything. This is what 
makes ad-hoc traversals difficult to parallelize.


* By contrast, Servo kernel functions are written to operate on only one 
flow at a time, consulting the children of the flow as necessary. This 
means that we need higher-level *driver function* that have the task of 
invoking the kernel functions in the right order. We have two driver 
functions in Servo: bottom-up traversal and top-down traversal. These 
functions take a flow tree and a kernel function and perform the 
appropriate traversal. Today, this traversal is sequential.


* The key insight to ensuring data race freedom and memory safety in a 
parallel setting (thanks to Eric!) is to observe that, assuming the 
driver function is correct, all we must do to ensure no data races is to 
*limit the kernel function to accessing only the node that it's 
operating on and its children*. If we forbid access to the `get_parent`, 
`get_prev_sibling`, and `get_next_sibling` accessors inside the kernel 
functions, and ensure that kernel functions close over no mutable 
layout-task-specific data, then kernel functions can do whatever they 
want and there will be no data races or memory safety violations.


* We have a mechanism in Rust's type system for conditionally forbidding 
access to methods in certain contexts: phantom types. We are already 
using this for the COW DOM (`AbstractNode`).


* We also have a mechanism to forbid kernel functions from closing over 
mutable state: unique closures (soon to become thunk traits).


* What we can do in Servo is to parameterize flow contexts over a 
phantom `View` type. `FlowContext` permits access to 
parents, siblings, and children. `FlowContext` does not 
permit access to parents or siblings, but does permit access to 
children. (Actually, it'd be a bit more complicated than this: the child 
of a `FlowContext` would be a 
`FlowContext`, which would permit access to siblings 
but not to parents. This is because kernels need to be able to access 
siblings of the first child.)


* This scheme depends on the correctness of the traversal functions, as 
well as the correctness of the tree manipulation code. Both are outside 
the capability of Rust's type system to prove. If we want additional 
assurance, we could use fuzz testing, or we could maybe use something 
like Promela to prove the correctness of our parallel traversal, or even 
explore proof assistants like Coq for the tree code...


* That said, the nice thing about this scheme is that the traversal 
functions and tree manipulation code are quite small compared to the 
untrusted ke

Re: [dev-servo] Plans for parallel constraint solving in layout

2013-07-16 Thread Patrick Walton

We just had another discussion with several outcomes:

# Floats

Height assignment is currently an in-order traversal in Servo, to cope 
with floats. However, this could be turned into a bottom-up traversal. 
The trick is to treat all DOM subtrees that contain floats (and no 
`clear: both`) as single "macro-nodes" for the purposes of layout. 
Consider this picture:


 X
/ \
   X   X
   |
   O   <--
  / \
 O   O
 |   |
 O   O
/ \
   Y   X

Suppose that the `O` nodes are potentially impacted by floats and the 
`X` and `Y` nodes are known not to be in a context that floats could 
impact (for example, if `Y` has the `clear: both` style). We need to 
traverse the `O` nodes sequentially to track floats. This can, however, 
be modeled as a bottom-up traversal with a more complex kernel function.


Recall that kernel functions are allowed to inspect and modify their 
children, just not their parents. We can therefore simply make the 
assign-heights kernel function for the node marked by the arrow perform 
the in-order, sequential traversal that accounts for floats on its 
subtree. The kernel functions for the remainder of the `O` nodes become 
no-ops. In other words, for every node in a potentially-float-affected 
subtree, we turn the assign-heights kernel functions into no-ops for 
every node except the parent node.


We could probably optimize this in the future to not even call the no-op 
functions at all.


# Scheduling

The scheme with the concurrent queue that I proposed earlier doesn't 
work, because we have to make sure that children are never visited until 
their parents are (for top-down) and that parents are never visited 
before their children are finished (for bottom-up). Instead, we need 
somewhat more complex synchronization.


Probably the most efficient thing to do would be to use a Chase-Lev 
work-stealing queue for each OS thread. We would have a sequential setup 
phase that does something different depending on whether this is a 
top-down or bottom-up traversal:


* For a top-down traversal, the setup phase counts the total number of 
nodes in the tree and then enqueues the root of the tree.


* For a bottom-up traversal, the setup phase counts the number of 
children of each flow and writes that value into the flow. Then it 
enqueues the leaves of the tree.


After the setup phase, the worker threads start up. Each worker thread 
spins on its work queue until done. When a worker has work to do, it 
removes a node from the work queue and executes the untrusted kernel 
function. Then:


* For a top-down traversal, the worker atomically decrements the total 
number of nodes. If zero, it shuts down and sends a Rust message to the 
main thread to wake up. Otherwise, it pushes children of this node onto 
its work queue.


* For a bottom-up traversal, the worker looks for a parent. If there is 
no parent (it's at the root of the tree), it shuts down, atomically sets 
a "we're done" flag, and sends a Rust message to the main tree to wake 
up. Otherwise, it atomically decrements its parent's child count. If 
nonzero, it looks for more work to do. If zero, it pushes the parent 
onto its work queue.


# Pointer compression

Chase-Lev queues need work items to be representable by a single 
pointer. This is a problem because flows are two words in Rust (since we 
do not have inheritance). However, we can perform compression to pack 
the two words into one word: the pointer should have its lower 4 bits 
clear, so we can stuff the flow type in there. This can probably be 
folded into the cast that needs to be done between 
`Flow` and `Flow` anyway.


# Work items

The outcome is that the work will probably go like this:

1. Change assign-heights to be a bottom-up traversal.

2. Create the phantom type and add it to flows.

3. Add conversions between the different flow phantom types and make the 
kernel functions use the `Flow` type.


4. Add the count field to each flow, which will be needed for parallel 
bottom-up traversals.


5. Implement the Chase-Lev deque. As a simpler approach, we could use a 
shared work queue for now.


6. Implement the parallel traversals.

Patrick
___
dev-servo mailing list
dev-servo@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-servo


Re: [dev-servo] Inline DOM storage progress report

2013-07-23 Thread Patrick Walton

On 7/23/13 7:08 PM, Boris Zbarsky wrote:

On 7/23/13 6:34 PM, Josh Matthews wrote:

Accordingly, there are now branches of servo, rust-mozjs,
and mozjs (all named 'inline') that contain my work to date on storing
DOM objects inline in their JS wrappers' fixed slots. There are some
rooting problems that make it crash once in a while, but otherwise
test_bindings.html runs to completion on a regular basis.


How does the resulting rust-side code look?  Is is sufficiently like
idiomatic rust?

Basically, would like to avoid us just pushing the complexity of
implementing a DOM object from CC boilerplate to access-to-members
boilerplate...


Note that the Rust compiler is missing a feature needed for automatic 
generation of trace hooks: external #[deriving] implementations. This is 
a sub-bug of "allow procedural macros to be defined outside the 
compiler" (and should be fairly straightforward once that is 
implemented). This feature is scoped for Rust 1.0 but is not yet 
implemented.


Also, I imagine this feature will benefit from the yet-to-be-implemented 
compiler support for custom smart pointers (and in fact this sort of 
thing in Servo was much of the reason for my pushing this feature). This 
should make accessing members of these DOM nodes more idiomatic. My 
"Removing Garbage Collection from the Rust Language" blog post describes 
the plan very briefly, but Niko and I need to iron out the exact details 
still. Work related to this is underway in the Rust compiler (in 
particular removing `@fn`).


Patrick

___
dev-servo mailing list
dev-servo@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-servo


Re: [dev-servo] Inline DOM storage progress report

2013-07-25 Thread Patrick Walton

On 7/25/13 8:30 AM, Boris Zbarsky wrote:

On 7/24/13 12:16 PM, Josh Matthews wrote:

All DOM objects are represented with |JSManaged whatever|.
When you want to access a property owner_document of Whatever, you use
|do whatever.with_imm |whatever| { whatever.owner_document }|, and |do
whatever.with_mut |whatever| { whatever.owner_document = other_document
}| to modify it.


Hmm.  That looks pretty annoying to write for every single property
access, or am I just misunderstanding?


It is somewhat annoying (and leads to slow compile times too).

Much of the goal of custom smart pointer work in Rust that's going on is 
to eliminate this boilerplate, so I would expect this to improve in time.


Patrick

___
dev-servo mailing list
dev-servo@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-servo


Re: [dev-servo] cssparser meeting notes

2013-08-02 Thread Patrick Walton

On 8/2/13 1:39 PM, Boris Zbarsky wrote:

1)  How fast is a Reader in practice?  Gecko used to use an input stream
for the parser and switched to a flat character buffer because it was
significantly faster, even if you ignore the cost of Read() calls: the
logic getting the next char became much simpler and compilers became
able to optimize tight loops around it (like number parsing, say) a lot
better.  Before that change, the register spills caused by the possible
function call to fill the buffer from the stream meant the
number-parsing loop code was terrible.

So in practice, parsing from a character buffer might be the right
thing, especially because it's not like we necessarily need incremental
parsing for CSS.


Rust Readers are currently very slow; they're scheduled to be reworked 
after the new scheduler is done. Perhaps we should keep this test case 
as a microbenchmark when we redesign the high-level Rust I/O interface.


Patrick

___
dev-servo mailing list
dev-servo@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-servo


Re: [dev-servo] cssparser meeting notes

2013-08-02 Thread Patrick Walton

On 8/2/13 1:46 PM, Brendan Eich wrote:

It's hard to beat raw buffer access. We shouldn't try to be too smart
here. Do we have *safe* raw buffer access options on the boards? (E.g.,
XDR inline from the old days: a way to ask for contiguous,
inline-function-deserializable memory?)


I'm not sure what XDR inline means, but you can allocate fixed-length 
buffers on the stack or inside any object in Rust.


Patrick

___
dev-servo mailing list
dev-servo@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-servo


Re: [dev-servo] cssparser meeting notes

2013-08-02 Thread Patrick Walton

On 8/2/13 1:47 PM, Jack Moffitt wrote:

When you say they are very slow, do you mean any use of the Reader
trait, or just the FILE and other implementations?


Any use of the Reader trait.

Patrick

___
dev-servo mailing list
dev-servo@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-servo


Re: [dev-servo] cssparser meeting notes

2013-08-02 Thread Patrick Walton

On 8/2/13 1:50 PM, Boris Zbarsky wrote:

On 8/2/13 4:47 PM, Jack Moffitt wrote:

Using a Reader doesn't imply that we're not reading from a character
buffer.


The question is whether the compiler can devirtualize at the callsite so
it can optimize around the read call, or possibly even inline it


You can use:

fn read(x: &T) { ... }

Which is devirtualized in the compiler front-end and if the methods are 
small they will be inlined.


Patrick

___
dev-servo mailing list
dev-servo@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-servo


Re: [dev-servo] 0.1 release criteria

2013-08-02 Thread Patrick Walton

On 8/2/13 12:23 PM, Jack Moffitt wrote:

I've created a page with a straw-man proposal of these criteria, and
any input is welcome. Should we have more things? Or less?

https://github.com/mozilla/servo/wiki/Version-0.1


You missed the all-important code name. Added.

Patrick

___
dev-servo mailing list
dev-servo@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-servo


Re: [dev-servo] 0.1 release criteria

2013-08-02 Thread Patrick Walton

On 8/2/13 12:23 PM, Jack Moffitt wrote:

We are starting to plan what will go in the first numbered version of
Servo. The proposal is to gate the initial release on a set of
criteria, and after that releases will be time based and include
whatever features are done by the release date (much like Rust).

We need to define what the release criteria are for this 0.1 release.
I'm not sure what specific timeline others were thinking, but I expect
we want to get this out there soon, perhaps the latest target would be
early next year.

I've created a page with a straw-man proposal of these criteria, and
any input is welcome. Should we have more things? Or less?


This sounds basically good; the main thing I considered missing is 
invalidation (which I added). As I said on the page, I think that it's 
going to be hard to get any meaningful numbers out of dynamic HTML 
benchmarks while we're still repainting the entire page whenever 
anything changes.


Patrick

___
dev-servo mailing list
dev-servo@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-servo


Re: [dev-servo] 8/5 meeting - quadtree leak; element binding conversion; parallel layout; 0.1 criteria; benchmarks; acid1 status

2013-08-06 Thread Patrick Walton

On 8/6/13 2:19 AM, David Bruant wrote:

I have never even considered writing a page with hundreds of rendered
iframes and I don't think I have seen it in practice.
Testing perf edge cases is important to be sure the browser doesn't suck
in these circumstances, but the average case should receive more focus
IMHO. Especially in early stages.


Pages with lots of social media "like" buttons are the canonical case 
here. (Although recently pages have started to lazy load them -- it'd be 
interesting to see whether this has changed.)


Patrick

___
dev-servo mailing list
dev-servo@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-servo


Re: [dev-servo] 8/5 meeting - quadtree leak; element binding conversion; parallel layout; 0.1 criteria; benchmarks; acid1 status

2013-08-06 Thread Patrick Walton

On 8/6/13 2:19 AM, David Bruant wrote:

Testing perf edge cases is important to be sure the browser doesn't suck
in these circumstances, but the average case should receive more focus
IMHO. Especially in early stages.


Also I'm not sure I fully agree with this statement, unqualified. We 
should absolutely optimize what is in use now, but that needs to be 
balanced against optimizing for what pages might do once parallel 
browsers are commonplace. Maybe when iframes are cheap, pages will 
routinely have hundreds of them: after all, they're quite useful for 
sandboxing.


To give a concrete example of this sort of thing, if you looked at what 
"average" pages did and picked that as the sole criterion to determine 
what to focus on, JavaScript wasn't worth optimizing before Ajax hit the 
scene. But Ajax wouldn't have happened unless JavaScript was reasonably 
fast.


Patrick

___
dev-servo mailing list
dev-servo@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-servo


Re: [dev-servo] Parallel builds

2013-08-09 Thread Patrick Walton

On 8/9/13 4:00 PM, Josh Matthews wrote:

A brief twitter observation convinced me to dig into our build system
effeciency. First off, some numbers from a -j10 build after a `make clean`:

origin/master:
real3m51.867s
user3m41.425s
sys0m10.434s

jdm/parbuild [1]:
real2m36.697s
user4m9.334s
sys0m10.177s

With the changes from my parbuild branch, we build all subdirectories in
parallel, then libmsg and libutil together, libnet by itself, then
libgfx and libscript build together, followed by the main servo crate.
It's probably possible to decouple libgfx from libnet (there are
precisely two references to the Image type from libnet), but that
wouldn't actually affect the total build time in a parallel setting,
since libgfx takes 12 seconds, libscript takes 60s, the servo crate
takes 46s, and the rest are on the low end of less than 10 seconds.
libscript is obviously the long tail, and it's going to get worse as
time goes on.


These numbers seem awfully high. It's probably worth digging into what 
the problem is and filing bugs against rustc. Is this with optimization 
on or off?


Patrick

___
dev-servo mailing list
dev-servo@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-servo


[dev-servo] The road to Acid2

2013-08-26 Thread Patrick Walton

Hi everyone,

Now that we've essentially passed Acid1 (once one remaining pull request 
is merged), Acid2 is on the horizon! I've identified the issues that 
were immediately apparent between us and Acid2 and created a wiki page 
and a milestone to track our progress:


https://github.com/mozilla/servo/wiki/Acid-test-features

https://github.com/mozilla/servo/issues?milestone=4&state=open

Please feel free to add to these lists.

Many of these issues will be fixed by Simon's "newnewcss" 
infrastructure, which involves moving away from `rust-css` and the 
NetSurf CSS library to a pure Rust implementation. Other than that, the 
largest tasks will be implementing tables, followed by `position: 
absolute` and `position: fixed`. There are several other, smaller tasks 
which are cleanly separable and would be great for contributors if 
anyone is interested: negative margins, negative clearance, transparent 
PNGs, data URLs, negative heights, and the `:hover` selector come to mind.


As always, feel free to hop by `#servo` on irc.mozilla.org with any 
questions. Thanks!


Patrick
___
dev-servo mailing list
dev-servo@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-servo


Re: [dev-servo] The road to Acid2

2013-08-26 Thread Patrick Walton

On 8/26/13 9:11 PM, L. David Baron wrote:

Is it worth prioritizing having working incremental layout so that
you can benchmark the architecture for doing incremental updates
(and make sure it performs acceptably) over these tasks that you
list?  (As somebody following from a distance, I'm more concerned
about that, because it seems likely to be the sort of thing that
might require changes in architecture and thus rewriting code that
would implement the above.)


We have a plan for incremental layout which is partially implemented 
already. Right now, I believe what is implemented is that changes that 
require only repaints do not reflow, and there is some initial work on 
style struct diffing and dirty bit propagation. Also intrinsic width 
bubbling is skipped if it is not needed. DLBI is underway but needs the 
flow tree construction to be rewritten to be useful.


It definitely makes sense to get incremental layout (and perhaps bidi) 
working before redoing too much code that would have to be rewritten. At 
this point I think that the "visitor + kernel function" pattern will 
work with incremental reflow, so any changes that can fit into this 
framework are likely to be incremental-safe. This includes stuff like 
margin collapsing, negative margins, and negative heights. Also, areas 
that are pure CSS parsing strike me as fine to work on.


Areas where we will have to be careful to make incremental-safe are 
tables, absolute positioning, and fixed positioning. I think some of 
that stuff should wait on the flow tree construction phase to be 
rewritten to be incremental. I would advise against making sweeping 
changes to flow tree construction in the current framework; better to 
just push on the flow tree construction rewrite to be bottom-up, 
parallelizable, and to propagate dirty bits efficiently (i.e. be 
incremental). However, as I said I think constraint solving changes, 
especially if they're just to kernel functions, are fine.


Patrick

___
dev-servo mailing list
dev-servo@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-servo


Re: [dev-servo] The road to Acid2

2013-08-27 Thread Patrick Walton

On 8/27/13 1:48 AM, Simon Sapin wrote:

Maybe also writing modes / vertical text?

Even if the actual feature is not a priority, it might be easier to
rewrite the layout code to use logical keywords (eg. measure/extent
instead of width/height) while we don’t have much layout code.

http://dev.w3.org/csswg/css-writing-modes/
http://dev.w3.org/csswg/css-writing-modes/#abstract-box



Agreed. I actually filed a bug for this yesterday:

https://github.com/mozilla/servo/issues/799

Patrick
___
dev-servo mailing list
dev-servo@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-servo


[dev-servo] Ideas for incremental flow tree construction

2013-08-27 Thread Patrick Walton

Hi everyone,

I wrote up some rough ideas for parallel, incremental flow tree 
construction in Servo. Please feel free to add or remove data, poke 
holes in it, etc. :)


https://github.com/mozilla/servo/wiki/Incremental-flow-tree-construction

Patrick
___
dev-servo mailing list
dev-servo@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-servo


Re: [dev-servo] team meeting agenda bashing

2013-08-31 Thread Patrick Walton

On 8/30/13 6:01 PM, Jack Moffitt wrote:

I'm working on the agenda for the upcoming team meeting in Mountain View.

What do people think about focusing our efforts getting some subset of
Dromaeo working? I suspect this would involve implementing DOM APIs
and measuring performance.


Benchmarks of browsers tend to be pretty disappointing: they tend to 
overfocus on JavaScript (which is totally irrelevant for Servo, at least 
at this stage) and canvas and WebGL performance (which are not relevant 
now, and also not particularly interesting for us).


Dromaeo is pretty old and is basically SunSpider.

Peacekeeper may be better, as it has some DOM and rendering stuff, 
although it's also full of JavaScript:


http://hg.mozilla.org/users/mpalmgren_mozilla.com/peacekeeper/file/1f813e4a8ff5/js/conf.js

There is RoboHornet, but it's basically failed at gaining any traction:

http://www.robohornet.org/

There is the maze solver benchmark which might be fun to optimize for. 
This is a selector matching benchmark AIUI. People actually do run this one:


http://ie.microsoft.com/testdrive/performance/mazesolver/

Patrick
___
dev-servo mailing list
dev-servo@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-servo


Re: [dev-servo] Partial layout worth thinking about?

2013-09-10 Thread Patrick Walton

On 9/10/13 12:31 PM, Boris Zbarsky wrote:

I generally consider it pretty intractable, but the Blink  folks are
trying it: http://code.google.com/p/chromium/issues/detail?id=283623

The idea is that a layout flush should only do layout until the part we
care about is stable.  Or something.


Could we do the particular optimization that Chrome is doing in Servo by 
just threading "what part of layout are we interested in?" (the 
LayoutGoal in Servo terms) throughout layout and sending the answer over 
a channel as soon as we know it?


For example, if we want offsetWidth of a div, then we could make the 
LayoutGoal something like "offsetWidth of div so-and-so" and then as 
soon as we know the answer send it to the content task. We would 
continue doing the layout as usual in the layout thread.


Patrick

___
dev-servo mailing list
dev-servo@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-servo


Re: [dev-servo] Partial layout worth thinking about?

2013-09-11 Thread Patrick Walton

On 9/11/13 9:12 AM, Boris Zbarsky wrote:

Or bail out of it, either way.  For the use cases here bailing out would
be better, since people tend to do this sort of thing in a loop.


Yes. But ideally it would be nice to be able to resume layout where you 
left off. According to the bug, Chrome's implementation doesn't do this 
and requires layout to be restarted the next time around. This would 
regress performance in the case in which script queries `offsetWidth` 
and then the unchanged page is laid out for rendering, since two layouts 
(one partial, one full) must be done.


I think solving this the right way (i.e. in a way that doesn't regress 
perf for the above case) is much easier in a multithreaded architecture 
like Servo than in a single-threaded architecture like Blink's, since 
suspending the state of the layout task is easy. The tricky bit is just 
being able to bail out of layout at any point. We discussed this 
yesterday and it may not actually be that hard to do on a coarse-grained 
level: just insert a potential bailout point after each traversal.



But again, the hard part is knowing when you have a final value for the
thing of interest.


Well, the easiest, but imprecise, thing to do would be to send the 
message after the node is visited during the final assign-heights 
traversal. No more layout computation happens for each node at that 
point. (This final traversal roughly corresponds to 
`FinishAndStoreOverflow()` in Gecko AIUI, though I'm not intimately 
familiar with Gecko's reflow algorithm.) Of course this is very 
conservative, but it could still avoid performing assign-heights (which 
is where shaping is performed) for nodes later in the document than the 
node of interest.


Patrick

___
dev-servo mailing list
dev-servo@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-servo


Re: [dev-servo] Partial layout worth thinking about?

2013-09-11 Thread Patrick Walton

On 9/11/13 9:35 AM, Boris Zbarsky wrote:

This fails if the node is inside an overflow:auto block that does
multipass layout to figure out which scrollbars to show: in that
situation there will be multiple FinishAndStoreOverflow() calls in Gecko
and presumably multiple assign-heights traversals in servo...


Not with overlay scrollbars. :)

Patrick

___
dev-servo mailing list
dev-servo@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-servo


[dev-servo] Parallel layout in Blink

2013-09-12 Thread Patrick Walton
Interesting work on Blink (thanks to Jack): 
https://groups.google.com/a/chromium.org/forum/#!topic/blink-dev/-TBnz3LJiGY


This overlaps with much of the layout and parallel constraint solving 
stuff that Servo is doing. I am not sure that performing parallel layout 
purely on the block level will provide enough fine-grained parallelism 
wins, but maybe.


Note that I'm somewhat skeptical of the suggestion made in that thread 
regarding running layout in a separate task--namely, that it's not worth 
it because current Web content won't benefit from it. Web authors are 
not able to reap the benefits of parallel layout partially *because* all 
browser engines are single-threaded. But with a multithreaded engine, we 
can enable new kinds of Web content to benefit by introducing APIs such 
as `getBoundingClientRectAsync` and moving authors away from the 
existing synchronous APIs.


Patrick
___
dev-servo mailing list
dev-servo@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-servo


[dev-servo] Fwd: Re: Fast path for text shaping--worth it?

2013-09-17 Thread Patrick Walton
cc-ing the list here. This is relevant to the "shaping fast path" 
discussion that came up in the meeting.


Patrick

 Original Message 
Subject: Re: Fast path for text shaping--worth it?
Date: Tue, 17 Sep 2013 09:02:34 +0100
From: Jonathan Kew 
To: Patrick Walton 
CC: John Daggett 

On 17/9/13 01:21, Patrick Walton wrote:

Hi Jonathan,

Samsung pointed out that WebKit has a fast path for text shaping for
text runs that are "simple" (all Latin text, I guess?) that does not
call into HarfBuzz to perform shaping; instead such runs are passed
directly to the underlying graphics subsystem to render. Is this
something that Gecko has an equivalent for, and is it worth it in your
view?

Thanks!
Patrick


cc-ing John Daggett, who is currently looking into text performance in
various ways

In Gecko, we used to have a distinction, at least on some platforms,
between a "fast" path that bypassed text-shaping features such as
ligatures and kerning, and a "quality" path that implemented these
features. We've dropped that (a couple years back, iirc?) so that all
text now goes through a "quality" path and receives the same layout
treatment.

The main way we try to address the performance impact at the moment is
by caching "shaped words" on a per-font basis, to avoid the cost of
re-shaping the same word repeatedly (either when it occurs many times on
the page, or when we need to re-flow the page for any reason). Largely
because of this caching, I suspect, we didn't see any significant
performance impact when we simplified the text system to only have the
"quality" path.

In practice, I think very little "normal" web content would be eligible
for a no-shaping fast path, if we want to maintain rendering quality;
we'd need to check that the font being used does not include any layout
features (including simple ligatures such as "fi" and "fl", or kerning
for pairs like "To" or "AV") that apply to the text in question. Most
fonts these days - even for plain Latin text - do have such features, so
they'd need the harfbuzz path.

For a simple example, compare the rendering on OS X of

  data:text/html,AWAKE! To Tell
You The Truth

between Firefox (which applies the kerning defined in the font) and
Chrome/Safari (which don't) -- the Firefox rendering looks way better.
You can force Chrome or Safari to match Firefox here by adding the
"hint" text-rendering:optimizeLegibility to the style, but our position
is that authors should not have to do this in order to get good text
rendering; it should be the standard behavior.

Determining exactly what text could safely be sent through a "fast path"
that bypasses harfbuzz shaping *without* loss of quality is tricky -
except for the corner case where the font simply does not have any
layout features at all. But in that case harfbuzz itself will be
relatively fast too, as it won't find much work to do. It's probably
true that we could get a (small) perf win by taking some shortcuts if we
detect that the text is purely 8-bit (hence no complex-script or
combining characters present) *and* the font being used has no layout
features to apply, but IMO the benefits in practice are likely to be
sufficiently minor (and rare) that it's unlikely to be worth the added
complexity.

JK



___
dev-servo mailing list
dev-servo@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-servo


Re: [dev-servo] 9/23 meeting - leaks, crashes, reflow performance, demos

2013-09-24 Thread Patrick Walton

On 9/23/13 11:57 AM, Boris Zbarsky wrote:

It's hard to say more here without seeing the actual testcase being
measured.


Here it is:

https://gist.github.com/pcwalton/6695691

I deliberately crafted it to avoid shaping and floats so that, to the 
greatest extent possible, it's testing the core block reflow algorithms 
only. (There is still some whitespace in it though.)


BTW, with my patches (both to Servo and to the Rust compiler) I have 
sequential Servo within 16% of Gecko (Gecko Reflow() is 398ms for me 
while Servo is 462ms, comprised of 166ms assign-heights + 150ms 
assign-widths + 146ms bubble-widths). The remaining major issues are the 
lack of a real CSS library for querying styles (this is the biggest one) 
and too many dynamic borrows (which is partially a missing feature in 
Rust that I plan to propose--shared fields--and partially an 
unimplemented one--borrowing `@mut Trait` to `&mut Trait`). I suspect a 
few percent can still be squeezed out with micro-optimizations. 
Furthermore, there may well be algorithmic improvements we can look at. 
(In particular I am not sure that guessing widths during the 
bubble-widths phase is optimal.)


Patrick
___
dev-servo mailing list
dev-servo@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-servo


Re: [dev-servo] 9/23 meeting - leaks, crashes, reflow performance, demos

2013-09-26 Thread Patrick Walton

On 9/25/13 3:09 PM, L. David Baron wrote:

So I think it's probably possible to do a good bit better than
current Gecko.  And I also think it might be worth seeing how Gecko
does on your test before/after the 2004 margin collapsing changes.


I just did some more work to optimize our method dispatch and 
representation of flows.


I realized that in most of my tests we were counting two Servo 
from-scratch reflows instead of one, making the numbers seem twice as 
bad as they were (as Servo has no incremental reflow yet). Here are my 
numbers:


Warm Servo: 83ms assign-heights + 72ms bubble-widths + 63ms 
assign-heights = 218ms

Warm WebKit (Safari 5): 2 calls to layout(), total 282ms
Warm Gecko (Nightly 27): 2 calls to Reflow(), total 287ms

WebKit seems to have a lot of variability. Sometimes I see +/- 50ms. 
Servo and Gecko are pretty consistent.


In any case we *seem* to be beating WebKit and Gecko by about 30% in 
each case. This is of course very early and could be totally wrong in 
myriad ways, as is always the case with benchmarks.


Patrick

___
dev-servo mailing list
dev-servo@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-servo


Re: [dev-servo] update layout-free CSS from workers

2013-10-13 Thread Patrick Walton

On 10/13/13 10:57 AM, Dave Herman wrote:

Interesting proposal from Google:

https://github.com/ianvollick/animation-proxy/blob/master/Explainer.md


Some random thoughts:

* I guess the obvious cross-thread GC problem is remedied by making 
animation proxies weak, so they decay to no-ops when their DOM objects die.


* I foresee tricky synchronization problems to keep the DOM objects that 
proxied objects in workers refer to in sync with what's happening on the 
main thread. In Gecko (and Servo) layers can appear or vanish in 
response to various events that are main-thread-specific.


* This is very short-term thinking, in that it handles only one case 
(animations), and lacks the capability to respond to events. For 
example, for touch-sensitive scrolling you want the ability to stop the 
scrolling immediately when a touch-up event comes in. But events are 
sent to the main thread, not the worker thread. So in the case of main 
thread contention there will be lag as the event gets routed to the 
worker thread and then the animation proxy stops. This could be fixed by 
making an "event proxy" or something, but this is a slippery slope 
toward an unmaintainable, unwieldy API.


* For the reason above and similar reasons, hacks like this are not the 
way native apps achieve their responsiveness. Native apps are fast 
because they have every native event handler execute very quickly (in 
under 16ms for 60 FPS), despite the fact that they execute on the main 
thread. Current browser engines are not suited to this because they run 
lots of things on the main thread that don't have to be there (for 
example, layout). I would rather fix *that* issue and allow Web apps to 
be architected like native apps, instead of piling hacks upon hacks to 
allow the Web to approximate native apps. Chromium seems uninterested in 
some aspects of this, like allowing layout to run on a separate thread, 
however...


For these reasons I'm lukewarm--I'm OK with this as a temporary hack, 
but ultimately we need to accept that getting rid of the contention on 
the script task by multithreading our browser engines is the only way 
we're going to achieve the performance we need.


Patrick

___
dev-servo mailing list
dev-servo@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-servo


Re: [dev-servo] update layout-free CSS from workers

2013-10-13 Thread Patrick Walton
This did make me realize something, however: If there are a subset of 
CSS properties that can be changed from a worker thread, then those 
layout properties can be changed from the main thread as well, without a 
trip through layout. In other words: When changing something like 
transform, if the node in question is layerized, we could talk directly 
to the compositor from the script task. One could imagine at least some 
subset of events also working this way, for example scrolling: the 
compositor could send a scroll event directly to the script task if it 
knows about a scrollable frame and the DOM node it maps to. If layout is 
asynchronous and the script task is not blocked on it, then it can 
respond directly to those events on the main thread, as native apps do.


This strikes me as a technically better solution, because it provides 
some mechanism for receiving events, as well as all the other 
main-thread-only APIs (such as reading from the DOM!) in the thread 
that's responsible for the animation. It's also hugely more usable to 
Web authors, because it doesn't require use of Web workers and the 
message passing that comes with it to get good performance. The 
downsides are that it requires Web authors to not use the blocking 
layout APIs to avoid shooting themselves in the foot, and that it 
requires Servo's off-main-thread-layout design, which Chromium seems to 
have unfortunately decided against.


Patrick
___
dev-servo mailing list
dev-servo@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-servo


Re: [dev-servo] update layout-free CSS from workers

2013-10-13 Thread Patrick Walton
That's a good point. The additional functionality that Google's proposal 
enables over the already-quite-powerful CSS Animations seems pretty small. The 
document brings up scrolling, but as I argued before, scrolling needs events, 
which this proposal does not address. So this really adds very little in 
practical terms over CSS animations except more CPU consumption and GC pauses.

I sympathize with the "extensible Web" part of this, but it seems to me that 
CSS animations are *already* quite low-level and extensible. What this proposal 
allows is procedural computation for the animations, which might be useful for 
things like complex particle simulations, but this won't be enough for the 
really important use cases from a UI perspective.

Ultimately, as I said before, I think there is no sidestepping the need for a 
truly multithreaded browser engine to eliminate jank.

Patrick

Andreas Gal  wrote:
>
>How does this relate to the async animations we are already performing
>on the compositor thread today? Isn't that sufficient for most
>animations people want? (transform, opacity).
>
>Andreas
>
>On Oct 13, 2013, at 1:43 PM, Patrick Walton 
>wrote:
>
>> This did make me realize something, however: If there are a subset of
>CSS properties that can be changed from a worker thread, then those
>layout properties can be changed from the main thread as well, without
>a trip through layout. In other words: When changing something like
>transform, if the node in question is layerized, we could talk directly
>to the compositor from the script task. One could imagine at least some
>subset of events also working this way, for example scrolling: the
>compositor could send a scroll event directly to the script task if it
>knows about a scrollable frame and the DOM node it maps to. If layout
>is asynchronous and the script task is not blocked on it, then it can
>respond directly to those events on the main thread, as native apps do.
>> 
>> This strikes me as a technically better solution, because it provides
>some mechanism for receiving events, as well as all the other
>main-thread-only APIs (such as reading from the DOM!) in the thread
>that's responsible for the animation. It's also hugely more usable to
>Web authors, because it doesn't require use of Web workers and the
>message passing that comes with it to get good performance. The
>downsides are that it requires Web authors to not use the blocking
>layout APIs to avoid shooting themselves in the foot, and that it
>requires Servo's off-main-thread-layout design, which Chromium seems to
>have unfortunately decided against.
>> 
>> Patrick
>> ___
>> dev-servo mailing list
>> dev-servo@lists.mozilla.org
>> https://lists.mozilla.org/listinfo/dev-servo

-- 
Sent from my Android phone with K-9 Mail. Please excuse my brevity.
___
dev-servo mailing list
dev-servo@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-servo


Re: [dev-servo] update layout-free CSS from workers

2013-10-13 Thread Patrick Walton

On 10/13/13 6:23 PM, Robert O'Callahan wrote:

Gecko (and Webkit/Blink) already do stuff like that, insofar as scripted
transform changes and changes to certain other properties do not cause
reflow. When there are multiple changes, some of which require reflow
and some of which don't, you need to wait for the reflow to happen
because otherwise you could render an invalid frame.


Hmm. So I guess this means that animation proxies actually have behavior 
that's different from setting the property directly: it can cause 
properties of the layer to immediately change irrespective of any 
pending reflows.


This seems like a useful capability to have. In fact it seems to me that 
it's somewhat orthogonal to off-main-thread animations. In a deeply 
multithreaded engine like Servo layout happens in the background, so it 
may be useful to have an operation like "update this transform now, I 
don't care whether the contents of the layer are up to date".



The biggest downside is that if the main thread gets busy with script
activity, the animation janks.


Sure. But I think that we needn't presuppose that script activity is 
unpredictable and uncontrollable by the Web app. In the native app 
world, the mantra is "put everything long-running into a background 
thread, and use the main thread only for actions that you know will be 
fast". I don't see any reason why this idea is incompatible with Web 
apps, and in fact I don't see any other way for performant Web apps to 
be structured, for reasons such as the fact that events are delivered to 
the main thread.


One of the high-level goals of Servo is to have nothing happening on the 
script task besides scripts under the control of the Web app (by running 
layout and cross-domain tabs and iframes in the background). If we 
accomplish this, then I think Web authors can "take the main thread 
back": there should be nothing running on the script thread except 
JavaScript that the Web developer explicitly wrote. With that and an 
easy story for Web developers to punt long-running computations onto a 
background thread, we should have a workable, proven solution for 
responsive apps.


Patrick

___
dev-servo mailing list
dev-servo@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-servo


Re: [dev-servo] update layout-free CSS from workers

2013-10-13 Thread Patrick Walton

On 10/13/13 6:20 PM, Robert O'Callahan wrote:

On Sun, Oct 13, 2013 at 4:03 PM, Patrick Walton mailto:pwal...@mozilla.com>> wrote:

* I foresee tricky synchronization problems to keep the DOM objects
that proxied objects in workers refer to in sync with what's
happening on the main thread. In Gecko (and Servo) layers can appear
or vanish in response to various events that are main-thread-specific.


That's not a problem; the creation of an animation proxy would force
layerization of an element.


Didn't we resist allowing Web authors to force layerization in the past, 
out of fear that they would use it everywhere?


Patrick

___
dev-servo mailing list
dev-servo@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-servo


Re: [dev-servo] update layout-free CSS from workers

2013-10-13 Thread Patrick Walton

On 10/13/13 3:30 PM, Brendan Eich wrote:

Patrick Walton wrote:

Ultimately, as I said before, I think there is no sidestepping the
need for a truly multithreaded browser engine to eliminate jank.


Has this been rejected just for now, in a pragmatic way, or more
decisively (for a longer term), by Blink/chromium people? Appreciate any
links.


https://groups.google.com/a/chromium.org/forum/#!topic/blink-dev/-TBnz3LJiGY

"From this data, we concluded that we might as well block the main 
thread during layout because the main thread is going to synchronously 
need access to layout data soon anyway."


No offense to Chrome folks, of course, but I think the methodology was 
somewhat suspect: they went to the top X Web sites and, if there was a 
search box, entered "cats" into it. This is (a) an event-heavy 
sequential bottleneck; (b) not particularly interactive (no scrolling or 
anything more interesting than entering text); (c) only looking at 
existing content and patterns, not new content that authors could create.


Patrick
___
dev-servo mailing list
dev-servo@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-servo


Re: [dev-servo] New style system and regressions

2013-10-23 Thread Patrick Walton
Huge kudos! This is an extremely exciting milestone. Thanks so much for all 
your hard work!

Patrick

Simon Sapin  wrote:
>Hi dev-servo,
>
>The new style system (CSS parsing, selector matching, cascading, 
>inheritance and computed values) that I’ve been working on for a few 
>months is finally here!
>
>This morning, we landed the final changes to switch the rest of Servo 
>over and remove NetSurf’s libcss. Please report on github any
>regression 
>you find. I’m already aware that selector parsing is broken, I’ll look 
>into this next.
>
>Cheers,
>-- 
>Simon Sapin
>___
>dev-servo mailing list
>dev-servo@lists.mozilla.org
>https://lists.mozilla.org/listinfo/dev-servo

-- 
Sent from my Android phone with K-9 Mail. Please excuse my brevity.
___
dev-servo mailing list
dev-servo@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-servo


Re: [dev-servo] 10/28 meeting - os x, rustpkg, ssl, interning, retina, reflow performance, android compositing

2013-10-29 Thread Patrick Walton

On 10/29/13 8:52 AM, Simon Sapin wrote:

Le 29/10/2013 15:06, Josh Matthews a écrit :

https://github.com/mozilla/servo/wiki/Meeting-2013-10-28



Retina display
P: In the short term, we should change the Au that CSS reports.


I disagree. Au as I understand it is one 60th of CSS px, always.

High-DPI displays change the relationship between CSS px and device
pixels, exactly like zooming. Shouldn’t it just be different default zoom?


And different window size, I guess. Sure, that makes sense.

Patrick

___
dev-servo mailing list
dev-servo@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-servo


Re: [dev-servo] TreeNode and the script/style split

2013-10-30 Thread Patrick Walton

On 10/30/13 3:26 PM, Jack Moffitt wrote:

Right now we have this TreeNode abstraction that has been around for a
long time and has lost its original purpose (whatever that was). It's
only job right now appears to be to break the cyclic dependency
between the script and style crates.

Does anyone have strong feelings as to whether this is a worthwhile
reason to keep TreeNode around? If we get rid of it, I assume that
style will have to live in the script crate as it did before.


As the author of that code: Nuke it from orbit!

Patrick

___
dev-servo mailing list
dev-servo@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-servo


Re: [dev-servo] Segmentation fault is caused by 'No appropriate framebuffer config found!'

2013-10-31 Thread Patrick Walton

On 10/31/13 12:27 AM, Deokjin Kim wrote:

Dear all,

When I executed servo, I met segmentation fault. It's caused by 'No
appropriate framebuffer config found!'.


We've seen problems with some GPU drivers on Linux not being able to 
handle binding X pixmaps to RGBA textures. I assumed that in 2013 
everyone would support this seemingly basic feature but that does not 
seem to be the case :(


I see three options:

1. EGLStream. Maybe that will work on those GPUs.

2. Investigate if there's some way to hack around it (using RGB pixmaps 
but somehow storing and reconstructing the alpha channel maybe?)


3. Shared memory with texture upload on the compositor side.

#3 is guaranteed to work, but has bad performance implications (though 
no worse than the browsers of today).


For now we can back off to RGB textures on those GPUs. This obviously 
won't work in the long run, though.


Patrick

___
dev-servo mailing list
dev-servo@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-servo


[dev-servo] Event loop and compositor communication

2013-11-01 Thread Patrick Walton

Hi everyone,

Currently Servo's event loop is a poll model on a fixed timer. This is 
done for historical reasons: originally the Rust runtime did not 
interoperate very well with native UI event loops and we had no choice. 
This is very bad—it precludes putting the browser engine into production 
for countless reasons and actually hinders our results when comparing 
performance against other engines because the polling chews CPU that 
could be used for other things. And, as we know with Gecko, event loop 
technical debt accumulates interest very rapidly. So I'd like to fix 
this soon.


After talking some with Alex Crichton, who has been responsible for much 
of the modern Rust I/O library, I think that there's a reasonable path 
forward that also dovetails quite nicely with the plans for 
multi-process Servo. Specifically, these are the steps I'm considering:


1. Enforce the `Encodable` bound on the compositor-to-renderer messages, 
adding serialization implementations where necessary. This is much 
easier in Rust than in C++ because (a) the compiler tells you where you 
need the implementations; (b) with the `deriving` feature the compiler 
will auto-generate the necessary serialization code for most data types 
for you. (Unlike slow reflection-based systems, the automatic 
serialization mechanism of Rust is based on generating highly optimized 
code at compile time, so it should be fast enough for our purposes.)


2. Change the compositor-to-renderer protocol to run over a local Unix 
socket instead of a Rust pipe. Because Unix sockets are integrated into 
`libuv` and the Rust standard library now has idiomatic bindings for 
them, this should be fairly straightforward. (When we do a Windows port 
this will be replaced with Windows' equivalent; I think `libuv` supports 
Windows Named Pipes or whatever.)


3. Attach the file descriptor for the compositor's end of the pipe to 
the compositor event loop, and eliminate the polling.


Thus at the end of this, the renderer end of the Unix socket will be in 
`libuv` land, and the compositor end will be in the OS's native event loop.


The interesting thing about this approach is that we take out two 
problems in one fell swoop: the event loop integration issue and the 
multiprocess/sandboxing issue. The remaining issues for sandboxing are 
(a) adding the code to spawn the processes, which will probably want 
Rust code; (b) actually dropping into the sandbox; (c) figuring out what 
we want to do with GL, which is probably "don't worry about it for now".


Does this sound reasonable to everyone?

Patrick
___
dev-servo mailing list
dev-servo@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-servo


Re: [dev-servo] Event loop and compositor communication

2013-11-02 Thread Patrick Walton

On 11/2/13 8:26 AM, Niko Matsakis wrote:

At a high level, it seems reasonable, but I don't have a good grasp
over what kinds of messages are going over these channels and with
what frequency. That seems to be crucial with respect to judging
the perf effects of moving to a serialization scheme.


Note that Google Chrome and Safari already do this (in Chrome's case, 
with handwritten C++ code and a somewhat idiosyncratic event loop 
abstraction). I don't expect our perf to be substantially worse.


Patrick
___
dev-servo mailing list
dev-servo@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-servo


Re: [dev-servo] Meeting 2013-11-04: mac builder failure, priorities (C++, layout, parallelism, rustpkg, embedding, graphics)

2013-11-05 Thread Patrick Walton

On 11/5/13 2:11 PM, Robert O'Callahan wrote:

I had a discussion with pcwalton and kmc yesterday about string interning.
The concurrent cuckoo hash table is cool but maybe not the right fit for
the browser workload for a couple of reasons:
a) interning lookups are mostly not on the hottest paths. They will occur
during parsing, and that can mostly happen off the main thread. Parsing
selectors for querySelectorAll etc would happen on the main thread (or at
least synchronously with it), but we cache parsed selectors in Gecko anyway
(as of recently). Also, we have the option of wrapping a thread-local cache
around the global intern table.
b) it's important to efficiently resize the table and shrink it by removing
unused atoms. That may not be easy to add to the concurrent cuckoo table.


Agreed. Also:

(c) We aren't really interested in a hash *table*, more of a hash *set*. 
So lookup speed of *values* doesn't actually seem to me to matter much.



Regarding new layout features, I think it's important to tackle some
architectural issues before adding more features:
-- Bidi and vertical text
-- Fragmentation (page/column/overflow:fragments breaking)
The former introduces abstractions that will be easier to introduce now
than any time in the future, since all your layout features will build on
them. The latter is a big architectural thing that will also be easier to
introduce now than in the future, and also may impact your design in
fundamental ways.


I think I have a handle on what needs to be done for the former, but the 
latter is a bit opaque to me. Maybe a tutorial on what we need to watch 
out for would be helpful :) I don't want to take up too much of 
your/dbaron's time, but it'd be great to know what we need to watch out 
for to avoid baking in bad assumptions.


Patrick

___
dev-servo mailing list
dev-servo@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-servo


[dev-servo] Preliminary parallel-layout branch

2013-11-27 Thread Patrick Walton

Hi everyone,

I have a preliminary parallel-layout branch here:

https://github.com/pcwalton/servo/tree/parallel-layout

Currently it only runs bubble-widths (computation of minimum and 
preferred widths) in parallel, and some things (such as text) are turned 
off due to not being thread-safe yet. (They use @ boxes, so I need to 
clean them up before this can land.) In the meantime I have some 
preliminary numbers on `perf-rainbow.html` in the tests directory (a 
page full of nested divs). Tested on a 4-way HyperThreaded 2.7 GHz Core 
i7 with 6 worker threads:


Sequential bubble-widths: mean 29.01ms, median 32.58ms
Parallel bubble-widths: mean 16.14ms, median 17.53ms
- Parallel portion: mean 13.47ms, median 14.59ms
- Sequential warmup portion: mean 2.67ms, median 2.94ms

The median parallel bubble-widths is 86% faster than the sequential 
bubble-widths. Also, I believe these numbers are reasonably realistic, 
because our reflow traversals are already comparable to Gecko, WebKit, 
and Blink in sequential performance. Thus this represents a significant 
speedup over the current state of the art, if these numbers are accurate.


Keep in mind that these numbers are very preliminary and are *highly* 
subject to change. They could well be entirely wrong, so please don't 
read too much into them.


On very small pages (where reflow is < 1 ms) the overhead swamps these 
parallel traversals and I observed 10x slowdowns. However, I'm not too 
concerned about these pages because if reflow is less than 1 ms we will 
usually not have much trouble hitting 60 FPS anyhow. We can back off to 
sequential for those cases with the same code. (This is the advantage of 
our "kernel function" approach.)


Patrick
___
dev-servo mailing list
dev-servo@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-servo


Re: [dev-servo] Removing shared boxes from the DOM

2013-12-03 Thread Patrick Walton

On 12/3/13 12:07 AM, Josh Matthews wrote:

Ms2ger and I have been working on this on and off, and the Event
hierarchy is looking very nice so far:
https://github.com/jdm/servo/commits/jsmanaged . It even builds and
passes tests, so we should be able to continue converting this
piece-by-piece. There is an absolute minimum amount of boilerplate
required now, which is lovely.


I just got a chance to look this over.

* Is `transmute()` safe? It doesn't look safe at a glance. Perhaps it 
should be marked as an unsafe function?


* `mut_val()` may need a dynamic borrow check to be memory safe. 
Probably we should wait on the new `Cell` to fix this, but it warrants a 
FIXME.


Patrick

___
dev-servo mailing list
dev-servo@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-servo


Re: [dev-servo] Removing shared boxes from the DOM

2013-12-03 Thread Patrick Walton

On 12/3/13 12:54 AM, Josh Matthews wrote:

I don't entirely understand what this means, either in theory or in
practice.


So the issue is that the Rust compiler assumes a couple of things around 
`&mut`:


1. No `&mut` aliases any `&` (because then `&` wouldn't be immutable).

2. No `&mut` aliases any other `&mut` (i.e. no two `&mut`s to the same 
location).


Violating this will lead to undefined behavior, in particular iterator 
invalidation. So these rules need to be enforced dynamically.


The easiest way, and the way we'd like to do it in the future, is to not 
have mutable borrows of reference-counted or garbage-collected, and to 
instead wrap individual fields that should be mutable in the new 
`RefCell` (today called `Slot` in Servo), which exposes convenient 
`.get()` and `.set()` methods. But the problem is that `RefCell`/`Slot` 
currently bloats the field by adding an extra byte. I have a plan to fix 
that for most fields, but this is not yet implemented. So it's probably 
best to just `FIXME` it for now.


Patrick
___
dev-servo mailing list
dev-servo@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-servo


Re: [dev-servo] Removing shared boxes from the DOM

2013-12-05 Thread Patrick Walton

On 12/5/13 9:05 AM, Josh Matthews wrote:

In my mind, the biggest improvement here is that we can actually have
lists of JSManaged, whereas before we could only store
~[AbstractNode] with the handwave-y guarantee that all of the nodes
should also be elements. I also find the new checked casts much nicer to
read (and yes, the downcasts assert at runtime if the value passed is
not an instance of the desired type).


This is great stuff, thanks. Have you given thought to how the rooting 
API would work? That's one of the last major pieces to making the Servo 
DOM type- and memory-safe.


Patrick

___
dev-servo mailing list
dev-servo@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-servo


Re: [dev-servo] Removing shared boxes from the DOM

2013-12-05 Thread Patrick Walton
Right, it's not done yet. This is a nicer workaround than the workaround we had 
before, essentially.

Patrick

Bobby Holley  wrote:
>To clarify, this stuff doesn't use the new inheritance stuff that's
>going
>into Rust, right? I assume that stuff isn't done yet?
>
>
>On Thu, Dec 5, 2013 at 9:05 AM, Josh Matthews 
>wrote:
>
>> For those who just want to see a before/after summary, here's an
>example.
>>
>> BEFORE: we have to handwrite conversion routines for each downcast or
>> upcast we want to use. Furthermore, we have the ugly
>> AbstractDocument/AbstractNode/AbstractEvent/AbstractEventTarget
>types, in
>> addition to all of the concrete @mut SomeDOMType that are scattered
>> everywhere.
>>
>> impl AbstractNode {
>>  /// Allow consumers to upcast from derived classes.
>>  pub fn from_document(doc: AbstractDocument) ->
>AbstractNode {
>>  unsafe {
>>  cast::transmute(doc)
>>  }
>>  }
>>
>>  pub fn from_eventtarget(target: AbstractEventTarget) ->
>> tractNode {
>>  assert!(target.is_node());
>>  unsafe {
>>  cast::transmute(target)
>>  }
>>  }
>> }
>>
>> 788 let doctarget =
>AbstractEventTarget::from_document(document);
>> 789 let wintarget = AbstractEventTarget::from_window(window);
>> 790 window.eventtarget.dispatch_event_with_target(wintarget,
>> Some(doctarget), event);
>>
>> 307 if child.is_element() {
>> 308 do child.with_imm_element |elem| {
>> 309 if callback(elem) {
>> 310 elements.push(child);
>> 311 }
>> 312 }
>> 313 }
>>
>>
>> AFTER all casting methods are automatically generated; all that's
>left is
>> writing the is_foo method and ensuring that we can tell the concrete
>type
>> of a DOM object based on the root type in an inheritance chain:
>>
>> impl UIEventDerived for Event {
>> fn is_uievent(&self) -> bool {
>> self.type_id == UIEventTypeId
>> }
>> }
>>
>> let doctarget = EventTargetCast::from(document);
>> let wintarget = EventTargetCast::from(window);
>> window.eventtarget.dispatch_event_with_target(wintarget,
>Some(doctarget),
>> event);
>>
>> if child.is_element() {
>> let elem = ElementCast::to(child);
>> if callback(elem) {
>>   elements.push(elem);
>> }
>> }
>>
>> In my mind, the biggest improvement here is that we can actually have
>> lists of JSManaged, whereas before we could only store
>> ~[AbstractNode] with the handwave-y guarantee that all of the nodes
>should
>> also be elements. I also find the new checked casts much nicer to
>read (and
>> yes, the downcasts assert at runtime if the value passed is not an
>instance
>> of the desired type).
>>
>> Cheers,
>> Josh
>>
>>
>> On 12/03/2013 03:07 AM, Josh Matthews wrote:
>>
>>> Ms2ger and I have been working on this on and off, and the Event
>>> hierarchy is looking very nice so far:
>>> https://github.com/jdm/servo/commits/jsmanaged . It even builds and
>>> passes tests, so we should be able to continue converting this
>>> piece-by-piece. There is an absolute minimum amount of boilerplate
>>> required now, which is lovely.
>>>
>>> Cheers,
>>> Josh
>>>
>>> On 11/28/2013 10:54 AM, Josh Matthews wrote:
>>>
 I've finally got a sketch of my plans to remove all of the @mut
 annotations from Servo's DOM. You can see it at
 https://gist.github.com/jdm/7693770, but here's the breakdown of
>the
 improvements:

 * no more AbstractNode (\o/) - we can actually use
>JSManaged
 when we want to refer to something derived from Element now.
 * actually fulfils the contract that the SpiderMonkey GC owns the
>sole
 reference
 * no need to add as_foo methods for downcasting

 Breaking it down further, one of the biggest changes is in the
 implementation of downcasting and upcasting. Upcasting a
>JSManaged
 to a JSManaged is a statically-checked operation that should
>be
 free at runtime. This is enforced by a series of empty traits that
>each
 derived class much implement (ie. see EventTargetBase), such that
>any
 value passed to Bar::from() must be a type that implements BarBase.
>In a
 similar fashion, downcasting is also statically-checked (does the
>cast
 even make sense?), before performing a dynamic assertion (is this
>base
 class actually an instance of the derived type?). Therefore, when
 calling Foo::to(), the value passed must implement the FooDerived
>trait
 with an appropriate boolean is_foo method implemented.

 These casting changes may not look like a big improvement at first,
>but
 the important consideration is the the upcasting traits can be
 automatically generated completely from WebIDL definitions, so that
 boilerplate should be essentially free. For downcasting, each type
>that
 is on an inheritance chain wi

Re: [dev-servo] Removing shared boxes from the DOM

2013-12-06 Thread Patrick Walton

On 12/6/13 10:12 AM, Niko Matsakis wrote:

I am not especially happy with these two changes. They feel hokey and
special purpose. I guess the best sol'n depends on the degree of static
safety we want. Without something like the `'return` lifetime, I'm not
sure how we can guarantee that `Root` values follow a stack discipline
-- and maybe we don't want to, because it does remove some flexibility
(e.g., roots in data structures that transitively are tied to the
stack and hence do follow a stack discipline).


Wouldn't these data structures be instead traced by the JS GC? ISTM if 
you are putting roots in a data structure it is best to make the data 
structure itself traced.


I think it's important to have this be safe in an ironclad sense--this 
is one of the most fundamental pieces of the Servo project--and as a 
result the `'return` lifetime is the solution I favor right now.


Patrick

___
dev-servo mailing list
dev-servo@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-servo


Re: [dev-servo] Removing shared boxes from the DOM

2013-12-06 Thread Patrick Walton
Devirtualization isn't necessary if we use unboxed closures.

Patrick

Niko Matsakis  wrote:
>On Fri, Dec 06, 2013 at 10:17:40AM -0800, Patrick Walton wrote:
>> Wouldn't these data structures be instead traced by the JS GC? ISTM
>> if you are putting roots in a data structure it is best to make the
>> data structure itself traced.
>
>This is not necessarily the case -- creating a new kind of GCThing is
>rather hard, and wrapping in an object is heavyweight compared to
>allocating a data structure on the stack. But thinking about this
>more, I think that Rust offers some nice solutions here.
>
>First off, SpiderMonkey has an AutoObjectVector (I think that's what
>it's called) that is used to store an arbitrary # of references. I
>haven't looked at how this is implemented, but I don't see how it
>could build on the Rooted types, so I guess it's some separate
>thing. If we can implement Rooted I imagine we can implement this.
>
>The only case where Rooted is really workable is when you have a
>statically known number of things to root (statically known for any
>given stack frame, at least). A simple example would be a struct
>with multiple fields:
>
>struct Context {
>a: RootedObject,
>b: RootedObject
>}
>
>Of course it's probably possible to have the compiler permit
>something like:
>
>let mut x = Context { a: root(), b: root() }
>
>even though the return value of `root()` is not being stored
>directly into an lvalue, but another option would be to use
>lifetimes:
>
>struct Context<'a> {
>a: &'a mut RootedObject,
>b: &'a mut RootedObject,
>}
>
>let mut a = root();
>let mut b = root();
>let mut x = Context { a: &mut a, b: &mut b };
>
>> I think it's important to have this be safe in an ironclad
>> sense--this is one of the most fundamental pieces of the Servo
>> project--and as a result the `'return` lifetime is the solution I
>> favor right now.
>
>I tend to agree that we need to get this right and safe. The fact that
>return has multiple use cases is appealing. However, it's also worth
>pointing out that once closures are also an ideal solution here,
>perhaps with some Haskell-do-like sugar to reduce rightward drift. The
>main problem is the fact that LLVM or the Rust compiler must
>devirtualize.
>
>
>Niko

-- 
Sent from my Android phone with K-9 Mail. Please excuse my brevity.
___
dev-servo mailing list
dev-servo@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-servo


[dev-servo] Storing pointers to flows and boxes inside the DOM

2013-12-14 Thread Patrick Walton

Hi everyone,

I filed an issue with thoughts and ideas on how to do this in Servo 
going forward: https://github.com/mozilla/servo/issues/1412


Feedback and comments, either here or on the issue tracker, are very 
welcomed! Checking my reasoning and ideas for the TBD areas are 
particularly valuable.


Thanks!
Patrick
___
dev-servo mailing list
dev-servo@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-servo


Re: [dev-servo] Storing pointers to flows and boxes inside the DOM

2013-12-16 Thread Patrick Walton
The flow representing the body is semantically reconstructed from scratch. Now 
that doesn't have to actually be what happens at the machine level--I have a 
TODO that we should reuse parts of flows where possible to accommodate 
efficient handling of cases like this. This shouldn't affect the safety of the 
design as long as the "flow-to-be-reused" is appropriately sanitized at the 
Rust type system level to avoid use-after-free of dead child flows/boxes.

Patrick



Boris Zbarsky  wrote:
>On 12/14/13 5:18 PM, Patrick Walton wrote:
>> Feedback and comments, either here or on the issue tracker, are very
>> welcomed! Checking my reasoning and ideas for the TBD areas are
>> particularly valuable.
>
>I'm trying to understand what "The number of primary boxes in each flow
>
>never changes throughout the flow's lifetime" means in practice.
>
>Say I have something like this:
>
>  
>Text
>
>Text
>  
>
>what happens when I remove one of those s from the DOM?  What 
>happens if I add a new  to the DOM?  What happens if I toggle 
>"display" on one of those s to "none"?
>
>-Boris
>___
>dev-servo mailing list
>dev-servo@lists.mozilla.org
>https://lists.mozilla.org/listinfo/dev-servo

-- 
Sent from my Android phone with K-9 Mail. Please excuse my brevity.
___
dev-servo mailing list
dev-servo@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-servo


[dev-servo] RFC: Rename "rendering" to "painting"

2014-01-30 Thread Patrick Walton

Hi everyone,

Preface: This is a total bikeshed.

I was wondering whether we might want to rename "render" to "paint". The 
issue is that "render" means different things in different browser 
engines and so the term is kind of confusing. As far as I can tell 
"render" in Gecko means either "layout" or "paint" depending on the 
circumstances. This is also true in WebKit, where "render" usually means 
"layout". But in Servo "render" means "paint" only, which is somewhat at 
odds with how the term is used in other engines.


So I'd like to propose renaming "render" to "paint" in Servo. More 
broadly, I was thinking about using these headings for the various 
rendering phases, which are a hopefully-clear intersection of the Gecko 
and WebKit terminology:


* "Layout" (from Gecko; WebKit calls this "rendering")
  - "Style recalc" (from WebKit; I'm not sure if Gecko has a name for 
this specifically)

o Initialization of layout data
o CSS selector matching/cascading
o "Dirty marking" (from Gecko)
  - "Reflow" (from Gecko; WebKit calls this "layout")
o Bubble widths (perhaps rename to "compute min/pref widths"?)
o Assign widths
o Assign heights/store overflow
  - Display list building (from Gecko; WebKit ports that use display 
lists call this "painting")

* "Painting" (from WebKit ports that do not use display lists)
  - Buffer preparation
  - Drawing
* Compositing

Thoughts?

Patrick
___
dev-servo mailing list
dev-servo@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-servo


Re: [dev-servo] Speculative parallel CSS lexing

2014-02-04 Thread Patrick Walton

On 2/4/14 9:15 AM, Daniel Hackney wrote:

My message was delayed for a few months in the purgatory of moderation,
so the info in the original email is somewhat out of date. I wrote a
summary of my findings in the file "capstone.pdf" here:

https://db.tt/JNcSdrIe

The moral of the story is that I got up to a 2x speedup with around 10
tasks (in Rust 0.8). There is a sequential bottleneck somewhere,
probably when it is collecting the results from the separate tasks into
a single array. I've thought of some ways it might be possible to
parallelize that, but haven't implemented it.


This is very cool. Have you tried using the workqueue [1]? This is a new 
feature that we're using throughout Servo that reduces task spawning 
overhead for data-parallel operations. It also keeps the number of tasks 
down to a minimum, to reduce memory usage.


Eliminating the bottleneck from collecting the results from the separate 
tasks into one array might be accomplished by using a concurrent array 
data structure. (One doesn't exist for Rust yet, but it'd be an 
excellent opportunity to write one.) :)


Patrick

[1]: 
https://github.com/mozilla/servo/blob/master/src/components/util/workqueue.rs


___
dev-servo mailing list
dev-servo@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-servo


[dev-servo] Leaf set construction is probably unnecessary

2014-02-09 Thread Patrick Walton
(Copy and paste from GitHub issue #1650 because I figure mailing list 
discussion is potentially more fruitful than GitHub issues for design 
discussions.)


I realized this morning that we can probably avoid the need for a leaf 
set by intertwining selector matching and flow construction, and flow 
construction and intrinsic-width-bubbling, to some degree. This is 
similar to what Gecko and WebKit do, but potentially somewhat cleaner 
because they are still separate functions and the implementations are 
strictly separate (no bouncing back and forth to handle special 
incremental-reflow cases); the parallel driver just knows how to invoke 
them. This works thanks to the heterogeneous nature of the `WorkQueue`: 
it can accept heterogeneous tasks and can run them all in parallel.


* Once the selectors have been matched for a leaf node, we can 
immediately start constructing its flows. Just call `construct_flows` 
once the node has been matched.
* Trickier, but also likely possible: Once flows have been constructed 
for a leaf node, immediately call `bubble_widths` on it. This works 
because we always know when a flow is going to be a leaf since 
e579daefc2956a2eb151588b628c51342de236d0.
* Once `assign_widths` has been called on a leaf, immediately start 
assigning its heights via `assign_heights`.


Assuming this works out, all parallel traversals will start from the 
root and go down, eliminating the need for a leaf set. We will probably 
still want a "backdoor" that sequentially computes bubble-widths for two 
reasons: (1) during incremental reflow, min/pref widths may have been 
invalidated without invalidating the flow; (2) it's easier to benchmark 
style recalc against Gecko and WebKit when it's not intertwined with 
intrinsic width calculation.


This would have numerous benefits:

1. Leaf set construction is expensive. On Wikipedia it's 16% of selector 
matching time on 4 cores. For comparison, that's difference between 
getting a 2.4x speedup for selector matching and getting a 2.9x speedup 
on 4 cores.
2. Eliminates one or two parallel traversals, reducing overhead. (In 
particular the warmup phase will go to essentially zero.)
3. Eliminates the synchronization point between selector matching and 
flow construction, allowing better multicore utilization.
4. Eliminates the necessity of ensuring that DOM nodes in the leaf set 
are alive which will be a bit of a pain when we start doing incremental 
reflow.

5. Better memory usage since the leaf set data structures will go away.

Patrick
___
dev-servo mailing list
dev-servo@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-servo


Re: [dev-servo] Leaf set construction is probably unnecessary

2014-02-09 Thread Patrick Walton
Interesting. That's a very good point that seems to argue in favor of keeping 
the distinction between the selector matching/cascading functions and the flow 
construction functions that we have, and making any optimization that combines 
the two passes operate at a higher level (as I was figuring we'd do anyway). 
That is, combining the two passes should be done with a routine that simply 
invokes the two functions one after another in the proper order, instead of 
ripping apart and intertwining the two functions together into one. That way we 
can have separate paths that invoke them separately without having to duplicate 
code.

Patrick

Boris Zbarsky  wrote:
>On 2/9/14 1:24 PM, Patrick Walton wrote:
>> (2) it's easier to benchmark
>> style recalc against Gecko and WebKit when it's not intertwined with
>> intrinsic width calculation.
>
>One note here.  It's not uncommon for pages to do things that need a 
>style recalc (e.g. a getComputedStyle() call) but don't need any sort
>of 
>layout at all, including intrinsic width calculation.
>
>In Gecko such calls do perform box construction right now but that's
>due 
>to Gecko storing the nsStyleContext in the nsIFrame.  In an ideal world
>
>a pure-style flush would just recompute styles and not touch the box 
>tree at all.  At least in Gecko, where you can't do the box tree bits
>in 
>parallel with running the part of the script after the 
>getComputedStyle() call.
>
>Gecko also has internal code that wants to trigger box construction but
>
>not any layout activity.
>
>Realistically, I think you want two slightly different kinds of style 
>recalc.  The ones triggered by the refresh driver (or whatever servo's 
>equivalent), which want to immediately segue into layout.  And the ones
>
>triggered by someone asking for up-to-date style information, probably 
>because they plan to then change it.  Doing any unnecessary work here
>is 
>probably a waste of time.
>
>-Boris
>___
>dev-servo mailing list
>dev-servo@lists.mozilla.org
>https://lists.mozilla.org/listinfo/dev-servo

-- 
Sent from my Android phone with K-9 Mail. Please excuse my brevity.
___
dev-servo mailing list
dev-servo@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-servo


Re: [dev-servo] 2/10 meeting notes (interning; embedding; beforeunload; acid2 status; testing status; parallelism status; easy bugs; rust upgrade; docs)

2014-02-11 Thread Patrick Walton

On 2/10/14 12:41 PM, Robert O'Callahan wrote:

What exactly do you mean by "font collection loading"? If you ask the right
questions I can explain how it works in Gecko :-).


Sure thing. Sorry for the delay here, I needed to profile and 
re-familiarize myself with Servo's code :)


I'm on a Mac, so I'll describe it in terms of those APIs. The hot 
font-related functions in Wikipedia flow construction are as follows:


* `CTFontGetBoundingBox` - called during frame construction for text 
boxes when we hit a font we haven't seen yet. I think this is for 
vertical align.


* `CTFontGetAdvancesForGlyphs`/`CTFontGetGlyphsForCharacters` - core 
shaping functions, called during text run construction, part of frame 
construction, as part of a HarfBuzz callback.


* `CTFontCollectionCreateMatchingFontDescriptors` - constructs the query 
used to ask the OS which fonts are available. This is ultimately called 
from Servo `FontContext::get_resolved_font_for_style()`, called during 
frame construction for text boxes.


* `CTFontManagerCopyAvailableFamilyNames` - gets the list of available 
families from the OS. Called when we create the thread-local font 
context. (There is one font context per frame construction thread.)


So I guess my question is: Where does Gecko call the analogues to these 
functions?


Patrick

___
dev-servo mailing list
dev-servo@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-servo


Re: [dev-servo] 2/10 meeting notes (interning; embedding; beforeunload; acid2 status; testing status; parallelism status; easy bugs; rust upgrade; docs)

2014-02-11 Thread Patrick Walton

On 2/11/14 8:04 PM, Jack Moffitt wrote:

* `CTFontCollectionCreateMatchingFontDescriptors` - constructs the query
used to ask the OS which fonts are available. This is ultimately called from
Servo `FontContext::get_resolved_font_for_style()`, called during frame
construction for text boxes.

* `CTFontManagerCopyAvailableFamilyNames` - gets the list of available
families from the OS. Called when we create the thread-local font context.
(There is one font context per frame construction thread.)


I think these we can stuff into an Arc and share. At some point the
system may get new fonts added, at which point we'll need to create
and pass new ones. It seems a little silly that we recalculate the
font list and all the font groups every time, but it's mostly that way
because we never had a reason to mess with it until now.


Need to check whether these are OS objects, and if so, whether they're 
thread safe on all platforms. Linux fontconfig in particular is 
particularly bad, as we've discovered. (You need a lock around the whole 
thing unless you have a *very* recent version, newer than the one in 
many distros.) :(


Patrick

___
dev-servo mailing list
dev-servo@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-servo


Re: [dev-servo] 2/10 meeting notes (interning; embedding; beforeunload; acid2 status; testing status; parallelism status; easy bugs; rust upgrade; docs)

2014-02-12 Thread Patrick Walton

On 2/11/14 9:42 PM, Robert O'Callahan wrote:

Obtaining the list of platform fonts is incredibly
performance-sensitive. We have a gfxPlatformFontList object for this,
which caches the font list indefinitely. When the system notifies us of
a font list change, we rebuild our font list from scratch, but that's
very rare of course. The tricky part is how to build this list without
slowing down startup. The key thing is that we don't always need the
list. We can look up simple fonts by family name without having a
complete list of the platform fonts, and for each language we have a
list of default font names that have a good chance of supporting the
characters of that language; all that means we can render many pages
without needing the platform font list.


Aha. That last one is probably going to be the biggest win in Servo. 
Everything I've tested uses pretty standard fonts, so probably the other 
browsers aren't hitting the platform font list at all. Filed an issue:


https://github.com/mozilla/servo/issues/1677

Patrick

___
dev-servo mailing list
dev-servo@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-servo


Re: [dev-servo] 2/10 meeting notes (interning; embedding; beforeunload; acid2 status; testing status; parallelism status; easy bugs; rust upgrade; docs)

2014-02-12 Thread Patrick Walton

On 2/12/14 12:06 PM, Robert O'Callahan wrote:

For Servo, I would pull the font.name  prefs from
all.js to get a fixed list of per-language font names to use for
fallback and default fonts, and not implement platform font lists for
now and hopefully not ever.


Sadly we already have them implemented. But we could rip them out. :)

Patrick

___
dev-servo mailing list
dev-servo@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-servo


[dev-servo] Parallel hazards with absolutely positioned blocks

2014-02-13 Thread Patrick Walton

Hi everyone,

Pradeep's excellent pull request [1] for absolute positioning raised 
some thorny issues relating to solving geometry constraints for 
absolutely positioned blocks in a parallel setting. The fundamental 
issue is that absolute positioning can depend not only on width and 
height of the containing block, but also the top/left/right position of 
the hypothetical box created by its static position. This is in conflict 
with our two-direction approach for computing widths and heights: widths 
are computed *top-down*, so that kids' widths depend on parents' widths, 
but heights are computed *bottom-up*, so that parents' heights depend on 
kids' heights.


There are two basic approaches to dealing with absolutely-positioned 
frames: (1) parent the absolutely-positioned frame to its containing 
block, and keep track of a separate "hypothetical frame" (which is what 
Gecko seems to do, with `nsHTMLReflowState::CalculateHypotheticalBox()` 
and `nsFrameManager::GetPlaceholderFrameFor()`), or (2) keep track of 
the containing block separately from the parent and use the 
"hypothetical frame" as the real frame, which is WebKit's approach (see 
the comments in `RenderObject::container()` and 
`RenderBox::computePositionedLogicalWidth()`).


The pull request #1681 takes approach #2. It deals with the parallel 
hazard by deferring height computation of absolutely positioned blocks 
until display list construction time (which is top-down). However, this 
has a problem in that the overflow for the containing block cannot be 
correctly calculated, because the height is not yet known. This will 
make it impossible for us to construct the display list only for frames 
in the visible region, since that depends on having correct overflow 
information.


After some thought I've begun to think that approach #2 (WebKit's 
approach) is a dead-end in a parallel setting. The fundamental problem 
is that the height of an absolutely-positioned block cannot be 
determined without knowing the height of its containing block (if `top` 
and `bottom` are specified). Yet knowing the overflow region of *at 
least* the containing block depends on knowing the height of the 
absolutely-positioned blocks for which it is the containing block. I say 
"at least" because if we were to adopt WebKit's approach (putting the 
absolutely-positioned frame at its hypothetical static location) then 
potentially many more intervening blocks would have their overflow 
regions depend on their absolutely positioned descendants:


 ^height +--+ overflow
 |   +---+ containing block |<+
 |   |   ++-+ |
 |   || overflow  |
 |   |v   |
 |   |+---+   |
 |   || block |   |
direction|   |+---+---+   |
   of|   || overflow  |
traversal|   |v   |
 |   |+---+   |
 |   || block |   |
 |   |+---+---+   |
 |   || overflow  |
 |   |v   |
 |   |  ++|
 |   +->+ absolutely positioned flow ++
 |  ++

That's a potentially unbounded amount of extra information that needs to 
travel in the opposite direction to the traversal.


By contrast, approach #1 (Gecko's approach) avoids the unbounded amount 
of information flow in the opposite direction to the traversal. The 
height and overflow region only need to travel one step: from containing 
block to its immediate absolutely positioned flow child and back. This 
can be implemented by simply moving the computation of height and 
overflow of absolutely-positioned frames to the parent. This loses a 
small amount of parallelism during the assign-heights phase, but it 
should be small with fixed overhead.


Width assignment becomes potentially trickier with absolute positioning. 
Recall that in Servo actual width computation is a top-down traversal. 
But because the width of an absolutely-positioned frame can depend on 
its hypothetical frame's static position, it is possible that an 
absolutely-positioned frame will not have its width ready by the time 
the traversal wants to compute it, because its hypothetical frame has 
not yet been reached. This can be remedied by simply (a) having 
containing blocks *not* enqueue their absolutely-positioned kids in the 
usual manner during the parallel top-down traversal and (b) having the 
*hypothetical* frame be the frame responsible for computing its 
associated absolutely positioned frame's width, and enqueuing its kids.


Unfortunately, I don't know o

Re: [dev-servo] Parallel hazards with absolutely positioned blocks

2014-02-13 Thread Patrick Walton

On 2/13/14 2:42 PM, Boris Zbarsky wrote:

Then you go through this list (this part should parallelize pretty
nicely, btw, except for the overflow area updates on parents) and reflow
them all, since now you know their containing block sizes, and you
update overflow areas.  Again, if there are abs pos kids you add to a
list, etc.


I have a couple of concerns about this from a performance point of view.

1. This creates multiple sequential barriers, which can result in a 
parallelism hazard. As an extreme case, consider:


div { position: absolute; }

...

This will effectively result in fully sequential layout because you have 
to wait until layout to finish to start working on the new absolutely 
positioned kids you found, and so on.


2. The overhead of updating parents' overflow areas over and over again. 
In fact a page like above will result in O(n^2) updates of parents' 
overflow areas where n is the number of nodes, unless I'm missing something.


Can we tweak the algorithm in some way to avoid these hazards?

Patrick

___
dev-servo mailing list
dev-servo@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-servo


Re: [dev-servo] Parallel hazards with absolutely positioned blocks

2014-02-13 Thread Patrick Walton
For the first concern, I realized that pages like that are going to be 
sequential no matter what we do, of course, because of the way parallel tree 
traversals work. So never mind there. The O(n^2) update of overflow areas still 
concerns me though.

Patrick Walton  wrote:
>On 2/13/14 2:42 PM, Boris Zbarsky wrote:
>> Then you go through this list (this part should parallelize pretty
>> nicely, btw, except for the overflow area updates on parents) and
>reflow
>> them all, since now you know their containing block sizes, and you
>> update overflow areas.  Again, if there are abs pos kids you add to a
>> list, etc.
>
>I have a couple of concerns about this from a performance point of
>view.
>
>1. This creates multiple sequential barriers, which can result in a 
>parallelism hazard. As an extreme case, consider:
>
> div { position: absolute; }
>
> ...
>
>This will effectively result in fully sequential layout because you
>have 
>to wait until layout to finish to start working on the new absolutely 
>positioned kids you found, and so on.
>
>2. The overhead of updating parents' overflow areas over and over
>again. 
>In fact a page like above will result in O(n^2) updates of parents' 
>overflow areas where n is the number of nodes, unless I'm missing
>something.
>
>Can we tweak the algorithm in some way to avoid these hazards?
>
>Patrick
>
>___
>dev-servo mailing list
>dev-servo@lists.mozilla.org
>https://lists.mozilla.org/listinfo/dev-servo

-- 
Sent from my Android phone with K-9 Mail. Please excuse my brevity.
___
dev-servo mailing list
dev-servo@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-servo


Re: [dev-servo] Parallel hazards with absolutely positioned blocks

2014-02-13 Thread Patrick Walton
Here's an idea for how to leverage the leaf-set-removal proposal to optimize 
this algorithm:

1. When assigning widths for parents, don't enqueue absolutely positioned kids 
to have their widths assigned.

2. Absolutely positioned kids of a node should not be counted as outstanding 
children-to-be-reflowed for the purposes of height assignment. The effect of 
this is that height may be computed for a node once all of its in-flow children 
have had their heights computed; it need not wait on absolutely positioned 
children.

3. After assigning heights for parents that have absolutely positioned kids, 
don't store overflow for them (or their ancestors) right away, because their 
overflow regions *do* depend on the overflow regions of their absolutely 
positioned kids. Instead, for each such parent, store an atomic count X of 
absolutely positioned kids inside it, and enqueue assign-*widths* for all of 
the parent's absolutely positioned kids.

4. After assigning widths for a leaf, start assigning heights for it 
immediately. (This is the leaf-set removal proposal.)

5. After assigning heights for an absolutely positioned frame, do *not* 
decrement its parent's count of outstanding children-to-be-reflowed (as that 
number should be zero). Instead decrement X (the atomic count of 
absolutely-positioned kids) on the parent. If zero, then compute overflow for 
the parent and all of its ancestors.

I believe this algorithm or something like it can achieve the goal of 
performing layout for absolutely positioned frames after all of the frames it 
could possibly depend on, without the subtle invariants of my first proposal. 
It's essentially the same as bz's algorithm but without the explicit list; 
rather it uses the parallel work queue as the list. It also avoids the O(n^2) 
hazard by computing overflow for each frame exactly once.

How does this sound?

Patrick

Patrick Walton  wrote:
>For the first concern, I realized that pages like that are going to be
>sequential no matter what we do, of course, because of the way parallel
>tree traversals work. So never mind there. The O(n^2) update of
>overflow areas still concerns me though.
>
>Patrick Walton  wrote:
>>On 2/13/14 2:42 PM, Boris Zbarsky wrote:
>>> Then you go through this list (this part should parallelize pretty
>>> nicely, btw, except for the overflow area updates on parents) and
>>reflow
>>> them all, since now you know their containing block sizes, and you
>>> update overflow areas.  Again, if there are abs pos kids you add to
>a
>>> list, etc.
>>
>>I have a couple of concerns about this from a performance point of
>>view.
>>
>>1. This creates multiple sequential barriers, which can result in a 
>>parallelism hazard. As an extreme case, consider:
>>
>> div { position: absolute; }
>>
>> ...
>>
>>This will effectively result in fully sequential layout because you
>>have 
>>to wait until layout to finish to start working on the new absolutely 
>>positioned kids you found, and so on.
>>
>>2. The overhead of updating parents' overflow areas over and over
>>again. 
>>In fact a page like above will result in O(n^2) updates of parents' 
>>overflow areas where n is the number of nodes, unless I'm missing
>>something.
>>
>>Can we tweak the algorithm in some way to avoid these hazards?
>>
>>Patrick
>>
>>___
>>dev-servo mailing list
>>dev-servo@lists.mozilla.org
>>https://lists.mozilla.org/listinfo/dev-servo
>
>-- 
>Sent from my Android phone with K-9 Mail. Please excuse my brevity.

-- 
Sent from my Android phone with K-9 Mail. Please excuse my brevity.
___
dev-servo mailing list
dev-servo@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-servo


Re: [dev-servo] Parallel hazards with absolutely positioned blocks

2014-02-14 Thread Patrick Walton

On 2/13/14 4:54 PM, Robert O'Callahan wrote:

On Fri, Feb 14, 2014 at 1:43 PM, Boris Zbarsky  wrote:


On 2/13/14 5:56 PM, Robert O'Callahan wrote:


2) Fragmentation. With something like overflow:fragments, absolute
positioning can affect the number of fragments you generate, which can
affect the size of the container of the fragments.



Ugh.  I thought one of the points of absolute positioning was to not
affect the layout of anything  If that's not the case anymore, that's
_really_ annoying.



Well, I'm not 100% sure. I wonder what David thinks.


So I just looked at the relevant piece of this spec and I'm confused:

http://dev.w3.org/csswg/css-overflow-3/#fragment-pseudo-element

Suppose that I have a document like this:


.a { position: relative; }
.b {
position: absolute;
overflow: fragments;
top: 5%;
bottom: 5%;
right: 0;
width: 100px;
}
.b::nth-fragment(2) { position: static; }



Foo

Four score and seven years ago blah blah blah...



So we lay out `a` to determine its height. Now we know the height of 
`b`. Then we lay out `b`. Because there isn't enough room for its 
multiple lines to fit given its height limit, we create some fragments. 
Fragment 2 has `position: static` (which is explicitly allowed by the 
spec), so now it has to be inserted as a block kid of `a`. So we go to 
reflow `a` again. Now it has a different, taller height, which of course 
affects `b`. Now `b` has a larger height--suppose that it's now tall 
enough to not generate any fragments! So we remove the static kid of 
`a`, making `a` is shorter. We go to reflow it again, and we're back 
where we started. The layout seems nonterminating...


Is there something I'm missing?

Patrick

___
dev-servo mailing list
dev-servo@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-servo


Re: [dev-servo] Parallel hazards with absolutely positioned blocks

2014-02-14 Thread Patrick Walton

On 2/14/14 4:15 PM, L. David Baron wrote:

The overflow:fragments spec should probably either (a) say the same
or (b) require fragments following an absolutely positioned fragment
to also be absolutely positioned.  I'm in favor of (b), I think.  (I
think there might also need to be a constraint about floats.)


I like (b) as well, from the standpoint of avoiding sequential 
synchronization points in Servo's layout. That would allow the algorithm 
I proposed in the previous message to Boris to work even for 
fragmentation, I believe.


Thanks for clarifying.

Patrick

___
dev-servo mailing list
dev-servo@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-servo


[dev-servo] Quantifying the impact of floats on parallelism

2014-02-17 Thread Patrick Walton

Hi everyone,

I did some preliminary investigation into how much floats are likely to 
impact parallelism in the wild. I wrote a quick script [1] that looks at 
how many nodes could have their heights assigned in parallel and ran it 
on the Alexa Top 50 in the US.


I assume that a node can be laid out in parallel unless it is "impacted 
by floats". A node is said to be "impacted by floats" in a direction 
(left or right) if any of the following is true:


(1) The node contains a float descendant that is part of the same block 
formatting context.


(2) Any of its previous siblings could be impacted by floats, unless the 
sibling has the "clear" property set and there are no siblings impacted 
by floats in the direction(s) cleared.


(3) A parent in a separate block formatting context is itself impacted 
by floats. (This accounts for the fact that "clear" does not clear 
floats in separate block formatting contexts.)


I believe these rules are sufficient to capture the idea of a node that 
can be laid out in parallel with its siblings, although I could 
definitely be wrong here. Take these numbers with a grain of salt, 
because it's very possible I'm missing something in the algorithm.


Anyway, the results are in this Google Doc [2]. In summary: On average 
85.16% of nodes can be laid out in parallel. There is some variability: 
two of the sites (reddit.com and buzzfeed.com) were below 60% parallel. 
However, the numbers seemed overall promising: 73% of the sites had > 
80% of the nodes able to be laid out in parallel.


Patrick

[1]: https://gist.github.com/pcwalton/9064215

[2]: 
https://docs.google.com/spreadsheet/ccc?key=0AmYOd6rv-l1OdHA0bVVRd3I4OW1xTEV1SERYUldrU2c&usp=sharing

___
dev-servo mailing list
dev-servo@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-servo


  1   2   3   >