> How would you create a `JS<T>` on the stack? I'll provide some context. I'm designing the HTML parser API to be compatible with different DOM representations. This is desirable for a few reasons:
- I want the library to be useful outside of Servo. It will have the ability to output a simple static parse tree, for users who don't have their own DOM type. - Decoupling the parser from the details of Servo's DOM will make both systems much easier to modify. - For off-thread parsing, we want to build a sequence of tree operations, but for on-thread parsing we should manipulate the DOM directly. My API mock-up at the moment looks like pub trait TreeSink<Handle> { fn create_element(&mut self, name: Atom) -> Handle; fn detach_from_parent(&mut self, child: Handle); fn append_element(&mut self, parent: Handle, child: Handle); // ... } (In the real API there will be more parameters, e.g. a namespace and attributes for create_element.) The library client will provide an implementation of TreeSink with an appropriate Handle type, and the parser will call these methods to manipulate the DOM during parsing. A Handle represents a reference to a mutable node in the DOM. They're required to be Clone, because the parser will hold internal references to nodes, e.g. in the stack of open elements. Implementing this trait for a refcounted DOM is straightforward: struct Node { pub name: Atom, pub parent: Option<WeakHandle>, pub children: Vec<Handle>, } #[deriving(Clone)] struct Handle { ptr: Rc<RefCell<Node>>, } struct WeakHandle { ptr: Weak<RefCell<Node>>, } struct Sink { root: Option<Handle>, } impl TreeSink<Handle> for Sink { ... } In Servo we have a JS-managed DOM, and on-main-thread parsing should manipulate it directly. I have something like #[deriving(Encodable)] struct Node { pub name: StrBuf, pub parent: Option<Handle>, pub children: Vec<Handle>, } type Handle = JS<Node>; impl TreeSink<Handle> for Sink { fn create_element(&mut self, name: Atom) -> Handle { let owned = ~Node { name: name, children: vec!(), parent: None, }; // Not shown: also build a JS wrapper object. unsafe { JS::from_raw(cast::transmute::<~Node, *mut Node>(owned)) } } fn append_element(&mut self, parent_hdl: Handle, child_hdl: Handle) { let mut parent = parent_hdl.root(); let mut child = child_hdl.root(); (*child).parent = Some(parent_hdl.clone()); parent.children.push(child_hdl.clone()); } } This (approximately) compiles, but I think it's not memory-safe, because we pass and return un-rooted JS<T> values. To fix this we need two handle types: pub trait TreeSink<InHandle, OutHandle> { fn create_element(&mut self, name: Atom) -> OutHandle; fn detach_from_parent(&mut self, child: InHandle); fn append_element(&mut self, parent: InHandle, child: InHandle); // ... } which will be instantiated as &JSRef<Node> and Temporary<Node> respectively. (And the lifetime of the JSRef will be inferred as the lifetime of each call, as in DOM bindings, but I'm not sure how to express this within the trait impl.) We'll need another trait to let the generic parser code convert between these types. There is also the question of what handles to store within the parser. - We could root every node as it's created, and unroot when the parser is destroyed. We'd store JSRef<Node>, transmuting away the lifetimes. - We could root the parser itself, make it traceable, and store JS<Node>. This seems safer, but would complicate the generic interface further. This is all a bit moot if a parser never lives across a JS operation that could GC. But I wouldn't bet on that always being the case. The current Hubbub bindings basically make this assumption, though; see http://irclog.gr/#show/irc.mozilla.org/servo/103713 I also thought about something like pub trait TreeSink<InHandle, OutHandle> { fn create_element<'t>(&'t mut self, name: Atom) -> OutHandle<'t>; fn detach_from_parent<'t>(&'t mut self, child: InHandle<'t>); but this would require higher-kinded polymorphism. It will never really be possible for safe code to use a handle which stores an &'t mut Node, because that would completely break Rust's mutable-pointer aliasing rules. For off-thread parsing, the TreeSink methods just record tree operations to be executed later. In that situation, I think handles should be sequential integer IDs. The tree op executor will use them as indexes into a vector of the nodes that it has created. This is similar to Gecko's approach, where a handle is an nsIContent**, i.e. a pointer to a slot where a node pointer will eventually be stored. By the way, it's not relevant to Servo, but I think we can parse into an owning tree without refcounting or copying. During tree building we'll have struct BuildNode { pub name: Atom, pub parent: Option<*mut BuildNode>, pub children: Vec<*mut BuildNode>, } and every node will be owned by the TreeSink itself. When parsing is finished we transmute the root to struct Node { pub name: Atom, /*priv*/ unused_parent: Option<uint>, pub children: Vec<~Node>, } transferring ownership of each node to its parent. Then we free any nodes that didn't make it into the final tree. Designing a library to be generic over its client's memory management approach in a statically safe way seems to be quite the challenge. I'd be very happy to hear thoughts regarding any of the above. keegan _______________________________________________ dev-servo mailing list dev-servo@lists.mozilla.org https://lists.mozilla.org/listinfo/dev-servo