Problem
The AST nodes produced by the ClojureScript analyzer have a :children key to allow generic traversals. Unfortunately, this key is currently a denormalized view of child ast nodes.
For example, an :if node has :test, :then, and :else keys whose values are also contained in :children. This presents some difficulty when writing AST transformations because every transform function must be aware of this duplication. To correctly replace the :test of an :if node, you must also replace the first element of the :children vector.
Discussion
- https://groups.google.com/d/topic/clojure-dev/vZLVKmKX0oc/discussion
- https://docs.google.com/document/d/1DRN-tBIqhqVVyoHIDs7CMkduBFk-vqd_958ojeIBLHQ/edit
- http://dev.clojure.org/jira/browse/CLJS-289
Design
- Solution must retain "AST is just data" property. ie No API should be required to interpret it
- Idea: store :children as keys into the AST map
- would be nice: (defn children [ast] ((apply juxt (:children ast)) ast))
- but need to handle implicit do blocks: (defn children [ast](mapcat #(if (coll? %) % [%]) ((apply juxt (:children ast)) ast)))
- ticket: http://dev.clojure.org/jira/browse/CLJS-289
- outstanding issue: interleaved bindings
- would be nice: (defn children [ast] ((apply juxt (:children ast)) ast))
- Alternate idea: all child nodes are moved from the top level into a map in :children
- For "if" this would change {:op :if :env {env} :test test-expr :then then-expr :else else-expr :children [test-expr then-expr else-expr]} into {:op :if :env {env} :children {:test test-expr :then then-expr :else else-expr}}
- This idea doesn't address blocks or bindings.
- dnolen: What about forms like let where :children is a bit complex?
- :children (into (vec (map :init bes))(conj (vec statements) ret))acohen: I think that should become: {:op :let :children {:init [inits] :statements [statements] :ret ret}We might want to discuss whether the inits are truly a child here, at the moment I think they are though.dnolen: So if a :children value entry is a vector it's something you need to iterate over? What about :ret? So you'll always need to look at :children if you want to know :ret?acohen: Yes, if :children entry is a seq, it must be a seq of nodes. For :let, I forgot that this is something else I changed in CinC. I made bindings an :op.So it should actually look like:{:op :let:children {:inits [{:op :binding :name x :children {:init init-expr}}...]:statements [{...} ...]:ret ret}}dnolen: What about :ret? So you'll always need to look at :children if you want to know :ret?acohen: Yep, that's right. Tree traversal would _always_ go through :children, rather than having this separation of "syntax-aware walkers go through the nodes that they know the names of" and "generic walkers go through :children"bbloom: I think this approach suffers another issue: Generic walkers might need to know evaluation order. I should be able to post-walk the :childrens nodes to get a strict top-to-bottom, inside-out, left-to-right sequence of nodes, ignoring conditional and repeated evaluation. Otherwise, I need knowledge of nodes to tell if side effects from :init may happen before those from :statementsacohen: Can we mandate that the children map be a map-like structure that preserves order?bbloom: I don't think that's reasonable unless core gets such a map-like structure.
Labels:
7 Comments
Hide/Show CommentsAug 29, 2012
Brandon Bloom
I'm interested in re-invigorating this discussion. I'm currently working on robust ANF/CPS transforms and running into various complexity caused by this. It's impossible for any AST transformer to know if a down-stream compiler phase will use :children or not. The first issue I hit was modifying :items in the :vector op. Turns out that the emit phase looks at :children instead of :items. Whoops! I've been resorting to building a form and then calling analyze on that.
Mar 09, 2013
Ambrose Bonnaire-Sergeant
Is there really enough utility in :children to build it into the AST spec? What's wrong with providing a `children` multimethod for people to use when needed?
We obviously need to completely eliminate redundancies. Redundant :children simply makes all but the most trivial code transformations nearly impossible to perform correctly. I have no idea how I'd start handling the subtleties of :children in a hygienic AST transformation, which is a simple transformation compared to CPS.
I think it would be a difficult task to include the children keys in each :op. There are at least three different types of children expressions: single expressions, sequence of expressions (implicit do) and local-binding inits. Distinguishing them would most likely result in a convoluted interface for children. Even if we found a solution, do we expect each consumer of an AST to define their own `children` function? Is this any better than just providing a multimethod to :refer as needed?
In my (biased) opinion: children are strictly convenience and they hinder much more than they help.
Mar 12, 2013
Aaron Cohen
I'm trying to be guided by principles that Rich outlined in the email thread here:
Mar 12, 2013
Brandon Bloom
At first, I was in favor of :children as a vector of keys, but now I'm not so sure.
Transformations inherently involve creating new nodes. "assoc" is useless in the presence of :env, :children, or other synthesized, non-essential attributes, since you need to get out the essential fields and re-analyze that new AST node or re-analyze a raw form. Any cruft data from a previous analyze call is inherently a bug in an AST transformation, if you change an essential attribute. So this means that when constructing new nodes, you have three choices:
But #3 is just gonna delegate to analyze anyway, since the result is a map, not some object with accessor methods. So that's right out. Given this reality, I think it's perfectly fine to include full copies of :children instead of going with a vector of keys. It's a little ugly for printing, but a helper function that strips out non-essential attributes.
Day dreaming for a moment: I wish that I could have a map that had representational and computed attributes, such that if you update a representational key via assoc or whatever, all the computed attributes get thrown out.
Ideally: The only API methods would be analyze-ast and analyze-form. The analyze-ast function should, optionally, validate that the ast passed in only contains essential attributes. Both analyze methods should be configurable for what passes and data to include. Children should be one such flag, just like type hints could be another such flag. The default would be "include everything!" such that the average client only needs to know a single API call: (analyze-form top-level)
Mar 12, 2013
Brandon Bloom
I should note that I've written a working ANF transform and an incomplete CPS transform. My experience has shown:
https://github.com/brandonbloom/cljs-cps/blob/2c8b9e9f5980aff5e62ea0922c9f30101584fcf9/src/cps.clj#L54-L59
Mar 12, 2013
Aaron Cohen
Hmm, assoc being useless doesn't match my experience.
My big use-case in CinC is making several passes over the AST doing nothing more than adding attributes to the tree. (:box tells me if I need to emit a cast, :type tells me the java type of this expression)
The need for those attributes to be propagated up and down the tree correctly is what causes the current :children redundancy to make me pull my hair out so much. There's no way for me to put the attributes in the correct place if I don't know which child is which.
Mar 12, 2013
Brandon Bloom
assoc is useful for adding analysis results, but is useless for "changing" nodes.
The only valid way to "change" a node is to create a completely new node and re-analyze it.