Rationale
Clojure's dynamic vars and binding mechanism mimic the traditional dynamic variables. However, the traditional notions don't interact well with laziness or thread pools. Improve binding so it plays better with other Clojure constructs.
The problems:
- When work is sent to thread-pool threads, e.g. via agent sends or future calls, the work isn't done with the same binding set as the invocation (without manual effort)
- Yet that is often the expectation/desire
- If a lazy sequence is created within a certain scope of bindings and returned outside it, it gets done in the dynamic scope of the consumption
- This too is not what is expected
The objective:
- Convey bindings to agent and future thread-pool threads
- Must be cheap!
- Means you can't pay the cost of establishing truly new bindings (copies)
- Instead, adopt binding map of point of call wholesale, including mutable cells
- Make adoption as simple as assignment of map to thread local
- This means these threads truly share the bindings with the caller
- just like nested code in the caller thread would
- i.e. can see changes
- But - must avoid concurrency mess
- Give bindings the semantics of volatiles
- and check thread identity on set! calls
- thus only the thread creating the binding can do set!s
- thread pool threads will see effects of set!s in launching thread
- this isn't really a feature to advertise, but is race-free and safe, at least
- merely a side effect of doing it fast
- this isn't really a feature to advertise, but is race-free and safe, at least
- Need a way to adopt a binding set wholesale
- Need volatile semantics on binding boxes
- Need thread ids in binding boxes
- new nested TBox type in Var
- Need to check matching thread on set!
- Build conveyance into send/send-off and future?
- Must be cheap!
- Lazy seqs + bindings
- this is trickier, quite unlikely we can afford even binding adoption per seq step
- perhaps another flavor of lazy-seq that does binding adoption
- use only when you need this
- since must establish bindings every step
- only affordable for i/o bound logic
- Deliver this separately from thread support
- Ditto delay?
Issues
- Var counts
- maintaining these adds per-bound-var costs
- and requires bindings cleanup on termination
- if we didn't need to clean up then conveyance could just be assignment of same Frame to thread's dvals
- in a world where few vars are used dynamically, and then if so almost always so, perhaps can do away with counter?
- simple flag - has ever been dynamically bound
- In all cases, people who don't care about or use bindings in the async/delayed work won't want to pay for the overhead of this
- how to avoid parallel set of constructs or flags everywhere?
- Think ahead to fork/join
- if all of our forks are via our APIs then we can do same propagation there
Labels:
2 Comments
Hide/Show CommentsJan 23, 2011
David Powell
Now that we have binding Frames, and binding counters have gone, what is the issue with having LazySeq capturing the Frame at construction, and setting and restoring the Frame before and after calling the generating fn?
Would this behaviour be flawed in some way, or is the concern just that it would be too slow?
Number crunching mapping over vectors is based on chunked-cons, so perhaps the overhead of swapping the frame wouldn't be too bad; i/o applications are probably item at a time, but they probably wouldn't notice the overhead.
Mar 15, 2012
Lars Rune Nøstdal