Motivation
To add extensibility to the Clojure reader in a composable way.
General Goals
- new syntax for "tagged" literals in the reader
- tags can be namespaced
- non-namespace-qualified tags are reserved for Clojure
- when the reader encounters a tagged literal, it invokes a function associated with that tag
- users may define new tags and tag-reader functions
- users may override functions for built-in tags
Non-goals
- Common Lisp-style reader macros
- Custom parsing of the character stream
- Reader will parse tagged literals as data and pass that to the tag function
Open Questions
- What is the syntax for tags?
- Keywords, like
#:tag form
- One character longer
- Slightly easier to parse
- Symbols, like
#tag form
- One character shorter
- Slightly harder to parse
- Ambiguous with record constructors?
- Not if we assume that:
- 1) record names always contain a dot
- Which they must, because deftypes occur in namespaces
- 2) user-defined tags are always namespace-qualified
- 3) built-in tags do not contain a dot
- 1) record names always contain a dot
- Not if we assume that:
- Keywords, like
- What happens when the reader encounters a tag for which no function is defined?
- Error (YES)
- Return form unchanged?
- Return form with extra metadata?
- Only possible if form supports metadata, ignore otherwise?
- How to associate tags with reader functions?
- file containing mapping from tags to Vars
- "magic" file name gets loaded before other files
- doesn't load any code directly
- If 2 JARs on classpath have a magic file, who wins?
- throw an exception
- map stored in dynamic var
- thread-local binding for each source file
created in
Compiler.load and Compiler.compile
- can be set! within a source file
- can be rebound when invoking the reader at runtime
- thread-local binding for each source file
- file containing mapping from tags to Vars
References / See Also
Labels:
4 Comments
Hide/Show CommentsAug 13, 2012
Steve Miner
In Clojure 1.4 it is an error to have an unknown tagged literal. The open issue CLJ-927 discusses a possible way to support unknown tags. My suggestion is to return a map that preserves the unknown tag and the data. For example, if there's no data-reader for #point [0 1] the result would be something like {:unknown-literal point :value [0 2]}. The map can also take metadata if you want to preserve some information about the source.
Sep 18, 2012
Steve Miner
There was a related discussion on the mailing list: "Tagged literals: undefined tags blow up reader"
https://groups.google.com/forum/?fromgroups=#!topic/clojure/-MZ0w0sQ4eM
Another way to handle an unknown tag would be to support a catch-all key (say 'default) that allows the programmer to handle unknown keys. The associated function should take two arguments: the tag and the literal value. If there is no 'default key in *data-readers*, then an error would be thrown (same as Clojure 1.4).
Dec 10, 2012
Steve Miner
Clojure 1.5 beta1 adds *default-data-reader-fn* as a way to handle undefined tags.
Dec 20, 2012
Steve Miner
Regarding "Ambiguous with record constructors", see CLJ-1100 "Reader literals cannot contain periods". As of Clojure 1.5 beta2, # followed by a qualified symbol with a period in the name is considered a record and causes an exception for the missing record class. With the patch for CLJ-1100, only non-qualified symbols containing periods are considered records. That allows user-defined qualified symbols with periods in their names to be used as data reader tags.