Skip to end of metadata
Go to start of metadata

Motivation

To add extensibility to the Clojure reader in a composable way.

General Goals

  • new syntax for "tagged" literals in the reader
  • tags can be namespaced
  • non-namespace-qualified tags are reserved for Clojure
  • when the reader encounters a tagged literal, it invokes a function associated with that tag
  • users may define new tags and tag-reader functions
  • users may override functions for built-in tags

Non-goals

  • Common Lisp-style reader macros
  • Custom parsing of the character stream
    • Reader will parse tagged literals as data and pass that to the tag function

Open Questions

  • What is the syntax for tags?
    • Keywords, like #:tag form
      • One character longer
      • Slightly easier to parse
    • Symbols, like #tag form
      • One character shorter
      • Slightly harder to parse
      • Ambiguous with record constructors?
        • Not if we assume that: 
          • 1) record names always contain a dot
            • Which they must, because deftypes occur in namespaces
          • 2) user-defined tags are always namespace-qualified
          • 3) built-in tags do not contain a dot
  • What happens when the reader encounters a tag for which no function is defined?
    • Error (YES)
    • Return form unchanged?
    • Return form with extra metadata?
      • Only possible if form supports metadata, ignore otherwise?
  • How to associate tags with reader functions?
    • file containing mapping from tags to Vars
      • "magic" file name gets loaded before other files
      • doesn't load any code directly
      • If 2 JARs on classpath have a magic file, who wins?
        • throw an exception
    • map stored in dynamic var
      • thread-local binding for each source file
        • created in Compiler.load and Compiler.compile

      • can be set! within a source file
      • can be rebound when invoking the reader at runtime

References / See Also

Labels:
  1. Aug 13, 2012

    In Clojure 1.4 it is an error to have an unknown tagged literal.  The open issue CLJ-927 discusses a possible way to support unknown tags.  My suggestion is to return a map that preserves the unknown tag and the data.  For example, if there's no data-reader for #point [0 1] the result would be something like  {:unknown-literal point :value [0 2]}. The map can also take metadata if you want to preserve some information about the source.

  2. Sep 18, 2012

    There was a related discussion on the mailing list: "Tagged literals: undefined tags blow up reader"

    https://groups.google.com/forum/?fromgroups=#!topic/clojure/-MZ0w0sQ4eM

    Another way to handle an unknown tag would be to support a catch-all key (say 'default) that allows the programmer to handle unknown keys.  The associated function should take two arguments: the tag and the literal value.  If there is no 'default key in *data-readers*, then an error would be thrown (same as Clojure 1.4).

  3. Dec 10, 2012

    Clojure 1.5 beta1 adds *default-data-reader-fn* as a way to handle undefined tags.

  4. Dec 20, 2012

    Regarding "Ambiguous with record constructors", see CLJ-1100 "Reader literals cannot contain periods".  As of Clojure 1.5 beta2, # followed by a qualified symbol with a period in the name is considered a record and causes an exception for the missing record class. With the patch for CLJ-1100, only non-qualified symbols containing periods are considered records. That allows user-defined qualified symbols with periods in their names to be used as data reader tags.