defrecord improvements

`defrecord` and `deftype` improvements for Clojure 1.3

Motivation

The Java unification of records prevents them from being first class, in either the data or fn sense:

record data is not first class
- can't read/write them
  - crummy choice: maps are good as data, need records for protocol polymorphism
- user code cannot fix this
  - anything that requires EvalRead is not a fix
record creation is not first class
- no per-record factory fn (or access to any associated fn plumbing, e.g. apply)
- Clojure level use/require doesn't get you access to records
- user code can mostly fix this (defrecord+factory macro)
Symmetrically, PO Java classes are also not first class
- A unified reader form would be ideal
Reduced import complexities

Solutions

For the sake of discussion, focus will revolve around example defrecords and deftypes defined as

(ns myns)

(defrecord MyRecord [a b])
(deftype MyType [a b])

Semantics of records as first class data

The semantics of record reader forms and record factory functions are defined as follows:

(-> (MyRecord. <initialization value> <initialization value>)
    (into {:a 1, :b 2})
    <validation>)

note: The semantics illustrated above should not be taken as implementation detail. at the moment <validation> is undefined and should be considered a no-op.

The <initialization value> refers to the same default default values for Java primitive types (as defined by type hinting on the record fields) or nil for instances. For record reader forms, the keys and values must remain as constants as their semantics require that the readable form coincide with the evalable form.

Record and Type reader forms

There would be two additional reader forms added to Clojure.

Labelled record reader form

#myns.MyRecord{:a 1, :b 2}

Positional record and type reader forms

#myns.MyRecord[1 2]

and

#myns.MyType[1 2]

This syntax satisfies the need for a general-purpose Java class construction reader form. However, not all Java classes are considered fully constructed after the use of their constructors. Therefore, serialization support is not provided for any Java classes by default. For instances such as these, Clojure will continue to provide facilities via print-dup in the known ways.

Generated factory functions

When defining a new defrecord, two functions will also be defined in the same namespace as the record itself. For new deftypes, only the positional constructor outlined below is generated.

Factory function taking a map (`defrecord` only)

A factory function named map->MyRecord taking a map is defined by defrecord.

(myns/map->MyRecord {:a 1, :b 2})

;=> #myns.MyRecord{:a 1, :b 2}

Factory function taking positional values (`defrecord` and `deftype`)

A factory function named ->MyRecord taking positional values (as defined by the record ctor) is also defined by defrecord.

(myns/->MyRecord 1 2)

;=> #myns.MyRecord{:a 1, :b 2}

and

(myns/->MyType 1 2)

;=> #<MyType myns.MyType@2ed277f2>

Writing records

When writing record data for the purposes of serialization, the positional reader form is used by default:

(binding [*print-dup* true]
  (pr-str (MyRecord. 1 2)))

;=> "#myns.MyRecord[1, 2]"

However, if you wish to use the map reader form instead, then the following would work:

(binding [*print-dup* true
          *verbose-defrecords* true]
  (pr-str (MyRecord. 1 2)))

;=> "#myns.MyRecord{:a 1, :b 2}"

note: printing forms for types are not provided by default

Tool support

Defining Clojure defrecords will also expose static class methods useable at the Java API level. These methods are not documented with the intention of public consumption and are considered implementation details.

Static factory for defrecords

The static factory exposed will mirror the map->MyRecord function:

(MyRecord/create aMap)

Basis access

A static factory allowing access to the basis keys will also be provided:

(MyRecord/getBasis)
;=> [a b]

and

(MyType/getBasis)
;=> [a b]

The getBasis method will return a PersistentVector of Symbols with (potentially) attached metadata for each field.

Old Ideas

Lesser Problems:

generic factory fn
- like factory fn, but generic with name
- introduces weak-referencing, modularity issues, etc.
- don't have a good problem statement, so ignoring this for now
- which comes first: generic or specific?
support for common creation patterns
- named arguments
  - with more than a few slots, record construction is difficult to read
- default values
  - maybe needs to be a property of factory fn, not record
  - different factory fns can have different defaults
- validations
- are the patterns truly common?
- very solvable in user space, esp. if per-record factory fn available
application code needing to know record fields
- synthesizing data
- creating factory fns if we don't provide them

Challenges:

how evaluative should record read/write be?
- option 1: records are data++: no EvalReader needed, no non-data semantics
- option 2: records are more:
  - maybe EvalReader required?
  - maybe special eval loopholes for constructor fns?
- option 1 wins
what happens when readers and writers disagree about a record's fields?
- positional approach would either fail or silently do the wrong thing
- k/v approach lets you get back to the data
  - still on you to fix it
does this have be a breaking change?
- data print/read: no
- constructor fn: yes
  - any good generated name likely to collide with what people are using
what if defrecord is not present on the read side?
- fail?
- create a plan map instead
  - plus tag in data?
  - plus tag in metadata?
  - reify in a tagging interface
- attempt to load
  - no – could lead to arbitrary code injection during read

Some Options:

create reader/writer positional syntax, no constructor fn
- pros
  - easy to deliver efficiently
  - non-breaking
  - introduces no logic (user or clojure) into print/read
- cons
  - what happens if defrecord field count changes?
  - what happens if field names change?
    - no way to know
- feels like a non-starter
create reader/writer kv syntax, no constructor fn
- pros
  - non-breaking
  - introduces no logic (user or Clojure) into print read
  - can still recover data if defrecord structure has changed
- cons
  - how to deliver read efficiently?
    - create empty object + merge
      - cache the empty object we merge against?
    - reflect against object and manufacture reader fn
      - who keeps track of this?
      - how would this interact with constructor, if we add that separately?
    - add a map-based constructor to defrecord classes
      - what would its signature be?
    - add a static map based factory fn to defrecord classes
reader/writer syntax that depends on a new factory fn
- pros
  - can be efficient
  - can implement any policy in handling defrecord changes
- cons
  - likely breaking (what will the fn names be?)
  - read/write now depends on fns
positional constructor fn
- no
- replicates the weakness of existing constructors
kv constructor fn
- open questions
  - autogenerated for all defrecords?
  - optional?
  - conveniences (defaults, etc.)
    - no

Tentative Proposal 1:

Define a k/v syntax for read and write that does not require a factory fn.

adopt the existing print syntax as legal read syntax?
- "#:user.P{:x 1, :y 2}"
get Rich's input on efficient reader approach (4 possibilities listed above)
if reader defrecord fields are different, merge and move on
Undecided: if record class not loaded:
- TBD: error or make a plain ol map?
- hm, could fix on writer side: option to dumb records down to maps?

Tentative Proposal 2:

Autogenerate a k/v factory fn for all defrecords.

(new-foo :x 1 :y 2)
class constructor is an interop detail
factory fn is the Clojure way
people can build their own defaults, validation, etc. easily with macros, given this

Some history:

The record multimethod was almost ready to go when Rich raised the GC issue. What happens when somebody creates a ton of record classes over time? GC can collect records that are not longer in use, but doesn't clean up the old multimethod functions.

Additional Reading

Some (non-contributed) code that demonstrates people's need for this:

cemerick's defrecord slot defaults
David McNeil's enhanced clojure records

defrecord improvements

`defrecord` and `deftype` improvements for Clojure 1.3

Motivation

Solutions

Semantics of records as first class data

Record and Type reader forms

Labelled record reader form

Positional record and type reader forms

Generated factory functions

Factory function taking a map (`defrecord` only)

Factory function taking positional values (`defrecord` and `deftype`)

Writing records

Tool support

Static factory for defrecords

Basis access

Old Ideas

Lesser Problems:

Challenges:

Some Options:

Tentative Proposal 1:

Tentative Proposal 2:

Some history:

Additional Reading

14 Comments

Rich Hickey

Stuart Halloway

Alex Miller

David McNeil

Stuart Halloway

David McNeil

Fogus

David McNeil

Alexander Taggart

Fogus

Alexander Taggart

Fogus

Rich Hickey

Fogus

defrecord and deftype improvements for Clojure 1.3

Motivation

Solutions

Semantics of records as first class data

Record and Type reader forms

Labelled record reader form

Positional record and type reader forms

Generated factory functions

Factory function taking a map (defrecord only)

Factory function taking positional values (defrecord and deftype)

Writing records

Tool support

Static factory for defrecords

Basis access

Old Ideas

Lesser Problems:

Challenges:

Some Options:

Tentative Proposal 1:

Tentative Proposal 2:

Some history:

Additional Reading

`defrecord` and `deftype` improvements for Clojure 1.3

Factory function taking a map (`defrecord` only)

Factory function taking positional values (`defrecord` and `deftype`)