Forest 1.0: A Language and Toolkit for Programming with Filestores
A filestore is a structured collection of data files housed in a
conventional hierarchical file system. Many applications use filestores
as a poor-mans database, and the correct execution of these
applications requires that the collection of files, directories, and
symbolic links stored on disk satisfy a variety of precise
invariants. Moreover, all of these structures must have acceptable
ownership, permission, and timestamp attributes. Unfortunately,
current programming languages do not provide support for documenting
assumptions about filestores, detecting errors, or safely loading from
and storing to them.
This paper describes the design, implementation, and semantics of
Forest, a novel domain-specific language for describing
filestores. The language uses a type-based metaphor to specify the
expected structure, attributes, and invariants of filestores.
Forest generates loading and storing functions that make it easy to
connect data on disk to an isomorphic representation in memory that
can be manipulated as if it were any other data structure. Forest
also generates metadata that describes the degree to which the
structures on the disk conform to the specification, making error
detection easy. Hence, in a nutshell, Forest extends the
rigorous discipline of typed programming languages and many of
their benefits to the untyped world of file systems.
We have implemented Forest as an embedded domain-specific language
in Haskell. In addition to generating infrastructure for reading,
writing and checking file systems, our implementation generates a
type class instances that make it easy to build generic tools that
operate over arbitrary filestores. We illustrate the utility of
this infrastructure by building a file system visualizer, a file access
checker, a generic query interface, description-directed variants of
several standard UNIX shell tools and (circularly) a simple Forest
description inference engine. Finally, we formalize a core fragment
of Forest in a semantics inspired by classical tree logics and prove round-tripping laws showing that the loading and storing functions behave sensibly.