]> Bytecode Verification for Haskell


Bytecode Verification for Haskell

Robert Dockins, <robdockins@fastmail.fm>
Samuel Guyer, <sguyer@cs.tufts.edu>

Dept. of Computer Science, Tufts University, Medford, MA

What is bytecode verification?

Why Verify Haskell?

Untrusted code? How does that work?

Haskell is purely functional. Side effects only happen in the IO monad. Any code which type-checks and is not in the IO monad is always safe to run.*

Hey, but wait! I need side effects to do anything useful!

Define an IO-like monad (e.g., ProxyIO) which only exposes the functionality needed by client code. Interpret ProxyIO in a trusted security kernel that performs I/O actions on behalf of the client only if the requested actions are allowed by the security policy.

Because verification guarantees that untrusted client code cannot perform IO calls directly, our security policies have a "default deny" flavor. Compare with the "default allow" behavior of Java, where the standard libraries have to be carefully coded to check security rights at every potentially sensitive method call.

* Provided you eliminate bad actors like unsafePerformIO.

Yhc (York Haskell Compiler)

Example malicious program

So what's wrong with this program?

The bogus POP 255 instruction breaks assumptions that the VM makes that programs will only access in-bounds stack locations.

This program pops off a bunch of the stack (which it doesn't own), writes a constant into whatever location that is, and then calls itself building a new stack frame lower down in the stack. The stack pointer will eventually be driven off the bottom end of the stack and into an unallocated memory region. When the VM attempts to write to this location, it will segfault.

Definition of function evil

Constants:

     0   Function        evil
     1   StringConstant  "ASDF"
  

Bytecodes:

     NEED_HEAP 1
     POP 255
     PUSH_CONST 1
     PUSH_ARG 0
     MK_AP 0
     EVAL
     RETURN
     END_CODE
  

Motivating type certificates

A Type-certifying Haskell compilation system

Compilation Pipeline

Haskell is Hard to Parse and Typecheck!

Solution: define a more tractable intermediate language and use that.

HaskellIL
LayoutNo layout
LR grammarLL(1) grammar
Fixity resolutionNo infix operators
DefaultingNo defaulting
Compound pattern matchingPrimitive case analysis
Implicit type abstraction/applicationExplicit type abstraction/application
Curry-style "external" typesChurch-style "internal" types
Complicated type-inferenceTop-down typechecking
TypeclassesDictionary-passing

Design of the (simplified) IL

Start with the solid base of System F ω , and add:

You end up with an intermediate language capable of encoding most of Haskell which has a strong metatheory.

We conjecture that this language is confluent and type-safe and that type-checking is decidable.

Compiler Block Diagram

Compilation Data Transformations

Stage 1 Bytecode

Type rewrite rules

Stage 2 bytecode

Final steps

Verification algorithm

  foreach f in combinator-definitions
    let code be the bytecode sequence for f
    let args be the types of arguments to f
    let result be the the result type of f
    let s be an array of length (len(code) + 1)
    s[0] := empty_stack; s[1..len(code)+1] := void

    for i in 0 .. len(code)
       if (s[i] = void) fail
       if (! preconditions(s[i],code[i],args,result)) fail

       foreach j in next-locations(code[i])
         s[j] := calculate-effect(s[i],code[i],j) or fail

         foreach r in rewrite-rules[i]
           s[j] := rewrite-type(r,s[j]) or fail
         done
       done
    done
  done

Properties of the verification algorithm

Future Directions

Wrap-up

What we did

Questions?

Appendix

This presentation is coded entirely in XHTML+SVG+MathML+Javascript! You too can code your presentations in a web-accessible, standards-compliant, cross-platform way!

Recent versions of Mozilla Suite and Firefox are able to directly render SVG and MathML embedded content. Hopefully other browser vendors will follow Mozilla's lead in the future.

Check out the W3C Slidy demo at http://www.w3.org/Talks/Tools/Slidy/ for details on how to create your own XHTML presentations.

The original HTML Slidy code and stylesheets were written by Dave Raggett, copyright © 2005 W3C (MIT, ERCIM, Keio). They were made avaliable under the W3C's software and document licenses.

The content of this presentation is © Robert Dockins, 2007.

Valid XHTML1.1