This article is my give on the relationship between mathematical proofs and programming languages. Many details on specific implementation have been left out with the aim for clarity and conceptual coherency.
The source used in this article is available as a Gist.
Proofs and Programming
First, what is a proof? A proof is a series of deductive arguments, such that the proposition is justified. This description might seem quite abstract, so let us look at a concrete example using the Peano naturals for representing natural numbers.
As the proposition has no quantifiers, we can directly use the first argument to reduce the expression. That is, the one given by the usual definition of addition over Peano numbers. We then have the following expression.
We still have a mathematical object, a propositional claim. We know from the Peano axioms, that syntactical equivalence satisfies reflexivity, symmetry, and transitivity. Henceforth the properties of equality are satisfied, and we may end our deductive sequence.
Next, we want to translate above into programming.
From the Curry–Howard correspondence we know that propositions are types and proofs are programs. Alright, so we need to make an expression that has above expression as its type, and a program which inhibits this type.
First, we need to define our objects: Peano naturals and Equality. We implement Peano naturals the usual way. I elaborate on this in a previous post. Equality is defined as follows.
data Refl a b where Refl :: Refl a a
Evidently, we the value
Refl can only inhibit the type
Refl a b
if the types
b are identical.
Concerning the Curry-Howard correspondence, the
Refl value also
has the unit type - it is not possible to attach further data to
We have defined equality in terms of reflection. However, we need equality also to satisfy symmetry and transitivity. In the source, we have the code needed for that.
We now want to make the type1 for our proof, or, the equivalent to the proposition stated above:
onePlusOneEqualsTwo :: Refl (Add (S Z) (S Z)) (S (S Z))
Add in the type definition is a type family defined in the source. It is
defined as we would usually define addition over Peano naturals.
To prove it we need to make an inhibitor to that type. The program is very simple for this case as the Haskell compiler reduces the type level expression per semantics of type families.
onePlusOneEqualsTwo = Refl
As it compiles2 it shows that Haskell is content with the proof.
We want to abstract our proofs. In proving terminology, we do this through quantifiers.
To have a more graspable problem that includes only quantification, without induction, we detour to boolean algebra. Here we can try to formalize De Morgan’s theorem:
here a and b can only assume two values, true and false.
deMorgan :: SBool a -> SBool b -> Refl (Not (And a b)) (Or (Not a) (Not b)) deMorgan STrue STrue = Refl -- The first case deMorgan STrue SFalse = Refl deMorgan SFalse STrue = Refl deMorgan SFalse SFalse = Refl
We simply provide an inhibitor to the type based on pattern matching. This is the same as proving by case analysis.
To understand what goes on we instantiate the type expression
in each case. Afterward, the compiler reduces per the semantics
Or type families. In the first case we instantiate
the type expression such that.
deMorgan :: SBool Tru -> SBool Tru -> Refl (Not (And Tru Tru)) (Or (Not Tru) (Not Tru))
This instantiation is from the value
STrue which has the type
The compiler then reduces the expression and derives that
Refl :: Refl Fls Fls. From the interpreter, we get that.
*Proof> :t deMorgan STrue STrue deMorgan STrue STrue :: Refl 'Fls 'Fls
Many interesting properties we want to reason about includes unbound data. That is, the data we think about is inductively defined. We now go back to the examples considering natural numbers as they are a good medium for discussing inductively defined data.
plus_id_r is the property that adding zero to
n on the right side is the identity of n.
plus_id_l is when we add 0
on the left side.
plus_id_l is given directly from our definition of addition. But
needs to be proven, and we can do this inductively using following code.
plus_id_r :: forall n. SNat n -> Refl (Add n Z) n plus_id_r Zero = Refl plus_id_r (Succ x) = gcastWith (plus_id_r x) Refl
The first case is the base case. We know that the value is
hence we can derive the type to
Z. It is immediately visible that
inhibits the type
Refl Z Z.
The next case is the induction case. Here we fold out the value such that if
n = Succ x, then
x = n-1 - we reduce this on our argument. We justify that
Refl also is an inhabitant in this case by calling
plus_id_r on the
It is indeed possible to prove stuff in Haskell. But it is not further practical. The reason is in particular because the language is not designed with the constructions we need, such as dependent types. We simulate them through GADTs.
The closest languages to Haskell that are suited for this is languages such as Idris and Gallina (Coq). They have all the facilities needed for incorporating proofs into one’s code.
If one has a software development background firmly grounded in OOP (Java, C#), it requires quite some time to wrap one’s head around the new way to understand types.
That we can do above is mostly of academic interest: How do make sure that certain compilers indeed do what they should do etc. But the techniques are becoming steadily more accessible to all programmers.
New languages, like Idris, come with type constructions to formally reason about our software.
First, thanks to Aslan Askarov for providing valuable feedback on this article. It has been incorporated to provide a more coherent article.
This article was a precursor for a presentation at a spare time teaching event at Aarhus university. Thanks to them for letting me host the presentation.