Smart constructors that cannot fail
haskellPublished on June 16, 2018
Wicked.
—Arnaud Spiwack
In type theory, a refinement type is a type endowed with a predicate which is assumed to hold for any element of the refined type. That’s a pretty useful thing and is achieved in Haskell typically by using what is called smart constructors.
For example, I could create a new type called GreaterThanFive
and only
export mkGreaterThanFive
and not its “normal” data constructor:
newtype GreaterThanFive = GreaterThanFive Int
mkGreaterThanFive :: Int -> Maybe GreaterThanFive
mkGreaterThanFive n =
if n > 5
then Just (GreaterThanFive n)
else Nothing
unGreaterThanFive :: GreaterThanFive -> Int
unGreaterThanFive (GreaterThanFive n) = n
-- BTW, be careful to do this instead of exposing a record selector, which
-- could be used to change the inner value defeating the purpose of this
-- technique.
Now to create a value of the GreaterThanFive
type you have to go through
the hoops check. This ensures that if you ever get a value from
GreaterThanFive
it’ll be indeed greater than 5. What a trick!
An important thing to note right away is that by putting additional
constraints on arguments of a function, often times that function itself can
be made total (and pure, if we otherwise used MonadThrow
or similar to
report errors):
-- Division by zero cannot happen here.
myDivide :: Int -> GreaterThanFive -> Int
myDivide n m = n `div` unGreaterThanFive m
Purity and totality are good, so we would like to push the checks to the
boundaries of our systems and inside do the neat stuff in the spirit of
myDivide
.
So far so good. I was wondering the other day though why refinement types
are so underrated and underused in Haskell? Well, let’s face it, they are
annoying to construct exactly because the check can fail, and so the smart
constructors have to live in an environment that gives us the ability to
report errors in some way. And that’s not always handy. Even if you can push
some checks for some types to the boundaries of your system, you have
not done. What about things that are produced deep inside your system? To
use the trick there, you again have to drop into the nasty realm of
MonadThrow
.
That can’t be helped, can it? There are ways.
- Liquid Haskell, which is great, but not first-class in Haskell (yeah, it could be).
-
The
refined
library, which is sort of OK now that they have dropped theFunctor
instance ofRefined p x
.
Can we do better? Well, let’s try an alternative design now.
Basics
The first thing I’d like to propose is to separate primitives for
construction of Refined
types and actual establishing of properties.
Because well, we could establish some properties this morning and then
establish some more in the afternoon. We should be able to! This also
suggests that if we want to have many different properties we could actually
store them in a phantom type that is a list on the type level, so here we
go:
-- | @'Refined' ps a@ is a wrapper around @a@ proving that it has properties
-- @ps@. @ps@ is a type-level list containing /properties/, that is, void
-- data types symbolizing various concepts.
newtype Refined (ps :: [*]) a = Refined a
-- | 'refined' creates a refined type with no associated properties. We
-- don't demand anything, and so quite conveniently this is a pure function.
refined :: a -> Refined '[] a
refined = Refined
-- | We can erase information we know by using 'unrefined'.
unrefined :: Refined ps a -> a
unrefined (Refined a) = a
That looks like a solid foundation to me. What’s about properties?
class Prop a p where
type PropProjection a p :: *
checkProp :: Proxy p -> a -> Either String (PropProjection a p)
Here, let p
be a property, something like this:
data NotEmpty
But this NotEmpty
thing is only a name for some concept. I can easily have
a non-empty String
, Text
, ByteString
, or a vector. So we really need
the other ingredient, the type that has the property, a
. There is no
functional dependency between them as they don’t determine each other.
You might be thinking now though, “I thought we’re talking about types
equipped with predicates, what the heck is PropProjection
though”?
Projections and a bit of category theory
Well yeah, but why not imagine properties (also) as some sort of morphisms
that connect types? If we consider property p
as a morphism, then a
will
be its domain and PropProjection a p
will be its codomain!
Examples should help.
We could have a property telling that a Text
value is not empty, then:
type PropProjection Text NotEmpty = NonEmptyText -- e.g. a newtype
We could do the same for linked lists (a
can only have the kind *
currently):
type PropProjection [Char] NotEmpty = NonEmpty Char
This connects the NonEmpty
type (non-empty lists), normally obtainable via
the smart constructor nonEmpty
to our list. Once we have proven that our
list is NotEmpty
(that the name of the property, p
thing), we’ll be able
to get NonEmpty
projection purely without possibility of failure.
We could have a property called Length
and in that case we could say:
type PropProjection Text Length = Int
We could also have a property IsURI
telling if a Text
value is a valid
URI
. In that case we could say (assuming that URI
is such a type that
can represent only valid URI
s, as it should):
type PropProjection Text IsURI = URI
Yes, that’s right. The idea is simple and not new: knowing properties of types enables us to do a lot more purely and totally.
So we know now domain and codomain, but what is morphism itself? Properties
are morphisms (IsURI
) connecting types (such as Text
and URI
), and
they do that via:
checkProp :: Proxy p -> a -> Either String (PropProjection a p)
And actual category may be viewed as the Kleisli category for the Either String
monad, if you’re interested in that stuff.
The upshot is that we can do the following:
-
We can establish properties by “probing” them with
checkProp
. - If we have established a property, we can use it to get to the property projection purely and that can’t fail.
This “getting to” part is often not the main thing. It’s just a nice and
very practical thing to be able to do, as we’ll discover shortly. Some
properties don’t have very meaningful projections in this setup, such as for
example GreaterThan
:
data GreaterThan (n :: Nat)
instance (Integral a, KnownNat n) => Prop a (GreaterThan n) where
type PropProjection a (GreaterThan n) = a -- could be () as well
checkProp Proxy n =
if n > fromIntegral (natVal (Proxy :: Proxy n))
then Right n
else Left "not your day"
Establishing properties
There are ways to establish properties. We’ll start with relatively uninteresting, “brute-force” methods.
First, we could just assume stuff to be true:
assumeProp
:: forall q ps a. (Prop a q)
=> Refined ps a
-> Refined (ps `AddProp` q) a
assumeProp = coerce
With TypeApplications
it’s quite a nice thing to write:
myText :: Refined '[NotEmpty] Text
myText = assumeProp @NotEmpty (refined "foo")
Sometimes it’s a handy thing to do too, but needless to say, it’s unsafe and may be a source of bugs.
Next, in various monads such as MonadThrow
, MonadFail
, and MonadError
you could just check:
estPropThrow
:: forall q ps m a. ( Prop a q
, KnownSymbol (PropName q)
, MonadThrow m
)
=> Refined ps a
-> m (Refined (ps `AddProp` q) a)
-- estPropFail, estPropError are in the same spirit so they are not shown
Slightly better, you could use Template Haskell to check something at the compile time:
estPropTH
:: forall q ps a. ( Prop a q
, KnownSymbol (PropName q)
, TH.Lift a
)
=> Refined ps a
-> TH.Q TH.Exp -- returns (Refined (ps `AddProp` q) a) as usual
That’s it. Quite a few ways. We could use for example estPropError
in your
parser or what have you, and since Either
is an instance of MonadError
,
we could indeed match on Either
and report parse error in typed and nice
ways.
If something goes wrong, this is what you get:
data RefinedException = RefinedException
{ rexpCallStack :: !CallStack
-- ^ Location where the check failed
, rexpValue :: !String
-- ^ Value that failed to satisfy a property check
, rexpPropName :: !String
-- ^ Name of the property in question
, rexpIssue :: !String
-- ^ Description of why the property check failed
} deriving (Show, Typeable)
That concludes the boring part.
Identity property and composition of properties
I said that we have a category of refined types where properties are morphisms. Well, questions you want to ask are probably “what is the identity in that category” and “how to compose those properties”?
I’m glad you asked.
IdProp
is our identity property. As it happens with this sort of thing, it
doesn’t tell us much about anything at all:
data IdProp
instance Prop a IdProp where
type PropProjection a IdProp = a
checkProp Proxy = Right
Composition is a bit more interesting. Intuitively, it should compose two checks and get us to the final property projection.
-- | 'Via' is the composition in the category of refined types.
data (t :: *) `Via` (p :: *)
instance (Prop (PropProjection a p) t, Prop a p) => Prop a (t `Via` p) where
type PropProjection a (t `Via` p) = PropProjection (PropProjection a p) t
checkProp Proxy =
checkProp (Proxy :: Proxy p) >=> checkProp (Proxy :: Proxy t)
infixl 5 `Via`
Yeah. So t `Via` p
works as a property. It means that p
holds and
then t
holds on projection of p
. What can we do with this?
Well, remember I said we could have Length
property. Let’s define it for
Text
:
data Length
instance Text => Prop Text Length where
type PropProjection Text Length = Int -- yeah, not so refined, but real
-- world will be happier with this
checkProp Proxy = Right . T.length
An interesting thing here of course that Length
always “holds” for any
Text
. So we could fearlessly assumeProp @Length
, but I’ll show you a
better way soon.
So now, I can state this:
rText0 :: Refined '[] Text
rText0 = refined "foobar"
-- In real programs use 'estMonadThrow' or similar, don't just blindly
-- assume things!
rText1 :: Refined '[GreaterThan 5 `Via` Length] Text
rText1 = assumeProp @(GreaterThan 5 `Via` Length) rText0
How about:
rText :: Refined '[UriAbsolute `Via` IsURI] Text
We can establish properties of projections, or properties of any types that have “connection” to our current refined type via other properties.
Take a moment to understand. GreaterThan 5 `Via` Length
is a property
of something I can obtain (purely) from my Text
. So it’s also a property
of Text
in a way. Nothing bad happened.
Now of course, I want to get length of my Text
and it better be equipped
with GreaterThan 5
property by construction!
Let’s just follow the morphism:
-- | Obtain a projection as a refined value, i.e. follow a morphism created
-- by a property.
followProp :: forall p ps a. (Prop a p, ps `HasProp` p)
=> Refined ps a
-> Refined (ProjectionProps ps p) (PropProjection a p)
-- ...
rLength :: Refined '[GreaterThan 5] Int
rLength = followProp @Length rText1
Notice how the `Via` Length
part disappeared.
Just in case you’re wondering if you can follow composite properties (with
Via
in them), the answer is “yes, you can”. t `Via` p
will first
follow p
and then t
as you’d expect.
Are we done yet?
No.
We really just started. No jokes, the whole thing was a preparation for this part. So watch closely now.
Simple things first. We can select a subset of properties or re-order them:
-- | Select some properties from known properties.
selectProps :: forall qs ps a. (ps `HasProps` qs)
=> Refined ps a
-> Refined qs a
selectProps = coerce
So I guess it’s best to use concrete type-level lists in API instead of
constraints like HasProps
. But I haven’t decided on this yet.
Now. Why not have ways to reason about stuff?
-- | An @'Axiom' name vs qs p@ allows us to prove property @p@ if properties
-- @qs@ are already proven. @name@ and arguments @vs@ determine both @qs@
-- and @p@.
class Axiom (name :: Symbol) (vs :: [*]) (qs :: [*]) (p :: *) | name vs -> qs p
(It’s a pity we can’t have proper theorems, but I’ll leave that to Coq for now.)
We could apply an Axiom
with this:
applyAxiom
:: forall name vs p qs ps a. (Prop a p, Axiom name vs qs p, ps `HasProps` qs)
=> Refined ps a
-> Refined (ps `AddProp` p) a
…to learn a new fact about a refined value.
The clever thing about name
here is that it allows us to determine both
qs
and p
without enumerating them explicitly. You may be wondering what
vs
is for, but that will be clearer from examples.
First, IpProp
demands something to be formalized:
-- | We always can assume that a value has 'IdProp'.
instance Axiom "id_prop" '[] '[] IdProp
-- name args need conclusion
-- | An existing property can be pre-composed with 'IdProp'.
instance Axiom "id_prop_pre" '[a] '[a] (a `Via` IdProp)
-- name args need conclusion
-- | Pre-composition of 'IdProp' can be dropped.
instance Axiom "id_prop_pre'" '[a] '[a `Via` IdProp] a
-- name args need conclusion
-- | An existing prperty can be post-composed with 'IdProp'.
instance Axiom "id_prop_post" '[a] '[a] (IdProp `Via` a)
-- name args need conclusion
-- | Post-composition of 'IdProp' can be dropped.
instance Axiom "id_prop_post'" '[a] '[IdProp `Via` a] a
-- name args need conclusion
So vs
, or arguments is just a helper collection of types (to help
determine qs
and p
because of the functional dependency) that is
sometimes necessary in order to state more interesting axioms, such as this
one:
instance CmpNat n m ~ GT => Axiom "weaken_gt"
'[V n, V m] '[GreaterThan n] (GreaterThan m)
Where V
is just a helper to allow us to have heterogeneous type-level
lists with respect to kinds of elements:
data V (a :: k)
So that’s how it works:
n :: Refined '[GreaterThan 5] Int
n = assumeProp @(GreaterThan 5) (refined 10)
m :: Refined '[GreaterThan 4] Int
m = selectProps $ applyAxiom @"weaken_gt" @'[V 5, V 4] n
Axioms may be quite interesting.
For example, we could demand that a URI
is absolute, let’s call that
property UriAbsolute
. Then we could have the property stating that URI
is convertible to Text
, let’s call it IsText
.
Then what can we say about this?
instance Axiom "abs_uri_not_empty" '[] '[UriAbsolute] (NotEmpty `Via` IsText)
Which reads: if you know what your URI
is absolute, then once you render
it to Text
, that rendering will be not empty.
Conclusion
I guess we could deduce of lot of stuff this way. Then add whatever assumptions we need to the checks at the boundaries of our systems, e.g. in parsers.
I have coded up the whole thing here (will put it on Hackage shortly):
Check it out and maybe try to use, I have no idea how it feels if used at scale. Maybe not very good!
Thanks
To Arnaud Spiwack, Facundo Dominguez, and Mathieu Boespflug for discussing the design with me.