Language Validation/Constraint Support For Types #939

wilbennett · 2020-12-05T02:01:09Z

Disclaimer

I'm not an expert F# developer so please forgive any syntactic mistakes in the following as I'm not writing this in an IDE.

Overview

One of the big draws to F# for me is the promise of domain modeling in the type system. An area where this seems to fall short, IMHO, is supporting validation on types. Again, I'm not an expert F# developer so please don't hesitate to correct any of my misconceptions.

My understanding is, looking at declared types, you should be able to easily reason about how they are handled and the compiler should prevent invalid usage.

Issue

Let's examine a simple case:

I want to create a string subtype that only allows strings of length 3 to 5:

The existing way of approaching this problem in F# is ...

type String3To5 = private String3To5 of string

module String3To5 =
  let create str =
      if String.length str < 3 then raise "str should be at least 3 characters"
      if String.length str > 5 then raise "str should be no more than 5 characters"
      String3To5 str

// OR

  let create str =
      if String.length str < 3 then None
      elif String.length str > 5 then None
      else Some (String3To5 str)

  let value (String3To5 str) = str

There are several things I don't like about this solution:

Why are both construction and reading private?
- The data is immutable. Reading should not be restricted.
- I'm forced to use the "value" helper everywhere (or active pattern)
The validation is separate from the declaration
I can't reason about the type by looking at it
- The name may give an indication but names lie

I propose we use one of the following options or something similar...

type String3To5 =
  | InvalidString3To5 of InvalidValue: string * Errors: string list
  | ValidString3To5 of Value: string
  with Validations
    Value: String.length Value < 3, "Value should be at least 3 characters"
    Value: String.length Value > 5, "Value should be no more than 5 characters"
  onfail errors: InvalidString3To5 (Value, errors) // Return type is String3To5

type String3To5 = String3To5 of Value: string
  with Validations
    Value: String.length Value < 3, "Value should be at least 3 characters"
    Value: String.length Value > 5, "Value should be no more than 5 characters"
  // One of:
  onfail errors: raise errors // Return type is String3To5
  onfail errors: None         // Return type is Option<String3To5>
  onfail errors: Error errors // Return type is Result<String3To5, string list>

I omitted some cases and took some corners for the sake of brevity but hopefully the intent is clear.

Pros and Cons

The advantages of making this adjustment to F# are ...

It is clear by looking at the declaration what the constraints on a type are
No need to create a module just for handling validation
Values are publicly accessible

The disadvantages of making this adjustment to F# are ...

Potential confusion, e.g. in the case where "new" returns Option instead of the type

Extra information

Estimated cost (XS, S, M, L, XL, XXL):

Related suggestions:

Affidavit (please submit!)

Please tick this by placing a cross in the box:

This is not a question (e.g. like one you might ask on stackoverflow) and I have searched stackoverflow for discussions of this issue
I have searched both open and closed suggestions on this site and believe this is not a duplicate
This is not something which has obviously "already been decided" in previous versions of F#. If you're questioning a fundamental design decision that has obviously already been taken (e.g. "Make F# untyped") then please don't submit it.

Please tick all that apply:

This is not a breaking change to the F# language design
I or my company would be willing to help implement and/or test this

For Readers

If you would like to see this issue implemented, please click the 👍 emoji on this issue. These counts are used to generally order the suggestions by engagement.

The text was updated successfully, but these errors were encountered:

cartermp · 2020-12-05T02:20:45Z

The validation is separate from the declaration

Note that another way to do this is via a static member, like so:

type private String3To5 =
    | String3To5 of string
    
    static member create str =
        if String.length str < 3 then None
        elif String.length str > 5 then None
        else Some (String3To5 str)

    static member value (String3To5 str) = str

match String3To5.create "123" with
| Some s -> s |> String3To5.value
| None -> ":("

Though I would probably use a normal class at this point just to make convenient use of . like so:

type String3To5 private (str) =
    static member Create str =
        if String.length str > 5 || String.length str < 3 then
            None
        else
            Some (String3To5(str))
    member this.Value = str


match String3To5.Create("123") with
| Some s -> s.Value
| None -> ":("

So at the very least you can ensure that it's all contained in the same entity.

wilbennett · 2020-12-05T21:04:30Z

@cartermp

Thank you so much for your prompt and awesome reply. I especially like the class approach and will be using that instead from this point (at least instead of single case DU).

That being said, there are a number of benefits that baking this into the language would provide. As I stated, I took some corners in the interest of making this brief (What can I say? I'm a lazy programmer). I also emphasized the domain modeling aspect and insidiously inserted a tiny bit about usage. Your awesome alternative fits most of the domain modeling concern.

I will create a more comprehensive example to illustrate what I mean and post here (hopefully later today). Until then, here is a sample list (partially based on what I've seen from Scott Wlaschin & other sources):

Declarations easily understood by non programmers i.e. domain experts
- Declaring members and such isn't friendly to them (Yes, blame me. I had code in my example validations)
Don't allow invalid usage
- By using a class, we've lost exhaustive DU checking, for example
By using a class, we've lost value equality - e.g. if needing to validate records
Having a list of validations, the compiler/runtime would do the aggregation of constraint failures for us
- With the class, we have to do that ourselves further obscuring the declaration (and not non-coder friendly)
- Can get list of validations from the runtime without having to write all that boilerplate ourselves
  - This can be useful when using the Railway Oriented Pattern - validation on the type used to validate input and return Result<'a, 'b>

Again, thanks for your awesome response and apologies for not using a more thorough example. I will post one shortly.

RealmPlume · 2020-12-26T00:43:54Z

You need Refinement Type, it remind me of the dead F7.
If you need DU. you can write a function with same name immediately in the module, to shadow the constructor , like this:

   type String3To5 =
      | String3To5 of string   

   let String3To5 txt =
      match String.length txt with
      | 3 | 4 | 5 -> Some(String3To5 txt)
      | _ -> None

wilbennett · 2020-12-26T03:56:26Z

Sorry for the delayed response - had some computer problems.

Here's a more detailed example of what I'm proposing:

// These would be in a separate file and possibly some included in F# library
let [Constrain] ``Should be at least 3 characters`` (value: string) = ...
let [Constrain] ``Should be no more than 5 characters`` (value: string) = ...
let [Constrain] ``Should be properly cased`` (value: string) = ...
let [Constrain] ``Should be at least `` target value = ...

// This would be in the file that domain experts could review
type String3To5 =
  | InvalidString3To5 of InvalidValue: string * Errors: string list
  | ValidString3To5 of Value: string
    with Constraints
        Value: ``Should be at least 3 characters``
        Value: ``Should be no more than 5 characters``
    onfail errors: InvalidString3To5 ("Value", errors) // Return type is String3To5

type InvalidPerson = { Errors: string list }

type Person = {
    FirstName: string
    LastName: string
    Age: int
  }
  with Constraints
    FirstName: ``Should be properly cased``
    LastName: ``Should be properly cased``
    Age: ``Should be at least `` 18
  onfail errors: { InvalidPerson.Errors = errors }

// Or with some syntactic sugar to make it more non programmer friendly

type Person = {
    FirstName: string
    LastName: string
    Age: int
  }
  with Constraints
    FirstName: Should be properly cased
    LastName: Should be properly cased
    Age: Should be at least 18
  onfail: InvalidPerson.Errors = errors

// Or inlined

type String3To5 =
  | InvalidString3To5 of InvalidValue: string * Errors: string list
  | ValidString3To5 of Value: string
        Requirement: ``Should be at least 3 characters``
        Requirement: ``Should be no more than 5 characters``
    onfail: InvalidString3To5 (Value, errors) // Return type is String3To5

type Person = {
    FirstName: string
         Requirement: Should be properly cased
    LastName: string
         Requirement: Should be properly cased
    Age: int
         Requirement: Should be at least 18
  }
  onfail: InvalidPerson.Errors = errors

abelbraaksma · 2020-12-27T13:18:09Z

I can sympathize with your suggestion, I guess we've all been there. But functional languages separate data from business logic, and validation is part of business logic. Typically you'd use a module with the same name as your type that contains tryCreate functions that return an option or a result type.

It's not uncommon to create a type with a hidden constructor that encapsulates creation behavior for data that should be validated. But I believe it's far more common to use a DU with the core types and combine with result for validation. There's a trade off for each approach and it depends on the domain which is most suitable.

There are some suggestions out there that could make this easier to bake into a DU, like allowing private constructor overrides. But I'm not sure they'll fully cover this, as when you construct a type, there's no mechanism, other than raising an exception, to return an invalid instance. Though, you could make the invalid instance part of the DU, but then we're back at the separation of concerns issue: that's what Result is for.

abelbraaksma · 2020-12-27T13:21:07Z

Btw, your last post reminds me of Design By Contract, which is an OO concept. I believe there are some libraries that inject such code for you, and you use attributes to add the "contract" requirements. Such libraries should work with F# just the same.

wilbennett · 2020-12-27T15:17:40Z

Thanks a lot for your input @abelbraaksma! I can definitely understand the separation of concerns idea and benefits. In my mind, though, given the promotion of domain driven design with F#, it makes sense for constraints to be a part of the definition.

Everything I've been reading touts being able to look at a type definition and having even mere mortals have a good intuition about what it does. I believe constraints fit squarely into that idea.

I'm not saying the way I presented it is the best (it's probably horrible), just that I think this is an integral part of a type and should be a first class citizen. I don't see how a domain design can be considered complete without specifying the constraints. Yes, there are bunch of code "workarounds" to add constraints, as shown by your post where you suggest yet another. My concern is just that, it's not standard. You have to learn what each project is doing. Looking at the design/definitions tell you little. All these workarounds end up compromising the use of the original intended type in some way. They also are not meant to be understood by the mere mortals.

On the topic of separation of concerns, I believe that can be a bit subjective. If we look at this from the perspective of the single responsibility principle, I can argue that the the one reason for a "change" (in the case of types, a new type to be created), is if the constraints or constituent types change (in type or number). IOW, I'm arguing that structure and constraints form the single responsibility. By splitting them into two separate entities, we've achieved little as a change to one breaks or requires a change in the other (or changes the entire contract).

As an aside: That is one of my go to rules for deciding single responsibility. If I split something into two, can I update the two new pieces independently? If yes, these are two single responsibilities. If no, I already had a single responsibility. In the case of my String3To5 example, changing the name or the constituent type would require changing the constraint and vice versa. That is one reason I don't see the need for separation. For argument sake, let's say someone changes the name to String3to6 but forgets to change the constraint. This is much easier to spot with the integrated constraint than it is with the separated workarounds.

At the risk of this response getting overwhelmingly long, I'll address your "business logic" concern. I lean in the direction of disagreeing with this. One, because the point of DDD is to model the domain and two, because types themselves, by definition, are constraints. If the domain expert tells me a ProductID can be 0..255, using "ProductID: byte" is effectively a constraint. I wouldn't use "ProductID: float". IOW, I'm arguing we are allowing some constraints in the definition but not others. Other examples, "Speed: float<m/s>", "Temperature: float<fahrenheit/.>".

Thanks again and I look forward to your additional feedback.

abelbraaksma · 2020-12-27T17:01:10Z

If the domain expert tells me a ProductID can be 0..255, using "ProductID: byte" is effectively a constraint.

Exactly. And then, your input gets an int. So you create a function tryCreate which takes an int and checks the range.

Similarly, you can have a dedicated type that is restricted to a string of 2 to 5 chars. When your input comes from strings that aren't restricted, you create a function tryCreate.

In both cases it's a design decision whether or not you create a type that's validating the input on creation. But in this approach, you would consider it an exception if the type is created with an out of range value. The byte would throw an out of range exception (triggered by switching to the checked module) and your custom type could do the same. I generally hide the constructor altogether.

I don't disagree with your points, but as often it depends to what approach is most sensible. You have all the freedom to create types that can only be instantiated with valid inputs. You have the freedom to put this logic 'on the type' using with or classes. Or you can do it how most functional libraries are designed, and create a same named module that provides all the interaction to the type in a safe way. Or a bit of both.

I'm not trying to suggest your proposal doesn't have merit. It has. But I'm unsure it should be part of the language. It may be a better fit a library instead. But I can be wrong, and I do see the/some benefit of adding this in some way to the language.

wilbennett · 2020-12-27T17:14:21Z

Thanks @abelbraaksma. I just want to clarify one thing.

Exactly. And then, your input gets an int. So you create a function tryCreate which takes an int and checks the range.

This is the same in either version. The point I was trying to make is that given the domain expert understands what "byte" means, or just from the dev perspective, looking at "ProductID: byte", you know there is no way possible to construct this type with an out of range ProductID. This is regardless of tryCreate. I look at this definition and I know: this record contains a product ID, the product ID is constrained to be 0..255, this is the contract and I cannot create an instance that violates these constraints, period.

jackfoxy · 2020-12-27T17:42:38Z

@wilbennett you may be interested in this for your needs DependentTypes
As @greatim pointed out many prefer to call types like this Refinement Types in favor of languages that support proofs of dependent types rather than DTs by construction.

wilbennett · 2020-12-27T18:19:37Z

Oh yes, thank you for your input @greatim.

Thanks @jackfoxy! That seems very interesting. I don't fully understand the entire thing yet but looking at the example, it's not quite the same as what I'm proposing. I'll definitely look into it some more though.

wilbennett · 2020-12-27T20:56:10Z

Let me do a compare and contrast that will hopefully shed more light on this idea.

I'll use @greatim's suggestion. I'm not picking on you @greatim. Your suggestion is elegant and succinct. I'm adding a private qualifier as I believe you accidentally left that off.

The domain expert says a product has an ID and a name:

type Product = {
    ProductID: int
    Name: string
}

You explain that "int" means a number and "string" means a sequence of characters. He/she says, well the product ID must be between 200 and 999 and the name must be 3 to 5 characters. So you create additional types:

type private String3To5 =
   | String3To5 of string   

let String3To5 txt =
   match String.length txt with
   | 3 | 4 | 5 -> Some(String3To5 txt)
   | _ -> None

type Int200To999 = ...
let Int200To999 = ...

type Product = {
    ProductID: Int200To999
    Name: String3To5
}

The expert says, oops, I meant 3 to 7 characters:

type private String3To7 =
   | String3To7 of string   

let String3To7 txt =
   match String.length txt with
   | 3 | 4 | 5 -> Some(String3To7 txt)
   | _ -> None

type Product = {
    ProductID: Int200To999
    Name: String3To7
}

Here are the issues I see with this approach:

We had to create new types to satisfy the constraints
We are mixing declarative (type) and imperative/procedural (in general - let) code
- This can no longer just be in a namespace, we are forced to use a module
This is no longer domain expert friendly
Without additional coding, the DU does not act like a regular DU
This code is not DRY or more specifically there is not a single point of truth
- Can we really trust what the type name says or must we trust but verify?
- We changed the name from "String3To5" to "String3To7" but forgot to change the pattern
- A new dev takes over for us. Which does he think is correct? The name or the pattern?

Now let's look at the alternative:

// In a separate file and possibly some included in the F# library
let [Constraint] ``Must be at least`` target value = ...
let [Constraint] ``Must be no more than`` target value = ...
let [Constraint] ``Minimum characters`` target value = ...
let [Constraint] ``Maximum characters`` target value = ...

// In the domain file
type Product = {
    ProductID: int
    Name: string
}

The expert says, well the product ID must be between 200 and 999 and the name must be 3 to 5 characters:

type Product = {
    ProductID: int
        Requirement: ``Must be at least`` 200
        Requirement: ``Must be no more than`` 999
    Name: string
        Requirement: ``Minimum characters`` 3
        Requirement: ``Maximum characters`` 5
}

The expert says, oops, I meant 3 to 7 characters:

type Product = {
    ProductID: int
        Requirement: ``Must be at least`` 200
        Requirement: ``Must be no more than`` 999
    Name: string
        Requirement: ``Minimum characters`` 3
        Requirement: ``Maximum characters`` 7
}

So what have we achieved?

We didn't have to create new types. The compiler can choose to create new types if it wants to
All code is declarative and succinct
The code is arguably still understandable by the domain expert
The code is DRY. There is a single point of truth for each constraint
The record still behaves like a record in every respect
If we applied this to a DU, it would still behave like a DU in every respect

How do we create these types?

We can use regular new semantics
- If any constraints fail, an exception is thrown
- When we do let a = { ProductID = 1; Name = "BFG" } at design time, the compiler could run constraints for us so we get notified of violations just as with other types (let a : int = 0.0).
We can have additional keywords/functions in the language. For example:
- optnew { ProductID = 1; Name = "BFG" }, returns Option<Product>
- resultnew { ProductID = 2; Name = "Staff" }, returns Result<Product, string list>
- Can be augmented by what I showed with the "onfail" example

I know you are probably saying, "But Wil, can you really trust what the constraint says or must you trust but verify". Very clever my friend but the difference is, the constraints are "global" and need only be verified once. They can then be used in any number of definitions with confidence. And remember, the compiler could run constraints at design time so we get the additional benefits we do with regular type constraints.

I don't believe this to be a foreign concept - just an extension of the constraints we already express when we create types. With just the plain definition, we are constraining ProductID to contain only numbers and to be within a specific range. All we are doing here is further constraining the range. Just by using the Product record, we are constraining products to consist of only a ProductID and a Name.

wilbennett · 2020-12-27T23:56:27Z

@abelbraaksma, another thought on why I do not consider these constraints "business logic".

F# isn't as rich in type constraints as other languages. If we take Delphi/Object Pascal, for example. You can define types like the following:

types
    TWeekDay = 1..7
    TString30 = string[30] // In delphi this actually makes all these 30 chars.

If we could do something similar in F#, it would be like:

type WeekDay = 1..7
type String30 = string[50] // We would treat 50 as a limit.
// And maybe an enhanced version like:
type String3To5 = string[3..5]
// In the case of the example I'm using, I would do:
type ProductName = string[3..5] // Included in the domain file

Hopefully you would now agree that using these types would not be considered "business logic".

If F# would allow constraints in this form, it would be even more acceptable. The caveat being we end up manually defining more types than the proposal and not be as flexible. Using the proposal, for example, we can have a constraint that uses a regular expression to constrain a string to being in a valid email address format. Then again, maybe we could do something like type EmailAddress = string match "some regex".

dsyme · 2022-06-16T17:32:35Z

I'll convert this to a discussion - it's a great discusssion about validation techniques but there's not a specific concrete proposal which is viable for the language, though one might emerge

#516 is related btw

cartermp added area: syntax area: type-checking-and-inference labels Dec 5, 2020

dsyme removed the area: syntax label Jun 16, 2022

fsharp locked and limited conversation to collaborators Jun 16, 2022

dsyme converted this issue into discussion #1155 Jun 16, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

This issue was moved to a discussion.

Language Validation/Constraint Support For Types #939

Language Validation/Constraint Support For Types #939

wilbennett commented Dec 5, 2020 •

edited by dsyme

Loading

cartermp commented Dec 5, 2020

wilbennett commented Dec 5, 2020

RealmPlume commented Dec 26, 2020

wilbennett commented Dec 26, 2020

abelbraaksma commented Dec 27, 2020

abelbraaksma commented Dec 27, 2020

wilbennett commented Dec 27, 2020

abelbraaksma commented Dec 27, 2020 •

edited

Loading

wilbennett commented Dec 27, 2020

jackfoxy commented Dec 27, 2020

wilbennett commented Dec 27, 2020

wilbennett commented Dec 27, 2020 •

edited

Loading

wilbennett commented Dec 27, 2020 •

edited

Loading

dsyme commented Jun 16, 2022

This issue was moved to a discussion.

This issue was moved to a discussion.

Language Validation/Constraint Support For Types #939

Language Validation/Constraint Support For Types #939

Comments

wilbennett commented Dec 5, 2020 • edited by dsyme Loading

Disclaimer

Overview

Issue

Pros and Cons

Extra information

Affidavit (please submit!)

For Readers

cartermp commented Dec 5, 2020

wilbennett commented Dec 5, 2020

RealmPlume commented Dec 26, 2020

wilbennett commented Dec 26, 2020

abelbraaksma commented Dec 27, 2020

abelbraaksma commented Dec 27, 2020

wilbennett commented Dec 27, 2020

abelbraaksma commented Dec 27, 2020 • edited Loading

wilbennett commented Dec 27, 2020

jackfoxy commented Dec 27, 2020

wilbennett commented Dec 27, 2020

wilbennett commented Dec 27, 2020 • edited Loading

wilbennett commented Dec 27, 2020 • edited Loading

dsyme commented Jun 16, 2022

This issue was moved to a discussion.

wilbennett commented Dec 5, 2020 •

edited by dsyme

Loading

abelbraaksma commented Dec 27, 2020 •

edited

Loading

wilbennett commented Dec 27, 2020 •

edited

Loading

wilbennett commented Dec 27, 2020 •

edited

Loading