Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Validated beginners doc #1903

Merged
merged 6 commits into from
Sep 26, 2017
Merged
Changes from 5 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
351 changes: 345 additions & 6 deletions docs/src/main/tut/datatypes/validated.md
Original file line number Diff line number Diff line change
Expand Up @@ -12,7 +12,351 @@ Response comes back saying your username can't have dashes in it, so you make so
have special characters either. Change, resubmit. Passwords need to have at least one capital letter. Change,
resubmit. Password needs to have at least one number.

Or perhaps you're reading from a configuration file. One could imagine the configuration library you're using returns
It would be nice to have all of these errors be reported simultaneously. That the username can't have dashes can
be validated separately from it not having special characters, as well as from the password needing to have certain
requirements. A misspelled (or missing) field in a config can be validated separately from another field not being
well-formed.

Enter `Validated`.

## A first approach

You'll note firsthand that `Validated` is very similar to `Either` because it also has two possible values: errors on the left side or successful computations on the right side.

Signature of the structure is as follows:

```scala
sealed abstract class Validated[+E, +A] extends Product with Serializable {
// Implementation elided
}
```

And its _projections_:

```scala
final case class Valid[+A](a: A) extends Validated[Nothing, A]
final case class Invalid[+E](e: E) extends Validated[E, Nothing]
```

Before diving into `Validated`, let's take a look at an `Either` based first approach to address our validation necessity.

Our data will be represented this way:

```tut:silent
case class RegistrationData(username: String, password: String, firstName: String, lastName: String, age: Int)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe add final here. Probably good to show best practices in the documentation?

```

And our error model:

```tut:silent
sealed trait DomainValidation {
def errorMessage: String
}

case object UsernameHasSpecialCharacters extends DomainValidation {
def errorMessage: String = "Username cannot contain special characters."
}

case object PasswordDoesNotMeetCriteria extends DomainValidation {
def errorMessage: String = "Password must be at least 10 characters long, including an uppercase and a lowercase letter, one number and one special character."
}

case object FirstNameHasSpecialCharacters extends DomainValidation {
def errorMessage: String = "First name cannot contain spaces, numbers or special characters."
}

case object LastNameHasSpecialCharacters extends DomainValidation {
def errorMessage: String = "Last name cannot contain spaces, numbers or special characters."
}

case object AgeIsInvalid extends DomainValidation {
def errorMessage: String = "You must be aged 18 and not older than 75 to use our services."
}
```

We have our `RegistrationData` case class that will hold the information the user has submitted, alongside the definition of the error model that we'll be using it for displaying the possible errors of every field. Now, let's explore the proposed implementation:

```tut:silent
import cats.syntax.either._

sealed trait FormValidator{
private def validateUserName(userName: String): Either[DomainValidation, String] =
if (userName.matches("^[a-zA-Z0-9]+$")) Right(userName) else Left(UsernameHasSpecialCharacters)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You could use Either.cond instead of if (...) Right(...) else Left(...).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is new for me! I'll change it in all the conditions :)


private def validatePassword(password: String): Either[DomainValidation, String] =
if (password.matches("(?=^.{10,}$)((?=.*\\d)|(?=.*\\W+))(?![.\\n])(?=.*[A-Z])(?=.*[a-z]).*$")) Right(password)
else Left(PasswordDoesNotMeetCriteria)

private def validateFirstName(firstName: String): Either[DomainValidation, String] =
if (firstName.matches("^[a-zA-Z]+$")) Right(firstName) else Left(FirstNameHasSpecialCharacters)

private def validateLastName(lastName: String): Either[DomainValidation, String] =
if (lastName.matches("^[a-zA-Z]+$")) Right(lastName) else Left(LastNameHasSpecialCharacters)

private def validateAge(age: Int): Either[DomainValidation, Int] =
if (age >= 18 && age <= 75) Right(age) else Left(AgeIsInvalid)

def validateForm(username: String, password: String, firstName: String, lastName: String, age: Int): Either[DomainValidation, RegistrationData] = {

for {
validatedUserName <- validateUserName(username)
validatedPassword <- validatePassword(password)
validatedFirstName <- validateFirstName(firstName)
validatedLastName <- validateLastName(lastName)
validatedAge <- validateAge(age)
}
yield RegistrationData(validatedUserName, validatedPassword, validatedFirstName, validatedLastName, validatedAge)
}

}

object FormValidator extends FormValidator

```

The logic of the validation process is as follows: **check every individual field based on the established rules for each one of them. If the validation is successful, then return the field wrapped in a `Right` instance; If not, then return a `DomainValidation` with the respective message, wrapped in a `Left` instance**.

Our service has the `validateForm` method for checking all the fields and, if the process succeeds it will create an instance of `RegistrationData`, right?

Well, yes, but the error reporting part will have the downside of showing only the first error.

Let's look in detail this part:

```tut:silent:fail
for {
validatedUserName <- validateUserName(username)
validatedPassword <- validatePassword(password)
validatedFirstName <- validateFirstName(firstName)
validatedLastName <- validateLastName(lastName)
validatedAge <- validateAge(age)
}
yield RegistrationData(validatedUserName, validatedPassword, validatedFirstName, validatedLastName, validatedAge)
```

A for-comprehension is _fail-fast_. If some of the evaluations in the `for` block fails for some reason, the `yield` statement will not complete. In our case, if that happens we won't be getting the accumulated list of errors.

If we run our code:

```tut:book
FormValidator.validateForm(
username = "fakeUs3rname",
password = "password",
firstName = "John",
lastName = "Doe",
age = 15
)
```

We should have gotten another `DomainValidation` object denoting the invalid age.

### An iteration with `Validated`

Time to do some refactoring! We're going to try a `Validated` approach:

```tut:silent
import cats.data._
import cats.data.Validated._
import cats.implicits._

def validateUserName(userName: String): Validated[DomainValidation, String] = {
if (userName.matches("^[a-zA-Z0-9]+$")) Valid(userName) else Invalid(UsernameHasSpecialCharacters)
}
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe reuse the Either methods here and call toValidated ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good suggestion. I'll take a look at it!


def validatePassword(password: String): Validated[DomainValidation, String] = {
if (password.matches("(?=^.{10,}$)((?=.*\\d)|(?=.*\\W+))(?![.\\n])(?=.*[A-Z])(?=.*[a-z]).*$")) Valid(password)
else Invalid(PasswordDoesNotMeetCriteria)
}

def validateFirstName(firstName: String): Validated[DomainValidation, String] = {
if (firstName.matches("^[a-zA-Z]+$")) Valid(firstName) else Invalid(FirstNameHasSpecialCharacters)
}

def validateLastName(lastName: String): Validated[DomainValidation, String] = {
if (lastName.matches("^[a-zA-Z]+$")) Valid(lastName) else Invalid(LastNameHasSpecialCharacters)
}

def validateAge(age: Int): Validated[DomainValidation, Int] = {
if (age >= 18 && age <= 75) Valid(age) else Invalid(AgeIsInvalid)
}
```
```tut:book:fail
def validateForm(username: String, password: String, firstName: String, lastName: String, age: Int): Validated[DomainValidation, RegistrationData] = {
for {
validatedUserName <- validateUserName(username)
validatedPassword <- validatePassword(password)
validatedFirstName <- validateFirstName(firstName)
validatedLastName <- validateLastName(lastName)
validatedAge <- validateAge(age)
}
yield RegistrationData(validatedUserName, validatedPassword, validatedFirstName, validatedLastName, validatedAge)
}
```

Looks similar to the first version. What we've done here was to use `Validated` instead of `Either`. Please note that our `Right` is now a `Valid` and `Left` is an `Invalid`.
Remember, our goal is to get all the validation errors for displaying it to the user.

But this approach won't compile, as you can see in the previous snippet. Why?

Without diving into details about monads, a for-comprehension uses the `flatMap` method for composition. Monads like `Either` can be composed in that way, but the thing with `Validated` is that it isn't a monad, but an [_Applicative Functor_](../typeclasses/applicativetraverse.html).
That's why you see the message: `error: value flatMap is not a member of cats.data.Validated[DomainValidation,String]`.

So, how do we do here?

### Meeting applicative

We have to look into another direction: a for-comprehension plays well in a fail-fast scenario, but the structure in our previous example was designed to catch one error at a time, so, our next step is to tweak the implementation a bit.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe we should mention somewhere here, that in order to get the Applicative instance for Validated the left side needs to have a Semigroup instance :)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Working on it! :)


```tut:silent
import cats.data._
import cats.data.Validated._
import cats.implicits._
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You don't need to import these again.


sealed trait FormValidatorNel {

type ValidationResult[A] = Validated[NonEmptyList[DomainValidation], A]
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since you are using validNel and invalidNel below, I think you can use ValidatedNel here as well.


private def validateUserName(userName: String): ValidationResult[String] =
if (userName.matches("^[a-zA-Z0-9]+$")) userName.validNel else UsernameHasSpecialCharacters.invalidNel
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe we could give a very quick heads-up on the use of .validNel and .invalidNel? :)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nevermind, I see you're doing it afterwards 😄


private def validatePassword(password: String): ValidationResult[String] =
if (password.matches("(?=^.{10,}$)((?=.*\\d)|(?=.*\\W+))(?![.\\n])(?=.*[A-Z])(?=.*[a-z]).*$")) password.validNel
else PasswordDoesNotMeetCriteria.invalidNel

private def validateFirstName(firstName: String): ValidationResult[String] =
if (firstName.matches("^[a-zA-Z]+$")) firstName.validNel else FirstNameHasSpecialCharacters.invalidNel

private def validateLastName(lastName: String): ValidationResult[String] =
if (lastName.matches("^[a-zA-Z]+$")) lastName.validNel else LastNameHasSpecialCharacters.invalidNel

private def validateAge(age: Int): ValidationResult[Int] =
if (age >= 18 && age <= 75) age.validNel else AgeIsInvalid.invalidNel

def validateForm(username: String, password: String, firstName: String, lastName: String, age: Int): ValidationResult[RegistrationData] = {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you perhaps put all the validateN functions into another snippet that's checked by tut, and then only have the validateForm go without tut? I don't think you need the surrounding trait btw :)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Meant to put this on the snippet above 😅

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Got it! Working on it and also on the support of <2.11.x Either for-comprehensions (Travis complained about flatMap on this versions).

Thank you for your patience with this! :)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just import cats.syntax.either._ to get that working :)

Thank you, for you help with the docs!

(validateUserName(username),
validatePassword(password),
validateFirstName(firstName),
validateLastName(lastName),
validateAge(age)).mapN(RegistrationData)
}

}

object FormValidatorNel extends FormValidatorNel
```

Let's see what changed here:

1. In this new implementation, we're using a `NonEmptyList`, a data structure that guarantees that at least one element will be present. In case that multiple errors arise, you'll get a list of `DomainValidation`.
2. We've declared the type alias `ValidationResult` that conveniently express the return type of our validation.
3. `.validNel` and `.invalidNel` combinators let us _lift_ the success or failure in their respective container (either a `Valid` or `Invalid[NonEmptyList[A]]`).
4. The [applicative](../typeclasses/applicative.html) syntax `(a, b, c, ...).mapN(...)` provides us a way to accumulatively apply the validation functions and yield a product with their successful result or the accumulated errors in the `NonEmptyList`. Then, we transform that product with `mapN` into a valid instance of `RegistrationData`.

**Deprecation notice:** since cats `1.0.0-MF` the cartesian syntax `|@|` for applicatives is deprecated. If you're using `0.9.0` or less, you can use the syntax: `(a |@| b |@| ...).map(...)`.

Note that, at the end, we expect to lift the result of the validation functions in a `RegistrationData` instance. If the process fails, we'll get our `NonEmptyList` detailing what went wrong.

For example:

```tut:book
FormValidatorNel.validateForm(
username = "Joe",
password = "Passw0r$1234",
firstName = "John",
lastName = "Doe",
age = 21
)

FormValidatorNel.validateForm(
username = "Joe%%%",
password = "password",
firstName = "John",
lastName = "Doe",
age = 21
)
```

Sweet success! Now you can take your validation process to the next level!

### A short detour

Typically, you'll see that `Validated` will be accompanied by a `NonEmptyList` when it comes to accumulation. The thing here is that you can define your own accumulative data structure and you're not limited to the aforementioned construction.

For doing this, you have to provide a `Semigroup` instance. `NonEmptyList`, by definition has its own `Semigroup`. For those who don't know what a `Semigroup` is, let's see a simple example.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is really great, thank you! However, you didn't really include an example?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It was more about how ap works for Invalid, but I can also provide a simple example of Semigroup. What do you think about giving the example first and then point to the actual behavior, as it is written now?


#### Accumulative Structures

According to [Wikipedia](https://en.wikipedia.org/wiki/Semigroup):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Instead of linking to wikipedia, I think the link to cats documentation should be enough :)
Maybe we could just say " For those who don't know what a Semigroup is, you can find out more here."

What do you think? :)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sounds good! I'll fix it.


> A semigroup is an algebraic structure consisting of a set together with an associative binary operation.

You can find more about how `Semigroup` works in cats [here](../typeclasses/semigroup.html).

Let's take a look at `ap` method of `Validated`:

```tut:silent:fail
/**
* From Apply:
* if both the function and this value are Valid, apply the function
*/
def ap[EE >: E, B](f: Validated[EE, A => B])(implicit EE: Semigroup[EE]): Validated[EE, B] =
(this, f) match {
// ...
case (Invalid(e1), Invalid(e2)) => Invalid(EE.combine(e2, e1))
// ...
}
}
```

We've omitted the complete implementation because our focus here is the case in where you need to append (that's the function of this method) two failures. Note the `implicit EE: Semigroup[EE]` parameter and the usage of its `.combine` operation. In the case of `NonEmptyList`, we're talking about a `List`, with certain properties that allow us to _combine_ (append) more than one element to it. That's because, apart from the fact that it is a `List`, it also has an instance of `Semigroup`, telling it how to operate with the accumulation.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I find this section a bit confusing.

Maybe something like :

We've omitted the complete implementation and only show the part where two failures are combined using the combine method of the Semigroup instance of EE.

I don't think we need the part about NonEmptyList. Maybe you can add an example of NonEmptyList's Semigroup to the Accumulative Structures section ?

Something like :

NonEmptyList.one("error 1") |+| NonEmptyList("error 2", "error 3")

"error 1".invalidNel[Int] |+| "error 2".invalidNel
("error 1".invalidNel[Int], "error 2".invalidNel[Int]).mapN(_ + _)


As we've said before: if you need another data type for processing the failures, you can use it, providing an instance of a `Semigroup` with the `.combine` logic.

### Going back and forth

cats offer you a nice set of combinators to transform your `Validated` based approach to an `Either` one and vice-versa.
Please note that, if you're using an `Either`-based approach as seen in our first example and you choose to convert it to a `Validated` one, you're constrained to the fail-fast nature of `Either`, but you're gaining a broader set of features with `Validated`.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What do you mean with

you're gaining a broader set of features with Validated.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is the section in where I have doubts about it. I was trying to express that, coming from Either to Validated, the person will gain more combinators (Features), even if you transfer the Either behavior (fail-fast) to Validated.

With your previous comments, I have an idea for simplifying this section (or even delete it). Let me work on it and I'll reach you out again :)


#### From `Validated` to `Either`

To do this, simply use `.toEither` combinator:

```tut:book
FormValidatorNel.validateForm(
username = "Joe",
password = "Passw0r$1234",
firstName = "John",
lastName = "Doe",
age = 21
).toEither
```

#### From `Either` to `Validated`

To do this, you'll need to use either `.toValidated` or `.toValidatedNel`:

```tut:book
FormValidator.validateForm(
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure how good this example really is, the fail-fast behaviour of Either is already done at this point, so converting to Validated does not somehow enable the Error accumulation we might want, as you probably know. :)
Maybe we could include an example that doesn't drop the error accumulation?

Copy link
Contributor Author

@AlejandroME AlejandroME Sep 11, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was thinking the same. I've done this little example for making use of .invalid/.invalidNel but nothing, apart from this snippet came to my mind. I'll think about this and I'll reach you out again :)

username = "MrJohnDoe$$",
password = "password",
firstName = "John",
lastName = "Doe",
age = 31
).toValidated

FormValidator.validateForm(
username = "MrJohnDoe$$",
password = "password",
firstName = "John",
lastName = "Doe",
age = 31
).toValidatedNel
```

The difference between the previous examples is that `.toValidated` gives you an `Invalid` instance in case of failure. Meanwhile, `.toValidatedNel` will give you a `NonEmptyList` with the possible failures. Don't forget about the caveat with `Either`-based approaches, mentioned before.

## Another case

Perhaps you're reading from a configuration file. One could imagine the configuration library you're using returns
a `scala.util.Try`, or maybe a `scala.util.Either`. Your parsing may look something like:

```scala
Expand All @@ -25,12 +369,7 @@ for {
You run your program and it says key "url" not found, turns out the key was "endpoint". So you change your code
and re-run. Now it says the "port" key was not a well-formed integer.

It would be nice to have all of these errors be reported simultaneously. That the username can't have dashes can
be validated separately from it not having special characters, as well as from the password needing to have certain
requirements. A misspelled (or missing) field in a config can be validated separately from another field not being
well-formed.

Enter `Validated`.

## Parallel validation
Our goal is to report any and all errors across independent bits of data. For instance, when we ask for several
Expand Down