Serialization toolbox for Pekko messages, events and persistent state that helps achieve compile-time guarantee on serializability. There are some Pekko serialization caveats this tool can help with:
- Missing serialization binding
- Incompatibility of persistent data
- Jackson Pekko Serializer
- Missing Codec registration
Add the following line to plugins.sbt
(take Version
from the above Maven badge or GitHub Releases):
addSbtPlugin("org.virtuslab.psh" % "sbt-pekko-serialization-helper" % Version)
and enable the sbt plugin in the target project:
lazy val app = (project in file("app"))
.enablePlugins(PekkoSerializationHelperPlugin)
To serialize message, persistent state or event in Pekko, Scala trait needs to be defined:
package org
trait MySer
Also, a serializer needs to be bound to this trait in a configuration file:
pekko.actor {
serializers {
jackson-json = "org.apache.pekko.serialization.jackson.JacksonJsonSerializer"
}
serialization-bindings {
"org.MySer" = jackson-json
}
}
The problem occurs if a class is not extended with the base trait bound to the serializer:
trait MySer
case class MyMessage() // extends MySer
pekko-serialization-helper
to the rescue! It detects messages, events and persistent states, and checks whether they
extend the given base trait and report an error when they don't. This ensures that the specified serializer is
used by Pekko and protects against an unintended fallback to
Java serialization or outright
serialization failure.
To use, base trait should be annotated with @org.virtuslab.psh.SerializabilityTrait
:
@SerializabilityTrait
trait MySerializable
It allows catching errors like these:
import org.apache.pekko.actor.typed.Behavior
object BehaviorTest {
sealed trait Command //extends MySerializable
def method(msg: Command): Behavior[Command] = ???
}
And results in a compile error, preventing non-runtime-safe code from being executed:
test0.scala:7: error: org.random.project.BehaviorTest.Command is used as Pekko message
but does not extend a trait annotated with org.virtuslab.psh.annotation.SerializabilityTrait.
Passing an object of a class that does NOT extend a trait annotated with SerializabilityTrait as a message may cause Pekko to
fall back to Java serialization during runtime.
def method(msg: Command): Behavior[Command] = ???
^
test0.scala:6: error: Make sure this type is itself annotated, or extends a type annotated
with @org.virtuslab.psh.annotation.SerializabilityTrait.
sealed trait Command extends MySerializable
^
The compiler plugin only checks the classes in the sbt modules where PekkoSerializationHelperPlugin
is explicitly enabled.
It may happen that the base trait (like MySerializable
in the example) lives in an sbt module like core
where the plugin
should not be enabled (e.g. for compilation performance reasons).
However, MySerializable
needs to be annotated with org.virtuslab.psh.SerializabilityTrait
.
In order to have access to the SerializabilityTrait
annotation without enabling the entire suite of compiler plugins,
add PekkoSerializationHelperPlugin.annotation
to libraryDependencies
:
import org.virtuslab.psh.PekkoSerializationHelperPlugin
lazy val core = (project in file("core"))
.settings(libraryDependencies += PekkoSerializationHelperPlugin.annotation)
A typical problem with a persistence is when the already persisted data is not compatible with the schemas defined in a new version of the application.
To solve this, a mix of a compiler plugin and an sbt task can be used for dumping schema
of pekko-persistence to a
file. It can be used for detecting accidental changes of events (journal) and states (snapshots) with a simple diff
.
To dump persistence schema for each sbt module where PekkoSerializationHelperPlugin
is enabled, run:
sbt ashDumpPersistenceSchema
Default file is target/<sbt-module-name>-dump-persistence-schema-<version>.yaml
(target/
of top-level module!) but it can be changed using sbt keys:
ashDumpPersistenceSchemaOutputFilename := "file.yaml" // Changes filename
ashDumpPersistenceSchemaOutputDirectoryPath := "~" // Changes directory
- name: org.random.project.Data
typeSymbol: trait
- name: org.random.project.Data.ClassTest
typeSymbol: class
fields:
- name: a
typeName: java.lang.String
- name: b
typeName: scala.Int
- name: c
typeName: scala.Double
parents:
- org.random.project.Data
- name: org.random.project.Data.ClassWithAdditionData
typeSymbol: class
fields:
- name: ad
typeName: org.random.project.Data.AdditionalData
parents:
- org.random.project.Data
A diff
command can be used to check the difference between the version of a schema from develop
/main
branch and the
version from the current commit.
Using Jackson Serializer for pekko-persistence is also one of the pitfalls and this plugin provides an alternative by using a serializer that uses Circe.
Dangerous code for Jackson:
case class Message(animal: Animal) extends MySer
sealed trait Animal
final case class Lion(name: String) extends Animal
final case class Tiger(name: String) extends Animal
To make this code work, a lot of Jackson annotations should be added:
case class Message(animal: Animal) extends MultiDocPrintService
@JsonTypeInfo(use = JsonTypeInfo.Id.NAME, property = "type")
@JsonSubTypes(
Array(
new JsonSubTypes.Type(value = classOf[Lion], name = "lion"),
new JsonSubTypes.Type(value = classOf[Tiger], name = "tiger")))
sealed trait Animal
final case class Lion(name: String) extends Animal
final case class Tiger(name: String) extends Animal
Also if an object is defined:
case object Tick
There will not be exceptions during serialization but Jackson will create
another instance of Tick
instead of restoring the object Tick
's underlying singleton.
actorRef ! Tick
// Inside the actor:
def receive = {
case Tick => // this won't get matched
} // message will be unhandled
A Circe-based Pekko serializer can be used. It uses Circe codecs, derived using Shapeless, that are generated during compile time (so serializer won't crash during runtime as reflection-based serializers may do). For a comparison of Circe with other serializers, read Appendix A.
Note that it is not obligatory to use this serializer for the other features (serializability checker, persistence schema dump) to work. They work as well with e.g. when Jackson serializer is selected.
Add the following to project dependencies:
import org.virtuslab.psh.PekkoSerializationHelperPlugin
lazy val app = (project in file("app"))
// ...
.settings(libraryDependencies += PekkoSerializationHelperPlugin.circePekkoSerializer)
Note that enabling PekkoSerializationHelperPlugin
does not add Circe serializer to the classpath automatically,
as this sbt plugin can be used to improve safety of other serializers as well.
Create a custom serializer by extending CircePekkoSerializer
base class:
import org.virtuslab.psh.circe.CircePekkoSerializer
class ExampleSerializer(actorSystem: ExtendedActorSystem)
extends CircePekkoSerializer[MySerializable](actorSystem) {
override def identifier: Int = 41
override lazy val codecs = Seq(Register[CommandOne], Register[CommandTwo])
override lazy val manifestMigrations = Nil
override lazy val packagePrefix = "org.project"
}
CircePekkoSerializer
can be configured to use Gzip compression when serializing payloads greater than defined size (default is without compression).
See default reference.conf file with comments for more details about CircePekkoSerializer
configuration.
For more guidelines on how to use the serializer,
read Pekko documentation about serialization,
CircePekkoSerializer
Scaladoc
and look at the examples.
If a codec is not registered, a runtime exception will occur.
import org.virtuslab.psh.circe.CircePekkoSerializer
import org.virtuslab.psh.circe.Register
class ExampleSerializer(actorSystem: ExtendedActorSystem)
extends CircePekkoSerializer[MySerializable](actorSystem) {
// ...
override lazy val codecs = Seq(Register[CommandOne]) // WHOOPS someone forgot to register CommandTwo...
}
java.lang.RuntimeException: Serialization of [CommandTwo] failed. Call Register[A]
for this class or its supertype and append result to `def codecs`.
To solve that, an annotation
@org.virtuslab.psh.Serializer
can be used.
During compilation, the plugin gathers all direct descendants of the class marked with @org.virtuslab.psh.SerializabilityTrait
and checks the body of classes annotated with @org.virtuslab.psh.Serializer
if they reference all these direct descendants in any way.
In practice, this is used for checking a class extending CircePekkoSerializer, like this:
import org.virtuslab.psh.circe.CircePekkoSerializer
import org.virtuslab.psh.circe.Register
@Serializer(
classOf[MySerializable],
typeRegexPattern = Register.REGISTRATION_REGEX)
class ExampleSerializer(actorSystem: ExtendedActorSystem)
extends CircePekkoSerializer[MySerializable](actorSystem) {
// ...
override lazy val codecs = Seq(Register[CommandOne]) // WHOOPS someone forgot to register CommandTwo...
// ... but Codec Registration Checker will throw a compilation error here:
// `No codec for `CommandOne` is registered in a class annotated with @org.virtuslab.psh.annotation.Serializer`
}
Note that as with Serializability Checker and Dump Persistence Schema,
this compiler plugin only runs in the sbt modules where PekkoSerializationHelperPlugin
is explicitly enabled.
For more information, read @Serializer
scaladoc.
All compiler plugins and their verbose modes can be enabled/disabled using two sbt keys:
ashCompilerPluginEnable := false // default is true
ashCompilerPluginVerbose := true // default is false
This can be done for all compiler plugins, like above, or just one:
ashCodecRegistrationCheckerCompilerPlugin / ashCompilerPluginEnable := false
ashDumpPersistenceSchemaCompilerPlugin / ashCompilerPluginVerbose := true
Additionally, Compile
and Test
scope can be specified:
Compile / ashDumpPersistenceSchemaCompilerPlugin / ashCompilerPluginVerbose := true
Test / ashCompilerPluginEnable := false
For full list of sbt keys, check org.virtuslab.psh.PekkoSerializationHelperKeys
.
The simplest example is the pekko-cluster application which uses Pekko Serialization Helper: pekko-cluster-app.
The second example is the pekko-persistence application which shows usage of the Dump Persistence Schema Compiler Plugin: pekko-persistence-app.
See full step-by-step guide on Pekko Serialization Helper usage.
If you want to contribute to this project, see Contributing Guide.
Serializer | Jackson | Circe | Protobuf v3 | Avro | Borer | Kryo |
---|---|---|---|---|---|---|
Data formats | JSON or CBOR | JSON | JSON or custom binary | JSON or custom binary | JSON or CBOR | custom binary |
Scala support | very poor, even with jackson-module-scala:
|
perfect out of the box | perfect with ScalaPB | perfect with Avro4s | perfect out of the box | perfect out of the box |
Pekko support | pekko-serialization-jackson | serializer provided by this project | used by pekko-remote internally | requires custom serializer | requires custom serializer | pekko-kryo |
Compile-time mechanics | nothing happens in compile time; everything based on runtime reflection | derives codecs via Shapeless | with ScalaPB, generates Scala classes based on *.proto files | with Avro4s, derives Avro schemas using Magnolia | derives codecs without Magnolia | with pekko-kryo, optionally derives codecs in compile time, but otherwise uses reflection in runtime |
Runtime safety | none, uses reflection | encoders and decoders are created during compilation | *.proto files are validated before compilation | Avro schema is created during compilation | encoders and decoders are created during compilation | depends on whether codecs were derived in compile time (then standard for Scala code), or not (than none) |
Boilerplate | a lot:
|
every top-level sealed trait must be registered manually - but see Codec Registration Checker | in case of custom types, a second layer of models is needed | sometimes requires annotations | every top-level sealed trait must be registered manually; every transitively included class must have an explicitly defined codec | every top-level sealed trait must be registered manually |
Schema evolution |
JacksonMigration :
|
|
|
|
|
|
Appendix B: what happens with serialization of Messages / Events / States that do not extend a base trait bound to a serializer
In runtime, if given Message/Event/State class does not extend a base trait bound to a serializer, the following problems will occur (depending on Pekko version and settings):
Note - the default setting for Pekko 2.5 is pekko.actor.allow-java-serialization=on
, whereas pekko.actor.allow-java-serialization=off
is the default setting for Pekko 2.6
Pekko version | pekko.actor.allow-java-serialization=off |
pekko.actor.allow-java-serialization=on |
---|---|---|
2.5 | Serialization will fail with logs like below:[WARN] [...] Outgoing message attempted to use Java Serialization even though `pekko.actor.allow-java-serialization = off` was set! Message type was: [class sample.cluster.transformation.BackendRegistration$] [ERROR] [...] Failed to serialize remote message [class pekko.actor.ActorSelectionMessage] using serializer [class pekko.remote.serialization.MessageContainerSerializer]. Transient association error (association remains live) pekko.remote.MessageSerializer$SerializationException: Failed to serialize remote message [class pekko.actor.ActorSelectionMessage] using serializer [class pekko.remote.serialization.MessageContainerSerializer]. at org.apache.pekko.remote.MessageSerializer$.serialize(MessageSerializer.scala:67) at org.apache.pekko.remote.EndpointWriter.$anonfun$serializeMessage$1(Endpoint.scala:1021) at scala.util.DynamicVariable.withValue(DynamicVariable.scala:62) at org.apache.pekko.remote.EndpointWriter.serializeMessage(Endpoint.scala:1021) at org.apache.pekko.remote.EndpointWriter.writeSend(Endpoint.scala:887) at org.apache.pekko.remote.EndpointWriter$$anonfun$4.applyOrElse(Endpoint.scala:859) at org.apache.pekko.actor.Actor.aroundReceive(Actor.scala:539) at org.apache.pekko.actor.Actor.aroundReceive$(Actor.scala:537) at org.apache.pekko.remote.EndpointActor.aroundReceive(Endpoint.scala:536) at org.apache.pekko.actor.ActorCell.receiveMessage(ActorCell.scala:612) at org.apache.pekko.actor.ActorCell.invoke(ActorCell.scala:581) at org.apache.pekko.dispatch.Mailbox.processMailbox(Mailbox.scala:268) at org.apache.pekko.dispatch.Mailbox.run(Mailbox.scala:229) at org.apache.pekko.dispatch.Mailbox.exec(Mailbox.scala:241) at org.apache.pekko.dispatch.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260) at org.apache.pekko.dispatch.forkjoin.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1339) at org.apache.pekko.dispatch.forkjoin.ForkJoinPool.runWorker(ForkJoinPool.java:1979) at org.apache.pekko.dispatch.forkjoin.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:107) Caused by: pekko.serialization.DisabledJavaSerializer$JavaSerializationException: Attempted to serialize message using Java serialization while `pekko.actor.allow-java-serialization` was disabled. Check WARNING logs for more details. |
Serialization succeeds - but using Java serialization (which is not a good choice for production). Warning log like below will appear on the startup:[WARN] [...] Using the default Java serializer for class [sample.cluster.transformation.TransformationResult] which is not recommended because of performance implications. Use another serializer or disable this warning using the setting 'pekko.actor.warn-about-java-serializer-usage' |
2.6 | Serialization will fail with logs like below:[WARN] [...] - Outgoing message attempted to use Java Serialization even though`pekko.actor.allow-java-serialization = off` was set! Message type was: [class sample.cluster.transformation.Worker$TransformText] [ERROR] [...] - Failed to serialize message [sample.cluster.transformation.Worker$TransformText]. pekko.serialization.DisabledJavaSerializer$JavaSerializationException: Attempted to serialize message using Java serialization while `pekko.actor.allow-java-serialization` was disabled. Check WARNING logs for more details |
Serialization succeeds - but using Java serialization (which is not a good choice for production). Warning log like below will appear on the startup:[WARN] [...] Using the default Java serializer for class [sample.cluster.transformation.TransformationResult] which is not recommended because of performance implications. Use another serializer or disable this warning using the setting 'pekko.actor.warn-about-java-serializer-usage' |