Schemata
The VLINGO XOOM platform schema registry.
The XOOM Schemata component is a schema registry. It provides the means for Bounded Contexts, à la services and applications, built using VLINGO XOOM to publish standard types that are made available to dependent Bounded Contexts as client services. The published standard types are known as schemas, and the registry hosts the schemas for given organizations and the services within.
One of the big problems with exchanging information across Bounded Contexts, implemented as microservices and applications, is when changes to the information structure, data types, and attribute/property names, cause incompatibilities for the dependents. Experience proves that such changes are infrequently known to consuming dependents. When that is the case, essential integrations break down.
Another challenge is maintaining type safety of the exchanged information across Bounded Contexts when a consumer deserializes (or unmarshals) data from the exchange format into a consumable local object. Further, various Bounded Contexts may be implemented in a number of different programming languages, and the process of deserializing is tedious or impossible due to incompatibilities in formats between producer and consumer data types.
XOOM Schemata provides the means to eliminate such problems and minimize challenges by maintaining semantic versioning across the range of standard types known as schemas. This tool also facilitates consumer build dependencies on the schema types according to the local programming language.
Published Language
XOOM Schemata supports what is know in DDD as the Published Language. If you have either of Vaughn Vernon's DDD books you can read more about Published Language in those. Still, we provide a basic overview of its uses here.
A few very important points in conjunction with developing a Published Language are,
A Published Language should not be directly related to the internal domain model of your Bounded Context, and the internal domain model of your Bounded Context should not depend on the types defined in your Published Language. The types defined in your Published Language may be fundamentally the same or similar, but they are not the same things. Separate the two.
A Published Language is used for presenting API types and data in an open and well-documented way. These are used by clients to communicate with a Bounded Context, and for a Bounded Context to communicate with client services outside its boundary.
Your domain model is more closely tied to the concepts learned and discovered by your team that are related to its shared mental model and specific Ubiquitous Language. A Published Language is driven by the needs of Bounded Contexts (clients, or otherwise collaborating/integrating applications and services) outside your Bounded Context that needs to use data to communicate and/or understand the outcomes that your Bounded Context produces. Your Published Language should be based on your Ubiquitous Language, but it may not (and often should not) share everything about its internal structure, typing, and data.
There is a Context Mapping strategic design pattern of DDD known as Conformist. Closely related to the previous point #3, the goal of Published Language is to prevent collaborating/integrating Bounded Contexts (clients, or otherwise collaborating or integrating applications and services) from being required to conform to your internal domain model and be affected by its ongoing changes. Instead, consumers would adhere to a common standard Published Language, or even translate from a standard Published Language into their own Ubiquitous Language.
If collaborators/integrators did conform directly to your Ubiquitous Language, every change in your domain model would ripple into their external Bounded Contexts, having negative maintenance impacts. There are times when being a Conformist can be advantageous. It requires less conceptual design to adhere to another model, but with less flexibility in your Context. In such cases, data types used by conforming collaborators/integrators can likewise be defined inside XOOM Schemata. Even so, here focus is given to the more inviting and flexible Published Language.
One exception to the strong suggestion for your domain model to not consume types from your Published Language may be with your Domain Events. It may make sense to use these in your domain model because it can reduce the amount of mapping between Domain Events defined in your domain model and those used for persistence and messaging, for example. Still, sharing types could be problematic and good judgment should be used in deciding whether or not to do so. This is generally a tradeoff in the development overhead of maintaining separate types and mapping them, and reducing the runtime overhead of mapping between types and the memory management garbage that this produces.
Use Cases
The following provides some typical use cases that are supported by the Published Language of a given Bounded Context. There may be concepts and schema structuring that are unfamiliar, but any such will be explained soon. It's most important that you now understand why the XOOM Schemata exists and why it is used.
A client sends a
Command
request to a Bounded Context. The client must communicate that request using types and data that the Bounded Context understands. The Bounded Context defines a schema inside XOOM Schemata such as:Org.Unit.Context.Commands.DoSomethingForMe
. ThatCommand
has some data structure, such as for the REST HTTP request body payload, or for a message payload if using messaging.A client sends a
Query
request to a Bounded Context. For example, a client sends aGET
request using REST over HTTP. The Bounded Context must respond with a result, and the200 OK
response body definition is defined as aDocument
. That Document result is defined in XOOM Schemata and may have a name in the following formatOrg.Unit.Context.Documents.TypeThatWasQueried
.After use case #1 above completes, the Bounded Context emits a
DomainEvent
. The type and data of that outgoingDomainEvent
is defined in XOOM Schemata and may have a name in the following formatOrg.Unit.Context.Events.SomethingCompleted
.It is possible that any one of #1, #2, and/or #3 use additional complex data types within their definition. These additional complex data types would be defined by the Bounded Context under
Org.Unit.Context.Data
, perhaps asOrg.Unit.Context.Data.SomethingDataType
.It is possible (even likely) that any one of #1, #2, and/or #3, if based on messaging (or possibly even REST), will define one or more types within
Org.Unit.Context.Envelope
, such asOrg.Unit.Context.Envelope.Notification
. Such anEnvelope
type "wraps" aCommand
, aDocument
, and/or aDomain Event
, and is used to communicate metadata about the incomingCommand
or the resultingDocument
and publishedDomain Event
.The publisher of a schema can require changes to it. The clients/consumers could fail to correctly consume the exchanged information due to being uninformed about the changes. To protect against such failures, XOOM Schemata manages schema versions by disallowing breaking changes based on semantic versioning. All breaking changes must be made only using the next highest major version. To use the minor or patch version requires nonbreaking changes.
The teams working in a given Bounded Context are free to choose whichever programming language and technology stack they deem best for their work. When any Bounded Context must consume information from any other, each could use different a language and set of technologies. The publisher of a schema and any consumers must be able to exchange the information in a type-safe manner even when language and technologies differ. XOOM Schemata facilitates generating consumer-side types based on a language-neutral format.
The functionality that accommodates these use cases is described next.
Concepts and Design
The XOOM Schemata presents the following basic logical interface and hierarchy:
From the top of the hierarchy the nodes are defined as follows.
Organization: The top-level division. This may be the name of a company or the name of a prominent business division within a company. If there is only one company using this registry then the Organization could be a major division within the implied company. There may be any number of Organizations defined, but there must be at least one.
Unit: The second-level division. This may be the name of a business division within the Organization, or if the Organization is a business division then the Unit may be a department or team within a business division. Note that there is no reasonable limit on the name of the Unit, so it may contain dot notation in order to provide additional organizational levels. In an attempt to maintain simplicity we don't want to provide nested Unit types because the Units themselves can become obsolete with corporate and team reorganizations. It's best to name a Unit according to some non-changing business function rather than physical departments.
Context: The logical application or (micro)service within which schemas are to be defined and for which the schemas are published to potential consumers. You may think of this as the name of the Bounded Context, and it may even be appropriate to name it the top-level namespace used by the Context, e.g. com.saasovation.agilepm
. Within each Context there may be a number of category types used to describe its Published Language served by its Open-Host Service. Currently these include: Commands, Data, Documents, Envelopes, and Events. Some of the parts are meant to help define other parts, and so are building blocks. Other parts are the highest level of the Published Language. These are called out in the following definitions.
Commands: This is a top-level schema type where Command operations, such as those supporting CQRS, are defined by schemas. If the Context's Open-Host Service is REST-based, these would define the payload schema submitted as the HTTP request body of POST
, PATCH
, and PUT
methods. If the Open-Host Service is an asynchronous-message-based mechanism (e.g. RabbitMQ or Kafka), these would define the payload of Command messages sent through the messaging mechanism.
Data: This is a building-block schema type where general-purpose data records, structures, or objects are defined and that may be used inside any of the other schema types (e.g. type Token
). You may also place metadata types here (e.g. type Metadata
or more specifically, type CauseMetadata
).
Documents: This is a top-level schema type that defines the full payload of document-based operations, such as the query results of CQRS queries. These documents are suitable for use as REST response bodies and messaging mechanism payloads.
Envelopes: This is a building-block schema type meant to define the few number of message envelopes that wrap message-based schemas. When sending any kind of message, such as Command messages and Event messages, it is common to wrap these in an Envelope that defines some high-level metadata about the messages being sent by a sender and being received by a receiver.
Events: This is a top-level schema type that conveys the facts about happenings within the Context that are important for other Context's to consume. These are known as Domain Events but may also be named Business Events. The reason for the distinction is that some viewpoints consider Domain Events to be internal-only events; that is, those events only of interest to the owning Context. Those holding that viewpoint think of events of interest outside the owning Context as Business Events. To avoid any confusion the term Event is used for this schema type and may be used to define any event that is of interest either inside or outside the owning Context, or both inside and outside the owning Context.
Schema: Under every top-level schema category (or type, such as Commands and Events) are any number of Schema definitions. Besides a category, a Schema has a name and description. Every Schema has at least one Schema Version, which holds the actual Specification for each version of the Schema. Thus, the Schema itself is a container for an ordered collection of Schema Versions that each have a Specification.
Schema Version: Every Schema has at least one Schema Version, and may have several versions. A Schema Version holds the Specification of a particular version of the Schema, and also holds a Description, a Semantic Version number, and a Status. The Description is a textual/prose description of the purpose of the Schema Version.
Specification: A Schema Version's Specification is a textual external DSL (code block) that declares the data types and shape of the Schema at a given version. Any new version's Specification must be backward compatible with previous versions' of the given Schema if the new version falls within the same major version. The DSL is shown in detail below.
Semantic Version: A semantic version is a three-part version, with a major, minor, and patch value, with each subsequent version part separated by a dot (decimal point), such as 1.2.3
for example. Here 1
is the major version, 2
is the minor version, and 3
is the patch version. If any two Schema Versions share the same major version then it is required that their Specifications be compatible with each other. Thus, the newer version, such as 1.2.0
, must be compatible with the Specification of 1.1.3
, and 1.1.3
must be compatible with 1.2.0
. On the other hand, version 2.0.0
can be incompatible with version 1.2.0
, using the change in major version to make necessary breaking changes. When 2.0.0
becomes the published production version, all dependents must have upgraded to safely consume it.
Status: The Schema Version Status has four possible values: Draft
, Published
, Deprecated
and Removed
. The Draft
is the initial status and means that the Specification is unofficial and may change. Dependents may still use a Draft
status Schema Version for test purposes, but with the understanding that the Specification may change at any time. When a Schema Version is considered production-ready, its status is upgraded to Published
. Marking a Schema Version as Published
is performed manually by the Context team after it has satisfied it's team and consumer dependency requirements. When your team decides to transition from one Schema Version to another, you might want to mark the old version as Deprecated
, which gives out a console warning when that version is consumed. If, for some reason, it is necessary to forever remove a Schema Version, it can be marked as Removed
. It may then still be viewed but not used. It can only be "restored" by defining a new Schema Version with its specification, with the understanding that it may require modification to become backward compatible with any now previous version(s).
Schema Version Specification DSL
The following demonstrates all the features supported by the typing language:
The following table describes the available types and a description of each.
Datatype
Description
{category}
This must be replaced by one of the concrete category types: command
, data
, document
, envelope
, and event
.
type
The datatype specifically defining that the type-name of the specification type should be included in the message itself with the given attribute name. (Note that this may be placed instead on the Envelope.)
version
The datatype specifically defining that the semantic version of the given Schema Version should be included in the message itself with the given attribute name. (Note that this may be placed instead on the Envelope.)
timestamp
The datatype specifically defining that the timestamp of when the given instance was created to be included in the message itself with the given attribute name. (Note that this may be placed instead on the Envelope.)
boolean
The boolean datatype with values of true and false only, to be included in the message with the given attribute name. This value may be defaulted if the declaration is followed by an equals sign and a true or false:
boolean flag = true
boolean[]
The boolean array datatype with multiple values of true and false only, to be included in the message with the given attribute name. This value may be defaulted if the declaration is followed by an equals sign and an array literal containing a number of true and false values:
boolean[] flags = { true, false, true }
byte
The 8-bit signed byte datatype with values of -128 to 127, to be included in the message with the given attribute name. This value may be defaulted if the declaration is followed by an equals sign and a byte literal:
byte small = 123
byte[]
The 8-bit signed byte array datatype with multiple values of - 128 to 127, to be included in the message with the given attribute name. This value may be defaulted if the declaration is followed by an equals sign and an array literal containing a number of byte values:
byte[] smalls = { 1, 12, 123 }
char
The char datatype with values supporting UTF-8, to be included in the message with the given attribute name. This value may be defaulted if the declaration is followed by an equals sign and a character literal:
char initial = 'A'
char[]
The char array datatype with multiple UTF-8 values, to be included in the message with the given attribute name. This value may be defaulted if the declaration is followed by an equals sign and an array literal containing a number of character values:
char[] initials = { 'A', 'B', 'C' }
double
The double-precision floating point datatype, to be included in the message with the given attribute name. This value may be defaulted if the declaration is followed by an equals sign and a double literal:
double pi = 3.1416
double[]
The double-precision floating point array datatype, to be included in the message with the given attribute name. This value may be defaulted if the declaration is followed by an equals sign and an array literal containing a number of double values:
double[] stats = { 1.54179, 7.929254, 32.882777091 }
float
The single-precision floating point datatype, to be included in the message with the given attribute name. This value may be defaulted if the declaration is followed by an equals sign and a float literal:
float pi = 3.14
float[]
The single-precision floating point array datatype, to be included in the message with the given attribute name. This value may be defaulted if the declaration is followed by an equals sign and an array literal containing a number of float values:
float[] stats = { 1.54, 7.92, 32.88 }
int
The 32-bit signed integer datatype with values of -2,147,483,648 to 2,147,483,647, to be included in the message with the given attribute name. This value may be defaulted if the declaration is followed by an equals sign and an integer literal:
int value = 885886279
int[]
The 32-bit signed integer datatype with multiple values of -2,147,483,648 to 2,147,483,647, to be included in the message with the given attribute name. This value may be defaulted if the declaration is followed by an equals sign and an array literal containing a number of integer values:
int[] values = { 885886279, 77241514, 9772531 }
long
The 64-bit signed integer datatype with values of -9,223,372,036,854,775,808 to 9,223,372,036,854,775,807, to be included in the message with the given attribute name. This value may be defaulted if the declaration is followed by an equals sign and a long literal:
long value = 15329885886279
long[]
The 64-bit signed integer datatype with multiple values of -9,223,372,036,854,775,808 to 9,223,372,036,854,775,807, to be included in the message with the given attribute name. This value may be defaulted if the declaration is followed by an equals sign and an array literal containing a number of long values:
long[] values = { 15329885886279, 24389775639272, 45336993791291 }
short
The 16-bit signed integer datatype with values of -32,768 to 32,767, to be included in the message with the given attribute name. This value may be defaulted if the declaration is followed by an equals sign and a short literal:
short value = 12986
short[]
The 16-bit signed integer datatype with multiple values of -32,768 to 32,767, to be included in the message with the given attribute name. This value may be defaulted if the declaration is followed by an equals sign and an array literal containing a number of short values:
short[] values = { 12986, 3772, 10994 }
string
The string datatype with values supporting multi-character UTF-8 strings, to be included in the message with the given attribute name. This value may be defaulted if the declaration is followed by an equals sign and a string literal:
string value = "ABC"
string[]
The string array datatype with multiple values supporting multi-character UTF-8 strings, to be included in the message with the given attribute name. This value may be defaulted if the declaration is followed by an equals sign and an array literal containing a number of string values:
string[] initials = { "ABC", "DEF", "GHI" }
category.TypeName
The explicit complex Schema type of a given Category in the current Context to be included in the message with the given attribute name. The version is the tip, as in the most recent version. There is no support for default values other than null
, which may be supported using the Null Object pattern.
category.TypeName[]
The explicit complex Schema type array of a given Category in the current Context to be included in the message with the given attribute name. The version is the tip, as in the most recent version. There is no support for default values.
category.TypeName:1.2.3
The explicit complex Schema type of a given Category in the current Context to be included in the message with the given attribute name. The version is the one declared following the colon (:
). There is no support for default values other than null, which may be supported using the Null Object pattern.
category.TypeName:1.2.3[]
The explicit complex Schema type array of a given Category in the current Context to be included in the message with the given attribute name. The version is the one declared following the colon (:
). There is no support for default values.
Any given complex Schema type may be included in the Specification, but doing so may limit to some extent consumption across multiple collaborating technical platforms. We make every effort to ensure cross-platform compatibility, but the chosen serialization type may be a limiting factor. We thus consider this an unknown until full compatibility can be confirmed by you and your team.
An additional warning is appropriate regarding direct domain model usage of Schema types. These Schema types are not meant to be used as first-class domain model Entities, Aggregates, or Value Objects. The Events category types may be used as Domain Events in the domain model, but if so we strongly suggest keeping the specifications simple (not include complex types). Thus,
Define your domain model Entities and Value Objects strictly in your domain model code, not using a Schema Specification.
Determine the positive and negative consequences of defining Domain Events only in the schema registry and using them both in the domain model and for your Published Language. It may or may not work well in your case.
Schema Specifications are primarily about data and expressing present and past intent, not behavior. Consider Schema Specification to be more about local-Context migrations of supported Domain Events and inter-Context collaboration and integration of all other Schema types.
Running XOOM Schemata
There are a few steps required to run XOOM Schemata.
Download Distribution
[Available soon]
Docker
Docker images are published for every Schemata build (stable and snapshots). Stable Docker images follow XOOM Platform's version (i.e. 1.7.7
), while the latest snapshot is published under the latest
tag.
The XOOM_ENV
environment variable can be used to specify the configuration Schemata should be run with. To run Schemata with in-memory database, use the dev
configuration:
prod
configuration will require a PostgreSQL database. You can learn about running in developer and other modes in the project repository's README
.
Build
Use git to checkout the repository, build, and run:
If you do not have Maven installed and thus don't have the mvn
command, you can use the provided Maven Wrapper. The following is an abstract example for *nix followed by a concrete example:
The following is an abstract example for Windows followed by a concrete example:
You must use the specific version to identify the code version that you checked out. Replace {version}
with a given release version, such as 1.7.7
. The jar would be named xoom-schemata-1.7.7-jar-with-dependencies.jar
, and thus the full command line would be:
The prod command-line argument is used to run the production database rather than the development database. The production database is Postgres. You can learn about running in developer and other modes in the project repository's README
.
Working with Schema Specifications and Schema Dependencies
XOOM Schemata provides an HTTP API and a web user interface. Both can be used to manage master data, like organizations and units, as well as schema definitions. Typically, you'll use the GUI to edit master data and browse existing schemata and the API to integrate schema registry interactions with your development tooling and build pipelines. Maven users also have the possibility of using XOOM Build Plugins to publish and consume schemas.
Using the GUI
The UI provides a treeview used to browse the available data and a view for each level in the hierarchy described above: Organizations, Units, Contexts, Schemas and Schema Versions. These are accessible via the menu to the left. We also have Dark Mode (top-right).
Defining elements
The following shows the process of defining one Organization containing one Unit with a single Context. Once you did this, you can go ahead and define your Schemas along with their Schema Versions.
When defining a Context, you need to use namespace syntax (e.g. com.example.demo):
To be able to create concrete specifications (in Schema Versions), you'll first need to define the Schema meta data. You can choose between all the categories mentioned in Use Cases. When defining a Schema, use initial cap (e.g. SomethingDefined):
When defining a Schema Version, we suggest to always keep semantic versions in order and without version gaps, so you should only use the three buttons for their respective purpose:
While there are many benefits in keeping your specification sources with your project's source code, the GUI still provides an editor to work with specifications. One example use case for this is if you want to describe a contract or small API surface of an external system outside your control and consume the events it publishes.
After now having defined one of every hierarchy-element, you can switch over to Home
.
Browsing Schemata
In the home view, you can browse existing Schema Versions by drilling down the hierarchy. Once you've selected the version you're interested in, you can:
Review its specification
Update its specification as long as the Schema Version is still a
Draft
Transition between the four lifecycle states
Draft
,Published
,Deprecated
andRemoved
Review source code generated from the specification (click on
Code
)
Review and update its description (click on
Description
, then click onPreview
)
When you've made some changes to the description and decide not to save them, you can useRevert
to just set it back to its initial state.
Redefining elements
After having defined a hierarchy element, you can also redefine it:
This works with every hierarchy element, other than Schema Version, as it mustn't be modified if it is not still a Draft
. If it is a Draft
, you can modify it on Home
. If not, you can define a new Schema Version.
When publishing a new version of an existing schema, the updated specification is validated in regard to the new semantic version according to the following rules:
New
patch
version (e.g.1.2.5
to1.2.6
): The specification needs to remain unchanged, only meta data can be updatedNew
minor
version (e.g.1.2.5
to1.3.0
): Only new fields may be added, there must be no removals, type changes or reordering of fieldsNew
major
version (e.g.1.2.5
to2.0.0
): No restrictions
If these rules are violated, you'll be presented a list of additions, removals, reorderings and type changes. The colors correspond to the New Major/Minor/Patch
button colors.
After changing to a new major version, we can define without problems.
Using the Maven plugin
The XOOM Build Plugins provides goals to talk to the schema registry as part of the build. To use it, include it in the build section of your project's pom.xml
and configure the goals as shown below.
Schemata within the registry are identified by references consisting of organization, unit, context namespace, schema name, and schema version. A schema reference pointing to the Schema MySchema
1.0.5
in the namespace com.example
of the unit RnD
within the ACME
organization would look like this: ACME:RnD:com.example:MySchema:1.0.5
.
Publishing schema to the registry
By default, the plugin expects your schema specification .vss
files to be in the src/main/vlingo/schemata
folder within your project. To publish these to the registry, you need to configure the push-schema
goal with:
the target registry URL, your organization, and unit.
the schema reference and the previous version, in case you're updating a previous version for each schema
A complete configuration for this goal might look like the example below. For additional details on configuration parameters and defaults, please refer to XOOM Build Plugins.
Consuming source code from the registry
The pull-schema
goal provides for retrieving sources generated from schemata stored in the registry. Per default, the generated sources will be written to target/generated-sources/vlingo
and be included in the project's compile path. The goal needs to be configured with:
the schemata instance URL, your organization, and unit
the reference of each schema version to consume
The following example makes the build put a SchemaDefined.java
file generated from the schema version identified by the referenceVlingo:examples:io.vlingo.xoom.examples.schemata:SchemaDefined:2.0.1
file into target/...
Round-trip example
This example shows how to integrate two bounded contexts mediated by the schema registry. In this section, we'll set up two maven projects, one publishing schemata to the registry and one consuming these.
First, make sure you have a schema registry instance running on localhost:9019
. Please refer to xoom-schemata-integration on how to set this up.
Open the GUI and create an organization, one unit with two contexts io.vlingo.xoom.examples.consumer
and io.vlingo.xoom.examples.producer
. In the producer context, also create a schema called MyFirstEvent
. Refer to the Using the GUI section above on how to do this.
Set up one project called consumer
and one called producer
by running the following twice, once using consumer
and once producer
as artifactId
and in package
.
In the producer project, include the xoom-build-plugins
configuration for pushing schemata as described above. Make sure the schema reference correctly points to your organization, unit, and context.
Now create a schema specification file called MyFirstEvent.vss
in producer/src/main/vlingo/schemata
with the following contents.
Within the producer project, run:
The build output should indicate the schema was pushed successfully.
Open the GUI and review the schema version you just pushed.
Now we are set up to consume this schema. Open the pom.xml in the consumer project and configure the build plugin to pull sources generated from this schema. Note that you'll also need to include a dependency to xoom-lattice
as the code currently generated is tied to that. But don't fret, we'll have the possibility to plug in custom code generators soon.
Run mvn install
in the consumer
project and have a look into target/generated-sources/vlingo
and into target/classes
to make sure the code was pulled and compiled. The build output indicates that as well.
You might have noted that the build emitted a warning that you're using a Draft
version, so let's fix that. Head over to the GUI, publish the schema from the Home
-View (as seen in Using the GUI) and re-run the build. Now the warning is gone.
XOOM Schemata does not only validate the version lifecycle but also whether changes are valid given semantic versioning. To see how this looks like, make incompatible changes to your specification and try to publish them as a new minor
version.
In this example, we'll remove the version
attribute, change the type of the timestamp
attribute and change the name of the type
attribute.
In the producer
pom, change the version number in the schema reference to 1.1.0
and add a line indicating the previousVersion
and run mvn install
in the producer
project. You'll note that pushing the schema fails with the list of changes you made. Also try to create the new version via the GUI and review the validation messages there.
Once you update the major version (2.0.0
), the build will run fine again. You can now update your consumer to pull the new version.
Last updated