Skip to content

Latest commit

 

History

History
313 lines (255 loc) · 11.2 KB

schema_definition.md

File metadata and controls

313 lines (255 loc) · 11.2 KB

Schema Definition

API designers often will need a mechanism to describe how the data on their API is defined. This helps both computers (to automatically generate client models, validators or documentation) and humans (to easily understand how to interact with an API). Let's see how each API style deals with schema definition.

REST

self-descriptive messages constraint states that we need each message to be self-descriptive; this comprehends the payload as well as the metadata.

In RESTful Web Services, we rely on HTTP to specify the metadata:

  • Content-Type: the Media Type plus a charset. This is also used to let clients specify the desired representation.
  • Last-Modified: last modification date and time of a resource.
  • Content-Encoding: compression method: gzip, compress, deflate, identity, br...
  • Content-Length: size in bytes of the body.
  • Content-Language: describes the language intended for the audience.

To choose the appropriate representation of the payload, the Content-Type entity header will be used. See the IANA document on Media-Types to check a comprehensive list of Media Types approved by the IANA. For example, text/plain or image/png. An extensible format, like application/xml or application/json can be used as well. Even a binary representation as Protobuf can be used in REST.

To describe JSON resources, several specifications can be used:

GitHub defines its own media types, as in:

application/vnd.github+json
application/vnd.github.v3+json
application/vnd.github.v3.raw+json
application/vnd.github.v3.text+json
application/vnd.github.v3.html+json
application/vnd.github.v3.full+json
application/vnd.github.v3.diff
application/vnd.github.v3.patch

However, some people argue that minting new Media Types should be avoided.

Let's see an example from the JSON Schema (the language used to describe schemas in OpenAPI) documentation:

{
    "$id": "https://example.com/person.schema.json",
    "$schema": "http://json-schema.org/draft-07/schema#",
    "title": "Person",
    "type": "object",
    "properties": {
        "firstName": {
            "type": "string",
            "description": "The person's first name."
        },
        "lastName": {
            "type": "string",
            "description": "The person's last name."
        },
        "age": {
            "description": "Age in years which must be equal to or greater than zero.",
            "type": "integer",
            "minimum": 0
        }
    }
}

An object implementing the Person schema could look like this:

{
    "firstName": "John",
    "lastName": "Doe",
    "age": 21
}

GraphQL

Defining an schema using a graph is with no doubt GraphQL trademark. As the official documentation states:

With GraphQL, you model your business domain as a graph

This resembles the mind-map most object-oriented developers follow, in which a business is modeled as a set of objects of their domain.

Then, GraphQL let API designers express their schema using a reduced yet powerful number of primitives in the GraphQL Schema Language:

Scalar types

Are used to define field types of Object types and operation arguments. This type comprehends:

  • Int
  • Float
  • String
  • Boolean
  • ID. Represents an identifier.

Note: most GraphQL implementations let developers define their own new scalar types.

Enumeration type

Allow to create a restricted list of values in a field:

{
    enum STATUS {
        READY,
        WAITING,
        RUNNING,
        TERMINATED
    }
}

Object type

Defines a structure, which is made up of other types. A field can be set as mandatory using the ! symbol. Otherwise, it would be nullable:

{
    type Article {
        id: ID!
        title: String!
        author: User!
        thumbnail: String
        comments: [Comment!]
    }
}

Each field in a type can optionally accept arguments. They can be used to for example customize the representation of a resource, to limit the response, or to filter a result:

{
    type Article {
        id: ID!
        title: String!
        date(format: DateFormats = ISO8601): String!
        author: User!
        thumbnail: String
        comments(limit: Int = 20, offset: Int = 0): [Comment!]
    }
}

Here we are letting the client application specify how it want the date to be represented (let's assume DateFormats is an existing Enumeration Type), with a default format ISO8601. They can also paginate the returned comments.

Union type

Union types indicates a list of various possible types. When a union type is used as a field, any of its concrete types may be used:

{
    union SearchResult = Professor | Student | Subject
}

If a query returns an array SearchResult, we can specify fields for each type, as in:

{
    search(keyword: "Smith")  {
        __typename
        ... on Professor {
            name
            last_name
        }
        ... on Student {
            name
            last_name
        }
        ... on Subject {
            name
        }
    }
}

Interface type

Object types can also make use of Interface types:

{
    interface Page {
        id: ID!
        title: String!
    }
    type Article implements Page {
        id: ID!
        title: String!
        author: User!
        thumbnail: String
        comments: [Comment!]
    }
}

Every field in the interface needs to be redefined. Interfaces are useful when an operation can return different types all of them implementing the same interface. In addition to interfaces, an operation can also be based on union types:

{
    union SearchResult = Professor | Student
}

Input type

Finally, GraphQL allows to define object types expected to be used as the input of an operation. These are used, for example, to provide an input object to create it in the system. To define an input type, we will follow the same syntax we used for regular object types:

{
    input ArticleInput {
        title: String!
        body: String!
        author: Person!
    }
}

gRPC

Although gRPC can be used with any extensible language, such as JSON, most of the time it uses Protocol Buffers to define the schema of its entities. It consists in a platform-agnostic language for serializing data, together with a compiler to generate language-specific code out from the schema definition. To define a schema, a .proto file will be used.

There are several versions of Protocol Buffers. Here, we will cover proto3 version, specifying it in the first line of the .proto file, as in:

syntax = "proto3";

Scalar types

Up to 15 scalar types are accepted, with transport concerns to take into consideration. For the sake of simplicity, only a couple of them, with their Java equivalent, are enumerated here:

.proto type Java type
double double
float float
int32 int
int64 long
bool boolean
string String

Enumeration type

Enumeration types can be used to restrict the allowed values:

enum Status {
    READY = 0;
    WAITING = 1;
    RUNNING = 2;
    TERMINATED = 3;
}

Message type

Finally, to define a composite type, we can use the Message type:

syntax = "proto3";

message Article {
    int32 id = 1;
    string title = 2;
    User author = 3;
    string thumbnail = 4;
    repeated Comment comment = 5;
}

There are several things to note:

  • Field Rules: message fields can be either: (1) singular, the default; or (2) repeated, an ordered list of elements.
  • Field numbers: each field is identified with a unique number. If a message is updated and a field is remover, its field number should be marked as forbidden using the reserved statement.

Any message type

Any adds support to embed a field of an unknown type. For example, our API might have an abstract Operation message type as a response to any requested operation. The Operation message might have a concrete status field, plus a response field of type Any:

import "google/protobuf/any.proto";

message Operation {
    int32 id = 1;
    Status status = 2;
    google.protobuf.Any response = 3;
}

Note we need to import google/protobuf/any.proto in order to use it.

Oneof message type

Similar to GraphQL Union Type, and sometimes referred to as union fields, Oneof allows to ser a list of possible fields where at most one of them will be specified. For example, if a request can result in either OK or failed, the result might contain a result field of type Oneof with two possible fields:

  • response - with the response, when the request was successfully run. For example, of type Any.
  • error - with an object representing the error, in case of any failure.

Which might result in something like this:

import "google/protobuf/any.proto";

message Operation {
    int32 id = 1;
    Status status = 2;
    oneof result {
        google.protobuf.Any response = 3;
        Error error = 4;
    }
}

There is much more with regard to Protocol Buffers:

  • Nested types: we can define a Message or an Enumeration type within another message declaration.
  • Importing definitions: a .proto file can reference another .proto file to import its definitions. This is not supported in Java.
  • Maps: we can create hash tables, to map a key_type to a value_type.