Skip to content

GraphQL read model

Warning

The GraphQL read model has known shortcomings. Please be aware of them before using it.

Introduction

When working with an NPL system, one will often want to query the states of the various system components. In addition to the standard read model offered by the Core and Streams the platform also offers a more flexible and powerful GraphQL-based read model. Its primary purpose is to query the states (i.e. the variables/fields) of protocols.

Background and motivation

The existing standard read model stores the entire state of protocols as JSON documents, and this is also what the user interested in protocol states would be provided with when querying the standard APIs.

The following is an example of a typical (prettified) JSON document representing a protocol state:

{
  "slots": {
    "this": {
      "ref": {
        "name": "1fcf4a3a-82a2-4648-9471-9b481edb13e3",
        "typeName": "/itpkg/OtherProtocol"
      }
    },
    "party": {
      "scalar": {
        "party": {
          "party": "party1"
        }
      }
    },
    "states": {
      "states": {
        "name": "/itpkg/OtherProtocol#states"
      }
    },
    "otherText": {
      "scalar": {
        "text": "Sdtoy8GEsk"
      }
    },
    "currentState": {
      "state": {
        "modifier": "INITIAL"
      }
    },
    "otherDateTime": {
      "scalar": {
        "dateTime": "2021-09-10T14:27:52.191949+02:00[Europe/Zurich]"
      }
    }
  }
}

That was just a protocol with very few fields -- more complex examples are left to the reader's imagination.

After acquiring the JSON state document, the user will then also have to parse it in order to acquire the desired data Another limitation (which may seem obvious, but is a significant limitation nonetheless) of the standard read model is that the JSON documents have no relational structure, thus forcing the user to make multiple successive queries if they are looking for data associated with the contents of the first query. A typical example of this -- which we will discuss further below -- is finding data stored in a protocol that is itself a field of another protocol.

GraphQL and PostGraphile

GraphQL, as the name suggests, is a query language for graphs. A typical example of a graph query comes from the domain of social networks: "give me a list of the friends of the friends of my friends". For us, that would be "give me a list of the protocols of the protocols of my protocols". The language itself is typed and declarative, and makes it very easy for users to ask for just the data they are interested in (as illustrated by the examples further down).

We have chosen to use PostGraphile (as well as a filtering plugin) in order to provide a GraphQL read-model that is generated from our pre-existing PostgreSQL persistence layer.

Read more about how to set up Postgraphile here

Protocol state tables

In order to accommodate the sort of protocol state queries that users are typically interested in, we have normalized parts of the aforementioned JSON blob into a number of new PostgreSQL tables. The schemas for a couple of these tables are shown below. Note that there are similar schemas for all the regular types supported by NPL (except unions, and subject to the other limitations listed below), and the schemas (as well as the corresponding types) can also be explored via the automatically generated documentation provided by PostGraphile using a GraphQL client as detailed below.

Protocol_fields_blob(protocol_id, field, value)      protocol_id → Protocol_states.protocol_id

Protocol_fields_struct(protocol_id, field, value)      protocol_id → Protocol_states.protocol_id

Struct_fields_datetime(struct_id, field, value, zone_id)      struct_id → Protocol_fields_struct.value

Protocol_fields_collections_text(protocol_id, field, key, value, collection_type)      protocol_id → Protocol_states.protocol_id

Some things to note:

  • Collections are one-to-many relations between the protocol id and field to collection elements. The type of the collection, e.g. Map<Number, Text> is indicated by collection_type.
  • SQL uses snake_case and GraphQL uses camelCase, but other than that the column and table names are directly translated from SQL to GraphQL.
  • Please be aware that all timestamps have been converted to UTC.

Access control

We require that requests made to the PostGraphile endpoint contain an Authentication header containing a JSON Web Token (JWT), which is obtained from the IAM's (e.g. Keycloak) token endpoint.

PostGraphile executes its queries against the PostgreSQL database as a specified user, and sets the access and entity claims for the PostgreSQL transaction to those contained within the JWT (provided that the token was properly signed by the IAM and is valid, of course). These claims are then checked against those associated with the protocols that own the data that is being queried by the database.

Known shortcomings

  • The current implementation of the GraphQL read model only supports querying top-level struct fields. If we have a struct SomeStruct { a: Number, b: SomeOtherStruct, c: SomeProtocol }, we can query a (as it is a top-level field within a struct) and c (as it is a foreign key/reference to a protocol and thus a graph connection is generated), but not b (as it is a nested struct).
  • Similarly, it is not possible to query nested collections. We only support querying collection elements on the top level of protocols (hence collections that are nested within structs or other collections are not queryable).
  • It is not possible to query union type fields.
  • Because NPL itself is supposed to be in control of the write model, GraphQL mutations are not allowed. This is strictly a read model and not a write model.
  • Deeply nested queries (e.g. querying a field contained within a protocol referenced by a protocol referenced by a protocol ... x50) may be very computationally expensive, so make sure you are aware of the performance characteristics when you use the read model (or make it available to others). Remember that you are reading directly from the database, so performance issues will have an impact on the entire system.
  • Filtering on anything but the top level of nested queries will result in empty lists for non-matching entries within the response.
  • Map keys will have use the same representation as NPL's .toText() function on the corresponding values, and as such will in some cases be serialized differently from values of the same type found elsewhere in the read model.

Endpoints

Querying

The following is an example of how it can be queried using cURL:

curl -X POST \
-H "Content-Type: application/json" -H "Authorization: Bearer $TOKEN" \
-d '{"query": "{ protocolFieldsCollectionsBlobs { nodes { field key value } } }"}' \
http://localhost:5555/graphql

Keep in mind that the TOKEN environment variable has to populated. It must first be retrieved from an IAM (e.g. Keycloak). The following is an example of how that might be done using the cURL and jq utilities in a development environment with Keycloak:

export TOKEN=$(curl -s "http://localhost:11000/realms/noumena/protocol/openid-connect/token" \
-d 'username=someuser' \
-d 'password=somepassword' \
-d 'grant_type=password' \
-d 'client_id=nm-platform-service-client' \
-d 'client_secret=5008dee5-77ee-4645-9d1f-93ebe3ea4311' | jq -j .access_token)

Clients

There are countless clients to choose from when it comes to accessing GraphQL APIs. When prototyping and testing queries it can be very handy to use one that provides the ability to set headers and browse the automatically generated documentation (which is useful for understanding the types and connections). One such client that we can recommend is Altair.