Skip to content

Querying Data with GraphQL

Jaren M. Brownlee edited this page Feb 18, 2022 · 29 revisions

Using GraphQL to query DeepLynx

Deep Lynx ships with the ability to query data that you have ingested in previous steps. This capability is still actively under construction, so if you don't see a needed feature in this write up please reach out to the development team to discuss its absence and whether the feature could be something you add.

This guide assumes basic knowledge of querying using GraphQL. You can find a good tutorial and write-up here - https://graphql.org/learn/. You should pay particular attention to how queries are built, how GraphQL returns only the data you explicitly request, and any documentation specific to the language you hope to use to interact with this functionality.

Dynamic GraphQL Schema

In GraphQL, schema is generally defined prior to running the application. The schema determines what kinds of queries can be made, how they're structured, and what functions they access in the program to populate the users results. The first iteration of this functionality followed this pattern, but we found the system to be unwieldy and hard to understand. Instead, Deep Lynx dynamically generates a schema each time you interact with the GraphQL endpoint for a given container.

Currently, the generated schema’s types map 1:1 to a Metatype in the container you are querying. So for example if your ontology had the Car, Maintenance, and Entry Metatypes – your GraphQL Schema would also have a type for Car, Maintenance, and Entry. These types are used for querying nodes with these specific Metatypes. More information on how to accomplish these queries can be found in the next major section.

Introspection

You can use introspection at any time to get a complete list of all potential types, and other information, at any time. This is useful as your schema might be different from another user’s based on the container you are querying. This is also helpful because we must perform a slight conversion on your Metatype names to make them work with GraphQL – and introspection allows you to see exactly how we might have named your custom Metatype.

Much like normal queries, you would POST your introspection GraphQL query to {{yourURL}}/containers/{{containerID}}/data - and much like normal queries, your POST body should be a JSON body with the following two fields - query and variables. query is an escaped string containing your GraphQL query. variables is a JSON object containing any variables which were included in your query.

More information on formatting a query can be found here - https://graphql.org/learn/serving-over-http/.

Here is an example introspective query - this one simply returns a list of all possible GraphQL types by name.

{ 
  __schema { 
    types { 
      name 
    } 
  } 
} 

Note: Not all types that are returned by this query are Metatypes. Some might be native GraphQL types, or enumerable types needed by a Metatype. When in doubt, refer back to the container’s ontology to differentiate between what’s a standard GraphQL type and what’s a Metatype.

Writing Queries

Please keep in mind that this is actively under construction, meaning things may change slightly and new features may be added. We will endeavor to keep this guide up to date, but if in doubt please contact the development team.

Currently (1/10/2022), you can only write queries which return a list of node records from the Deep Lynx graph database. Plans have been made to extend this functionality to edges in the future, but that functionality currently doesn’t exist. There are also plans to include an HTTP endpoint to which you would pass a node/edge ID and a filter object to receive a response which represents the actual graph structure in which Deep Lynx stores its data.

We will not spend a large amount of this guide on how to make a GraphQL request. We feel this topic is covered well in other guides, as well as the official documentation . We follow all standard practices in how a GraphQL query is structured as a JSON object for POSTing to an HTTP endpoint. This guide also assumes that while we display a GraphQL query in its object form that you are aware that it must be encoded as a string and included in a JSON object as the query property.

The most important thing to remember is that the endpoint to which you send all GraphQL queries (via HTTP POST requests) is the following:

{{yourURL}}/containers/{{yourContainerID}}/data

Querying for Nodes based on Metatype

The primary objective of our GraphQL endpoint and functionality is to allow a user to swiftly retrieve a list of all nodes of a given Metatype and which match a set of filters based on that Metatype’s properties, or keys.

The following steps will demonstrate how a user might query and receive nodes of the Requirement type.

A user must know the name of the GraphQL type for “Requirement”. This is done most easily by running the introspective query listed earlier in this page and searching for a type by, or close to, that name. In this case the name of the GraphQL type should be the exact same – “Requirement”. Note: Metatype names that consist of more than one word will have their spaces replaced by the _ character. So “Maintenance Entry” would become “Maintenance_Entry” in the GraphQL schema. A metatype whose name starts with a number will have _ prepended to its name. So “1 Maintenance” would become “_1_Maintenance”.

Optional - Just like on the main GraphQL schema, you can run an introspective query on the “Requirement” type to see what fields may exist for you to request and query against. The following query illustrates how to accomplish this. The fields returned from this query represent all valid fields upon which you can filter or request, as well as their datatypes.

{ 
  __type(name: "Requirement") { 
    name 
    fields { 
      name 
      type { 
        name 
        kind 
      } 
    } 
  } 
} 

You might see the following response (explanation of fields are in comments next to fields)

{ 
    "data": { 
        "__type": { 
            "name": "Requirement", # represents which Metatype you queried 
            "fields": [ 
                { 
                    "name": "_record", # this is a special object which 
                    "type": {          # contains metadata about the object  
                        "name": "recordInfo", 
                        "kind": "OBJECT" 
                    } 
                }, 
                { 
                    "name": "type", # name of the field itself
                    "type": { # datatype of the field
                        "name": "String",  
                        "kind": "SCALAR" 
                    } 
                }, 
                { 
                    "name": "active", 
                    "type": { 
                        "name": "Boolean", 
                        "kind": "SCALAR" 
                    } 
                }, 
                { 
                    "name": "id", 
                    "type": { 
                        "name": "Float", 
                        "kind": "SCALAR" 
                    } 
                } 
            ] 
        } 
    } 
} 

We can further interrogate the _record objects (and any other fields of the kind OBJECT) using another introspective query based on the type name of these objects:

{ 
    __type(
        name: "recordInfo"  #name field of "_record" type above
    ){ 
        fields{
            name
            type{
                name
            }
        }
    }
} 

This will allow us to see which sub-fields may be queried in these OBJECT-typed fields.

Valid Metatype Field Names

The Metatype field names generally map directly to your Metatype’s properties/keys. However, in this case, the names are taken from the Metatype Key’s "property name" field, not the name of the Metatype Key itself. So, in this case, the Requirement Metatype has a field named “Active” and that field's property name is “active” – thus “active” is what you would use to request that field in the return object.

Once a user knows the Metatype they’re querying for and the possible fields, they can then craft the query. Remember that GraphQL only returns the data you explicitly request. This means that if you leave a field off your request, the return value will also omit this field.

The query below will request all possible fields for the “Requirement” Metatype – comments next to the query help explain each part.

{ 
  Requirement { # start the query with the GraphQL name of your metatype
    active 
    type   # these fields map directly to the Metatype’s properties and 
    id     # match those returned as part of the introspective query 
    name
    _record { # this is a special object which contains metadata about the
      id    # node itself, such as its Deep Lynx id, original id, etc 
      data_source_id 
      original_id 
      import_id 
      metatype_id 
      metatype_name 
      created_at 
      created_by 
      modified_at 
      modified_by 
      metadata 
      page    # both page and limit are currently under construction 
      limit   # and do not affect the data 
    } 
  } 
} 

Filtering Results

There are many cases where, instead of simply listing all the instances of a given metatype, the user will want to filter their results to meet certain criteria. Filtering is easily broken down by fields, with the argument as a string as follows:

{
    Requirement(name: "M21 - Total Cesium Loading"){
        id
        name
        type
        basis
    }
}

This query would return all requirements with the specified name from DeepLynx, as well as their id, type, and basis.

Using Operators

The string portion of the query actually consists of two parts: a search operator, and the data to search on. As we can see from the query above, the search operator defaults to an implicit "eq". The same query could be rewritten as

{
    Requirement(name: "eq M21 - Total Cesium Loading"){
        ... <fields>
    }
}

and would return identical results. The search operators are as follows:

Operator Description Returns Example
eq Equals or equal to. This is the implicit behavior. All nodes with specified fields matching the expression (name: "eq M21 - Total Cesium Loading")
neq Non-equal or not equal to. All nodes except those with specified fields matching the expression (name: "neq M21 - Total Cesium Loading")
like Matches results against a passed-in pattern. Uses wildcard % for any number of characters and _ to match 1 character. Wildcard behavior mimics that of Postgres. All nodes whose specified fields match the given pattern (name: "like %M21%") searches for "M21" anywhere in the name
in Matches one result from an array of options. All nodes whose specified fields match one option in the array (id: "in 179,306")
>, < Check if a numerical field is less than or greater than a certain value. Please note that >= and <= are not currently supported. All nodes whose specified fields are greater than or less than the given value (id: "> 200")

Filtering on Multiple Fields

There will likely be situations in which you want to filter on multiple fields within a metatype. This is also possible through GraphQL. To filter on multiple fields, simply separate the fields with whitespace like so:

{
    Requirement(
        id: "> 200"
        name: "like %M21%"
    ){
        id
        name
        type
        basis
    }
}

This query would return all requirements whose id matches the condition "> 200" and whose name contains the string "M21". This behavior mimics that of an AND conjunction. Support for OR is not currently available.

Filtering on Record Information

As was covered in a previous section, each metatype node will contain metadata about the node record as stored in the _record field:

{ 
  Requirement { 
    _record { # this is a special object which contains metadata about the
      id    # node itself, such as its Deep Lynx id, original id, etc 
      data_source_id 
      original_id 
      import_id 
      metatype_id 
      metatype_name 
      created_at 
      created_by 
      modified_at 
      modified_by 
      metadata 
      page    # both page and limit are currently under construction 
      limit   # and do not affect the data 
    } 
  } 
} 

This data will be returned according to what was requested, just like other fields in GraphQL syntax. There are a few sub-fields within this record information which can be queried on. They are: data_source_id, original_id, import_id, page, and limit. In the current state, page and limit are under construction so querying on them would not be useful. In future updates, these attributes will be used to implement pagination of results.

To query on a sub-property of _record, the syntax is as follows:

Requirement(
    _record: {data_source_id: "2"}
){
    _record{
        data_source_id
        original_id
        import_id
    }
    name
    type
}

The main difference here is that, because _record is an Object, its sub-properties are queried using an additional set of curly braces ({}) to specify the sub-field. You can also query for multiple sub-fields just like with a normal query:

Requirement(
    _record: {data_source_id: "2" import_id: "1"}
){
    ... <fields>
}

Future Implementation

In a future update, we anticipate the ability to query a node based on its relationship to other metatypes. This feature is still under construction. Expected future behavior can be seen in the diagram below:

image

DeepLynx Wiki

Sections marked with ! are in progress.

Building DeepLynx

DeepLynx Overview

Getting Started

Building From Source

Admin Web App


Deploying DeepLynx


Integrating with DeepLynx


Using DeepLynx

Ontology

Data Ingestion

Timeseries Data

Manual Path
Automated Path
File/Blob Storage

Data Querying

Event System

Data Targets


Developing DeepLynx

Developer Overview

Project Structure and Patterns

Data Access Layer

Development Process

Current Proposals

Clone this wiki locally