Herrick Fang

Dataloaders in Flask and Graphene

  • Created: 04 Apr 2021
  • Modified: 04 Jun 2021

GraphQL Dataloaders in Flask

Earlier in the year, we saw some really slow queries (2-5+ seconds) executing on our most frequently visited pages at work. The user experience was becoming unbearable, so we decided to focus our efforts on potential performance improvements.

At work, we run a Flask app connected to a Postgres database deployed on Render. We were using GraphQL with graphene and hit the common N+1 query problem. Looking up solutions online, dataloaders would solve the issue for our use case.

Dataloaders let you cache calls in your graphql document, so that you can batch multiple resolvers at once and cache repeated calls. This would be extremely useful for our queries where there were 30+ child relationships getting called sequentially.

However, we also had to keep in mind the limitations of the library Graphene-SQLAlchemy that we were using. While it provides a simple integration that makes it easy to map columns on an ORM object with resolvers, it didn’t support some custom modifications of SQLAlchemy objects. Particuluarly, for fields that called other tables via foreign keys, it only supported select in load and lazy load relationships. This is pretty useful for base queries that do not have many filter conditions. However, we added a lot of extra filter conditions, so that we could have more custom queries. To make this possible, many of our relationships used dynamic relationships and we also usedhybrid properties to call the dynamic relationship with the added filter conditions.

Solution

We found a way to implement dataloaders at the request level after looking into documentation. We ended up implementing our general framework with something similar to the response here with the following code structured below:

# dataloaders.py
from promise import Promise
from promise.dataloader import DataLoader
from some.internal.libary import UnitQuery

class UnitSamplesLoader(DataLoader):
    def batch_load_fn(self, ids):
        samples = UnitQuery.get_samples(unit_ids=ids) # bulk query we wrote
        samples_dict = defaultdict(list)
        for sample in samples:
            sample_dict[sample.unit_id].append(sample)
        return Promise.resolve([samples_dict.get(unit_id, []) for unit_id in ids])
# context.py
from typing import Dict
from dataloaders import UnitSamplesLoader
from promise.dataloader import DataLoader

def construct_dataloaders() -> Dict[str, DataLoader]:
    # Here, we looped through our registry to figure out which dataloaders to instantiate.
    dataloaders = {"unit__samples__loader": UnitSamplesLoader()}  # type: Dict[str, DataLoader]
    return dataloaders

def get_graphql_context() -> Dict[str, Dict[str, DataLoader]]:
    return {"dataloaders": construct_dataloaders()}
# app.py
from context import get_graphql_context
from flask_graphql import GraphQLView

graphql_view = GraphQLView.as_view(
    "graphql",
    schema=schema,
    middleware=middleware,
    get_context=get_graphql_context,
)
# schema.py
import graphene

class Unit(graphene.ObjectType):
    def resolve_samples_loader(self, info, **kwargs):
         dataloaders = info.context["dataloaders"]
         return dataloaders[f"unit__samples__loader"].load(self.id)

That helped us set up the general framework and made it possible to load dataloaders from the graphql context in Flask.

We also needed to handle the other hiccup of hybrid properties. We decided to convert most object relationships back to the default lazy type and split dynamic queries into another abstraction, query objects, that would pass in the specific ORM object so that the id can return the dynamically filtered object. This was in place of the of relationship being used as the cached key. Additionally, we decided to overwrite our hybrid property resolvers with the dataloader implementation.

Overall, with these changes in place, we were able to see massive performance improvements! Using Honeycomb and Scout, we were able to decrease some of our queries by 70%, and saw sub 500 ms queries for the p50 and p95 response times, which was a huge relief.