dbt documentation — A comprehensive guide to generate and host dbt docs

Praneeth Kumar Reddy Ballarapu
4 min readMar 30, 2024

--

What is dbt ?

dbt is a transformation workflow that helps you get more work done while producing higher-quality results. You can use dbt to modularize and centralize your analytics code, while also providing your data team with guardrails typically found in software engineering workflows.

Read more about dbt introduction here

dbt documentation

Good documentation for your dbt models will help downstream consumers discover and understand the datasets which you curate for them.

dbt provides a way to generate documentation for your dbt project and render it as a website.

Keeping the dbt documentation up to date and hosting it somewhere for the users or developers is a must.

In this comprehensive guide, we will explore how to generate the dbt docs for a project, and sometimes we will have more than one dbt project and we want to keep the documentation for all the projects in one place instead of having multiple sites for individual projects.

There is no direct way to generate the docs for multiple projects. To overcome this problem I have created a GitHub Action which generates the docs for all the projects and creates the files ready for hosting in GitHub Pages

Generating and deploying documentation

Navigate to the GitHub repository of your dbt project settings and create environments to store env variables

Environments configuration for GitHub repository
Defining environment variables

Create a new GitHub workflow to generate and deploy the dbt docs to GitHub pages. We will trigger the workflow on push to the main branch.

We will use the generate-dbt-docs GitHub action in the workflow to generate the docs — check the readme for all inputs supported by the action. In this article for demo, we will two inputs

projects_dir - directory where dbt projects are present in the repository

docs_dir - directory where the generated dbt docs will be written — this same directory will be used in the next step to upload this folder as page artifact

name: "dbt-docs-publish"

on:
push:
branches:
- main
workflow_dispatch:

jobs:
build:
runs-on: ubuntu-latest
environment: prod # environment name defined in the github

# using the defined enviromenent variables to set the env vars for the runner
env:
DBT_TARGET: ${{vars.DBT_TARGET}}
DBT_PROFILES_DIR: ${{ github.workspace }}/projects
CLICKHOUSE_USER: ${{vars.CLICKHOUSE_USER}}
CLICKHOUSE_PASSWORD: ${{vars.CLICKHOUSE_PASSWORD}}
CLICKHOUSE_DATABASE: ${{vars.CLICKHOUSE_DATABASE}}

steps:
- name: "Step 01 - Checkout current branch"
id: step01
uses: actions/checkout@v3

- name: "Step 02 - Install dbt"
id: step02
run: |
pip3 install dbt-core dbt-clickhouse
dbt --version

- name: "Step 03 - Setup ClickHouse"
id: step03
uses: praneeth527/clickhouse-server-action@v1.0.1
with:
tag: '23.3.18.15-alpine'

# https://github.com/marketplace/actions/generate-dbt-docs
- name: "Step 04 - Generate dbt docs"
id: step04
uses: praneeth527/generate-dbt-docs@v1
with:
projects_dir: projects
docs_dir: ${{ github.workspace }}/docs


- name: "Step 05 - Upload pages to artifact"
id: step05
uses: actions/upload-pages-artifact@v3
with:
path: ${{ github.workspace }}/docs


# https://github.com/marketplace/actions/deploy-github-pages-site
deploy-to-github-pages:
needs: build

permissions:
pages: write
id-token: write

environment:
name: github-pages
url: ${{ steps.deployment.outputs.page_url }}

runs-on: ubuntu-latest
steps:
- name: Deploy to GitHub Pages
id: deployment
uses: actions/deploy-pages@v4

The above workflow has two jobs

  1. build — which will install the required dbt packages (dbt-core & dbt-clickhouse), other external dependencies (here ClickHouse server), and generating and preparting dbt docs to upload to GitHub pages
  2. deploy — which will deploy the artifact uploaded by the previous job to GitHub pages
Workflow actions page

Demo

Repository: https://github.com/praneeth527/dbt-docs-demo

Docs link: https://praneeth527.github.io/dbt-docs-demo/

Projects home page
demo_1 project documentation
demo_2 project documentation

Summary

Hosting dbt documentation is essential for collaboration and knowledge sharing among other teams. We have explored how to use generate-dbt-docs action for generating docs for multiple projects and hosting the same in github pages

Hope this article was helpful. Corrections/suggestions are welcome. Thanks for reading.

--

--

Responses (1)