Generating SDKs with Speakeasy

Q: What is the difference between workflow.yaml and gen.yaml?

workflow.yaml defines the sources and targets — which spec feeds which SDK and where it goes — while gen.yaml holds per-language generation settings like package names and versioning. The CLI reads both from the .speakeasy directory.

Q: Do I have to lint my spec before generating an SDK?

It is not strictly required, but speakeasy lint catches the spec problems that produce broken SDKs before generation runs. The managed GitHub Action runs linting as a gate, so failing rules block the SDK pull request.

This guide is part of SDK Generation & Changelog Automation and covers Speakeasy — driven by the speakeasy CLI — for turning an OpenAPI spec into production-grade SDKs and keeping them current automatically. The scope here is the .speakeasy/ configuration directory (workflow.yaml and gen.yaml), spec validation with speakeasy lint, per-target generation, the x-speakeasy-* extensions that shape output, and the managed GitHub Action that opens SDK pull requests and cuts releases. For a fully declarative multi-registry alternative see Fern; for template-level customization of a single language see OpenAPI Generator. Building a docs site is out of scope.

Speakeasy’s model splits the world into sources (your specs, optionally merged and overlaid) and targets (the SDKs you emit from them). You describe that mapping once, and the CLI — locally or in the managed action — handles linting, generation, versioning, and publishing.

Two properties make this worth adopting over a hand-rolled generation script. First, the same configuration runs identically on your laptop and in CI: speakeasy run locally and the GitHub Action call the same engine, so “works on my machine” is not a failure mode. Second, versioning is automatic and change-aware — Speakeasy compares the new spec against the last generated state and bumps the SDK’s SemVer by the magnitude of the change, so a breaking removal forces a major bump while an added optional field is a minor one. That removes the most error-prone manual step in SDK maintenance and is why the managed action can open a correctly versioned release pull request unattended.

Prerequisites & Environment Setup

The CLI is distributed as a single binary; SDK toolchains are only needed when you want to compile the generated output locally. Pin the CLI version so generation is reproducible.

# macOS / Linux
brew install speakeasy-api/homebrew-tap/speakeasy

# or the version-pinned install script (CI-friendly)
curl -fsSL https://go.speakeasy.com/cli-install.sh | sh -s -- -v 1.456.1

speakeasy --version        # 1.456.1

Authenticate once. The CLI stores an API key tied to your Speakeasy workspace; in CI you pass it as an environment variable instead:

speakeasy auth login       # interactive, for local use
export SPEAKEASY_API_KEY="sk_xxx"   # CI: set as a repository secret

Scaffold the configuration. speakeasy quickstart walks you through choosing a source spec and a first target, writing the .speakeasy/ directory:

speakeasy quickstart

.speakeasy/
  workflow.yaml     # sources -> targets mapping
  gen.yaml          # per-target generation settings (one per target dir)

The split between these two files is deliberate and worth internalizing before editing either. workflow.yaml answers what flows where — which specs are sources, what overlays apply, and which targets they feed. gen.yaml answers how each language is shaped — class names, package identifiers, and the current version. A target’s gen.yaml lives inside that target’s output directory, not in the repository root, because it travels with the generated SDK. Keeping the two concerns separate means you can add a new language target without touching the existing targets’ settings.

A note on the API key: it ties generation to your Speakeasy workspace, where features like managed registries and lint baselines live. Local runs use the cached login from speakeasy auth login; CI runs read SPEAKEASY_API_KEY from the environment. Both validate and run need it, so set it as a repository secret before wiring up any workflow.

Core Configuration

workflow.yaml declares sources (input specs, optionally merged and overlaid) and targets (SDKs to emit). A source can list several inputs, in which case Speakeasy merges them into one logical API before generating — handy when a gateway stitches multiple services into a single public surface. The optional registry block uploads the resolved, bundled source to Speakeasy’s spec registry so other repos and the docs product can consume a versioned snapshot rather than chasing a moving main. Each significant key is annotated:

# .speakeasy/workflow.yaml
workflowVersion: 1.0.0
speakeasyVersion: 1.456.1          # pin so the action and local runs match
sources:
  acme-api:
    inputs:
      - location: ./openapi.yaml    # one or more specs; multiple are merged
    overlays:
      - location: ./overlay.yaml    # OpenAPI Overlay to patch the spec non-destructively
    registry:
      location: registry.speakeasyapi.dev/acme/acme/acme-api  # optional bundle store
targets:
  typescript:
    target: typescript              # generator target language
    source: acme-api                # which source feeds this target
    output: ./sdks/typescript       # where the SDK is written in the repo
    publish:
      npm:
        token: $NPM_TOKEN           # enables publish on the action's release step
  python:
    target: python
    source: acme-api
    output: ./sdks/python
    publish:
      pypi:
        token: $PYPI_TOKEN
  go:
    target: go
    source: acme-api
    output: ./sdks/go

gen.yaml lives inside each target’s output directory and controls language-specific generation. Settings you change most often:

# sdks/typescript/.speakeasy/gen.yaml
configVersion: 2.0.0
generation:
  sdkClassName: AcmeSDK             # the top-level client class
  maintainOpenAPIOrder: true        # keep method order stable across regenerations
  usageSnippets:
    optionalPropertyRendering: withExample
typescript:
  version: 0.5.0                     # the SDK package SemVer (bumped automatically)
  packageName: "@acme/sdk"          # published npm name
  author: Acme
  responseFormat: flat              # return the body directly, not a wrapper
  enumFormat: union                 # TS string-literal unions instead of enums

The CLI manages the version field: each generation bumps it according to the magnitude of the spec change (a removed field is a major bump, an added optional field a minor one), which keeps SemVer honest without manual edits. You can still pin or override a version when you need to — set it explicitly in gen.yaml for a one-off coordinated release — but for steady-state maintenance, letting the CLI own it is the point.

Several gen.yaml keys change the ergonomics of the generated SDK enough to be worth setting deliberately. responseFormat: flat returns the response body directly instead of wrapping it in an envelope object, which is what most consumers expect. enumFormat: union emits TypeScript string-literal unions rather than enum declarations, avoiding the well-known footguns of TypeScript numeric enums. maintainOpenAPIOrder: true keeps methods in spec order across regenerations, which produces small, reviewable diffs in the SDK pull requests instead of noisy reorderings. The sdkClassName is the first thing a consumer types, so choose it to read well: new AcmeSDK(...) beats a generated default.

The same source can feed multiple targets, and overlays applied at the source level reach every target at once. That is the mechanism behind keeping TypeScript, Python, and Go SDKs perfectly in step: there is exactly one source of truth, and divergence is structurally impossible.

Integration Pattern

Run speakeasy lint as a fast PR gate, then let the managed GitHub Action handle regeneration and the release pull request. The action below validates on every PR and regenerates on a daily schedule or on main updates.

# .github/workflows/speakeasy-sdks.yml
name: Speakeasy SDKs
on:
  pull_request:
    paths: ["openapi.yaml", ".speakeasy/**"]
  schedule:
    - cron: "0 6 * * 1-5"     # weekday regeneration check
  workflow_dispatch: {}       # allow manual runs

permissions:
  contents: write             # the action commits/opens the SDK PR
  pull-requests: write

jobs:
  lint:
    runs-on: ubuntu-latest
    if: github.event_name == 'pull_request'
    steps:
      - uses: actions/checkout@v4
      - uses: speakeasy-api/sdk-generation-action@v15
        with:
          action: validate     # runs speakeasy lint against the configured source
        env:
          SPEAKEASY_API_KEY: ${{ secrets.SPEAKEASY_API_KEY }}

  generate:
    runs-on: ubuntu-latest
    if: github.event_name != 'pull_request'
    steps:
      - uses: actions/checkout@v4
      - uses: speakeasy-api/sdk-generation-action@v15
        with:
          action: run          # regenerate every target, bump versions
          mode: pr             # open a pull request instead of pushing to main
        env:
          SPEAKEASY_API_KEY: ${{ secrets.SPEAKEASY_API_KEY }}
          GITHUB_TOKEN: ${{ secrets.GITHUB_TOKEN }}
          NPM_TOKEN: ${{ secrets.NPM_TOKEN }}
          PYPI_TOKEN: ${{ secrets.PYPI_TOKEN }}

With mode: pr, the action regenerates the SDKs, commits the diff to a branch, and opens a pull request titled with the new versions — your team reviews the SDK change like any other code review. This review-first posture is the safest default: a human sees the exact surface change, the version bump, and any new errors before anything reaches a registry. The alternative, mode: direct, commits regenerated SDKs straight to main and is appropriate only when you trust the spec gate completely and want zero-touch updates.

The permissions block is mandatory and easy to forget — the action needs contents: write to push the branch and pull-requests: write to open the PR. Without them the job fails late with a permissions error rather than a configuration error, which sends people looking in the wrong place. Scoping the pull_request trigger to the spec and .speakeasy/** paths keeps unrelated commits from spinning up generation, and the schedule trigger catches drift even when no one touched the spec — useful when a generator upgrade, not a spec change, is what produces a new SDK.

Merging that PR can trigger publishing when publish tokens are configured. For the full release flow including the companion publish workflow, see Automating SDK releases with the Speakeasy GitHub Action.

Advanced Options

Shaping output with x-speakeasy-* extensions. Like Fern’s extensions, these live in the spec and apply to every target. Rename methods, group them into namespaces, and mark pagination:

# openapi.yaml
paths:
  /users/{id}:
    get:
      operationId: getUser
      x-speakeasy-name-override: get          # client.users.get(...)
      x-speakeasy-group: users                # nest under a `users` namespace
  /users:
    get:
      operationId: listUsers
      x-speakeasy-pagination:                 # emit real pagination iterators
        type: offsetLimit
        inputs:
          - name: page
            in: parameters
            type: page
        outputs:
          results: $.data

Retries and the OpenAPI Overlay. Enable global retries once at the document level so every method gets backoff:

# openapi.yaml (root)
x-speakeasy-retries:
  strategy: backoff
  backoff:
    initialInterval: 500       # ms
    maxInterval: 60000
    maxElapsedTime: 3600000
    exponent: 1.5
  statusCodes: ["5XX", "429"]
  retryConnectionErrors: true

When you cannot edit the upstream spec, put x-speakeasy-* additions in an overlay.yaml referenced from workflow.yaml — the OpenAPI Overlay is applied at build time and leaves the source spec untouched. The Overlay format is a standard, tool-agnostic way to express patches with JSONPath targets, so the same overlay that adds Speakeasy extensions can also fix descriptions or examples without forking the upstream document. Because it is a separate, reviewable file, your customizations survive every upstream spec update cleanly.

Method-level retries and timeouts. The document-level x-speakeasy-retries above sets a default, but individual operations can override it — mark a long-running export endpoint with a longer maxElapsedTime while keeping snappy defaults elsewhere. Pair retries with idempotency where the API supports it so retried mutations are safe; the SDK will replay the request with the same idempotency key rather than risk a duplicate write.

Pagination is the extension that most improves day-to-day SDK usability. Marking an endpoint with x-speakeasy-pagination makes the generated method return an iterator that fetches subsequent pages on demand, so a consumer writes for page in client.users.list() instead of manually threading cursor or offset parameters through a loop. The type (offsetLimit, cursor, or url) tells the generator which paging discipline the API follows; getting it right means every language SDK paginates correctly with no per-language code.

Lint rulesets and severities. Tighten or relax validation with a ruleset so CI fails only on what matters:

# .speakeasy/lint.yaml
lintVersion: 1.0.0
rulesets:
  acme:
    rulesets:
      - speakeasy-recommended      # the curated baseline
    rules:
      operation-operationId:
        severity: error            # every operation must have an operationId
      duplicated-entry-in-enum:
        severity: warn

Verification & Testing

Verification happens in two layers: lint the spec to catch problems that would produce a broken SDK, then generate and compile to prove the output is sound. Both layers run in CI, but running them locally first turns a slow round-trip through the action into a fast inner loop.

Lint first — this is the gate the managed action also enforces:

speakeasy lint openapi -s openapi.yaml

Expected clean output:

INFO	Linting OpenAPI spec...
✓ OpenAPI spec is valid
0 errors, 0 warnings, 0 hints

Then generate locally and confirm targets are produced:

speakeasy run

│ Source acme-api ......... ✓ linted, validated
│ Target typescript ....... ✓ generated  sdks/typescript  v0.6.0
│ Target python ........... ✓ generated  sdks/python      v0.6.0
│ Target go ............... ✓ generated  sdks/go          v0.6.0

Reading the speakeasy run summary closely pays off: the reported version per target tells you what the next release would publish, and a target shown as unchanged means the spec edit did not alter that language’s surface. If you expected a change and see none, the edit was cosmetic or landed on a node the generator ignores.

Compile the generated packages and run their built-in tests to confirm the spec change is sound:

cd sdks/typescript && npm install && npm run build && npm test
cd ../python       && pip install -e . && python -m pytest

This compile-and-test step is the real gate. A spec change that breaks generation fails here, locally and in the PR check, long before it can reach a registry — which is precisely why the validate job runs on every pull request. Never patch the generated code to make the build pass; fix the spec or the gen.yaml setting that produced it, because the next regeneration overwrites any manual edit.

After a release, confirm the published versions match the ones the action reported:

npm view @acme/sdk version
pip index versions acme

Troubleshooting

speakeasy lint fails with operation-operationId errors — one or more operations lack an operationId. Speakeasy uses it to name methods, so a missing id blocks generation. Add a unique operationId to every operation; if you must defer, lower the rule to warn in .speakeasy/lint.yaml, but expect autogenerated method names.

SPEAKEASY_API_KEY is not set in CI — the action authenticates against your workspace and cannot run without the key. Add it as a repository secret and reference it in the job’s env: block. Note that validate and run both require it, even though validate only lints.

The action opens no PR even though the spec changed — the regenerated SDK was byte-identical (a comment-only or non-functional spec edit), or mode was left at the default direct-commit. Confirm mode: pr is set, and check the action log for No changes detected; cosmetic spec edits that do not alter the SDK surface will not produce a release.

npm publish returns 403 Forbidden on merge — the NPM_TOKEN lacks publish scope for the package, or the package name in gen.yaml is already owned by another account. Use an npm automation token with publish rights and verify packageName matches a scope you control. The same applies to PYPI_TOKEN for the Python target.

FAQ

What is the difference between workflow.yaml and gen.yaml?

workflow.yaml defines the sources and targets — which spec feeds which SDK and where it goes — while gen.yaml holds per-language generation settings like package names and versioning. The CLI reads both from the .speakeasy/ directory, with one gen.yaml per target output directory.

Do I have to lint my spec before generating an SDK?

It is not strictly required, but speakeasy lint catches the spec problems that produce broken SDKs before generation runs. The managed GitHub Action runs linting as a gate, so failing rules block the SDK pull request from being opened.

How does the Speakeasy GitHub Action ship SDK updates?

The managed action regenerates each target on a schedule or on spec change, bumps the SemVer version, and opens a pull request with the regenerated SDK. Merging the PR can trigger publishing to npm, PyPI, and other registries when publish tokens are configured.

What do x-speakeasy-* extensions control?

They shape generated output without changing API behaviour — renaming methods, grouping operations into namespaces, marking pagination, and enabling retries. They live inline in the OpenAPI spec so every target inherits the same ergonomics in one place.

SDK Generation & Changelog Automation — the parent overview of SDK and changelog tooling
Automating SDK releases with the Speakeasy GitHub Action — the full release-and-publish flow
Fern — declarative multi-registry SDK generation
OpenAPI Generator — template-driven, single-binary alternative