Algolia DocSearch Integration for API Portals

Algolia DocSearch adds fast, typo-tolerant, keyboard-driven search to a developer portal by crawling rendered docs into an Algolia index and rendering a search modal with a drop-in widget. This guide is part of Developer Portal Frameworks & UI Setup and covers the crawler configuration, the @docsearch/react and @docsearch/js widgets, the appId / apiKey / indexName triad, and wiring reindexing into CI so search stays current.

Key objectives:

  • Configure the DocSearch crawler to index the right content and ignore navigation chrome
  • Mount the search widget with the correct credentials
  • Keep the Search-Only key in the browser and the Write key in CI secrets
  • Trigger a reindex automatically after each docs deploy

For a step-by-step install on a single portal, see Adding Algolia DocSearch to a docs portal.

DocSearch crawl and query flow The crawler reads the deployed portal and writes records to an Algolia index using the write key; the browser widget queries the same index using the search-only key. Deployed portal rendered HTML Browser widget @docsearch/react DocSearch crawler write key Algolia index indexName query (search-only key)

Prerequisites & Environment Setup

DocSearch has two halves: a crawler that reads your deployed docs and writes records to an Algolia index, and a widget that queries that index from the browser. You need an Algolia application and two of its keys.

Requirements:

  • An Algolia application with an appId. Create one in the Algolia dashboard, or apply to the hosted DocSearch program if your docs are public and open-source.
  • A Search-Only API key — safe to ship in client-side JavaScript. It can only run queries.
  • A Write (Admin) API key — used by the crawler to push records. Keep it in CI secrets only; never expose it to the browser.
  • A publicly reachable, fully rendered docs site. The crawler reads server-rendered or pre-rendered HTML. If your portal renders content only after client-side hydration, ensure the crawler can still see the text (static export, or DocSearch’s JS rendering mode).

Install the widget for a React-based portal:

npm install @docsearch/[email protected] @docsearch/[email protected]

Or for a non-React site:

npm install @docsearch/[email protected] @docsearch/[email protected]

Pin the version. The widget’s CSS class names and modal markup are stable within a major version but can shift across minors, which matters if you override the styling.

If you self-host the crawler, install the open-source scraper image. It runs as a one-shot container that reads a config file and exits:

docker pull algolia/docsearch-scraper:latest

Core Configuration

The crawler is driven by a JSON config that tells it which URLs to start from, which DOM selectors map to record fields, and which index to write. The selectors block is the part that determines search quality: it splits each page into hierarchical records (lvl0lvl4 plus text) so results group by section.

{
  "index_name": "api-portal",
  "start_urls": ["https://docs.example.com/"],
  "sitemap_urls": ["https://docs.example.com/sitemap.xml"],
  "stop_urls": ["https://docs.example.com/changelog/"],
  "selectors": {
    "lvl0": {
      "selector": ".sidebar .menu__link--active",
      "global": true,
      "default_value": "Documentation"
    },
    "lvl1": "article h1",
    "lvl2": "article h2",
    "lvl3": "article h3",
    "lvl4": "article h4",
    "text": "article p, article li, article td"
  },
  "custom_settings": {
    "attributesForFaceting": ["lang", "version"]
  }
}

What each key controls:

  • index_name is the Algolia index the crawler writes to. This exact string must match the widget’s indexName — a mismatch is the most common reason search returns nothing.
  • start_urls are the entry points the crawler follows links from. sitemap_urls speeds up discovery by giving it the full URL list directly.
  • stop_urls excludes paths from indexing — a regex list. Use it to skip a high-churn changelog or generated reference that you do not want polluting results.
  • selectors map page structure to record levels. lvl0 is usually the active navigation category (set global: true so it applies to the whole page), and text captures the body. Scope every selector to the content region (article …) so navigation, footer, and sidebar text do not enter the index.
  • custom_settings.attributesForFaceting declares attributes you can later filter on, such as version for a multi-version portal.

Run the crawler with the application credentials passed as environment variables, never in the config file:

docker run -it --env-file=.env \
  -e "CONFIG=$(cat docsearch.json | jq -r tostring)" \
  algolia/docsearch-scraper

The .env file holds APPLICATION_ID and API_KEY (the Write key). Keeping credentials out of docsearch.json lets you commit the config to version control safely.

Integration Pattern

In CI, run the crawler after the docs deploy so the index reflects the live site. The widget then needs no rebuild — it always queries the current index. The workflow below assumes a preceding deploy job has published the portal.

# .github/workflows/reindex-docsearch.yml
name: Reindex DocSearch
on:
  workflow_run:
    workflows: ["Deploy Docs Portal"]
    types: [completed]
jobs:
  reindex:
    # Only reindex if the deploy succeeded
    if: ${{ github.event.workflow_run.conclusion == 'success' }}
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4
      - name: Wait for CDN to serve new content
        run: sleep 30
      - name: Run DocSearch crawler
        run: |
          docker run \
            -e "APPLICATION_ID=${{ secrets.ALGOLIA_APP_ID }}" \
            -e "API_KEY=${{ secrets.ALGOLIA_WRITE_KEY }}" \
            -e "CONFIG=$(cat docsearch.json | jq -r tostring)" \
            algolia/docsearch-scraper:latest

The workflow_run trigger chains this job onto the deploy workflow, and the conclusion == 'success' guard prevents reindexing a failed deploy. The ALGOLIA_WRITE_KEY lives only in repository secrets — it is the one credential that must never reach the browser. The short sleep gives the CDN time to serve the new pages before the crawler reads them; tune it to your cache propagation time.

For portals built on a framework with a native search slot, wire the same index into that slot rather than mounting a second widget. Docusaurus for API Portals ships a DocSearch theme that reads appId, apiKey, and indexName from config; Mintlify Setup & Migration provides its own search that you can swap for DocSearch when you need cross-domain indexing.

Advanced Options

Mounting @docsearch/react. Render the DocSearch component anywhere in the tree, typically in the portal header. Import the CSS once at the app root:

import { DocSearch } from '@docsearch/react';
import '@docsearch/css';

export function SearchButton() {
  return (
    <DocSearch
      appId="YOUR_APP_ID"
      apiKey="YOUR_SEARCH_ONLY_KEY"
      indexName="api-portal"
      placeholder="Search the API docs"
    />
  );
}

The component renders a button that opens the search modal; Ctrl/Cmd+K opens it from anywhere. All three credentials are public-safe here because apiKey is the Search-Only key.

Mounting @docsearch/js. For a non-React portal, call docsearch() against a container element:

<div id="docsearch"></div>
<link rel="stylesheet" href="https://cdn.jsdelivr.net/npm/@docsearch/[email protected]" />
<script src="https://cdn.jsdelivr.net/npm/@docsearch/[email protected]"></script>
<script>
  docsearch({
    container: '#docsearch',
    appId: 'YOUR_APP_ID',
    apiKey: 'YOUR_SEARCH_ONLY_KEY',
    indexName: 'api-portal',
  });
</script>

Filtering and ranking with searchParameters. Restrict results to the active version or language by passing Algolia query parameters — this is where the attributesForFaceting from the crawler config pays off:

<DocSearch
  appId="YOUR_APP_ID"
  apiKey="YOUR_SEARCH_ONLY_KEY"
  indexName="api-portal"
  searchParameters={{ facetFilters: ['version:v2', 'lang:en'] }}
/>

This keeps a user reading the v2 docs from getting v1 results, without maintaining separate indexes.

Verification & Testing

After the crawler runs, confirm records actually landed in the index before trusting the widget. Query the index directly with the Search-Only key:

curl -s "https://YOUR_APP_ID-dsn.algolia.net/1/indexes/api-portal/query" \
  -H "X-Algolia-API-Key: YOUR_SEARCH_ONLY_KEY" \
  -H "X-Algolia-Application-Id: YOUR_APP_ID" \
  -d '{"query":"authentication"}' | jq '.nbHits'

A non-zero nbHits confirms the crawler populated the index and the Search-Only key can read it. A zero result for a term you know exists in the docs means either the crawler did not run, wrote to a different index_name, or your selectors excluded the content region.

Then verify the widget mounts and opens in the browser:

npx [email protected] install --with-deps chromium
node -e "
const { chromium } = require('playwright');
(async () => {
  const b = await chromium.launch();
  const p = await b.newPage();
  await p.goto('https://docs.example.com/');
  await p.click('.DocSearch-Button');
  await p.fill('.DocSearch-Input', 'authentication');
  await p.waitForSelector('.DocSearch-Hit', { timeout: 10000 });
  await b.close();
  console.log('DocSearch returned hits in the modal');
})();
"

The waitForSelector('.DocSearch-Hit') fails the check if the modal opens but returns nothing — catching the credential or index-name mismatch that a static page load would not reveal.

Troubleshooting

  • Index does not exist or empty modal. The widget’s indexName does not match the crawler’s index_name, or the crawler has never run successfully. Confirm both strings are identical and check the crawler logs for a write confirmation.
  • Method not allowed / 403 when the widget queries. You shipped the Write or Admin key instead of the Search-Only key. Replace apiKey with the Search-Only key from the Algolia dashboard; the write key is rejected for client-side search and is a security risk if exposed.
  • Results include navigation, footer, or sidebar text. The selectors are not scoped to the content region. Prefix every selector with the article container (e.g. article h2, article p) so the crawler ignores chrome.
  • Crawler indexes nothing on a client-rendered SPA. The scraper read empty HTML before hydration. Pre-render or statically export the docs, or enable the crawler’s JavaScript rendering option so it waits for content to appear.

FAQ

Do I have to run the DocSearch crawler myself?

No. Open-source and qualifying docs sites can apply for Algolia’s free hosted DocSearch program, where Algolia runs the crawler on a schedule. Self-hosted teams run the open-source crawler container themselves against their own Algolia application.

Which API key does the widget use?

The widget uses the Search-Only API key, which is safe to expose in client-side code because it can only run queries. The Admin or Write API key, used by the crawler to push records, must stay in CI secrets and never appear in the browser.

Should I use @docsearch/react or @docsearch/js?

Use @docsearch/react for React, Next.js, and Docusaurus portals so the modal integrates with the component tree. Use @docsearch/js for plain HTML, server-rendered pages, or any non-React stack, mounting it onto a container element.

Why does the search modal return zero results after deploy?

The most common cause is a mismatch between the indexName in the widget and the index the crawler wrote to, or a crawler that has not run since the content changed. Confirm both reference the same index and trigger a reindex in CI after each deploy.