Adding Algolia DocSearch to a Docs Portal

This guide is part of Algolia DocSearch Integration within Developer Portal Frameworks & UI Setup. It covers adding the DocSearch widget to an existing static docs portal end to end: obtaining credentials, writing the crawler configuration, seeding and verifying the index, and wiring the search box into your pages. This task comes up the moment a portal grows past a dozen pages and readers can no longer scan a sidebar to find an endpoint or guide.

DocSearch crawl and query flow The crawler reads the deployed portal using a config JSON, writes records into an Algolia index, and the browser search box queries that index with a search-only key. Deployed portal config JSON + selectors DocSearch crawler Algolia index Search box (search key) query

Problem & Context

A static docs portal — built with a generator or hand-rolled HTML — has no server-side search. The usual stopgaps are a browser Ctrl+F (only searches the current page) or a client-side index like Lunr that bloats the bundle and degrades as the corpus grows. Algolia DocSearch solves this with a hosted index and a polished autocomplete widget, but the setup has a sequence that, done out of order, produces a search box that returns nothing.

The most common before-state is exactly that: the widget is on the page, the network tab shows successful 200 queries to Algolia, and yet zero results appear. The cause is almost always an empty index — the crawler was never run, or it ran against selectors that matched no content. The steps below put credentials, crawler config, seeding, and the front end in the right order so the index has records before the box ever queries it.

Step-by-Step Solution

1. Obtain Algolia credentials

Two paths. For public technical docs, apply to the hosted DocSearch program; Algolia provisions the application and runs the crawler weekly. For private or commercial docs, create your own Algolia application and run the crawler yourself. Either way you end up with three values:

Application ID:     ABC123XYZ
Search-Only API Key: 1a2b3c...   # safe to ship in client JS
Admin API Key:       9z8y7x...   # crawler only — NEVER ship to the browser
Index name:          acme_docs

2. Write the crawler config

Create docsearch-config.json. The selectors map tells the crawler which DOM elements become the hierarchy (lvl0–lvl5) and body text:

{
  "index_name": "acme_docs",
  "start_urls": ["https://docs.acme.dev/"],
  "sitemap_urls": ["https://docs.acme.dev/sitemap.xml"],
  "stop_urls": ["https://docs.acme.dev/changelog/"],
  "selectors": {
    "lvl0": {
      "selector": ".sidebar .active-section",
      "default_value": "Documentation"
    },
    "lvl1": "article h1",
    "lvl2": "article h2",
    "lvl3": "article h3",
    "text": "article p, article li, article td"
  },
  "custom_settings": {
    "attributesForFaceting": ["lang", "version"]
  }
}

Match the selectors to your real markup. If your content sits in <main> rather than <article>, the article selectors capture nothing and the index stays empty.

3. Seed the index

Run the crawler with the official Docker image, passing the admin key via environment variables (never in the JSON):

docker run -it --env-file=.env \
  -e "CONFIG=$(cat docsearch-config.json | jq -r tostring)" \
  algolia/docsearch-scraper:latest

.env holds the secrets:

# .env  (git-ignored)
APPLICATION_ID=ABC123XYZ
API_KEY=9z8y7x...   # admin key — write access

Expected output ends with a record count:

> DocSearch: https://docs.acme.dev/ 142 records)
Nb hits: 142

4. Verify the index

Confirm records exist before touching the front end. From the Algolia dashboard, open the acme_docs index and check the record count, or query the API directly:

curl -s "https://ABC123XYZ-dsn.algolia.net/1/indexes/acme_docs/query" \
  -H "X-Algolia-API-Key: 1a2b3c..." \
  -H "X-Algolia-Application-Id: ABC123XYZ" \
  -d '{"query":"authentication"}' | jq '.nbHits'

Expected:

17

A non-zero nbHits confirms the index is populated and the search key works.

Add the DocSearch JS and CSS, then initialise the widget against a container element:

<!-- in your portal layout, before </body> -->
<link rel="stylesheet" href="https://cdn.jsdelivr.net/npm/@docsearch/css@3" />
<div id="docsearch"></div>
<script src="https://cdn.jsdelivr.net/npm/@docsearch/js@3"></script>
<script>
  docsearch({
    container: '#docsearch',
    appId: 'ABC123XYZ',
    apiKey: '1a2b3c...',   // SEARCH-ONLY key, safe in client code
    indexName: 'acme_docs',
  });
</script>

Reload the portal, press the search box, type a term, and the autocomplete dropdown shows grouped results.

Complete Working Example

A single deploy-time script that crawls on every release and a matching front-end snippet. The script reads secrets from the environment so nothing sensitive lands in the repo:

#!/usr/bin/env bash
# reindex.sh — run after each docs deploy to refresh the Algolia index
set -euo pipefail

: "${APPLICATION_ID:?set APPLICATION_ID}"
: "${API_KEY:?set admin API_KEY}"   # write access — keep out of client code

CONFIG_FILE="docsearch-config.json"

echo "Seeding Algolia index from ${CONFIG_FILE}..."
docker run --rm \
  -e "APPLICATION_ID=${APPLICATION_ID}" \
  -e "API_KEY=${API_KEY}" \
  -e "CONFIG=$(jq -r tostring < "${CONFIG_FILE}")" \
  algolia/docsearch-scraper:latest

echo "Verifying record count..."
COUNT=$(curl -s "https://${APPLICATION_ID}-dsn.algolia.net/1/indexes/$(jq -r .index_name "${CONFIG_FILE}")" \
  -H "X-Algolia-API-Key: ${API_KEY}" \
  -H "X-Algolia-Application-Id: ${APPLICATION_ID}" | jq '.entries // 0')

if [ "${COUNT}" -eq 0 ]; then
  echo "ERROR: index is empty — check selectors in ${CONFIG_FILE}" >&2
  exit 1
fi
echo "Index holds ${COUNT} records. Done."

The front-end initialisation, kept in one file you include on every page:

<!-- docsearch.html — include in the portal <head>/footer partial -->
<link rel="stylesheet" href="https://cdn.jsdelivr.net/npm/@docsearch/css@3" />
<div id="docsearch"></div>
<script src="https://cdn.jsdelivr.net/npm/@docsearch/js@3"></script>
<script>
  docsearch({
    container: '#docsearch',
    appId: 'ABC123XYZ',
    apiKey: '1a2b3c...',     // search-only key
    indexName: 'acme_docs',
    insights: true,           // capture click/conversion analytics
    placeholder: 'Search the API docs',
  });
</script>

Wire reindex.sh into your deploy pipeline so the index never drifts from the published pages.

Gotchas & Edge Cases

Selectors that match nothing. The single biggest cause of an empty index is lvl1/text selectors pointing at elements that do not exist in the rendered HTML. Crawl one URL with the scraper’s verbose logs and confirm record counts per page before blaming the front end.

Shipping the admin key. The admin (write) key in client JavaScript lets anyone modify or wipe your index. Only the search-only key belongs in the browser. Keep the admin key in CI secrets and use it solely for crawling.

JS-rendered content. If your portal renders content client-side, the default crawler sees an empty shell. Either pre-render to static HTML at build time or enable the scraper’s js_render option with a Selenium endpoint so the DOM is populated before extraction.

FAQ

Do I need to pay for Algolia DocSearch?

The hosted DocSearch program is free for qualifying open-source and technical documentation, and Algolia runs the crawler for you. For private or commercial docs you self-host the crawler against your own Algolia application and pay standard Algolia usage.

Why does my search box show no results after setup?

The index is almost certainly empty because the crawler has not run, or it ran against selectors that match nothing. Trigger a crawl and confirm the record count is greater than zero in the Algolia dashboard before debugging the front end.

Can I run the DocSearch crawler myself?

Yes. Run the algolia/docsearch-scraper Docker image with your config JSON and an admin API key. This is the standard path for private docs or when you need to crawl on every deploy.