Skip to main content

Setup Your Own Crawler and Push to Algolia Index

Algolia Search API Features

  • Thin and minimal low-level HTTP client to interact with Algolia's API
  • Works on both the browser and node.js
  • UMD compatible, can be used with any module loader
  • Built with TypeScript

Getting Started

You can find the repository for the crawler at https://github.com/algolia/docsearch-scraper.

Setup using Repo Secrets

To set up your own crawler and push to the Algolia index, you need to configure the following environment variables in your repository secrets:

  • ALGOLIA_API_KEY: Set this to your Algolia admin API Key.
  • ALGOLIA_APP_ID: Set this to your Algolia Application ID.
  • ALGOLIA_INDEX_NAME: Set this to the name of your index.

Documentation

For detailed instructions, refer to the Algolia DocSearch documentation.

.env File

Create a .env file in your project directory and add the following variables:

APPLICATION_ID=<your Algolia Application ID>
API_KEY=<your admin API Key>

Running the Command

To run the crawler, follow these steps:

  1. Install the necessary dependencies by running the following commands:
pip install --user pipx
pipx install pipenv

# If you encounter an error with Python venv
sudo apt install python3.10-venv

# Install dependencies
pipenv install
  1. Activate the virtual environment by running:
pipenv shell
  1. Run the scraper using the following command:
./docsearch run config.json

Manually Update package.json

If needed, update your package.json file with the following dependencies:

package.json
{
"dependencies": {
"@docusaurus/core": "2.2.0",
"@docusaurus/preset-classic": "2.2.0",
// ...
}
}

Checking Docusaurus Version

To check the version of Docusaurus you have installed, run the following command:

npx docusaurus --version

By following these steps, you can set up your own crawler and push data to the Algolia index. Make sure to carefully follow the instructions and refer to the documentation for any further guidance.