Setup Your Own Crawler and Push to Algolia Index
Algolia Search API Features
- Thin and minimal low-level HTTP client to interact with Algolia's API
- Works on both the browser and node.js
- UMD compatible, can be used with any module loader
- Built with TypeScript
Getting Started
You can find the repository for the crawler at https://github.com/algolia/docsearch-scraper.
Setup using Repo Secrets
To set up your own crawler and push to the Algolia index, you need to configure the following environment variables in your repository secrets:
ALGOLIA_API_KEY
: Set this to your Algolia admin API Key.ALGOLIA_APP_ID
: Set this to your Algolia Application ID.ALGOLIA_INDEX_NAME
: Set this to the name of your index.
Documentation
For detailed instructions, refer to the Algolia DocSearch documentation.
.env File
Create a .env
file in your project directory and add the following variables:
APPLICATION_ID=<your Algolia Application ID>
API_KEY=<your admin API Key>
Running the Command
To run the crawler, follow these steps:
- Install the necessary dependencies by running the following commands:
pip install --user pipx
pipx install pipenv
# If you encounter an error with Python venv
sudo apt install python3.10-venv
# Install dependencies
pipenv install
- Activate the virtual environment by running:
pipenv shell
- Run the scraper using the following command:
./docsearch run config.json
Manually Update package.json
If needed, update your package.json
file with the following dependencies:
{
"dependencies": {
"@docusaurus/core": "2.2.0",
"@docusaurus/preset-classic": "2.2.0",
// ...
}
}
Checking Docusaurus Version
To check the version of Docusaurus you have installed, run the following command:
npx docusaurus --version
By following these steps, you can set up your own crawler and push data to the Algolia index. Make sure to carefully follow the instructions and refer to the documentation for any further guidance.