Indexing Documents with Opper GitHub Actions
This guide explores how to automatically index your documentation in Opper Indexes using GitHub Actions for integration with your CI/CD pipelines. With these tools, you can automate the indexing of repository content whenever files change, ensuring your documentation is always searchable and up-to-date. We will cover our two official Opper actions: the Repository Indexer for files directly from your repository, and the Web Indexer for content from built and deployed websites.
Overview
Opper provides two GitHub Actions for different indexing needs:
- Repository Indexer Action - Indexes files directly from your repository
- Web Indexer Action - Scrapes and indexes content from websites
Let us explore how to implement each of these actions in your workflows.
Repository Indexer Action
The Repository Indexer Action is designed to automatically index files directly from your GitHub repository. This is ideal for documentation, markdown files, and other text content stored in your repo.
Basic Implementation
Here's a simple workflow file that indexes all markdown files in your repository whenever changes are pushed to the main branch:
This minimal configuration will:
- Run whenever code is pushed to the main branch
- Checkout your repository code
- Index all markdown (.md, .mdx) and text (.txt) files using Opper
Customizing the Index
You can further customize what gets indexed with additional parameters:
Configuration Options
The Repository Indexer Action accepts the following parameters:
Parameter | Description | Default | Required |
---|---|---|---|
apikey | Your Opper API key | - | Yes |
index | Name of the Opper index | repo-docs | No |
folder | Directory to index | . (root) | No |
file_types | File extensions to index | .md .mdx .txt | No |
model | Custom model for metadata extraction | Default Opper model | No |
Web Indexer Action
The Web Indexer Action scrapes content from websites and adds it to your Opper index. This is perfect for indexing documentation sites that have been built and deployed, especially when the final website's content differs from the source files in your repository.
When to Use the Web Indexer
The Web Indexer Action is ideal when:
- Your documentation is generated or transformed during the build process
- You need to index the final rendered HTML content, not the source markdown
- You have a static site generator or documentation platform that produces a website
- The site structure in production differs from your repository structure
- Your documentation includes dynamically generated content not present in the source files
Basic Implementation
A common pattern is to trigger the Web Indexer after your documentation site has been successfully built and deployed. Here's a workflow file that demonstrates this:
This workflow:
- Builds and deploys your documentation site when changes are pushed to the main branch
- Uses the Web Indexer Action to scrape and index the deployed site
Configuration Options
The Web Indexer Action accepts the following parameters:
Parameter | Description | Required |
---|---|---|
apikey | Your Opper API key | Yes |
index | Name of the Opper index | Yes |
url | The URL to start scraping from | Yes |
Advanced Use Cases
Combining Both Actions
You can use both actions together to create a comprehensive documentation index:
Conclusion
GitHub Actions provide a powerful way to automate the indexing of your documentation into Opper. Whether you're working with repository files or external websites, these actions make it easy to keep your knowledge base searchable and up-to-date.
By setting up these workflows, you can ensure that your documentation is automatically indexed whenever it changes, making it immediately searchable through Opper-powered applications and tools.
For more information, visit the GitHub repositories for the Repository Indexer Action and Web Indexer Action.