Indexing Documents with Opper GitHub Actions

This guide explores how to automatically index your documentation in Opper Indexes using GitHub Actions for integration with your CI/CD pipelines. With these tools, you can automate the indexing of repository content whenever files change, ensuring your documentation is always searchable and up-to-date. We will cover our two official Opper actions: the Repository Indexer for files directly from your repository, and the Web Indexer for content from built and deployed websites.

Overview

Opper provides two GitHub Actions for different indexing needs:

  1. Repository Indexer Action - Indexes files directly from your repository
  2. Web Indexer Action - Scrapes and indexes content from websites

Let us explore how to implement each of these actions in your workflows.

Repository Indexer Action

The Repository Indexer Action is designed to automatically index files directly from your GitHub repository. This is ideal for documentation, markdown files, and other text content stored in your repo.

Basic Implementation

Here's a simple workflow file that indexes all markdown files in your repository whenever changes are pushed to the main branch:

This minimal configuration will:

  • Run whenever code is pushed to the main branch
  • Checkout your repository code
  • Index all markdown (.md, .mdx) and text (.txt) files using Opper

Customizing the Index

You can further customize what gets indexed with additional parameters:

Configuration Options

The Repository Indexer Action accepts the following parameters:

ParameterDescriptionDefaultRequired
apikeyYour Opper API key-Yes
indexName of the Opper indexrepo-docsNo
folderDirectory to index. (root)No
file_typesFile extensions to index.md .mdx .txtNo
modelCustom model for metadata extractionDefault Opper modelNo

Web Indexer Action

The Web Indexer Action scrapes content from websites and adds it to your Opper index. This is perfect for indexing documentation sites that have been built and deployed, especially when the final website's content differs from the source files in your repository.

When to Use the Web Indexer

The Web Indexer Action is ideal when:

  • Your documentation is generated or transformed during the build process
  • You need to index the final rendered HTML content, not the source markdown
  • You have a static site generator or documentation platform that produces a website
  • The site structure in production differs from your repository structure
  • Your documentation includes dynamically generated content not present in the source files

Basic Implementation

A common pattern is to trigger the Web Indexer after your documentation site has been successfully built and deployed. Here's a workflow file that demonstrates this:

This workflow:

  1. Builds and deploys your documentation site when changes are pushed to the main branch
  2. Uses the Web Indexer Action to scrape and index the deployed site

Configuration Options

The Web Indexer Action accepts the following parameters:

ParameterDescriptionRequired
apikeyYour Opper API keyYes
indexName of the Opper indexYes
urlThe URL to start scraping fromYes

Advanced Use Cases

Combining Both Actions

You can use both actions together to create a comprehensive documentation index:

Conclusion

GitHub Actions provide a powerful way to automate the indexing of your documentation into Opper. Whether you're working with repository files or external websites, these actions make it easy to keep your knowledge base searchable and up-to-date.

By setting up these workflows, you can ensure that your documentation is automatically indexed whenever it changes, making it immediately searchable through Opper-powered applications and tools.

For more information, visit the GitHub repositories for the Repository Indexer Action and Web Indexer Action.