GitHub Connector

Index pull requests, commits, comments, and markdown in your GitHub repositories.

The GitHub connector enables the indexing of pull requests, commits, comments, and markdown in your GitHub repositories.

Create Access Token

To install the GitHub connector, use a GitHub account with admin access to the organization (requires read access to teams). Using this account, create a personal access token (currently Classic only) according to these instructions. You will need to ensure repo and user scopes are set (as shown below):

github-scopes

Note: for GitHub Enterprise Server installations, you would need to enable the site_admin scope too.

(Optional) Create GitHub App

Note: GitHub App flow does not work with GitHub Enterprise Server installations. You would need to use a personal access token like above

Instead of using a personal access token, a GitHub app can be created to install the GitHub connector. This provide better rate limits, especially if your org is on Enterprise Cloud. You can create a GitHub app according to these instructions.

You will need to ensure the following app permissions are set:

Repository Permissions:

PermissionsValue
AdministrationRead-only
ContentsRead-only
IssuesRead-only
MetadataRead-only
Pull RequestsRead-only

Organization Permissions:

PermissionsValue
AdministrationRead-only
MembersRead-only

Setup Webhooks

If live indexing of GitHub content is desired, you will want to setup Webhooks. Go to Settings->Webhooks->Add Webhook. After authentication, you can now setup the webhook with the following configs:

ConfigValue
Payload URLhttps://search.example.com/callback/github/
Content typeapplication/json
SecretGenerated value (e.g. openssl rand -base64 32)
SSL verificationEnable SSL verification
Active
Which events would you like to trigger this webhook?Let me select individual events

The following events are indexed at the repository level:

  • Collaborator add, remove, or changed
  • Commit comments
  • Issues
  • Issue comments
  • Pull requests
  • Pull requests reviews
  • Pull requests reviews comments
  • Pushes
  • Repositories

The following events are indexed at the organization level:

  • Team

Please manually check the boxes for each of the events above.

Warning: Setting up webhooks at the organization level will send webhook events for all repositories within the organization. If this is not desireable, please make changes to the configuration to only process and index specific repositories.

Provide Configuration

Provide these configuration values to your Deployment Engineer:

  • OrgName should be the identifier for your org (e.g. atolio)
  • ApplicationID is your GitHub app ID, can be found at your GitHub app settings page. This field only affects GitHub and GitHub Enterprise Cloud installations with no effects to GitHub Enterprise Server installations
  • InstallationID is created on the first installation of the GitHub app on your org, can be found at the end of the URL for your installation: https://github.com/organizations/{ORG}/settings/installations/{INSTALLATION_ID}. This field only affects GitHub and GitHub Enterprise Cloud installations with no effects to GitHub Enterprise Server installations
  • SamlAuthenticationLevel is where you enabled SAML authentication, 2 options: enterprise or org
  • EnterpriseBaseURL is the hostname of your GitHub Enterprise Server instance
  • EnterpriseSlug is your GitHub Enterprise name

Note: If you have many GitHub repositories, it may be worth using the includes or excludes feature in Atolio configuration to control the relevant repositories that should be indexed.

The GitHub connector provides two types of inclusion/exclusions in the resources field of the configuration as illustrated in the example below:

resources:
    repository:
        included:
            - lumen-infra
    file:
        included:
            - .md

The repository name lumen-infra in this example is the repo part in the repository’s URL (which has the following generic format: https://github.com/repos/{owner}/{repo}). The example above will only index the lumen-infra repository.

File filtering is done of the basis of file extension names (including the preceding dot .). The example above will configure the connector to index all markdown files. Note that by default file filtering is completely disabled as it is a relatively time consuming process.

Provide Secrets

Provide these secrets to your Deployment Engineer:

  • WebhookSecret is the webhook secret previously created
  • Token is the personal access token
  • AppPrivateKey is the GitHub App private key in PEM format

(Optional) Configure GitHub user mappings

If employees have brought pre-existing GitHub accounts to their company’s GitHub organization, the identity resolver cannot automatically map them to company user identities. For this reason, the GitHub configuration contains a user mapping table. For example, the entries below map two GitHub user IDs to the user’s username (i.e. email address) in the organization:

12345678:  user1@example.com # "user1"
87654321:  user2@example.com # "user2"

The numerical GitHub ID can easily be obtained using the following: https://api.github.com/users/{USERNAME} in a browser where {USERNAME} is replaced with the GitHub username which can be found in the user’s profile (under the full name printed below the user’s avatar). The ID is in the id field of the JSON response. Adding an entry like this for every user will allow Atolio to link the GitHub user IDs to other sources.