GitHub

Overview

GitHub is an online tool for software collaboration.

This GitHub class uses the PyGitHub library to make requests to the GitHub REST API. The class provides methods to:

  • Get an individual user, organization, repo, issue, or pull request

  • Get lists of user or organization repos for a given username or organization_name

  • Get lists of repo issues, pull requests, and contributors for a given repo_name

  • Download files and tables

Note

API Credentials
  • If you have a GitHub account you can use your normal username and password to authenticate with the API.

  • You can also use a personal access token.

Quickstart

To instantiate the GitHub class using a username and password, you can store your username and password as environmental variables (GITHUB_USERNAME and GITHUB_PASSWORD), or pass them in as arguments. Alternatively, you can provide a personal access token as an environmental variable (GITHUB_ACCESS_TOKEN) or as an argument.

from parsons import GitHub

# Authenticate by passing a username and password arguments
github = GitHub(username='my_username', password='my_password')

# Authenticate by passing an access token as an argument
github = GitHub(access_token='my_access_token')

# Authenticate with environmental variables
github = GitHub()

With the class instantiated, you can now call various endpoints.

# Get repo by its full name (account/name)
parsons_repo = github.get_repo("move-coop/parsons")

# Get the first page of a repo's issues as a Table
parsons_issues_table = github.list_repo_issues("move-coop/parsons")

# Download Parsons README.md to local "/tmp/README.md"
parsons_readme_path = github.download_file("move-coop/parsons", "README.md", local_path="/tmp/README.md")

API

class parsons.GitHub(username=None, password=None, access_token=None)[source]

Creates a GitHub class for accessing the GitHub API.

Uses parsons.utilities.check_env to load credentials from environment variables if not supplied. Supports either a username and password or an access token for authentication. The client also supports unauthenticated access.

Parameters:
  • username – Optional[str] Username of account to use for credentials. Can be set with GITHUB_USERNAME environment variable.

  • password – Optional[str] Password of account to use for credentials. Can be set with GITHUB_PASSWORD environment variable.

  • access_token – Optional[str] Access token to use for credentials. Can be set with GITHUB_ACCESS_TOKEN environment variable.

get_user(username)[source]

Loads a GitHub user by username

Parameters:

username – str Username of user to load

Returns:

dict

User information

get_organization(organization_name)[source]

Loads a GitHub organization by name

Parameters:

organization_name – str Name of organization to load

Returns:

dict

Organization information

get_repo(repo_name)[source]

Loads a GitHub repo by name

Parameters:

repo_name – str Full repo name (account/name)

Returns:

dict

Repo information

list_user_repos(username, page=None, page_size=100)[source]

List user repos with pagination, returning a Table

Parameters:
  • username – str GitHub username

  • page – Optional[int] Page number. All results are returned if not set.

  • page_size – int Page size. Defaults to 100.

Returns:

Table

Table with page of user repos

list_organization_repos(organization_name, page=None, page_size=100)[source]

List organization repos with pagination, returning a Table

Parameters:
  • organization_name – str GitHub organization name

  • page – Optional[int] Page number. All results are returned if not set.

  • page_size – int Page size. Defaults to 100.

Returns:

Table

Table with page of organization repos

get_issue(repo_name, issue_number)[source]

Loads a GitHub issue

Parameters:
  • repo_name – str Full repo name (account/name)

  • issue_number – int Number of issue to load

Returns:

dict

Issue information

list_repo_issues(repo_name, state='open', assignee=None, creator=None, mentioned=None, labels=None, sort='created', direction='desc', since=None, page=None, page_size=100)[source]

List issues for a given repo

Parameters:
  • repo_name – str Full repo name (account/name)

  • state – str State of issues to return. One of “open”, “closed”, “all”. Defaults to “open”.

  • assignee – Optional[str] Name of assigned user, “none”, or “*”.

  • creator – Optional[str] Name of user that created the issue.

  • mentioned – Optional[str] Name of user mentioned in the issue.

  • labels – list[str] List of label names. Defaults to []

  • sort – str What to sort results by. One of “created”, “updated”, “comments”. Defaults to “created”.

  • direction – str Direction to sort. One of “asc”, “desc”. Defaults to “desc”.

  • since – Optional[Union[datetime.datetime, datetime.date]] Timestamp to pull issues since. Defaults to None.

  • page – Optional[int] Page number. All results are returned if not set.

  • page_size – int Page size. Defaults to 100.

Returns:

Table

Table with page of repo issues

get_pull_request(repo_name, pull_request_number)[source]

Loads a GitHub pull request

Parameters:
  • repo_name – str Full repo name (account/name)

  • pull_request_number – int Pull request number

Returns:

dict

Pull request information

list_repo_pull_requests(repo_name, state='open', base=None, sort='created', direction='desc', page=None, page_size=100)[source]

Lists pull requests for a given repo

Parameters:
  • repo_name – str Full repo name (account/name)

  • state – str One of “open, “closed”, “all”. Defaults to “open”.

  • base – Optional[str] Base branch to filter pull requests by.

  • sort – str How to sort pull requests. One of “created”, “updated”, “popularity”. Defaults to “created”.

  • direction – str Direction to sort by. Defaults to “desc”.

  • page – Optional[int] Page number. All results are returned if not set.

  • page_size – int Page size. Defaults to 100.

Returns:

Table

Table with page of repo pull requests

list_repo_contributors(repo_name, page=None, page_size=100)[source]

Lists contributors for a given repo

Parameters:
  • repo_name – str Full repo name (account/name)

  • page – Optional[int] Page number. All results are returned if not set.

  • page_size – int Page size. Defaults to 100.

Returns:

Table

Table with page of repo contributors

download_file(repo_name, path, branch=None, local_path=None)[source]

Download a file from a repo by path and branch. Defaults to the repo’s default branch if branch is not supplied.

Uses the download_url directly rather than the API because the API only supports contents up to 1MB from a repo directly, and the process for downloading larger files through the API is much more involved.

Because download_url does not go through the API, it does not support username / password authentication, and requires a token to authenticate.

Parameters:
  • repo_name – str Full repo name (account/name)

  • path – str Path from the repo base directory

  • branch – Optional[str] Branch to download file from. Defaults to repo default branch

  • local_path – Optional[str] Local file path to download file to. Will create a temp file if not supplied.

Returns:

str

File path of downloaded file

download_table(repo_name, path, branch=None, local_path=None, delimiter=',')[source]

Download a CSV file from a repo by path and branch as a Parsons Table.

Parameters:
  • repo_name – str Full repo name (account/name)

  • path – str Path from the repo base directory

  • branch – Optional[str] Branch to download file from. Defaults to repo default branch

  • local_path – Optional[str] Local file path to download file to. Will create a temp file if not supplied.

  • delimiter – Optional[str] The CSV delimiter to use to parse the data. Defaults to ‘,’

Returns:

Parsons Table

See Parsons Table for output options.