GitHub

Overview

GitHub is an online tool for software collaboration.

This GitHub class uses the PyGitHub library to make requests to the GitHub REST API. The class provides methods to:

  • Get an individual user, organization, repo, issue, or pull request

  • Get lists of user or organization repos for a given username or organization_name

  • Get lists of repo issues, pull requests, and contributors for a given repo_name

  • Download files and tables

Note

API Credentials
  • If you have a GitHub account you can use your normal username and password to authenticate with the API.

  • You can also use a personal access token.

Quickstart

To instantiate the GitHub class using a username and password, you can store your username and password as environmental variables (GITHUB_USERNAME and GITHUB_PASSWORD), or pass them in as arguments. Alternatively, you can provide a personal access token as an environmental variable (GITHUB_ACCESS_TOKEN) or as an argument.

from parsons import GitHub

# Authenticate by passing a username and password arguments
github = GitHub(username='my_username', password='my_password')

# Authenticate by passing an access token as an argument
github = GitHub(access_token='my_access_token')

# Authenticate with environmental variables
github = GitHub()

With the class instantiated, you can now call various endpoints.

# Get repo by its full name (account/name)
parsons_repo = github.get_repo("move-coop/parsons")

# Get the first page of a repo's issues as a Table
parsons_issues_table = github.list_repo_issues("move-coop/parsons")

# Download Parsons README.md to local "/tmp/README.md"
parsons_readme_path = github.download_file("move-coop/parsons", "README.md", local_path="/tmp/README.md")

API

class parsons.GitHub(username=None, password=None, access_token=None)[source]

Creates a GitHub class for accessing the GitHub API.

Uses parsons.utilities.check_env to load credentials from environment variables if not supplied. Supports either a username and password or an access token for authentication. The client also supports unauthenticated access.

Args:
username: Optional[str]

Username of account to use for credentials. Can be set with GITHUB_USERNAME environment variable.

password: Optional[str]

Password of account to use for credentials. Can be set with GITHUB_PASSWORD environment variable.

access_token: Optional[str]

Access token to use for credentials. Can be set with GITHUB_ACCESS_TOKEN environment variable.

get_user(username)[source]

Loads a GitHub user by username

Args:
username: str

Username of user to load

Returns:
dict

User information

get_organization(organization_name)[source]

Loads a GitHub organization by name

Args:
organization_name: str

Name of organization to load

Returns:
dict

Organization information

get_repo(repo_name)[source]

Loads a GitHub repo by name

Args:
repo_name: str

Full repo name (account/name)

Returns:
dict

Repo information

list_user_repos(username, page=None, page_size=100)[source]

List user repos with pagination, returning a Table

Args:
username: str

GitHub username

page: Optional[int]

Page number. All results are returned if not set.

page_size: int

Page size. Defaults to 100.

Returns:
Table

Table with page of user repos

list_organization_repos(organization_name, page=None, page_size=100)[source]

List organization repos with pagination, returning a Table

Args:
organization_name: str

GitHub organization name

page: Optional[int]

Page number. All results are returned if not set.

page_size: int

Page size. Defaults to 100.

Returns:
Table

Table with page of organization repos

get_issue(repo_name, issue_number)[source]

Loads a GitHub issue

Args:
repo_name: str

Full repo name (account/name)

issue_number: int

Number of issue to load

Returns:
dict

Issue information

list_repo_issues(repo_name, state='open', assignee=None, creator=None, mentioned=None, labels=[], sort='created', direction='desc', since=None, page=None, page_size=100)[source]

List issues for a given repo

Args:
repo_name: str

Full repo name (account/name)

state: str

State of issues to return. One of “open”, “closed”, “all”. Defaults to “open”.

assignee: Optional[str]

Name of assigned user, “none”, or “*”.

creator: Optional[str]

Name of user that created the issue.

mentioned: Optional[str]

Name of user mentioned in the issue.

labels: list[str]

List of label names. Defaults to []

sort: str

What to sort results by. One of “created”, “updated”, “comments”. Defaults to “created”.

direction: str

Direction to sort. One of “asc”, “desc”. Defaults to “desc”.

since: Optional[Union[datetime.datetime, datetime.date]]

Timestamp to pull issues since. Defaults to None.

page: Optional[int]

Page number. All results are returned if not set.

page_size: int

Page size. Defaults to 100.

Returns:
Table

Table with page of repo issues

get_pull_request(repo_name, pull_request_number)[source]

Loads a GitHub pull request

Args:
repo_name: str

Full repo name (account/name)

pull_request_number: int

Pull request number

Returns:
dict

Pull request information

list_repo_pull_requests(repo_name, state='open', base=None, sort='created', direction='desc', page=None, page_size=100)[source]

Lists pull requests for a given repo

Args:
repo_name: str

Full repo name (account/name)

state: str

One of “open, “closed”, “all”. Defaults to “open”.

base: Optional[str]

Base branch to filter pull requests by.

sort: str

How to sort pull requests. One of “created”, “updated”, “popularity”. Defaults to “created”.

direction: str

Direction to sort by. Defaults to “desc”.

page: Optional[int]

Page number. All results are returned if not set.

page_size: int

Page size. Defaults to 100.

Returns:
Table

Table with page of repo pull requests

list_repo_contributors(repo_name, page=None, page_size=100)[source]

Lists contributors for a given repo

Args:
repo_name: str

Full repo name (account/name)

page: Optional[int]

Page number. All results are returned if not set.

page_size: int

Page size. Defaults to 100.

Returns:
Table

Table with page of repo contributors

download_file(repo_name, path, branch=None, local_path=None)[source]

Download a file from a repo by path and branch. Defaults to the repo’s default branch if branch is not supplied.

Uses the download_url directly rather than the API because the API only supports contents up to 1MB from a repo directly, and the process for downloading larger files through the API is much more involved.

Because download_url does not go through the API, it does not support username / password authentication, and requires a token to authenticate.

Args:
repo_name: str

Full repo name (account/name)

path: str

Path from the repo base directory

branch: Optional[str]

Branch to download file from. Defaults to repo default branch

local_path: Optional[str]

Local file path to download file to. Will create a temp file if not supplied.

Returns:
str

File path of downloaded file

download_table(repo_name, path, branch=None, local_path=None, delimiter=',')[source]

Download a CSV file from a repo by path and branch as a Parsons Table.

Args:
repo_name: str

Full repo name (account/name)

path: str

Path from the repo base directory

branch: Optional[str]

Branch to download file from. Defaults to repo default branch

local_path: Optional[str]

Local file path to download file to. Will create a temp file if not supplied.

delimiter: Optional[str]

The CSV delimiter to use to parse the data. Defaults to ‘,’

Returns:
Parsons Table

See Parsons Table for output options.