Azure: Blob Storage

Overview

Azure Blob Storage is a cloud file storage system that uses storage accounts to organize containers (similar to “buckets” for other storage providers) in which to store arbitrary files referred to as ‘blobs’. This Parsons integration currently only implements block blobs, not page or append blobs.

Note

Authentication
This connector requires authentication credentials for an Azure Blob Storage storage account. The azure-storage-blob library is used for this connector, and examples of how to create and use multiple types of credentials are included in the documentation.

Quickstart

This class requires a credentials argument and either an account_name or an account_url argument that includes the account name. You can store these as environmental variables (AZURE_CREDENTIAL, AZURE_ACCOUNT_NAME, and AZURE_ACCOUNT_URL, respectively) or pass them in as arguments:

Instantiate class

from parsons import AzureBlobStorage

# First approach: Use API credentials via environmental variables
azure_blob = AzureBlobStorage()

# Second approach: Pass API credentials as arguments
azure_blob = AzureBlobStorage(account_name='my_account_name', credential='1234')

List containers and blobs

# Get all container names for a storage account
container_names = azure_blob.list_containers()

# Get all blob names for a storage account and container
blob_names = azure_blob.list_blobs(container_names[0])

Create a blob from a file or Table

# Upload a CSV file from a local file path and set the content type
azure_blob.put_blob('blob_name', 'test1.csv', './test1.csv', content_type='text/csv')

# Create a Table and upload it as a JSON blob
table = Table([{'first': 'Test', 'last': 'Person'}])
azure_blob.upload_table(table, 'blob_name', 'test2.json', data_type='json')

Download a blob

# Download to a temporary file path
temp_file_path = azure_blob.download_blob('blob_name', 'test.csv')

# Download to a specific file path
azure_blob.download_blob('blob_name', 'test.csv', local_path='/tmp/test.csv')

API

class parsons.AzureBlobStorage(account_name=None, credential=None, account_domain='blob.core.windows.net', account_url=None)[source]

Instantiate AzureBlobStorage Class for a given Azure storage account.

Args:
account_name: str
The name of the Azure storage account to use. Not required if AZURE_ACCOUNT_NAME environment variable is set, or if account_url is supplied.
credential: str
An account shared access key with access to the Azure storage account, an SAS token string, or an instance of a TokenCredentials class. Not required if AZURE_CREDENTIAL environment variable is set.
account_domain: str
The domain of the Azure storage account, defaults to “blob.core.windows.net”. Not required if AZURE_ACCOUNT_DOMAIN environment variable is set or if account_url is supplied.
account_url: str
The account URL for the Azure storage account including the account name and domain. Not required if AZURE_ACCOUNT_URL environment variable is set.
Returns:
AzureBlobStorage
list_containers()[source]

Returns a list of container names for the storage account

Returns:
list[str]
List of container names
container_exists(container_name)[source]

Verify that a container exists within the storage account

Args:
container_name: str
The name of the container
Returns:
bool
get_container(container_name)[source]

Returns a container client

Args:
container_name: str
The name of the container
Returns:
ContainerClient
create_container(container_name, metadata=None, public_access=None, **kwargs)[source]

Create a container

Args:
container_name: str
The name of the container
metadata: Optional[dict[str, str]]
A dict with metadata to associated with the container.
public_access: Optional[Union[PublicAccess, str]]
Settings for public access on the container, can be ‘container’ or ‘blob’ if not None
kwargs:
Additional arguments to be supplied to the Azure Blob Storage API. See Azure Blob Storage SDK documentation for more info.
Returns:
ContainerClient
delete_container(container_name)[source]

Delete a container.

Args:
container_name: str
The name of the container
Returns:
None
list_blobs(container_name, name_starts_with=None)[source]

List all of the names of blobs in a container

Args:
container_name: str
The name of the container
name_starts_with: Optional[str]
A prefix to filter blob names
Returns:
list[str]
A list of blob names
blob_exists(container_name, blob_name)[source]

Verify that a blob exists in the specified container

Args:
container_name: str
The container name
blob_name: str
The blob name
Returns:
bool
get_blob(container_name, blob_name)[source]

Get a blob object

Args:
container_name: str
The container name
blob_name: str
The blob name
Returns:
BlobClient
get_blob_url(container_name, blob_name, account_key=None, permission=None, expiry=None, start=None)[source]

Get a URL with a shared access signature for a blob

Args:
container_name: str
The container name
blob_name: str
The blob name
account_key: Optional[str]
An account shared access key for the storage account. Will default to the key used on initialization if one was provided as the credential, but required if it was not.
permission: Optional[Union[BlobSasPermissions, str]]
Permissions associated with the blob URL. Can be either a BlobSasPermissions object or a string where ‘r’, ‘a’, ‘c’, ‘w’, and ‘d’ correspond to read, add, create, write, and delete permissions respectively.
expiry: Optional[Union[datetime, str]]
The datetime when the URL should expire. Defaults to UTC.
start: Optional[Union[datetime, str]]
The datetime when the URL should become valid. Defaults to UTC. If it is None, the URL becomes active when it is first created.
Returns:
str
URL with shared access signature for blob
put_blob(container_name, blob_name, local_path, **kwargs)[source]

Puts a blob (aka file) in a bucket

Args:
container_name: str
The name of the container to store the blob
blob_name: str
The name of the blob to be stored
local_path: str
The local path of the file to upload
kwargs:
Additional arguments to be supplied to the Azure Blob Storage API. See Azure Blob Storage SDK documentation for more info. Any keys that belong to the ContentSettings object will be provided to that class directly.
Returns:
BlobClient
download_blob(container_name, blob_name, local_path=None)[source]

Downloads a blob from a container into the specified file path or a temporary file path

Args:
container_name: str
The container name
blob_name: str
The blob name
local_path: Optional[str]
The local path where the file will be downloaded. If not specified, a temporary file will be created and returned, and that file will be removed automatically when the script is done running.
Returns:
str
The path of the downloaded file
delete_blob(container_name, blob_name)[source]

Delete a blob in a specified container.

Args:
container_name: str
The container name
blob_name: str
The blob name
Returns:
None
upload_table(table, container_name, blob_name, data_type='csv', **kwargs)[source]

Load the data from a Parsons table into a blob.

Args:
table: obj
A Parsons Table
container_name: str
The container name to upload the data into
blob_name: str
The blob name to upload the data into
data_type: str
The file format to use when writing the data. One of: csv or json
kwargs:
Additional keyword arguments to supply to put_blob
Returns:
BlobClient