Azure: Blob Storage
Overview
Azure Blob Storage is a cloud file storage system that uses storage accounts to organize containers (similar to “buckets” for other storage providers) in which to store arbitrary files referred to as ‘blobs’. This Parsons integration currently only implements block blobs, not page or append blobs.
Note
- Authentication
This connector requires authentication credentials for an Azure Blob Storage storage account. The
azure-storage-blob
library is used for this connector, and examples of how to create and use multiple types of credentials are included in the documentation.
Quickstart
This class requires a credentials
argument and either an account_name
or an account_url
argument that
includes the account name. You can store these as environmental variables (AZURE_CREDENTIAL
, AZURE_ACCOUNT_NAME
,
and AZURE_ACCOUNT_URL
, respectively) or pass them in as arguments:
Instantiate class
from parsons import AzureBlobStorage
# First approach: Use API credentials via environmental variables
azure_blob = AzureBlobStorage()
# Second approach: Pass API credentials as arguments
azure_blob = AzureBlobStorage(account_name='my_account_name', credential='1234')
List containers and blobs
# Get all container names for a storage account
container_names = azure_blob.list_containers()
# Get all blob names for a storage account and container
blob_names = azure_blob.list_blobs(container_names[0])
Create a blob from a file or Table
# Upload a CSV file from a local file path and set the content type
azure_blob.put_blob('blob_name', 'test1.csv', './test1.csv', content_type='text/csv')
# Create a Table and upload it as a JSON blob
table = Table([{'first': 'Test', 'last': 'Person'}])
azure_blob.upload_table(table, 'blob_name', 'test2.json', data_type='json')
Download a blob
# Download to a temporary file path
temp_file_path = azure_blob.download_blob('blob_name', 'test.csv')
# Download to a specific file path
azure_blob.download_blob('blob_name', 'test.csv', local_path='/tmp/test.csv')
API
- class parsons.AzureBlobStorage(account_name=None, credential=None, account_domain='blob.core.windows.net', account_url=None)[source]
Instantiate AzureBlobStorage Class for a given Azure storage account.
- Args:
- account_name: str
The name of the Azure storage account to use. Not required if
AZURE_ACCOUNT_NAME
environment variable is set, or ifaccount_url
is supplied.- credential: str
An account shared access key with access to the Azure storage account, an SAS token string, or an instance of a TokenCredentials class. Not required if
AZURE_CREDENTIAL
environment variable is set.- account_domain: str
The domain of the Azure storage account, defaults to “blob.core.windows.net”. Not required if
AZURE_ACCOUNT_DOMAIN
environment variable is set or ifaccount_url
is supplied.- account_url: str
The account URL for the Azure storage account including the account name and domain. Not required if
AZURE_ACCOUNT_URL
environment variable is set.
- Returns:
AzureBlobStorage
- list_containers()[source]
Returns a list of container names for the storage account
- Returns:
- list[str]
List of container names
- container_exists(container_name)[source]
Verify that a container exists within the storage account
- Args:
- container_name: str
The name of the container
- Returns:
bool
- get_container(container_name)[source]
Returns a container client
- Args:
- container_name: str
The name of the container
- Returns:
ContainerClient
- create_container(container_name, metadata=None, public_access=None, **kwargs)[source]
Create a container
- Args:
- container_name: str
The name of the container
- metadata: Optional[dict[str, str]]
A dict with metadata to associated with the container.
- public_access: Optional[Union[PublicAccess, str]]
Settings for public access on the container, can be ‘container’ or ‘blob’ if not
None
- kwargs:
Additional arguments to be supplied to the Azure Blob Storage API. See Azure Blob Storage SDK documentation for more info.
- Returns:
ContainerClient
- delete_container(container_name)[source]
Delete a container.
- Args:
- container_name: str
The name of the container
- Returns:
None
- list_blobs(container_name, name_starts_with=None)[source]
List all of the names of blobs in a container
- Args:
- container_name: str
The name of the container
- name_starts_with: Optional[str]
A prefix to filter blob names
- Returns:
- list[str]
A list of blob names
- blob_exists(container_name, blob_name)[source]
Verify that a blob exists in the specified container
- Args:
- container_name: str
The container name
- blob_name: str
The blob name
- Returns:
bool
- get_blob(container_name, blob_name)[source]
Get a blob object
- Args:
- container_name: str
The container name
- blob_name: str
The blob name
- Returns:
BlobClient
- get_blob_url(container_name, blob_name, account_key=None, permission=None, expiry=None, start=None)[source]
Get a URL with a shared access signature for a blob
- Args:
- container_name: str
The container name
- blob_name: str
The blob name
- account_key: Optional[str]
An account shared access key for the storage account. Will default to the key used on initialization if one was provided as the credential, but required if it was not.
- permission: Optional[Union[BlobSasPermissions, str]]
Permissions associated with the blob URL. Can be either a BlobSasPermissions object or a string where ‘r’, ‘a’, ‘c’, ‘w’, and ‘d’ correspond to read, add, create, write, and delete permissions respectively.
- expiry: Optional[Union[datetime, str]]
The datetime when the URL should expire. Defaults to UTC.
- start: Optional[Union[datetime, str]]
The datetime when the URL should become valid. Defaults to UTC. If it is
None
, the URL becomes active when it is first created.
- Returns:
- str
URL with shared access signature for blob
- put_blob(container_name, blob_name, local_path, **kwargs)[source]
Puts a blob (aka file) in a bucket
- Args:
- container_name: str
The name of the container to store the blob
- blob_name: str
The name of the blob to be stored
- local_path: str
The local path of the file to upload
- kwargs:
Additional arguments to be supplied to the Azure Blob Storage API. See Azure Blob Storage SDK documentation for more info. Any keys that belong to the
ContentSettings
object will be provided to that class directly.
- Returns:
BlobClient
- download_blob(container_name, blob_name, local_path=None)[source]
Downloads a blob from a container into the specified file path or a temporary file path
- Args:
- container_name: str
The container name
- blob_name: str
The blob name
- local_path: Optional[str]
The local path where the file will be downloaded. If not specified, a temporary file will be created and returned, and that file will be removed automatically when the script is done running.
- Returns:
- str
The path of the downloaded file
- delete_blob(container_name, blob_name)[source]
Delete a blob in a specified container.
- Args:
- container_name: str
The container name
- blob_name: str
The blob name
- Returns:
None
- upload_table(table, container_name, blob_name, data_type='csv', **kwargs)[source]
Load the data from a Parsons table into a blob.
- Args:
- table: obj
- container_name: str
The container name to upload the data into
- blob_name: str
The blob name to upload the data into
- data_type: str
The file format to use when writing the data. One of: csv or json
- kwargs:
Additional keyword arguments to supply to
put_blob
- Returns:
BlobClient