SFTP

The SFTP class allows you to interact with SFTP services, using the Paramiko SFTP library under the hood.

The class provides methods to:

  • Create SFTP connections

  • Make, remove, and list the contents of directories

  • Get, put, remove, and check the size of files

Note

Authentication

Depending on the server provider, SFTP may require either password or public key authentication. The SFTP class supports both methods via password and rsa_private_key_file arguments.

Quickstart

To instantiate SFTP, pass your host name, user name, and either a password or an authentication key file as keyword arguments:

from parsons import SFTP

sftp = SFTP(host='my_hostname', username='my_username', password='my_password')

# List contents of a directory
sftp.list_directory(remote_path='my_dir')

# Get a file
sftp.get_file(remote_path='my_dir/my_csv.csv', local_path='my_local_path/my_csv.csv')

To batch multiple methods using a single connection, you can create a connection and use it in a with block:

connection = sftp.create_connection()

with connection as conn:
    sftp.make_directory('my_dir', connection=conn)
    sftp.put_file('my_csv.csv', connection=conn)

API

class parsons.SFTP(host, username, password, port=22, rsa_private_key_file=None)[source]

Instantiate SFTP Class

Args:
host: str

The host name

username: str

The user name

password: str

The password

rsa_private_key_file str

Absolute path to a private RSA key used to authenticate stfp connection

port: int

Specify if different than the standard port 22

Returns:

SFTP Class

create_connection()[source]

Create an SFTP connection.

Returns:

SFTP Connection object

list_directory(remote_path='.', connection=None)[source]

List the contents of a directory

Args:
remote_path: str

The remote path of the directory

connection: obj

An SFTP connection object

Returns:

list of files and subdirectories in the provided directory

make_directory(remote_path, connection=None)[source]

Makes a new directory on the SFTP server

Args:
remote_path: str

The remote path of the directory

connection: obj

An SFTP connection object

remove_directory(remote_path, connection=None)[source]

Remove a directory from the SFTP server

Args:
remote_path: str

The remote path of the directory

connection: obj

An SFTP connection object

get_file(remote_path, local_path=None, connection=None)[source]

Download a file from the SFTP server

Args:
remote_path: str

The remote path of the file to download

local_path: str

The local path where the file will be downloaded. If not specified, a temporary file will be created and returned, and that file will be removed automatically when the script is done running.

connection: obj

An SFTP connection object

Returns:
str

The path of the local file

get_files(files_to_download=None, remote=None, connection=None, pattern=None, local_paths=None)[source]

Download a list of files, either by providing the list explicitly, providing directories that contain files to download, or both.

Args:
files_to_download: list

A list of full remote paths (can be relative) to files to download

remote: str or list

A path to a remote directory or a list of paths

connection: obj

An SFTP connection object

pattern: str

A regex pattern with which to select file names. Defaults to None, in which case all files will be selected.

local_paths: list

A list of paths to which to save the selected files. Defaults to None. If it is not the same length as the files to be fetched, temporary files are used instead.

Returns:
list

Local paths where the files are saved.

get_table(remote_path, connection=None)[source]

Download a csv from the server and convert into a Parsons table.

The file may be compressed with gzip, or zip, but may not contain multiple files in the archive.

Args:
remote_path: str

The remote path of the file to download

connection: obj

An SFTP connection object

Returns:
Parsons Table

See Parsons Table for output options.

put_file(local_path: str, remote_path: str, connection=None, verbose: bool = True) None[source]

Put a file on the SFTP server

Args:
local_path: str

The local path of the source file

remote_path: str

The remote path of the new file

connection: obj

An SFTP connection object

verbose: bool

Log progress every 5MB. Defaults to True.

remove_file(remote_path, connection=None)[source]

Delete a file on the SFTP server

Args:
remote_path: str

The remote path of the file

connection: obj

An SFTP connection object

get_file_size(remote_path, connection=None)[source]

Get the size of a file in MB on the SFTP server. The file is not downloaded locally.

Args:
remote_path: str

The remote path of the file

connection: obj

An SFTP connection object

Returns:
int

The file size in MB.

list_subdirectories(remote_path, connection=None, pattern=None)[source]

List the subdirectories of a directory on the remote server.

Args:
remote_path: str

The remote directory whose subdirectories will be listed

connection: obj

An SFTP connection object

pattern: str

A regex pattern with which to select full directory paths. Defaults to None, in which case all subdirectories will be selected.

Returns:
list

The subdirectories in remote_path.

list_files(remote_path, connection=None, pattern=None)[source]

List the files in a directory on the remote server.

Args:
remote_path: str

The remote directory whose files will be listed

connection: obj

An SFTP connection object

pattern: str

A regex pattern with which to select file names. Defaults to None, in which case all files will be selected.

Returns:
list

The files in remote_path.

walk_tree(remote_path, connection=None, download=False, dir_pattern=None, file_pattern=None, max_depth=2)[source]

Recursively walks a directory, fetching all subdirectories and files (as long as they match dir_pattern and file_pattern, respectively) and the maximum directory depth hasn’t been exceeded. Optionally downloads discovered files.

Args:
remote_path: str

The top level directory to walk

connection: obj

An SFTP connection object

download: bool

Whether to download discovered files

dir_pattern: str

A regex pattern with which to select directories. Defaults to None, in which case all directories will be selected.

file_pattern: str

A regex pattern with which to select files. Defaults to None, in which case all files will be selected.

max_depth: int

A limit on how many directories deep to traverse. The default, 2, will search the contents of remote_path and its subdirectories.

Returns:
tuple

A list of directories touched and a list of files. If the files were downloaded the file list will consist of local paths, if not, remote paths.