Transfer

class swpy.lib.transfer.Transfer

Bases: object

A class to transfer files according to various protocols. Supported protocols include HTTP, HTTPS, FTP, FTPS and SFTP. You can use format code of datetime in the path.

temp_dir

Directory path where files are temporarily stored during download/upload process between protocols.

The default value is “./temp/”.

Type

str

log_path

Path to save the log that has been recorded during operation.

The default value is “./log/%Y/%Y%m%d.log”.

Type

str

ftp_session

Object containing session information of an ftp connection.

The default value is None.

Type

ftplib.FTP

sftp_session

Object containing session information of an sftp connection.

The default value is None.

Type

paramiko.sftp_client.SFTPClient

sftp_transport

Object containing ssh transport information of an sftp connection.

The default value is None.

Type

paramiko.transport.Transport

Examples

>>> tf = Transfer()
>>> tf.set_temp_dir("local/dir1/temp/") # optional
>>> tf.set_log_path("local/log/%Y/%Y%m%d_log.txt") # optional
>>> tf.download("ftp://test.ftp.org/data/test_%Y%m%d_%H%M%S.txt", "local/data_test_%Y%m%d_%H%M%S.txt", src_id="id", src_pw="password")

Notes

FTP, SFTP Common Features

  • Before using the download/upload function, you must run the connect function first.

  • The path using FTP/SFTP must consist of path excluding address. (The address is used in the connect function.)

  • Example. FTP download procedure

    >>> tf = Transfer()
    >>> tf.ftp_connect(addr, id, pw)
    >>> tf.ftp_download(src_path, dst_path)
    >>> tf.ftp_disconnect()
    
  • You can download/upload files in directory units. In this case, you must include ‘/’ at the end.

  • The source path and destination path must be set to the same file type:

    (o) Download {dir path} to {dir path}
    (o) Download {file path} to {file path}
    (x) Download {dir path} to {file path}
    (x) Download {file path} to {dir path}
    
  • After the first connection by FTP/SFTP, the starting point of the directory can be set between the user home directory and the root directory by adding ‘~/’ or ‘/’ in front of the path. (~/: Home directory of user accessed, /: Root directory)

  • If file download/upload fails, it retries 3 times until it is successful. If it fails after all retries, that file is skipped so that the other file processing is not affected.

  • If there is no file in the server (550 error occurs), it skips immediately without retrying.

  • Through the return value of the download/upload function, you can check the correctness of the input argument value and whether the download/upload file list is successfully fetched. However, the file download/upload failure cannot be checked at this stage.

  • The failure of file download/upload can be checked only in log files

Methods Summary

copy_file(src_path, dst_path[, overwrite])

Locally copies files.

download(src_path, dst_path[, src_id, …])

This is a generalized function for moving files between various protocols.

ftp_check_connection()

Checks the FTP connection status.

ftp_check_directory(dir)

Examines the directory path of the connected FTP server and creates new directories that do not exist.

ftp_check_directory_by_file_path(file_path)

Examines the file path of the connected FTP server and creates new directories that do not exist.

ftp_connect(addr[, id, pw])

Connects to given addr using FTP.

ftp_disconnect()

Disconnects the ftplib.FTP object saved in the ftp_session variable and initializes the value of ftp_session to None.

ftp_download(src_path, dst_path[, overwrite])

Locally downloads files provided by the FTP.

ftp_get_file_list(path[, recursive])

Brings the files in path of the connected FTP server to the list.

ftp_upload(src_path, dst_path)

Uploads local files to a path using the FTP.

http_download(src_path, dst_path[, id, pw, …])

Locally downloads a file provided by the HTTP.

set_log_path(log_path)

Sets the path to save the log that has been recorded during operation.

set_temp_dir(temp_dir)

Sets temp directory.

sftp_check_connection()

Checks the SFTP connection status.

sftp_check_directory(dir)

Examines the directory path of the connected SFTP server and creates new directories that do not exist.

sftp_check_directory_by_file_path(file_path)

Examines the file path of the connected SFTP server and creates new directories that do not exist.

sftp_connect(addr[, id, pw])

Connects to given addr using SFTP.

sftp_disconnect()

Disconnects the paramiko.sftp_client.SFTPClient object and paramiko.transport.Transport object stored in the sftp_session variable and sftp_transport variable, and initializes each value to None.

sftp_download(src_path, dst_path[, overwrite])

Locally downloads files provided by the SFTP.

sftp_get_file_list(path[, recursive])

Brings the files in path of the connected SFTP server to the list.

sftp_upload(src_path, dst_path)

Uploads local files to a path using the SFTP.

Methods Documentation

copy_file(src_path, dst_path, overwrite=False)

Locally copies files. You can copy files in directory units. In this case, you must include ‘/’ at the end. The source path and destination path must be set to the same file type. If the source file does not exist or copying file is failed, the file is skipped so that the other copying file process is not affected. Through the return value, you can check the correctness of the input argument value is correct, and whether the file list to copy is successfully fetched. However, the failure to copy the file cannot be checked at this stage. Failure to copy files can only be checked by the log path. Basically, if a file already exists in the given dst_path, it is not copied. It can be overwritten by setting overwrite to True.

Parameters
  • src_path (str) – Local source path.

  • dst_path (str) – Local destination path.

  • overwrite (bool, optional) –

    Overwriting setting. Default is False.

    (True: Overwrite / False: Not overwrite)

Returns

bool – True/False (Success/Fail)

Examples

>>> tf.copy_file("local/data/", "local/group/data_copy/")
download(src_path, dst_path, src_id='', src_pw='', dst_id='', dst_pw='')

This is a generalized function for moving files between various protocols. Internally, this function has been implemented to parse the protocol of src_path and dst_path and move the file accordingly. This does not maintain the session of FTP/SFTP. When transferring files between protocols, files stored in the temporary directory are deleted when the transfer is complete. If the FTP/SFTP connection fails, it retries 3 times until it is successful. If it fails after all retries, the return value is False. src_path and dst_path must clearly contain protocol information

Examples of path format in various protocols:

# Local
/dir1/dir2/...
# HTTP
http://host:port/...
# FTP
ftp://host:port/...
# SFTP
sftp://host:port/...
Parameters
  • src_path (str) – Source path.

  • dst_path (str) – Destination path.

  • src_id (str, optional) – ID to access source address.

  • src_pw (str, optional) – Password to access source address.

  • dst_id (str, optional) – ID to access destination address.

  • dst_pw (str, optional) – Password to access destination address.

Returns

bool – True/False (Success/Fail)

Examples

>>> tf.download("http://{host}:{port}/data_%Y%m%d_%H%M%S.png", "ftp://{host}:{port}/test_data_%Y%m%d_%H%M%S.png", "src_id", "src_pw", "dst_id", "dst_pw")
ftp_check_connection()

Checks the FTP connection status. If the connection of the ftplib.FTP object saved in the ftp_session variable is disconnected or the ftp_session value is None, it returns False.

Returns

bool – True/False (Success/Fail)

Examples

>>> tf.ftp_check_connection()
ftp_check_directory(dir)

Examines the directory path of the connected FTP server and creates new directories that do not exist. Generally, the starting point of the dir is the current working directory of the connected FTP server. dir can be expressed as an absolute path and a relative path by using ‘/’ and ‘./’ in front of it. ‘/’ must be after dir. After the check is complete, the working directory is the same as the directory before the check.

Parameters

dir (str) – Directory path of the connected FTP server.

Returns

bool – True/False (Success/Fail)

Examples

>>> tf.ftp_check_directory("./dir1/dir2/")
ftp_check_directory_by_file_path(file_path)

Examines the file path of the connected FTP server and creates new directories that do not exist. Generally, the starting point of the file_path is the current working directory of the connected FTP server. file_path can be expressed as an absolute path and a relative path by using ‘/’ and ‘./’ in front of it. ‘/’ cannot follow file_path. After the check is complete, the working directory is the same as the directory before the check.

Parameters

file_path (str) – File path of the connected FTP server.

Returns

bool – True/False (Success/Fail)

Examples

>>> tf.ftp_check_directory_by_file_path("./dir1/dir2/filename.txt")
ftp_connect(addr, id='', pw='')

Connects to given addr using FTP. Save the successfully connected ftplib.FTP object in the ftp_session variable. The default port is 21, and the port can be set by expressing addr as “host:port”. (The port is option.) The connection timeout is 30 seconds. You can log into addr by setting id and pw.

Parameters
  • addr (str) – address using FTP in the form of “host[:port]”.

  • id (str, optional) – ID to access given address.

  • pw (str, optional) – Password to access given address.

Returns

bool – True/False (Success/Fail)

Examples

>>> tf.ftp_connect("host:port", "id", "pw")
ftp_disconnect()

Disconnects the ftplib.FTP object saved in the ftp_session variable and initializes the value of ftp_session to None.

Returns

bool – True/False (Success/Fail)

Examples

>>> tf.ftp_disconnect()
ftp_download(src_path, dst_path, overwrite=False)

Locally downloads files provided by the FTP. ftp_connect function must be preceded. src_path must consist of path excluding address. (The address is used as an argument of the ftp_connect function.) If a problem occurs while writing a file to dst_path, it deletes the file. Generally, if a file already exists in the given dst_path, it is not downloaded. It can be overwritten by setting overwrite to True. See also ‘FTP, SFTP Common Features’ note of Transfer.

Parameters
  • src_path (str) – [[~]/] Source path using FTP. (Excluding address.)

  • dst_path (str) – Local destination path.

  • overwrite (bool, optional) –

    Overwriting setting. Default is False.

    (True: Overwrite / False: Not overwrite)

Returns

bool – True/False (Success/Fail)

Examples

>>> tf.ftp_download("ftp://data_site.com/data/data_%Y%m%d.png", "local/data/data_%Y%m%d.png")
ftp_get_file_list(path, recursive=False)

Brings the files in path of the connected FTP server to the list. Generally, only files in depth 1 from path are fetched. If recursive is True, all files in subdirectories are included in the list. The files imported into the list contain extension and directory structure information. If it fails to get the file list, None is returned. If path is file, only that file is imported into the list.

Parameters
  • path (str) – Path to get file list.

  • recursive (bool, optional) –

    Whether to include all files in subdirectories. Default is False.

    (True: Included / False: Not included)

Returns

list – File name list

Examples

>>> ftp_get_file_list("dir1/", False)
["file1.txt", "file2.txt"]
>>> ftp_get_file_list("dir1/", True)
[“file1.txt”, “file2.txt”, “dir2/fileA.txt”, “dir2/dir3/file_a.txt”]
>>> ftp_get_file_list("dir1/file1.txt", False)
[“file1.txt”]
ftp_upload(src_path, dst_path)

Uploads local files to a path using the FTP. ftp_connect function must be preceded. dst_path must consist of path excluding address. (The address is used as an argument of the ftp_connect function.) See also ‘FTP, SFTP Common Features’ note of Transfer.

Parameters
  • src_path (str) – Local source path.

  • dst_path (str) – [[~]/] Destination path using FTP. (Excluding address.)

Returns

bool – True/False (Success/Fail)

Examples

>>> tf.ftp_upload("local/data/data_%Y%m%d.png", "ftp://data_site.com/data/data_%Y%m%d.png")
http_download(src_path, dst_path, id='', pw='', overwrite=False)

Locally downloads a file provided by the HTTP. The connection timeout is 30 seconds. If the connection fails, it retries 3 times until it is successful. If it fails after all retries, it returns False. If the file does not exist in the source path(404 error occurs), it immediately returns False without retrying. If a problem occurs while writing a file to dst_path, it deletes the file. Generally, if a file already exists in the given dst_path, it is not downloaded. It can be overwritten by setting overwrite to True. You cannot download in directory units.

Parameters
  • src_path (str) – Source path using HTTP.

  • dst_path (str) – Local destination path.

  • id (str, optional) – ID to access source address.

  • pw (str, optional) – Password to access source address.

  • overwrite (bool, optional) –

    Overwriting setting. Default is False.

    (True: Overwrite / False: Not overwrite)

Returns

bool – True/False (Success/Fail)

Examples

>>> tf.http_download("http://data_site.com/data/%Y", "local/data/%Y")
set_log_path(log_path)

Sets the path to save the log that has been recorded during operation.

Parameters

log_path (str) – Log path.

Examples

>>> tf.set_log_path("local/log/%Y/%Y%m%d_log.txt")
set_temp_dir(temp_dir)

Sets temp directory. The temp directory is a directory where files are temporarily stored during download/upload process between protocols. The default value is “./temp/”.

Parameters

temp_dir (str) – Temp directory path.

Examples

>>> tf.set_temp_dir("local/dir1/temp/")
sftp_check_connection()

Checks the SFTP connection status. If the connection of the paramiko.sftp_client.SFTPClient object saved in the sftp_session variable is disconnected or the ftp_session value is None, it returns False.

Returns

bool – True/False (Success/Fail)

Examples

>>> tf.sftp_check_connection()
sftp_check_directory(dir)

Examines the directory path of the connected SFTP server and creates new directories that do not exist. Generally, the starting point of the dir is the current working directory of the connected SFTP server. dir can be expressed as an absolute path and a relative path by using ‘/’ and ‘./’ in front of it. ‘/’ must be after dir. After the check is complete, the working directory is the same as the directory before the check.

Parameters

dir (str) – Directory path of the connected SFTP server.

Returns

bool – True/False (Success/Fail)

Examples

>>> tf.sftp_check_directory("./dir1/dir2/")
sftp_check_directory_by_file_path(file_path)

Examines the file path of the connected SFTP server and creates new directories that do not exist. Generally, the starting point of the file_path is the current working directory of the connected SFTP server. file_path can be expressed as an absolute path and a relative path by using ‘/’ and ‘./’ in front of it. ‘/’ cannot follow file_path. After the check is complete, the working directory is the same as the directory before the check.

Parameters

file_path (str) – File path of the connected SFTP server.

Returns

bool – True/False (Success/Fail)

Examples

>>> tf.ftp_check_directory_by_file_path("./dir1/dir2/filename.txt")
sftp_connect(addr, id='', pw='')

Connects to given addr using SFTP. Saves successfully connected paramiko.sftp_client.SFTPClient object and paramiko.transport.Transport object in sftp_session variable and sftp_transport variable, respectively. The default port is 22, and the port can be set by expressing addr as “host:port”. (The port is option.) You can log into addr by setting id and pw.

Parameters
  • addr (str) – address using SFTP in the form of “host[:port]”.

  • id (str, optional) – ID to access given address.

  • pw (str, optional) – Password to access given address.

Returns

bool – True/False (Success/Fail)

Examples

>>> tf.sftp_connect("host:port", "id", "pw")
sftp_disconnect()

Disconnects the paramiko.sftp_client.SFTPClient object and paramiko.transport.Transport object stored in the sftp_session variable and sftp_transport variable, and initializes each value to None.

Returns

bool – True/False (Success/Fail)

Examples

>>> tf.sftp_check_connection()
sftp_download(src_path, dst_path, overwrite=False)

Locally downloads files provided by the SFTP. sftp_connect function must be preceded. src_path must consist of path excluding address. (The address is used as an argument of the sftp_connect function.) If a problem occurs while writing a file to dst_path, it deletes the file. Generally, if a file already exists in the given dst_path, it is not downloaded. It can be overwritten by setting overwrite to True. See also ‘FTP, SFTP Common Features’ note of Transfer.

Parameters
  • src_path (str) – [[~]/] Source path using SFTP. (Excluding address.)

  • dst_path (str) – Local destination path.

  • overwrite (bool, optional) –

    Overwriting setting. Default is False.

    (True: Overwrite / False: Not overwrite)

Returns

bool – True/False (Success/Fail)

Examples

>>> tf.sftp_download("sftp://{host}:{port}/data_%Y/", "local/data_%Y/")
sftp_get_file_list(path, recursive=False)

Brings the files in path of the connected SFTP server to the list. Generally, only files in depth 1 from path are fetched. If recursive is True, all files in subdirectories are included in the list. The files imported into the list contain extension and directory structure information. If it fails to get the file list, None is returned. If path is file, only that file is imported into the list.

Parameters
  • path (str) – Path to get file list.

  • recursive (bool, optional) –

    Whether to include all files in subdirectories. Default is False.

    (True: Included / False: Not included)

Returns

list – File name list

Examples

>>> sftp_get_file_list("dir1/", False)
["file1.txt", "file2.txt"]
>>> sftp_get_file_list("dir1/", True)
[“file1.txt”, “file2.txt”, “dir2/fileA.txt”, “dir2/dir3/file_a.txt”]
>>> sftp_get_file_list("dir1/file1.txt", False)
[“file1.txt”]
sftp_upload(src_path, dst_path)

Uploads local files to a path using the SFTP. sftp_connect function must be preceded. dst_path must consist of path excluding address. (The address is used as an argument of the sftp_connect function.) See also ‘FTP, SFTP Common Features’ note of Transfer.

Parameters
  • src_path (str) – Local source path.

  • dst_path (str) – [[~]/] Destination path using SFTP. (Excluding address.)

Returns

bool – True/False (Success/Fail)

Examples

>>> tf.sftp_upload("local/data/%Y/test_%Y%m%d.txt", "sftp://{host}:{port}/data/%Y/test_%Y%m%d.txt")