layout | title | permalink | redirect_from | ||
---|---|---|---|---|---|
post |
PYTHON SDK |
/docs/python-sdk |
|
AIStore Python SDK is a growing set of client-side objects and methods to access and utilize AIS clusters.
For PyTorch integration and usage examples, please refer to AIS Python SDK available via Python Package Index (PyPI), or see https://github.com/NVIDIA/aistore/tree/master/python/aistore.
- client
- cluster
- bucket
- object
- multiobj.object_group
- multiobj.object_names
- multiobj.object_range
- multiobj.object_template
- job
- object_reader
- object_iterator
- etl
class Client()
AIStore client for managing buckets, objects, ETL jobs
Arguments:
endpoint
str - AIStore endpoint
def bucket(bck_name: str,
provider: str = PROVIDER_AIS,
namespace: Namespace = None)
Factory constructor for bucket object. Does not make any HTTP request, only instantiates a bucket object.
Arguments:
bck_name
str - Name of bucketprovider
str - Provider of bucket, one of "ais", "aws", "gcp", ... (optional, defaults to ais)namespace
Namespace - Namespace of bucket (optional, defaults to None)
Returns:
The bucket object created.
def cluster()
Factory constructor for cluster object. Does not make any HTTP request, only instantiates a cluster object.
Returns:
The cluster object created.
def job(job_id: str = "", job_kind: str = "")
Factory constructor for job object, which contains job-related functions. Does not make any HTTP request, only instantiates a job object.
Arguments:
job_id
str, optional - Optional ID for interacting with a specific jobjob_kind
str, optional - Optional specific type of job empty for all kinds
Returns:
The job object created.
def etl(etl_name: str)
Factory constructor for ETL object. Contains APIs related to AIStore ETL operations. Does not make any HTTP request, only instantiates an ETL object.
Arguments:
etl_name
str - Name of the ETL
Returns:
The ETL object created.
def dsort(dsort_id: str = "")
Factory constructor for dSort object. Contains APIs related to AIStore dSort operations. Does not make any HTTP request, only instantiates a dSort object.
Arguments:
dsort_id
- ID of the dSort job
Returns:
dSort object created
class Cluster()
A class representing a cluster bound to an AIS client.
@property
def client()
Client this cluster uses to make requests
def get_info() -> Smap
Returns state of AIS cluster, including the detailed information about its nodes.
Returns:
aistore.sdk.types.Smap
- Smap containing cluster information
Raises:
requests.RequestException
- "There was an ambiguous exception that occurred while handling..."requests.ConnectionError
- Connection errorrequests.ConnectionTimeout
- Timed out connecting to AIStorerequests.ReadTimeout
- Timed out waiting response from AIStore
def list_buckets(provider: str = PROVIDER_AIS)
Returns list of buckets in AIStore cluster.
Arguments:
provider
str, optional - Name of bucket provider, one of "ais", "aws", "gcp", "az", "hdfs" or "ht". Defaults to "ais". Empty provider returns buckets of all providers.
Returns:
List[BucketModel]
- A list of buckets
Raises:
requests.RequestException
- "There was an ambiguous exception that occurred while handling..."requests.ConnectionError
- Connection errorrequests.ConnectionTimeout
- Timed out connecting to AIStorerequests.ReadTimeout
- Timed out waiting response from AIStore
def list_jobs_status(job_kind="", target_id="") -> List[JobStatus]
List the status of jobs on the cluster
Arguments:
job_kind
str, optional - Only show jobs of a particular typetarget_id
str, optional - Limit to jobs on a specific target node
Returns:
List of JobStatus objects
def list_running_jobs(job_kind="", target_id="") -> List[str]
List the currently running jobs on the cluster
Arguments:
job_kind
str, optional - Only show jobs of a particular typetarget_id
str, optional - Limit to jobs on a specific target node
Returns:
List of jobs in the format job_kind[job_id]
def list_running_etls() -> List[ETLInfo]
Lists all running ETLs.
Note: Does not list ETLs that have been stopped or deleted.
Returns:
List[ETLInfo]
- A list of details on running ETLs
def is_aistore_running() -> bool
Checks if cluster is ready or still setting up.
Returns:
bool
- True if cluster is ready, or false if cluster is still setting up
class Bucket()
A class representing a bucket that contains user data.
Arguments:
client
RequestClient - Client for interfacing with AIS clustername
str - name of bucketprovider
str, optional - Provider of bucket (one of "ais", "aws", "gcp", ...), defaults to "ais"namespace
Namespace, optional - Namespace of bucket, defaults to None
@property
def client() -> RequestClient
The client bound to this bucket.
@property
def qparam() -> Dict
Default query parameters to use with API calls from this bucket.
@property
def provider() -> str
The provider for this bucket.
@property
def name() -> str
The name of this bucket.
@property
def namespace() -> Namespace
The namespace for this bucket.
def create(exist_ok=False)
Creates a bucket in AIStore cluster. Can only create a bucket for AIS provider on localized cluster. Remote cloud buckets do not support creation.
Arguments:
exist_ok
bool, optional - Ignore error if the cluster already contains this bucket
Raises:
aistore.sdk.errors.AISError
- All other types of errors with AIStoreaistore.sdk.errors.InvalidBckProvider
- Invalid bucket provider for requested operationrequests.ConnectionError
- Connection errorrequests.ConnectionTimeout
- Timed out connecting to AIStorerequests.exceptions.HTTPError
- Service unavailablerequests.RequestException
- "There was an ambiguous exception that occurred while handling..."requests.ReadTimeout
- Timed out receiving response from AIStore
def delete(missing_ok=False)
Destroys bucket in AIStore cluster. In all cases removes both the bucket's content and the bucket's metadata from the cluster. Note: AIS will not call the remote backend provider to delete the corresponding Cloud bucket (iff the bucket in question is, in fact, a Cloud bucket).
Arguments:
missing_ok
bool, optional - Ignore error if bucket does not exist
Raises:
aistore.sdk.errors.AISError
- All other types of errors with AIStoreaistore.sdk.errors.InvalidBckProvider
- Invalid bucket provider for requested operationrequests.ConnectionError
- Connection errorrequests.ConnectionTimeout
- Timed out connecting to AIStorerequests.exceptions.HTTPError
- Service unavailablerequests.RequestException
- "There was an ambiguous exception that occurred while handling..."requests.ReadTimeout
- Timed out receiving response from AIStore
def rename(to_bck_name: str) -> str
Renames bucket in AIStore cluster. Only works on AIS buckets. Returns job ID that can be used later to check the status of the asynchronous operation.
Arguments:
to_bck_name
str - New bucket name for bucket to be renamed as
Returns:
Job ID (as str) that can be used to check the status of the operation
Raises:
aistore.sdk.errors.AISError
- All other types of errors with AIStoreaistore.sdk.errors.InvalidBckProvider
- Invalid bucket provider for requested operationrequests.ConnectionError
- Connection errorrequests.ConnectionTimeout
- Timed out connecting to AIStorerequests.exceptions.HTTPError
- Service unavailablerequests.RequestException
- "There was an ambiguous exception that occurred while handling..."requests.ReadTimeout
- Timed out receiving response from AIStore
def evict(keep_md: bool = False)
Evicts bucket in AIStore cluster. NOTE: only Cloud buckets can be evicted.
Arguments:
keep_md
bool, optional - If true, evicts objects but keeps the bucket's metadata (i.e., the bucket's name and its properties)
Raises:
aistore.sdk.errors.AISError
- All other types of errors with AIStoreaistore.sdk.errors.InvalidBckProvider
- Invalid bucket provider for requested operationrequests.ConnectionError
- Connection errorrequests.ConnectionTimeout
- Timed out connecting to AIStorerequests.exceptions.HTTPError
- Service unavailablerequests.RequestException
- "There was an ambiguous exception that occurred while handling..."requests.ReadTimeout
- Timed out receiving response from AIStore
def head() -> Header
Requests bucket properties.
Returns:
Response header with the bucket properties
Raises:
aistore.sdk.errors.AISError
- All other types of errors with AIStorerequests.ConnectionError
- Connection errorrequests.ConnectionTimeout
- Timed out connecting to AIStorerequests.exceptions.HTTPError
- Service unavailablerequests.RequestException
- "There was an ambiguous exception that occurred while handling..."requests.ReadTimeout
- Timed out receiving response from AIStore
def copy(to_bck: Bucket,
prefix_filter: str = "",
prepend: str = "",
dry_run: bool = False,
force: bool = False) -> str
Returns job ID that can be used later to check the status of the asynchronous operation.
Arguments:
to_bck
Bucket - Destination bucketprefix_filter
str, optional - Only copy objects with names starting with this prefixprepend
str, optional - Value to prepend to the name of copied objectsdry_run
bool, optional - Determines if the copy should actually happen or notforce
bool, optional - Override existing destination bucket
Returns:
Job ID (as str) that can be used to check the status of the operation
Raises:
aistore.sdk.errors.AISError
- All other types of errors with AIStorerequests.ConnectionError
- Connection errorrequests.ConnectionTimeout
- Timed out connecting to AIStorerequests.exceptions.HTTPError
- Service unavailablerequests.RequestException
- "There was an ambiguous exception that occurred while handling..."requests.ReadTimeout
- Timed out receiving response from AIStore
def list_objects(prefix: str = "",
props: str = "",
page_size: int = 0,
uuid: str = "",
continuation_token: str = "",
flags: List[ListObjectFlag] = None,
target: str = "") -> BucketList
Returns a structure that contains a page of objects, job ID, and continuation token (to read the next page, if available).
Arguments:
prefix
str, optional - Return only objects that start with the prefixprops
str, optional - Comma-separated list of object properties to return. Default value is "name,size".Properties
- "name", "size", "atime", "version", "checksum", "cached", "target_url", "status", "copies", "ec", "custom", "node".page_size
int, optional - Return at most "page_size" objects. The maximum number of objects in response depends on the bucket backend. E.g, AWS bucket cannot return more than 5,000 objects in a single page.NOTE
- If "page_size" is greater than a backend maximum, the backend maximum objects are returned. Defaults to "0" - return maximum number of objects.uuid
str, optional - Job ID, required to get the next page of objectscontinuation_token
str, optional - Marks the object to start reading the next pageflags
List[ListObjectFlag], optional - Optional list of ListObjectFlag enums to include as flags in the request target(str, optional): Only list objects on this specific target node
Returns:
BucketList
- the page of objects in the bucket and the continuation token to get the next page Empty continuation token marks the final page of the object list
Raises:
aistore.sdk.errors.AISError
- All other types of errors with AIStorerequests.ConnectionError
- Connection errorrequests.ConnectionTimeout
- Timed out connecting to AIStorerequests.exceptions.HTTPError
- Service unavailablerequests.RequestException
- "There was an ambiguous exception that occurred while handling..."requests.ReadTimeout
- Timed out receiving response from AIStore
def list_objects_iter(prefix: str = "",
props: str = "",
page_size: int = 0,
flags: List[ListObjectFlag] = None,
target: str = "") -> ObjectIterator
Returns an iterator for all objects in bucket
Arguments:
prefix
str, optional - Return only objects that start with the prefixprops
str, optional - Comma-separated list of object properties to return. Default value is "name,size".Properties
- "name", "size", "atime", "version", "checksum", "cached", "target_url", "status", "copies", "ec", "custom", "node".page_size
int, optional - return at most "page_size" objects The maximum number of objects in response depends on the bucket backend. E.g, AWS bucket cannot return more than 5,000 objects in a single page.NOTE
- If "page_size" is greater than a backend maximum, the backend maximum objects are returned. Defaults to "0" - return maximum number objectsflags
List[ListObjectFlag], optional - Optional list of ListObjectFlag enums to include as flags in the request target(str, optional): Only list objects on this specific target node
Returns:
ObjectIterator
- object iterator
Raises:
aistore.sdk.errors.AISError
- All other types of errors with AIStorerequests.ConnectionError
- Connection errorrequests.ConnectionTimeout
- Timed out connecting to AIStorerequests.exceptions.HTTPError
- Service unavailablerequests.RequestException
- "There was an ambiguous exception that occurred while handling..."requests.ReadTimeout
- Timed out receiving response from AIStore
def list_all_objects(prefix: str = "",
props: str = "",
page_size: int = 0,
flags: List[ListObjectFlag] = None,
target: str = "") -> List[BucketEntry]
Returns a list of all objects in bucket
Arguments:
prefix
str, optional - return only objects that start with the prefixprops
str, optional - comma-separated list of object properties to return. Default value is "name,size".Properties
- "name", "size", "atime", "version", "checksum", "cached", "target_url", "status", "copies", "ec", "custom", "node".page_size
int, optional - return at most "page_size" objects The maximum number of objects in response depends on the bucket backend. E.g, AWS bucket cannot return more than 5,000 objects in a single page.NOTE
- If "page_size" is greater than a backend maximum, the backend maximum objects are returned. Defaults to "0" - return maximum number objectsflags
List[ListObjectFlag], optional - Optional list of ListObjectFlag enums to include as flags in the request target(str, optional): Only list objects on this specific target node
Returns:
List[BucketEntry]
- list of objects in bucket
Raises:
aistore.sdk.errors.AISError
- All other types of errors with AIStorerequests.ConnectionError
- Connection errorrequests.ConnectionTimeout
- Timed out connecting to AIStorerequests.exceptions.HTTPError
- Service unavailablerequests.RequestException
- "There was an ambiguous exception that occurred while handling..."requests.ReadTimeout
- Timed out receiving response from AIStore
def transform(etl_name: str,
to_bck: Bucket,
timeout: str = DEFAULT_ETL_TIMEOUT,
prefix_filter: str = "",
prepend: str = "",
ext: Dict[str, str] = None,
force: bool = False,
dry_run: bool = False) -> str
Visits all selected objects in the source bucket and for each object, puts the transformed result to the destination bucket
Arguments:
etl_name
str - name of etl to be used for transformationsto_bck
str - destination bucket for transformationstimeout
str, optional - Timeout of the ETL job (e.g. 5m for 5 minutes)prefix_filter
str, optional - Only transform objects with names starting with this prefixprepend
str, optional - Value to prepend to the name of resulting transformed objectsext
Dict[str, str], optional - dict of new extension followed by extension to be replaced (i.e. {"jpg": "txt"})dry_run
bool, optional - determines if the copy should actually happen or notforce
bool, optional - override existing destination bucket
Returns:
Job ID (as str) that can be used to check the status of the operation
def put_files(path: str,
prefix_filter: str = "",
pattern: str = "*",
basename: bool = False,
prepend: str = None,
recursive: bool = False,
dry_run: bool = False,
verbose: bool = True) -> List[str]
Puts files found in a given filepath as objects to a bucket in AIS storage.
Arguments:
path
str - Local filepath, can be relative or absoluteprefix_filter
str, optional - Only put files with names starting with this prefixpattern
str, optional - Regex pattern to filter filesbasename
bool, optional - Whether to use the file names only as object names and omit the path informationprepend
str, optional - Optional string to use as a prefix in the object name for all objects uploaded No delimiter ("/", "-", etc.) is automatically applied between the prepend value and the object namerecursive
bool, optional - Whether to recurse through the provided path directoriesdry_run
bool, optional - Option to only show expected behavior without an actual put operationverbose
bool, optional - Whether to print upload info to standard output
Returns:
List of object names put to a bucket in AIS
Raises:
requests.RequestException
- "There was an ambiguous exception that occurred while handling..."requests.ConnectionError
- Connection errorrequests.ConnectionTimeout
- Timed out connecting to AIStorerequests.ReadTimeout
- Timed out waiting response from AIStoreValueError
- The path provided is not a valid directory
def object(obj_name: str) -> Object
Factory constructor for an object in this bucket. Does not make any HTTP request, only instantiates an object in a bucket owned by the client.
Arguments:
obj_name
str - Name of object
Returns:
The object created.
def objects(obj_names: list = None,
obj_range: ObjectRange = None,
obj_template: str = None) -> ObjectGroup
Factory constructor for multiple objects belonging to this bucket.
Arguments:
obj_names
list - Names of objects to include in the groupobj_range
ObjectRange - Range of objects to include in the groupobj_template
str - String template defining objects to include in the group
Returns:
The ObjectGroup created
def make_request(method: str,
action: str,
value: dict = None,
params: dict = None) -> requests.Response
Use the bucket's client to make a request to the bucket endpoint on the AIS server
Arguments:
method
str - HTTP method to use, e.g. POST/GET/DELETEaction
str - Action string used to create an ActionMsg to pass to the servervalue
dict - Additional value parameter to pass in the ActionMsgparams
dict, optional - Optional parameters to pass in the request
Returns:
Response from the server
def verify_cloud_bucket()
Verify the bucket provider is a cloud provider
def get_path() -> str
Get the path representation of this bucket
def as_model() -> BucketModel
Return a data-model of the bucket
Returns:
BucketModel representation
class Object()
A class representing an object of a bucket bound to a client.
Arguments:
bucket
Bucket - Bucket to which this object belongsname
str - name of object
@property
def bucket()
Bucket containing this object
@property
def name()
Name of this object
def head() -> Header
Requests object properties.
Returns:
Response header with the object properties.
Raises:
requests.RequestException
- "There was an ambiguous exception that occurred while handling..."requests.ConnectionError
- Connection errorrequests.ConnectionTimeout
- Timed out connecting to AIStorerequests.ReadTimeout
- Timed out waiting response from AIStorerequests.exceptions.HTTPError(404)
- The object does not exist
def get(archpath: str = "",
chunk_size: int = DEFAULT_CHUNK_SIZE,
etl_name: str = None,
writer: BufferedWriter = None) -> ObjectReader
Reads an object
Arguments:
archpath
str, optional - If the object is an archive, usearchpath
to extract a single file from the archivechunk_size
int, optional - chunk_size to use while reading from streametl_name
str, optional - Transforms an object based on ETL with etl_namewriter
BufferedWriter, optional - User-provided writer for writing content output. User is responsible for closing the writer
Returns:
The stream of bytes to read an object or a file inside an archive.
Raises:
requests.RequestException
- "There was an ambiguous exception that occurred while handling..."requests.ConnectionError
- Connection errorrequests.ConnectionTimeout
- Timed out connecting to AIStorerequests.ReadTimeout
- Timed out waiting response from AIStore
def put_content(content: bytes) -> Header
Puts bytes as an object to a bucket in AIS storage.
Arguments:
content
bytes - Bytes to put as an object.
Raises:
requests.RequestException
- "There was an ambiguous exception that occurred while handling..."requests.ConnectionError
- Connection errorrequests.ConnectionTimeout
- Timed out connecting to AIStorerequests.ReadTimeout
- Timed out waiting response from AIStore
def put_file(path: str = None)
Puts a local file as an object to a bucket in AIS storage.
Arguments:
path
str - Path to local file
Raises:
requests.RequestException
- "There was an ambiguous exception that occurred while handling..."requests.ConnectionError
- Connection errorrequests.ConnectionTimeout
- Timed out connecting to AIStorerequests.ReadTimeout
- Timed out waiting response from AIStoreValueError
- The path provided is not a valid file
def promote(path: str,
target_id: str = "",
recursive: bool = False,
overwrite_dest: bool = False,
delete_source: bool = False,
src_not_file_share: bool = False) -> Header
Promotes a file or folder an AIS target can access to a bucket in AIS storage. These files can be either on the physical disk of an AIS target itself or on a network file system the cluster can access. See more info here: https://aiatscale.org/blog/2022/03/17/promote
Arguments:
path
str - Path to file or folder the AIS cluster can reachtarget_id
str, optional - Promote files from a specific target noderecursive
bool, optional - Recursively promote objects from files in directories inside the pathoverwrite_dest
bool, optional - Overwrite objects already on AISdelete_source
bool, optional - Delete the source files when done promotingsrc_not_file_share
bool, optional - Optimize if the source is guaranteed to not be on a file share
Returns:
Object properties
Raises:
requests.RequestException
- "There was an ambiguous exception that occurred while handling..."requests.ConnectionError
- Connection errorrequests.ConnectionTimeout
- Timed out connecting to AIStorerequests.ReadTimeout
- Timed out waiting response from AIStoreAISError
- Path does not exist on the AIS cluster storage
def delete()
Delete an object from a bucket.
Returns:
None
Raises:
requests.RequestException
- "There was an ambiguous exception that occurred while handling..."requests.ConnectionError
- Connection errorrequests.ConnectionTimeout
- Timed out connecting to AIStorerequests.ReadTimeout
- Timed out waiting response from AIStorerequests.exceptions.HTTPError(404)
- The object does not exist
class ObjectGroup()
A class representing multiple objects within the same bucket. Only one of obj_names, obj_range, or obj_template should be provided.
Arguments:
bck
Bucket - Bucket the objects belong toobj_names
list[str], optional - List of object names to include in this collectionobj_range
ObjectRange, optional - Range defining which object names in the bucket should be includedobj_template
str, optional - String argument to pass as template value directly to api
def delete()
Deletes a list or range of objects in a bucket
Raises:
aistore.sdk.errors.AISError
- All other types of errors with AIStorerequests.ConnectionError
- Connection errorrequests.ConnectionTimeout
- Timed out connecting to AIStorerequests.exceptions.HTTPError
- Service unavailablerequests.RequestException
- "There was an ambiguous exception that occurred while handling..."requests.ReadTimeout
- Timed out receiving response from AIStore
Returns:
Job ID (as str) that can be used to check the status of the operation
def evict()
Evicts a list or range of objects in a bucket so that they are no longer cached in AIS NOTE: only Cloud buckets can be evicted.
Raises:
aistore.sdk.errors.AISError
- All other types of errors with AIStorerequests.ConnectionError
- Connection errorrequests.ConnectionTimeout
- Timed out connecting to AIStorerequests.exceptions.HTTPError
- Service unavailablerequests.RequestException
- "There was an ambiguous exception that occurred while handling..."requests.ReadTimeout
- Timed out receiving response from AIStore
Returns:
Job ID (as str) that can be used to check the status of the operation
def prefetch()
Prefetches a list or range of objects in a bucket so that they are cached in AIS NOTE: only Cloud buckets can be prefetched.
Raises:
aistore.sdk.errors.AISError
- All other types of errors with AIStorerequests.ConnectionError
- Connection errorrequests.ConnectionTimeout
- Timed out connecting to AIStorerequests.exceptions.HTTPError
- Service unavailablerequests.RequestException
- "There was an ambiguous exception that occurred while handling..."requests.ReadTimeout
- Timed out receiving response from AIStore
Returns:
Job ID (as str) that can be used to check the status of the operation
def copy(to_bck: "Bucket",
prepend: str = "",
continue_on_error: bool = False,
dry_run: bool = False,
force: bool = False)
Copies a list or range of objects in a bucket
Arguments:
to_bck
Bucket - Destination bucketprepend
str, optional - Value to prepend to the name of copied objectscontinue_on_error
bool, optional - Whether to continue if there is an error copying a single objectdry_run
bool, optional - Skip performing the copy and just log the intended actionsforce
bool, optional - Force this job to run over others in case it conflicts (see "limited coexistence" and xact/xreg/xreg.go)
Raises:
aistore.sdk.errors.AISError
- All other types of errors with AIStorerequests.ConnectionError
- Connection errorrequests.ConnectionTimeout
- Timed out connecting to AIStorerequests.exceptions.HTTPError
- Service unavailablerequests.RequestException
- "There was an ambiguous exception that occurred while handling..."requests.ReadTimeout
- Timed out receiving response from AIStore
Returns:
Job ID (as str) that can be used to check the status of the operation
def transform(to_bck: "Bucket",
etl_name: str,
timeout: str = DEFAULT_ETL_TIMEOUT,
prepend: str = "",
continue_on_error: bool = False,
dry_run: bool = False,
force: bool = False)
Performs ETL operation on a list or range of objects in a bucket, placing the results in the destination bucket
Arguments:
to_bck
Bucket - Destination bucketetl_name
str - Name of existing ETL to applytimeout
str - Timeout of the ETL job (e.g. 5m for 5 minutes)prepend
str, optional - Value to prepend to the name of resulting transformed objectscontinue_on_error
bool, optional - Whether to continue if there is an error transforming a single objectdry_run
bool, optional - Skip performing the transform and just log the intended actionsforce
bool, optional - Force this job to run over others in case it conflicts (see "limited coexistence" and xact/xreg/xreg.go)
Raises:
aistore.sdk.errors.AISError
- All other types of errors with AIStorerequests.ConnectionError
- Connection errorrequests.ConnectionTimeout
- Timed out connecting to AIStorerequests.exceptions.HTTPError
- Service unavailablerequests.RequestException
- "There was an ambiguous exception that occurred while handling..."requests.ReadTimeout
- Timed out receiving response from AIStore
Returns:
Job ID (as str) that can be used to check the status of the operation
def archive(archive_name: str,
mime: str = "",
to_bck: "Bucket" = None,
include_source_name: bool = False,
allow_append: bool = False,
continue_on_err: bool = False)
Create or append to an archive
Arguments:
archive_name
str - Name of archive to create or appendmime
str, optional - MIME type of the contentto_bck
Bucket, optional - Destination bucket, defaults to current bucketinclude_source_name
bool, optional - Include the source bucket name in the archived objects' namesallow_append
bool, optional - Allow appending to an existing archivecontinue_on_err
bool, optional - Whether to continue if there is an error archiving a single object
Returns:
Job ID (as str) that can be used to check the status of the operation
def list_names() -> List[str]
List all the object names included in this group of objects
Returns:
List of object names
class ObjectNames(ObjectCollection)
A collection of object names, provided as a list of strings
Arguments:
names
List[str] - A list of object names
class ObjectRange(ObjectCollection)
Class representing a range of object names
Arguments:
prefix
str - Prefix contained in all names of objectsmin_index
int - Starting index in the name of objectsmax_index
int - Last index in the name of all objectspad_width
int, optional - Left-pad indices with zeros up to the width provided, e.g. pad_width = 3 will transform 1 to 001step
int, optional - Size of iterator steps between each itemsuffix
str, optional - Suffix at the end of all object names
class ObjectTemplate(ObjectCollection)
A collection of object names specified by a template in the bash brace expansion format
Arguments:
template
str - A string template that defines the names of objects to include in the collection
class Job()
A class containing job-related functions.
Arguments:
client
RequestClient - Client for interfacing with AIS clusterjob_id
str, optional - ID of a specific job, empty for all jobsjob_kind
str, optional - Specific kind of job, empty for all kinds
@property
def job_id()
Return job id
@property
def job_kind()
Return job kind
def status() -> JobStatus
Return status of a job
Returns:
The job status including id, finish time, and error info.
Raises:
requests.RequestException
- "There was an ambiguous exception that occurred while handling..."requests.ConnectionError
- Connection errorrequests.ConnectionTimeout
- Timed out connecting to AIStorerequests.ReadTimeout
- Timed out waiting response from AIStore
def wait(timeout: int = DEFAULT_JOB_WAIT_TIMEOUT, verbose: bool = True)
Wait for a job to finish
Arguments:
timeout
int, optional - The maximum time to wait for the job, in seconds. Default timeout is 5 minutes.verbose
bool, optional - Whether to log wait status to standard output
Returns:
None
Raises:
requests.RequestException
- "There was an ambiguous exception that occurred while handling..."requests.ConnectionError
- Connection errorrequests.ConnectionTimeout
- Timed out connecting to AIStorerequests.ReadTimeout
- Timed out waiting response from AIStoreerrors.Timeout
- Timeout while waiting for the job to finish
def wait_for_idle(timeout: int = DEFAULT_JOB_WAIT_TIMEOUT,
verbose: bool = True)
Wait for a job to reach an idle state
Arguments:
timeout
int, optional - The maximum time to wait for the job, in seconds. Default timeout is 5 minutes.verbose
bool, optional - Whether to log wait status to standard output
Returns:
None
Raises:
requests.RequestException
- "There was an ambiguous exception that occurred while handling..."requests.ConnectionError
- Connection errorrequests.ConnectionTimeout
- Timed out connecting to AIStorerequests.ReadTimeout
- Timed out waiting response from AIStoreerrors.Timeout
- Timeout while waiting for the job to finish
def start(daemon_id: str = "",
force: bool = False,
buckets: List[Bucket] = None) -> str
Start a job and return its ID.
Arguments:
daemon_id
str, optional - For running a job that must run on a specific target node (e.g. resilvering).force
bool, optional - Override existing restrictions for a bucket (e.g., run LRU eviction even if the bucket has LRU disabled).buckets
List[Bucket], optional - List of one or more buckets; applicable only for jobs that have bucket scope (for details on job types, seeTable
in xact/api.go).
Returns:
The running job ID.
Raises:
requests.RequestException
- "There was an ambiguous exception that occurred while handling..."requests.ConnectionError
- Connection errorrequests.ConnectionTimeout
- Timed out connecting to AIStorerequests.ReadTimeout
- Timed out waiting response from AIStore
class ObjectReader()
Represents the data returned by the API when getting an object, including access to the content stream and object attributes
@property
def attributes() -> ObjectAttributes
Object metadata attributes
Returns:
Object attributes parsed from the headers returned by AIS
def read_all() -> bytes
Read all byte data from the object content stream. This uses a bytes cast which makes it slightly slower and requires all object content to fit in memory at once
Returns:
Object content as bytes
def raw() -> bytes
Returns: Raw byte stream of object content
def __iter__() -> Iterator[bytes]
Creates a generator to read the stream content in chunks
Returns:
An iterator with access to the next chunk of bytes
class ObjectIterator()
Represents an iterable that will fetch all objects from a bucket, querying as needed with the specified function
Arguments:
list_objects
Callable - Function returning a BucketList from an AIS cluster
class Etl()
A class containing ETL-related functions.
@property
def name() -> str
Name of the ETL
def init_spec(template: str,
communication_type: str = DEFAULT_ETL_COMM,
timeout: str = DEFAULT_ETL_TIMEOUT) -> str
Initializes ETL based on Kubernetes pod spec template. Returns etl_name.
Arguments:
template
str - Kubernetes pod spec template Existing templates can be found atsdk.etl_templates
For more information visit: https://github.com/NVIDIA/ais-etl/tree/master/transformerscommunication_type
str - Communication type of the ETL (options: hpull, hrev, hpush)timeout
str - Timeout of the ETL job (e.g. 5m for 5 minutes)
Returns:
Job ID string associated with this ETL
def init_code(transform: Callable,
dependencies: List[str] = None,
preimported_modules: List[str] = None,
runtime: str = _get_default_runtime(),
communication_type: str = DEFAULT_ETL_COMM,
timeout: str = DEFAULT_ETL_TIMEOUT,
chunk_size: int = None,
transform_url: bool = False) -> str
Initializes ETL based on the provided source code. Returns etl_name.
Arguments:
transform
Callable - Transform function of the ETLdependencies
list[str] - Python dependencies to installpreimported_modules
list[str] - Modules to import before running the transform function. This can be necessary in cases where the modules used both attempt to import each other circularlyruntime
str - [optional, default= V2 implementation of the current python version if supported, else python3.8v2] Runtime environment of the ETL [choose from: python3.8v2, python3.10v2, python3.11v2] (see ext/etl/runtime/all.go)communication_type
str - [optional, default="hpush"] Communication type of the ETL (options: hpull, hrev, hpush, io)timeout
str - [optional, default="5m"] Timeout of the ETL job (e.g. 5m for 5 minutes)chunk_size
int - Chunk size in bytes if transform function in streaming data. (whole object is read by default)transform_url
optional, bool - If True, the runtime will provide the transform function with the URL to the object on the target rather than the raw bytes read from the object
Returns:
Job ID string associated with this ETL
def view() -> ETLDetails
View ETL details
Returns:
ETLDetails
- details of the ETL
def start()
Resumes a stopped ETL with given ETL name.
Note: Deleted ETLs cannot be started.
def stop()
Stops ETL. Stops (but does not delete) all the pods created by Kubernetes for this ETL and terminates any transforms.
def delete()
Delete ETL. Deletes pods created by Kubernetes for this ETL and specifications for this ETL in Kubernetes.
Note: Running ETLs cannot be deleted.